Lab Solution 5: rwlocks and Event Barriers

The lab checkoff sheet for all students can be found right here.

Get Started

Before starting, go ahead and clone the lab5 folder, which contains the test framework for the EventBarrier class discussed in Problem 2.

git clone /usr/class/cs110/repos/lab5/shared lab5

Problem 1: Read-Write Locks

The following was a final exam question from a few years ago. The students were expected to write the code that follows.

The read-write lock (implemented by the rwlock class) is a mutex-like class with three public methods:

class rwlock {
 public:
  rwlock();
  void acquireAsReader();
  void acquireAsWriter();
  void release();

 private:
  // object state omitted
};

Any number of threads can acquire the lock as a reader without blocking one another. However, if a thread acquires the lock as a writer, then all other acquireAsReader and acquireAsWriter requests block until the writer releases the lock. Waiting for the write lock will block until all readers release the lock so that the writer is guaranteed exclusive access to the resource being protected. This is useful if, say, you use some shared data structure that only very periodically needs to be modified. All reads from the data structure require you to hold the reader lock (so as many threads as you want can read the data structure at once), but any writes require you to hold the writer lock (giving the writing thread exclusive access).

The implementation ensures that as soon as one thread tries to get the writer lock, all other threads trying to acquire the lock—either as a reader or a writer—block until that writer gets the locks and releases it. That means the state of the lock can be one of three things:

Ready, meaning that no one is trying to get the write lock.
Pending, meaning that someone is trying to get the write lock but is waiting for all the readers to finish.
Writing, meaning that someone is writing.

The following solution (written by Jerry Cain) relies on two mutexes and two condition_variable_anys. Here is the full interface for the rwlock class:

class rwlock {
 public:
  rwlock(): numReaders(0), writeState(Ready) {}
  void acquireAsReader();
  void acquireAsWriter();
  void release();

 private:
  int numReaders;
  enum { Ready, Pending, Writing } writeState;
  mutex readLock, stateLock;
  condition_variable_any readCond, stateCond;
};

And here are the implementations of the three public methods:

void rwlock::acquireAsReader() {
  lock_guard<mutex> lgs(stateLock);
  stateCond.wait(stateLock, [this]{ return writeState == Ready; });
  lock_guard<mutex> lgr(readLock);
  numReaders++;
}

void rwlock::acquireAsWriter() {
  stateLock.lock();
  stateCond.wait(stateLock, [this]{ return writeState == Ready; });
  writeState = Pending;
  stateLock.unlock();
  lock_guard<mutex> lgr(readLock);
  readCond.wait(readLock, [this]{ return numReaders == 0; });
  writeState = Writing;
}

void rwlock::release() {
  stateLock.lock();
  if (writeState == Writing) {
    writeState = Ready;
    stateLock.unlock();
    stateCond.notify_all();
    return;
  }

  stateLock.unlock();
  lock_guard<mutex> lgr(readLock);
  numReaders--;
  if (numReaders == 0) readCond.notify_one();
}

Very carefully study the implementation of the three methods, and answer the questions that appear below. This lab problem is designed to force you to really internalize the condition_variable_any—one of the more difficult concepts in the entire threading segment of the course, and understand how it works.

The implementation of acquireAsReader acquires the stateLock (via the lock_guard) before it does anything else, and it doesn’t release the stateLock until the method exits. Why can’t the implementation be this instead?

void rwlock::acquireAsReader() {
  stateLock.lock();
  stateCond.wait(stateLock, [this]{ return writeState == Ready; });
  stateLock.unlock();
  lock_guard<mutex> lgr(readLock);
  numReaders++;
}

The implementation of acquireAsWriter acquires the stateLock before it does anything else and it releases the stateLock just before it acquires the readLock. Why can’t acquireAsWriter adopt the same approach as acquireAsReader and just hold onto stateLock until the method returns?
Notice that we have a single release method instead of releaseAsReader and releaseAsWriter methods. How does the implementation know if the thread acquired the rwlock as a writer instead of a reader (assuming proper use of the class)?
The implementation of release relies on notify_all in one place and notify_one in another. Why are those the correct versions of notify to call in each case?
A thread that owns the lock as a reader might want to upgrade its ownership of the lock to that of a writer without releasing the lock first. Besides the fact that it’s a waste of time, what’s the advantage of not releasing the read lock before re-acquiring it as a writer, and how could the implementation of acquireAsWriter be updated so it can be called after acquireAsReader without an intervening release call?

Solution

Why can't the implementation be the suggested alternative? Assume just two threads:
- Thread 1 calls acquireAsReader and is swapped off after third of five lines.
- Thread 2 calls and progresses through all of acquireAsWriter.
- Thread 1 progresses through rest of acquireAsReader.
- We have one reader and one writer, and that’s forbidden.
Why can't acquireAsWriter adopt the same approach as acquireAsReader? If the writer doesn’t release stateLock before waiting for the number of readers to fall to 0, it blocks readers trying to release their locks from decrementing numReaders.
How does the implementation know if the thread acquired the rwlock as a writer instead of a reader? The implementation is such that the write state can only be Writing when there’s one write lock and zero read locks. When the write state is Writing, then only one thread could possibly be calling release, unless the class is being used improperly.
notify_one vs. notify_all answer: Any number of threads might be waiting for a Ready state so they can advance to acquire read locks. The writer needs to notify all of them when it releases the write lock. At most one thread—a writer—can be waiting for the number of readers to be 0, so calling notify_one is sufficient.
Not releasing the read lock before re-acquiring it as a writer, and updating the implementation:
- One advantage: fewer mutexes and condition_variable_anys need to be waited on, so the chance that the threads trying to upgrade the lock are forced to yield the processor is much, much smaller.
- Another advantage: using the underlying thread (which are pthreads) and its support for priorities, you can give threads that are trying to upgrade higher priority so they get the processor before lower priority threads do.
- Implementation idea: change the acquireAsWriter to accept a bool to state whether it holds a read lock already.
- Better implementation idea: update the rwlock to maintain a set of thread ids (which are all the underlying pthread_t’s really are) that hold a read lock, and if the thread trying to upgrade finds its thread id in the set, then it knows it’s upgrading. It can then wait until numReaders == numUpgraders instead of numReaders == 0. Couple this with higher thread priorities and you can ensure that exactly one upgrading thread succeeds while all others wait. It’s true that all of the other upgraders technically have a read lock, but they’re blocked inside acquireAsWriter, so they’re not actually reading whatever data structure is being accessed, and that's okay.

Problem 2: Event Barriers

An event barrier allows a group of one or more threads—we call them consumers—to efficiently wait until an event occurs (i.e. the barrier is lifted by another thread, called the producer). The barrier is eventually restored by the producer, but only after consumers have detected the event, executed what they could only execute because the barrier was lifted, and notified the producer they’ve done what they need to do and moved past the barrier. In fact, consumers and producers efficiently block (in lift and past, respectively) until all consumers have moved past the barrier. We say an event is in progress while consumers are responding to and moving past it.

The EventBarrier implements this idea via a constructor and three zero-argument methods called wait, lift, and past. The EventBarrier requires no external synchronization, and maintains enough internal state to track the number of waiting consumers and whether an event is in progress. If a consumer arrives at the barrier while an event is in progress, wait returns immediately without blocking.

The following test program and sample run illustrate how the EventBarrier works. The backstory for the sample run: weary travelers approach a castle only to be blocked by a castle gate. The travelers wait until the gatekeeper lifts the gate, allowing the travelers to pass through. The gatekeeper only lowers the gate after all travelers have passed through the gate, and the travelers only proceed toward the castle once all have passed through the gate.

// A list of the names of the "traveler threads" and its length
static const string kTravelerNames[] = { "Peter", "Paul", "Mary", "Manny", "Moe", "Jack" };
static const int kTravelerNamesLength = sizeof(kTravelerNames) / sizeof(kTravelerNames[0]);


/* The gatekeeper sleeps for a random amount of time, then opens the castle gate
 * by lifting the event barrier.  That will block until all travelers have passed
 * through, at which point this function returns.
 */
static void gatekeeper(EventBarrier& castleGate) {
  sleep(random() % 5 + 7);
  cout << oslock << "Gatekeeper opens the gate." << endl << osunlock;
  castleGate.lift();
  cout << oslock << "Gatekeeper closes the gate, knowing all have passed." 
       << endl << osunlock;
}

/* A traveler sleeps for some amount of time to simulate approaching the
 * the castle gate.  Then it waits for the castle gate to open.  Once the 
 * gate is opened, the traveler sleeps for some amount of time to simulate 
 * passing through the gate.  Then, it waits for all others to pass through
 * before returning.
 */
static void traveler(const string& name, EventBarrier& castleGate) {
  cout << oslock << name << " walks toward the castle gate." << endl << osunlock;
  sleep(random() % 3 + 3);
  cout << oslock << name << " arrives at the castle gate, must wait." << endl << osunlock;
  castleGate.wait();
  cout << oslock << name << " detects castle gate lifted, starts entering." << endl << osunlock;
  sleep(random() % 3 + 2);
  cout << oslock << name << " has passed through the castle gate." << endl << osunlock;
  castleGate.past();
  cout << oslock << name << " carries on, knowing all others have passed." << endl << osunlock;
}

int main(int argc, char *argv[]) {
  // An EventBarrier to model the gate.  The "barrier" represents the gate.
  EventBarrier castleGate;

  // Spawn the traveler and gatekeeper threads
  thread travelers[kTravelerNamesLength];
  for (size_t i = 0; i < kTravelerNamesLength; i++) {
    travelers[i] = thread(traveler, kTravelerNames[i], ref(castleGate));
  }
  thread gatekeeperThread(gatekeeper, ref(castleGate));

  // Wait for all threads to finish
  for (thread& traveler : travelers) traveler.join();
  gatekeeperThread.join();

  return 0;
}

$ ./ebtest
Peter walks toward the castle gate.
Paul walks toward the castle gate.
Mary walks toward the castle gate.
Manny walks toward the castle gate.
Moe walks toward the castle gate.
Jack walks toward the castle gate.
Mary arrives at the castle gate, must wait.
Peter arrives at the castle gate, must wait.
Paul arrives at the castle gate, must wait.
Manny arrives at the castle gate, must wait.
Jack arrives at the castle gate, must wait.
Moe arrives at the castle gate, must wait.
Gatekeeper opens the gate.
Mary detects castle gate lifted, starts entering.
Peter detects castle gate lifted, starts entering.
Paul detects castle gate lifted, starts entering.
Manny detects castle gate lifted, starts entering.
Jack detects castle gate lifted, starts entering.
Moe detects castle gate lifted, starts entering.
Mary has passed through the castle gate.
Peter has passed through the castle gate.
Paul has passed through the castle gate.
Jack has passed through the castle gate.
Manny has passed through the castle gate.
Moe has passed through the castle gate.
Moe carries on, knowing all others have passed.
Gatekeeper closes the gate, knowing all have passed.
Mary carries on, knowing all others have passed.
Peter carries on, knowing all others have passed.
Paul carries on, knowing all others have passed.
Jack carries on, knowing all others have passed.
Manny carries on, knowing all others have passed.

Your lab5 folder includes event-barrier.h, event-barrier.cc, and ebtest.cc, and typing make should generate an executable called ebtest that you can run to ensure that the EventBarrier class you'll flesh out in event-barrier.h and .cc are working properly. The one exercise in this lab that has you do any coding is this one, as it expects you complete the implementation stub you've been supplied with.

Solution

// .h
class EventBarrier {
public:  
   * This constructor uses an initialization list to initialize its current private fields.
   * An initialization list is a list of instance variables between the () and the { and
   * how they should be initialized.  Here, numWaiting is initialized to 0, and
   * currentlyLifted to false.
   */
  EventBarrier(): numWaiting(0), currentlyLifted(false) {}
  void wait();
  void lift();
  void past();

private:
  // The number of threads currently blocked in a call to `wait`, waiting for the barrier to lift.
  size_t numWaiting;

  // true if the barrier is lifted, false otherwise.
  bool currentlyLifted;

  // A lock around both numWaiting and currentlyLifted
  std::mutex metadataLock;

  // waited on both for the barrier to be lifted, and for all those waiting to pass.
  std::condition_variable_any barrierCV;
}

// .cc

void EventBarrier::wait() {
  // acquire the lock before modifying numWaiting or accessing currentlyLifted
  lock_guard<mutex> metadataLockGuard(metadataLock);

  // We are now waiting for the barrier to be lifted
  numWaiting++;

  // Wait for the barrier to be lifted.  If already lifted, this returns immediately.
  barrierCV.wait(metadataLock, [this] { return currentlyLifted; });
}

void EventBarrier::lift() {
  // acquire the lock before modifying numWaiting or accessing currentlyLifted
  lock_guard<mutex> metadataLockGuard(metadataLock);

  // Lift the barrier
  currentlyLifted = true;

  // Notify everyone waiting for the barrier that it is lifted
  barrierCV.notify_all();

  /* Wait for all those waiting to pass through the barrier.
   * It's fine if more people call `wait` after we have started
   * sleeping here - we will just wait for them to pass as well.
   */ 
  barrierCV.wait(metadataLock, [this] { return numWaiting == 0; });

  // Lower the barrier
  currentlyLifted = false;
}

void EventBarrier::past() {
  // acquire the lock before modifying numWaiting or accessing currentlyLifted
  lock_guard<mutex> metadataLockGuard(metadataLock);

  // We have passed the barrier
  numWaiting--;

  if (numWaiting > 0) {
    // If others are still waiting, wait for them to pass as well
    barrierCV.wait(metadataLock, [this] { return numWaiting == 0; });
  } else {
    // If noone else is waiting, signal the lifting thread that all have passed
    barrierCV.notify_all();
  }
}

Checkoff Questions Solutions

Problem 1a: The implementation of acquireAsReader acquires the stateLock (via the lock_guard) before it does anything else, and it doesn't release the stateLock until the method exits. The lab handout presents an alternate implementation of acquireAsReader where the stateLock is released even sooner. Why doesn't that work?
- Thread 1 calls acquireAsReader and is swapped off after third of five lines.
- Thread 2 calls and progresses through all of acquireAsWriter.
- Thread 1 progresses through rest of acquireAsReader.
- We have one reader and one writer, and that’s forbidden.
Problem 1b: The implementation of rwlock::release relies on notify_all in one place and notify_one in another. Why are those the correct versions of notify to call in each case? Any number of threads might be waiting for a Ready state so they can advance to acquire read locks. The writer needs to notify all of them when it releases the write lock. At most one thread—a writer—can be waiting for the number of readers to be 0, so calling notify_one is sufficient.
Problem 2a: An EventBarrier causes threads to block in various places - in wait, if the block has not yet been lifted, in lift, if not all waiting have passed yet, and in past, if all others waiting have not yet passed. How are each of these blocks implemented in code? Each of these is implemented via a condition variable. For wait, until currentlyLifted is true, we wait to be signaled by the barrier being lifted. For lift, until numWaiting is 0, we wait to be signaled by a thread going past the barrier. And for past, until numWaiting is 0, we wait to be signaled by the last thread that goes past the barrier.
Problem 2b: There are cases where wait and lift never block. For example, wait will not block if the event is already occurring. lift will never block if there are none waiting for the event to happen. How is it possible for these to not block, given that we always call cv.wait in those places? cv.wait with a lambda function calls the actual cv.wait(mutex) in a while loop with that lambda as the condition to break. So if the condition is true when we initially call cv.wait, we will never actually wait, since we will never loop.