Topic 3 Multithreading (4)

Leedehai
Monday, May 8, 2017
Monday, May 15, 2017

3.6 The semaphore class

3.6.1 Reimeplement eat() in 3.5.3

For exams (KOB):
(1) State the many pros on this approach over the busy waiting approach we initially used to avert the threat of deadlock in 3.5.2.
(2) Can you think of any situations when busy-waiting might be the right approach?

3.7 Reader–writer problem (KOB)

This problem describes a situation where threads are trying to access the same shared resource at one time. Also referred to as the consumer-producer problem in a more generic sense.
It is different from the dining philosophers problem, where threads are bidding resource instances in a directed-cyclic-graph pattern.

Say there is a webpage content generator - the writer, and a browser to parse and render webpage's content - the reader. There is a buffer, whose capacity is 8 characters, and there are 40 characters to transmit in total. Obviously, the content cannot be delivered in an one-time transmission.

The main function:

/* main.cc */ #include ... using namespace std; static void writer(); static void reader(); int main() { /* spawn off two child threads */ thread w(writer); thread r(reader); /* join the child threads - don't forget */ w.join(); r.join(); return 0; }

The implementation of the thread routines.

/* routines.cc */ #include ... using namespace std; static const size_t kNumTotalChars = 40; static const size_t kBufferSize = 8; static char buffer[kBufferSize]; semaphore full(0); semaphore empty(kBufferSize); /* or pass in a smaller positive number */ void writer() { for (size_t i = 0; i < kNumTotalChars; i++) { /* lock one empty slot. If none is empty, wait */ empty.wait(); buffer[i % kBufferSize] = generateOneChar(); /* increment the number of full slots by 1 */ full.signal(); } } void reader() { for (size_t i = 0; i < kNumTotalChars; i++) { /* lock one full slot. If none is full, wait */ full.wait(); char ch = buffer[i % kBufferSize]; parseOneChar(ch); /* increment the number of empty slots by 1 */ empty.signal(); } }

3.8 myth-buster: an example for load balancer

3.8.1 Sequential version: slow (~1 min)

/* Sequential: slow */ #include ... using namespace std; static unsigned short kMinMythMachine = 1; static unsigned short kMaxMythMachine = 32; static void compileCS110ProcessCountMap( const unordered_set<string>& cs110studentIDs, map<unsigned short, size_t>& counts) { for (unsigned short num = kMinMythMachine; num <= kMaxMythMachine; num++) { int numProcesses = getNumProcesses(num, cs110studentIDs); if (numProcesses >= 0) { /* -1 expresses networking failure */ counts[num] = numProcesses; cout << "myth" << num << " has this many CS110-student processes: " << numProcesses << endl; } } }

Nothing interesting here. The program just traverse all the machines. It is understandably slow (~1 min), since communication over the Internet is slow.

3.8.2 Parallel version: faster (~9 sec)

Parallelism can be implemented with multiprocessing or multithreading. We go for the latter here.

When you ssh to a myth machine, you certainly don't like waiting ~1 minute for the load balancer to assign you a machine.

/* Multithreading: faster */ #include ... using namespace std; /* globals (in heap, not in stack) to share across threads */ const unordered_set<string> cs110studentIDs; map<unsigned short, size_t> counts; static mutex processCountMapLock; static void countCS110Processes(unsigned short num, semaphore& s) { int numProcesses = getNumProcesses(num, cs110studentIDs); if (numProcesses >= 0) { processCountMapLock.lock(); /* write to the shared: lock it first! */ processCountMap[num] = numProcesses; processCountMapLock.unlock(); /* done writing: unlock */ cout << oslock << "myth" << num << " has this many CS110-student processes: " << numProcesses << endl << osunlock; } /* the thread completes and returns its permission slip back */ s.signal(on_thread_exit); /* s.signal(on_thread_exit) signals the semaphore ON the thread's * exit; s.signal() does so BEFORE the thread's exit, * but this thread may not be truely done yet. */ } static unsigned short kMinMythMachine = 1; static unsigned short kMaxMythMachine = 32; static int kMaxNumThrds = 8; /* max. number of child threads at a time */ static void compileCS110ProcessCountMap() { vector<thread> threads; /* used to limit number of threads */ semaphore numAllowed(kMaxNumThrds); for (unsigned short num = kMinMythMachine; num <= kMaxMythMachine; num++) { /* wait for a permission to create a new child thread */ numAllowed.wait(); threads.push_back(thread(countCS110Processes, num, ref(numAllowed))); } for (thread& t: threads) t.join(); /* remember to reap the zombies */ }

Also, the lines below ...

processCountMapLock.lock(); /* write to the shared: lock it first! */ processCountMap[num] = numProcesses; processCountMapLock.unlock(); /* done writing: unlock */

... could be replaced with

lock_guard<mutex> lg(processCountMapLock); /* the constructor locks it */ processCountMap[num] = numProcesses; /* the lock is unlokced when the "lg" object is destroyed */

A good practice: limit the maximum number of threads allowed at a time, to prevent stack-overflowing the process, overwhelming the kernel's thread manager, or overwhelming the web server your program is connecting to.
The OS may also impose this limit - when threads are too many, thread creating requests will be denied.

SIDE NOTE:
In fact, however, for myth and big networks like Google, FaceBook, Microsoft, etc., users are still not satisfied with this performance - they don't want to wait for a second! So a common practice is having the load balancer run the program periodically and pre-compute the assignment.

C++ grammar in object-oriented multithreading: pass in a thread routine: say, inside a class A's method function, you want to install A class's another method void foobar(int number, semaphore &s) as the thread routine, the function pointer can be pushed through a thread constructor in one of the three ways listed below:

thread t([this, n, &sem] { this->foobar(n, sem); }); /* 1st way */ thread t([this](int number, semaphore &s) { this->foobar(number, s) }, n, ref(sem)); /* 2nd way */ thread t(&A::foobar, this, n, ref(sem)); /* 3rd way */ /* The template function ref() explicitly tells the compiler * "I'm passing by reference, not by value!", as the compiler * does not do type-matching for "thread" class constructor's * variadic argument list. */ /* mutex and semaphore objects are not copiable. Their copy-constructors * are explicitly deleted in their class declarations */
EOF