Lab Handout 7: ThreadPool and Proxy
The lab checkoff sheet for all students can be found right here.
This week, we'll be revisiting the previous thread pool assignment, and if there's time also dive into some questions about your current proxy assignment.
Problem 1: ThreadPool Thought Questions
Presented below are partial implementations of possible ThreadPool::wait
and ThreadPool::worker
designs:
void ThreadPool::wait() {
lock_guard<mutex> lg(m);
done.wait(m, [this]{ return <condition>; });
}
void ThreadPool::worker(size_t workerID) {
while (true) {
// code here waits for a thunk to be supplied and then executes it
lock_guard<mutex> lg(m);
<update to variables>
if (<update made condition true>) done.notify_all();
}
}
- Briefly describe a simple
ThreadPool
test program that would have deadlocked hadThreadPool::worker
calleddone.notify_one
instead ofdone.notify_all
. - Assume
ThreadPool::worker
gets through the call todone.notify_all
and gets swapped off the processor immediately after thelock_guard
is destroyed. Briefly describe a situation where a thread that calledThreadPool::wait
still won’t advance past thedone.wait
call. - Had
done.notify_all
been called unconditionally, theThreadPool
would have still worked correctly. Why is this true, and why is theif
test still the right thing to include?
Consider the two server implementations below, where the sequential handleRequest
function always takes exactly 1.500 seconds to execute. The two servers would respond very differently if 1000 clients were to connect—one per 1.000 seconds—over a 1000 second window. What would the 500th client experience when it tried to connect to the first server? What would the 500th client experience when it tried to connect to the second?
// Server Implementation 1
int main(int argc, char *argv[]) {
int server = createServerSocket(12345); // sets the backlog to 128
while (true) {
int client = accept(server, NULL, NULL);
handleRequest(client);
}
}
// Server Implementation 2
int main(int argc, char *argv[]) {
ThreadPool pool(1);
int server = createServerSocket(12346); // sets the backlog to 128
while (true) {
int client = accept(server, NULL, NULL);
pool.schedule([client] { handleRequest(client); });
}
}
Problem 2: proxy
Thought Questions
-
Some students suggested that your
proxy
implementation open one persistent connection to the secondary proxy when a secondary proxy is used, and that all requests be forwarded over that one connection. Explain why this would interfere with the second proxy’s ability to handle the primary proxy’s forwarded requests. -
Long polling is a technique where new data may be pushed from server to client by relying on a long-standing HTTP request with a protracted, never-quite-finished HTTP response. Long polling can be used to provide live updates as they happen (e.g. push notifications to your smartphone) or stream large files (e.g. videos via Netflix, youtube, vimeo). Unfortunately, the long polling concept doesn’t play well with your
proxy
implementation. Why is that? -
HTTP pipelining is a technique where a client issues multiple HTTP requests over a single persistent connection instead of just one. The server can process the requests in parallel and issue its responses in the same order the requests were received. Relying on your understanding of HTTP, briefly explain why
GET
andHEAD
requests can be safely pipelined butPOST
requests can’t be. -
When implementing
proxy
, we could have relied on multiprocessing instead of multithreading to support concurrent transactions. Very briefly describe one advantage of the multiprocessing approach over the multithreading approach, and briefly describe one disadvantage. -
We interact with socket descriptors more or less the same way we interact with traditional file descriptors. Identify one thing you can’t do with socket descriptors that you can do with traditional file descriptors, and briefly explain why not.
-
The
createClientSocket
function for theproxy
assignment has a call togethostbyname_r
, which has the prototype listed below. This is a reentrant version that is thread-safe, because the client shares the location of a locally allocatedstruct hostent
via argument 2 where the return value can be placed, thereby circumventing the caller’s dependence on shared, statically allocated, global data. Note, however, that the client is expected to pass in a large character buffer (as with a locally declaredchar buffer[1 << 16]
) and its size via arguments 3 and 4 (e.g.buffer
andsizeof(buffer)
). What purpose does this buffer serve?
struct hostent {
char *h_name; // real canonical host name
char **h_aliases; // NULL-terminated list of host name aliases
int h_addrtype; // result’s address type, typically AF_INET
int length; // length of the addresses in bytes (typically 4, for IPv4)
char **h_addr_list // NULL-terminated list of host’s IP addresses
};
int gethostbyname_r(const char *name, struct hostent *ret,
char *buf, size_t buflen,
struct hostent **result, int *h_errnop);
- Part of the
man
page on thebind
system call is below. bind is used to assign an IP address to a socket, but it is generic enough to be able to assign IPv4 or IPv6 addresses:
NAME
bind - bind a name to a socket
SYNOPSIS
#include <sys/types.h> /* See NOTES */
#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr,
socklen_t addrlen);
DESCRIPTION
When a socket is created with socket(2), it exists in a name space
(address family) but has no address assigned to it. bind() assigns
the address specified by addr to the socket referred to by the file
descriptor sockfd. addrlen specifies the size, in bytes, of the
address structure pointed to by addr. Traditionally, this opera‐
tion is called “assigning a name to a socket”.
It is normally necessary to assign a local address using bind()
before a SOCK_STREAM socket may receive connections (see
accept(2)).
There are actually three different siblings of the sockaddr
record family, as shown below (also see the slides here. The first one is a generic socket address structure, the second is specific to traditional IPv4 addresses (e.g. 171.64.64.131
), and the third is specific to IPv6 addresses (e.g. 4371:f0dd:1023:5::259
), which aren’t in widespread use just yet. The addresses of socket address structures like those above are cast to (struct sockaddr *
) when passed to all of the various socket-oriented system calls (e.g. accept
, connect
, and so forth). How can these system calls tell what the true socket address record type really is—after all, it needs to know how to populate it with data—if everything is expressed as a generic struct sockaddr *?
struct sockaddr { struct sockaddr_in { struct sockaddr_in6 {
short sa_family; short sin_family; short sin6_family;
char sa_data[14]; short sin_port; short sin6_port;
}; struct in_addr sin_addr; // other fields;
char sin_zero[8]; };
};