Today, Chris Gregg finishes up nonblocking I/O with two hip examples:
expensive-server.cc
and efficient-server.cc.
Both methods use a single thread to support a server able to quickly respond to multiple client requests,
though the efficient server does so using the epoll system calls to
leverage the OS's ability to support waiting for changes to file descriptors.
Today and Wednesday, we'll introduce the idea of I/O event-driven programming and how it can be used to complement nonblocking I/O techniques. Linux provides I/O event detection facilities via its epoll (short for event polling) library. We'll spend today's lecture discussing three key epoll functions and the various ways they can be combined with nonblocking descriptors to build singly-threaded web servers that can handle tens of thousands of connections. We will discuss the pros and cons of nonblocking I/O while comparing it to blocking I/O approaches relying on multithreading and multiprocessing.
proxy Thought Questionsproxy Thought Questionsproxy Thought Questions Solution
This week's section is completely optional and you get credit for just being you whether or not you participate. We'll adopt a hybrid model where the students decide how much traditional discussion section they want, and how much should operate as traditional office hours. I'm posting the solutions immediately, as they might very well inform how you approach your final programming assignment.
This is the last week of section, so you're off the hook for next week. :)
Today's lecture has Chris Gregg front and center, and it's a different type of lecture than you normally see in CS110. There's no code at all. Rather, it's an exhaustive recap of all of the general systems design principles we've been exposed to over the course of the quarter. Chris makes it clear that a system's design is generally more intellectually captivating than its implementation. In fact, at some point, the coding part becomes easier, but the design of novel, nontrivial systems is never easy.
Today lecture video is still me, and I work through the implementation of createClientSocket and createServerSocket,
so you have some sense of how Linux system calls can be leveraged to provide simple, easy-to-use networking abstractions.
Once I do that, I'll pivot in a short overview of how MapReduce works. I do so because it's an easily described distributed system that relies on every major topic we've relied on this quarter, and it's a huge victory to fully understand how something you've almost certainly heard of before can be implemented using everything you've learned in CS110.
Our final assignment would normally have been to implement a reduced (har) version of MapReduce, and I'll even make it available in mid-June as an unofficial programming exercise that you can work on just for fun. I'm around for the vast majority of the summer, and can even answer questions if you have them while working on it.
I'm back today and lead off with a discussion of how networking is really just a distributed form of traditional function call and return. And in spite of the fact that function calls across the internet are inherently slow, there are many software engineering advantages to hosting the implementation of some core service on a remote server in one location.
I speak a bit about HTTP to the extent it's needed to back true web services. And once I do that, I'll go on to discuss the various data
structures, system calls, and Linux library functions needed to implement createClientSocket
and createServerSocket.
Good morning, everyone! Chris Gregg is still in charge in today's lecture video, and he'll continue his discussion of networking. He explains what it means to be a client application and how to write one, and he even discusses HTTP and how it's used to guide so many conversations taking place over the web—particularly those between browers and virtually all remote web servers.
Chris continues to speak about how to build a simple server, and what it means to build an API.
We'll spend this week's lab answering any general questions you have about threads, conditional variables, semaphores, and so forth. Time permitting, we'll pose several questions about how a server's architecture influences its behavior, and we'll pose additional questions about some various of the various Web products you're familiar with and the protocols they follow.
Today's lecture video breaks into our last major topic: networking. Chris Gregg takes the stage for the next two weeks or so, and today he's introduces us to the idea of a network, the client-server idiom, and how the two endpoints exchange information. In the process, he illustrates why a network connection, once it's been set up, is little more than a bidirectional pipe between processes on different machines. This descriptor abstraction is so much bigger than you previously thought!
You'll receive your very first multithreading assignment this coming Monday, so you'll get plenty of opportunity to code with threads, mutexes, semaphores, and so forth. The lab handout, however, asks that you answer some high level questions about threads and the various directives we've discussed to solve synchronization problems.
Time permitting, we'll work through the implementation of a read-write lock, which allows multiple threads to access shared data in a read-only manner provided no threads are permitted to change that data. If a thread needs to change the data, it can acquire the rwlock, but only fully grab the lock once all reader threads have released their interest in the data.
During today's lecture video, we code through the final implementation of
the myth-buster example introduced this past Monday. We then
wrap up all things multithreading with a program so large that it gets its
very own handout.
Today's lecture video is all me again, and I further discuss the semaphore by re-implementing it, and then using it upgrade the dining philosophers simulation to use it. I'll then move on to a new collection of examples that relies on the mutex and the semaphore to support a few different inter-thread communication patterns you've not seen prior.
assign3 Redux, Threadsassign3 Redux, Threadsassign3 Redux, Threads SolutionI'd like to spend this week's lab reviewing a few things about Assignment 3 that made it challenging. By now, you're well aware how difficult multiprocessing and concurrency can be, but it's not unusual for students to arrive at working Assignment 3 solutions even though they didn't quite understand everything they wrote. Problem 1 has you revisit those programs and work through why they needed to be written the way they were.
The remaining questions both deal with multithreading, and you'll work through one or both of them with the CA leading your discussion section.
Aside: There are a few questions that would normally be included under Problem 1 that I couldn't include this quarter, because they expose partial solutions to Assignment 3, and some people are still working on it. I will, however, add those parts to the handout on Saturday and will include answers to them when I release the solutions on Monday morning.
In the first few minutes or today, I finish up the ticket agents example from last Wednesday so we more fully
understand how the mutex can be used to mark the boundaries of critical regions
so that at most one thread is within it at any one moment.
I then move on to a new example—the dining philosophers simulation—and illustrate how a system can arrive at deadlock and forever starve all threads of from CPU time. That'll be our opportunity to introduce the condition variable and the semaphore as a means of preventing deadlock from ever happening.
First, we'll want to work through a file descriptor and forking example so that we understand how file redirection can be implemented. We actually present a broken implementation, understand why it's broken, and then fix it!
We'll spend the rest of our time scrutinizing a parallel, theoretically interesting implementation of mergesort so you understand how it's architectured and ultimately believe that it works. The discussion section even includes extra problems that you can work through in your spare time. :) They'll help solidify your understanding of virtual memory and process scheduling.
I started in on threading this past Monday, and in today's lecture video, I continue talking about it by working through a collection of introductory examples that illustrate how threads work and the types of concurrency issues that sometimes present when multiple threads are running within a process at the same time.
We do precisely this by working through the remaining examples from this past Monday's slide deck, and then transition over to C++, which in my opinion provides much better support for threading and concurrency than pure C does. (I teach C threads first, however, because C++ threads are implemented in terms of C threads, and you may see pure C threading directives in gdb backtraces.)
I'm back on lecture duty today! Today's video recording spend some time explaining the various OS components that allow multiple processes to be (seemingly) running at the same time, even when there's just one CPU. Also, discuss how each process operates as if it owns all of memory, even though it clearly can't. Key phrases include virtual memory, translation lookaside buffers, process control blocks, and OS scheduler.
The video sneaks in a little bit of time at the end introducing multithreading, which will be the focus of the next three weeks.
This week we'll investigate the use of fork, execvp, and waitpid to build a command line utility called exargs. We'll then work through some short answer questions to prompt you to think more holistically about how multiprocessing and all of the system calls we use to support it work behind the scenes.
The lab handout includes a collection of addtional laptop exercises to elevate your gdb and valgrind acumen, and they are designed to be worked on independently after you've read through them. Both additional problems are really, really neat.
Chris returns to the Disneyland example and works through the series of fixes needed if it's to always work properly, regardless of how and when the child processes exit.
He then speaks to the types of concurrency issues that can present themselves when a signal arrives at the worst possible time and forces a signal handler to execute and manipulate data that was being manipulated by the main program before it was interrupted. System programmers rely of directives called signal sets, signal set masks, and signal blocks to install just enough process and signal coordination to ensure a process can't possibly suffer from any concurrency issues.
Finally, he revisits this past Monday's simplesh example to implant the collection of improvements needed to cull the resources of processes running in the background. The ultimate, A+ solution that fixes all problems relies on a system call named sigsuspend, which in my opinion is the most difficult single system call we need to cover this quarter.
Last Wednesday's lecture introduced execvp as the one system call that reboots a process to execute the main function of a new program. Today's lecture advances that discussion even further by working through the implementation of a tiny shell, much like the one you're using whenever you're using Unix. We'll then introduce the notion of a pipe as a way to set up communication channels between multiple processes.
Your first lab will have you discuss the various ways you can exploit a file system design to implement other functions. You'll also learn some fun valgrind tricks and answer a good number of big-picture, short answer questions about file systems.
This past Monday's lecture introduced fork in its final minutes. Today's lecture works through a collection of etudes—all included in the various Lecture 04 slide decks—to help us establish some expertise on how fork works and how to best manage the concurrency and synchronization concerns that present themselves. We'll then move on to discuss the execvp that's used to effectively reboot a recently forked child process to run a different program from scratch. Today's slide deck is very short, but it contains one substantial example that illustrates how fork, execvp, and waitpid often contribute to a program that relies on helper processes to gets its overall job done.
We work through the rest of our discussion of how a filesystem is layered over raw hardware. Last time, we explain how basic files—be they text files, audio files, executable, videos, etc— are segmented into blocks and distributed across free sectors of the hardware. Today's lecture advances that discussion to be clear how filesystem supports directories, and how an absolute path name like /usr/class/cs110/WWW/index.html can be parsed and resoved to an inode number, which is the location within the inode table that stores meta-information about that file, who can access it, how large it is, and where its payload lives.
The lecture then discusses a few things as a way of transitioning to multiprocessing: how open file sessions are maintained by the OS on a per-process basis, and how system calls differ from traditional function calls. And the last few minutes introduce a new system call that's used to spawn new processes, and this new system call will be the focus on the next two weeks.
We'll work through a few more UNIX Filesystem API examples before advancing on to a discussion of how the raw hardware of a disk is leveraged to look like the filesystem you've taken for granted for the entirety of your UNIX and programming careers. My hope is that by the end of the day you'll have gained a familiarity with the filesystem APIs and begin to understand how filesystems work under the hood.
Good stuff, these filesystems.
We need to invest some time learning some C and some UNIX/Linux libraries that offer us programmatic access to the file system. There are a few data structures and a collection of Linux library functions that allow us to crawl over the tree of files and directories, and there are even more functions that grant us raw, low-level access to the file contents. We'll invest some time discussing these functions and data structures—enough that we can implement a collection of obvious and not-so-obvious programs that emulate some of the terminal/shell builtins you've been using your entire UNIX lives.
Introductions, course administration and expectations. Standard first day stuff.
I'm happy to announce that CS110's very own Ryan Eberhardt is receiving the 2020 Centennial Teaching Assistant Award for his work as a CS106AJ, CS106AX, CS110, and CS110L teaching assistant and instructor. He could have easily graduated a year ago had he wanted to, but he's been sticking around in part because he wants to teach as much as possible before departing. Hit Ryan up on Slack or over email to congratulate him, because he deserves to be congratulated.
Another happy announcement: Three CS110 graduates (one from 2018, but two from this quarter!) have been working on a Mac and Windows desktop application—and no, Glen, no Linux version yet, and don't tell me you weren't about to ask—called Sesame. Maddie Wang, Lainey Wang, and Rachel Naidich have been working on various projects over the course of the year, and Sesame represents the accumulation of all that work so far.
Here's Rachel's elevator pitch:
Sesame recreates the feeling of being with your friends all day, every day, even when we're apart. You're in a house filled with all your closest friends, some with their doors open and some with their doors closed. If it's open, click to hang out instantly. If it's closed, knock to talk later.
CS110 Relevance Alert: they couldn't wait to tell me they needed execvp to get a key portion of the platform working. Cool. Even Doris (@doris on Sesame) was impressed.
CS110 is recognizing #ShutDownSTEM, #ShutDownAcademia, and #Strike4BlackLives with the goal of prompting Stanford and all other higher education institutions to pursue immediate solutions to their own anti-Black racism that don't rely solely on the demands and activism of their communities but also the demands and activism of their top administrators.
There is no daily briefing today, and whatever materials would have been presented today will be presented tomorrow on June 11th. We will still provide support via CS110 channels where students opt in for help, e.g. Piazza, Slack, and office hours.
If you didn't read the two announcements posted to Piazza earlier today, please read them when you get a chance. Those announcements are right here and right here.
The first message mentioned I was relaxing the requirement that you get at least half the points on Assignments 5 or 6. Provided your overall avearge in the course is a 70% or higher, you still meet the requirements needed to pass the course. The second message, posted about an hour later, mentioned that I decided to drop the lowest assignment score and retain the other five. Your top five assignments are weighted equally, and your lowest assignment—even if it's a 0—will be dropped.
I've shared my own answers for each of the questions I posed in the multithreading self-assessment I posted last Friday. Feel free to read through the questions and my answers if and when you have the time and energy.
Forgive the redundant announcement, but it's too important to not repeat everywhere. Because of the current state of the world and the multitude of crises that are currently happening, we've decided to change the late policies for Assignments 5 and 6. We've also changed the section attendance policy for Week 9, which is the final week of sections. Lastly, we've linked some campus support resources that are available to everyone.
There's much more information in this pinned Piazza post, which you should read to get the full details of the new policies and how they apply to you. If you're not able to access the post for whatever reason, contact any member of the course staff and we'll send you the information.
I'll include the last portion of that post here: Understand that CS110 shouldn't be a priority right now. I will do everything I can to help when you're ready to come back. In fact, the entire CS110 staff is here to support you, and we will do everything we can to make things easier.
The multithreading self-assessment is now available, and can be found right here. Many of you are still in full-on thread pool mode, so I'll hold off on posting solutions until Wednesday night so everyone has more than a few days to mull over the problems.
Happy Monday, CS110 community! Three of your fellow students are also in CS278 this quarter, and for their final project they are building on top of
Gabby Delos Reyes, Kavita Selva, and Thomas Hsieh comprise the team, and in their own words: "We are trying to recreate the social aspects and fill the voids that are missing from living in Huang Basement while grinding on—their words, I prefer luxuriate in—CS. This is through a Discord server moderated by our group dedicated to assignment five and six. Feel free to browse the Discord if you'd like to see what the layout looks like."
If you're curious, you should definitely check it out. Gabby, Kavita, and Thomas are all planning on being present during office hours, but the server will be up and running for the duration of the quarter.
I've spent the past few days typing up an Assignment 3 Redux handout, and it's finally done! Feel free to read through it as you have time, and hit me up on Slack or Piazza if you have questions.
Hey there, happy CS110 aggregators!
Assignment 4 grades are being published later tonight and should be available by the time you read this. If while reading over your grade report you notice tests failed because some stray debug statements caused a mismatch that wasn't caught by the sanitycheck, then reach out to Jerry directly so I can fix it for you and rerun the tests. I don't want silly errors like that to impact your grade. 🙃
If you have questions about why a test failed and it can't be attributed to errant debug statements, then engage the CA leading your discussion section first, and he or she can rope me in if they need to. The autograder test names are fairly elaborate, and they all come with detailed descriptions of what was tested, but that still might not enough to fully understand what went wrong when something did, in fact, go wrong. So, of course, reach out to your section CA if you'd like to investigate further.
I've gone ahead and shared my own solutions to the multiprocessing self-assessment I posted this past Friday. As always, you're more than welcome to engage me or the CA leading your discussion section if you have any questions about my answers and/or whether yours would have been equally acceptable.
The multiprocessing self-assessment is now available, and can be found right here. Given that many of you are still working on Assignment 4 until Sunday, I'll refrain from posting answers until Wednesday night. That way those who submit by Sunday will still have some down time to focus on the self-assessment without a looming assignment deadline.
Forgive the redundant announcement, but it's too important to not repeat everywhere. Because of the current myth cluster meltdown and its impact on our ability to effectively work remotely, I've extended the Assignment 4 deadline to this coming Sunday night at 11:59pm. Operate as if that were the deadline all along, so that submitting Monday is one day late, etc. Those who manage to submit by tomorrow's original deadline will be awarded a 5% honorarium. It's a borderline hollow gesture, since there are no S+'s, but it's the bonus I would give in any other querter where I needed to extend the deadline because of infrastructure issues.
There's much more information in this pinned Piazza post.
Your Assignemnt 3 grade reports have just been published, and as always, you can head over to your CS110 Gradebook and drill into your assign3 entry to arrive your report. Overall, everyone did brilliant work. I spent the majority of the weekend coming up with clever but fair tests to exercise your submissions, and the sheer number of you who got everything working perfectly was much higher than I expected. Very, very, very, very, very, very good work, everyone.
If you have a question about why you failed a test and you can't quite figure it out based on the test description and the diff provided in the grade report, feel free to reach out to the CA leading your discussion section, and he or she will reach out to me if they can't answer your questions.
I mentioned this during this morning's CS110 Briefing email, but I'll mention it here as well: I typed up an Assignment 2 Redux handout over the weekend, and that handout can be viewed by clicking right here. I code reviewed a bunch of assign2 submissions, and I saw mostly very good code. But I did see some things that I've also seen in prior quarters, so I wanted to document those observations and get them down in handout form.
Please give it a read, and reach out over Piazza, Slack, or email if you have any questions I can answer.
Here are my thoughts on how you should have been thinking about the self-assessment questions. Feel free to reach out to me or the CA leading your discussion section if you have questions or are curious whether a different answer of yours is also a good one.
As promised, I've put together the equivalent of a take-home midterm with the added bonus that you don't have to turn it in. I'm assuming you're all here to learn, and that you're interested in being challenged to tackle specific systems problems you're more fully outfitted to think about after three weeks of CS110. Take the weekend to read over the problems and think about what you'd have to say if any or all of these appeared on an actual test.
The self-assessment handout is right here. I'll give you the weekend to work through everything, and I'll post my own answers on Monday evening around 7:00pm, after you've had the 72 hours I promised you'd have to work on it.
Assignment 1 submissions have been fully graded, and functionality grade reports can be viewed by visiting your friendly Gradebook and clicking on your assign1 entry to see how you did. Remember that we're not doing code reviews this quarter, but we are willing to help you figure out why something isn't working when your code is failing a test and you can't figure out why. In that case, you can reach out the CA leading your discussion section to see what may have gone wrong.
In general, everyone did terrific work on this first assignment, and I'm optimistic that the remaining five assignments will go equally well over the course of the quarter!
I'm genuinely thrilled to see how diligent and resourceful you all are in your collective pursuit to tame this Assignment 1 monster. Keep doing whatever it is you're doing and you will all prevail!
Also, remember to sign up for labs, which start this week! It looks like a large fraction of you haven't signed up yet, but fortunately, most times are still available. We've even added a few more sections at the times that look to be very popular, so you have all kinds of options!
You can sign up for lab right here!
I just received this in an email from Alex Ayers, who directs the Peer Learning and Tutoring Programs within the Center for Teaching and Learning. He writes:
Want to meet with an experienced peer to discuss course concepts, think through a problem set, or prepare for an upcoming self-assessment? CTL offers appointment tutoring for CS 107 and CS 110, in addition to tutoring for a number of other courses. For more information and to schedule an appointment, visit our tutoring appointments and drop-in schedule page. We also have a variety of remote learning resources and academic coaching available to assist with all of your learning needs!
Everyone, we're still working on a permanent office hours grid for the quarter, but the CAs have teamed up to improvise some office hours at various times today and tomorrow. Check out our nifty office hours page and inspect the Google calendar within it to see when we'll be online.
Greetings from San Francisco, where I'll be hunkered down and working from my kitchen island for the foreseeable future. I've taken the last week to revisit the course to optimize it for remote instruction, and I think I've arrived at a solution that will work equally for all students in all time zones.
First things first! When you get a moment, please fill out this brief survey so we can get to know you better and more easily cater to the needs of our students.
Current enrollment is at 450 students, which is about 275 more students than we expected this quarter. Neat!
To best support everyone across at least eight different time zones, I'll be relying on CS110 lecture videos that were recorded during the Winter of 2019, when Chris Gregg and I last taught the course for SCPD. Those lecture videos are 80 minutes long, so I'll generally post two videos per week--one by 1:30pm CA time on Monday and a second at the same time on Wednesdays--and those two videos will serve as lecture material for the week. The one exception: I'll post a lecture video this coming Friday at 1:30pm as well (but no other Fridays, since those will be set aside for some of the online discussion sections.)
In past quarters, there have been eight assignments, but this quarter there will only be six. I'm excited about the opportunity to scale back the assignment set to cater to the demands of an unpredictable timeline while still meeting the post conditions of the course.
As opposed to, you know, every other quarter in the history of quarters, there will be no CS110 exams. I'm optimizing for a version of the class that avoids high-stakes exams at all costs. I will, however, allow students to work on what I call self-assessments every few weeks over the course of a weekend when no assignment is due. These self-assessments are completely optional and won't be graded, and you get no credit for participating. The assessments will be written so that they could be completed in about 45 minutes by someone who is completely on top of the material, but I'll give you a full weekend to think about them before I distribute answers.
The only synchronous part of the entire course will be the discussion sections. Discussion section participation is required, so we'll accommodate everyone across all time zones by providing discussion section times that work for everyone. We'll publish discussion section times on Monday, April 13th at 3:00pm CA time, at which point you can sign up for a section on a first-come, first-served basis.
That's it for the moment, everyone. The CS110 staff and I are investing as much time and energy as we can to instill some sense of normalcy into your otherwise upended lives.
Until tomorrow!