http://cs106b.stanford.edu/class/cs106b/about_assessments
We know that you work hard on completing your programming assignments, and that work forms the primary mechanism for growing your coding practice skills and learning the theory concepts in CS106B. Growing and learning messy processes full of ups and downs, and the weeklong format of assignments allows time for this. Exams, on the other hand, are a way for you to demonstrate that by the end of those assignment ups and downs, you did finally reach a place of mastery at the level we expect for this course. We will have two exam assessments in this course. ## Mid-Quarter Diagnostic Exam The mid-quarter diagnostic will be a short, loosely-timed online assessment of the core, fundamental topics from the first half of CS106B. __The diagnostic is designed to take about 90 minutes, and you will complete it during a timed 3-hour window of your choosing between 5pm on Tuesday, October 26 and 5pm on Wednesday October 27.__ The 24-hr window means that you need to turn in your exam by 5pm on Wednesday, so you should start your 3-hr window no later than 2pm. By enrolling in this course, you are confirming that you will be able to take the exam in that window. We understand that most of you have other commitments such as jobs, sports, other exams, etc, and that is why we are allowing the flexibility of a 24-hr window; but you are required to meet that schedule __without further exceptions__. (If you have OAE accommodations that affect exam scheduling, our staff will work with you to meet those.) Watch this space for a link to more details about the exam topics and practice problems, to be released as the date of the exam approaches. {% comment %} More [details on the mid-quarter diagnostic](assessments/1-diagnostic). {% endcomment %} ## End-Quarter Diagnostic Exam The end-quarter diagnostic will have the same format as the mid-quarter diagnostic, and will focus on the fundamental topics from the second half of CS106B. Students often ask if the final will be "cumulative," and the answer is somewhat yes and somewhat no. The focus is very much on the second half of the course; but on the other hand, it would be impossible not to touch topics from the first half at all, since the course content builds on itself. __The diagnostic is designed to take about 90 minutes, and you will complete it during a timed 3-hour window of your choosing on Monday, December 6 (from 12:01am to 11:59pm PDT on December 6).__ December 6 is our registrar-scheduled final exam day (they have us for 8:30-11:30am on December 6), so you are certainly welcome to take it in that exact timeslot, but we are allowing some extra flexibility. The 24-hr window means that you need to turn in your exam by 11:59pm, so you should start your 3-hr window no later than 9pm. As with the mid-quarter diagnostic, by enrolling in this course, you are confirming that you will be able to take the exam in that window. (If you have OAE accommodations that affect exam scheduling, our staff will work with you to meet those.) Watch this space for a link to more details about the exam topics and practice problems, to be released as the date of the exam approaches. {% comment %} More [details on the final project](assessments/2-diagnostic). {% endcomment %}
http://cs106b.stanford.edu/class/cs106b/about_assignments
Programming is a skill best learned by doing, and the programming assignments in CS106B form the central skill development part of your experience in the course. We have a great set of assignments planned that we hope you will find fun, challenging, illuminating, and rewarding. There are 7 assignments, about one each week with breaks around the two exams (see the course [schedule] for tentative assignment dates). Students self-report spending between 10 and 20 hours on each assignment. If you find yourself heading towards the upper end of that range for an assignment, please reach out to course staff for tips. Our workload is challenging because we want to foster the most growth possible for you in our 10 weeks together, but we do want the total hours to stay within reason for a 5-unit course and are happy to work with you towards that goal. In CS106B, we write programs in the C++ language and use the Qt Creator IDE for editing, compiling, and debugging. Please visit the [Qt Installation Guide][qt] for install instructions. ## Common questions about assignments --- ### What is the policy on late assignments? Students are granted a penalty-free grace period for submission on each assignment, the length of which depends on the specific assignment. Read our [course late policy](late) for the details. ### What is the assignment collaboration policy? Since this is essentially a beginning writing course, but with code instead of essays, we have a policy that assignments are done individually (no partners/groups). In later CS courses, you will often encounter teamwork-based programming, and we believe strongly in the value of that---for programmers who have already developed some individual skills in the fundamentals. We adhere to the Stanford and CS department Honor Code policies. Please review our [Honor Code policy][honor_code] for guidelines __specific to this course__ (i.e., do __not__ assume that what is ok in other classes is necessarily ok in this one). ### How can we get help on our assignments? Your main starting points for help are the [online forum][forum] and the nightly virtual drop-in "Lair" help hours with the section leaders (undergraduate TAs). The online forum alows 24-hr access to discussion with your peers, and quick (though not 24-hr, we do sleep!) answers from course staff. The instructors will also hold weekly office hours where you are welcome to discuss assignments or other topics. ### How are assignments evaluated? Programs will be graded on "functionality" (is the program's behavior correct?) and "style" (is the code well written and elegant?). We use a bucket grading scale to focus attention on the qualitative feedback comments graders leave you (rather than just points): * __++__ An absolutely fantastic submission that will only come along a few times during the quarter across the entire class. To ensure that this score is given only rarely, any grade of ++ must be approved by the instructors and head TA. * __+__ A submission that is "perfect" or exceeds our standard expectation for the assignment. To receive this grade, a program often reflects additional work beyond the requirements or gets the job done in a particularly elegant way. * A submission that satisfies all the requirements for the assignment, showing solid functionality as well as good style. It reflects a job well done. * A submission that meets the requirements for the assignment, possibly with a few small problems. * A submission that has problems serious enough to fall short of the requirements for the assignment. * __-__ A submission that has extremely serious problems, but nonetheless shows some effort and understanding. * __--__ A submission that shows little effort and does not represent passing work. {:.list-unstyled} From past experience, we expect most grades to be and ### How do we receive feedback from our grader? Assignment grades and qualitative commented feedback from your section leader will be made available via [Paperless](cs198.stanford.edu/paperless). You will also meet with your section leader to discuss the grading feedback in a short conference.
http://cs106b.stanford.edu/class/cs106b/about_lectures
Our lectures are scheduled for Mondays, Wednesdays, and Fridays from 11-11:50am PDT, in Bishop Auditorium (in the Lathrop building). Most students should plan to attend in person so you may participate in class discussion, but it is not required to do so. All lectures will be video recorded for asynchronous viewing. It usually takes about 2-3 hours after the end of class for the videos to post to [Canvas][canvas], where they will appear under Panopto Course Videos (note that there is no live remote viewing). If you have a question during lecture, you may of course raise your hand to ask it at any time. Since we have two co-instructors for this course, Cynthia Lee and Julie Zelenski, you also have the option to ask questions during lecture via our [online forum][forum] in a special megathread that will be continuously monitored by one instructor while the other lectures.
http://cs106b.stanford.edu/class/cs106b/about_section
Starting in Week 2, you'll have weekly hour-long small group discussion section meetings hosted by your section leader (an undergraduate TA), who is also your mentor and grader in the course. The section materials for each week consist of a set of problems for further practice with recent lecture topics. For that reason, to attend section __you should be up to date on lecture viewing__. You don't need to have understood everything in lecture perfectly, but your section leader won't be able to effectively guide you and the other students in your section through the problems unless everyone is at least caught up on viewing. Solutions to the section exercsies will be added under the webite "Sections" tab at the end of the week, for your use as assignment code templates or exam studying. Active participation in section is required for all students. Choose a participation style that is comfortable for you, including asking questions, contributing answers, and participating in pair discussions with fellow students. ## Common questions about sections --- ### How do I sign up for section? Section signups are conducted on the [CS198 section portal][cs198] (do not sign-up for sections on Axess). The section portal will __open signups on {{ page.signups_open |date: "%A, %B %-d at %l:%M %P %Z" }} and close on {{ page.signups_close | date: "%A, %B %-d at %l:%M %P %Z" }}.__ Section sign-ups are not first-come first-serve, rather you submit your preferences and after submissions close the staff will construct a schedule and email you your assigned section. ### How is section participation graded? Section participation will be graded on this scale: + 2 : Showed up to section on time, followed section norms, participated in an engaged manner + 1 : Showed up to section late, minimal participation + 0 : Did not show up to section, or did not follow established section norms and policies ### What should I do if I must miss a section meeting? If you miss your section in a given week, you can attend another section that week to make up your absence. Make sure to let the section leader whose section you attend know that you're there. That way, they can let your section leader know that you went to an alternate section for the week. A list of all section times can be found on the [CS198 section portal][cs198]. ### How do I become a section leader someday? You can apply during/after completing 106B. Come join us! Information about applying can be found on the [CS 198 Website](https://cs198.stanford.edu).
http://cs106b.stanford.edu/class/cs106b/about_staff
{% for p in site.data.course.staff %} {% assign photo = p.name | headshot %} {% include captioned_img.html img=photo %} Email Office Hours {% for office_hour in p.office_hours %} {{ office_hour }} {% endfor %} {% endfor %}
http://cs106b.stanford.edu/class/cs106b/course_placement
Not sure of CS106B is right for you? Wondering if you should start with CS106A or CS107 instead? This is a collection of our usual advice to students who ask about selecting the course that's right for them. Of course, we are happy to provide further guidance, just reach out on the [forum][forum] or in office hours. ## CS106A: Start here! CS106A is our first-quarter programming course. It teaches the widely-used Python programming language, with an emphasis on conceptual understanding of the fundamental building blocks of coding (in any language) and principles of good coding style. If you're interested in learning how to program a computer, this is the place to start. CS106A has no prerequisites - it's open to everyone! The course is designed to appeal to everyone from humanists and social scientists to aspiring hard-core techies. If you've had some experience with coding, it can be hard to decide if CS106A or CS106B is the right starting point for you. If you've taken AP CS Principles (but not AP CS A / Java), we recommend that you start with CS106A. If you completed AP CS A, then CS106B is most likely the best match for you, although some students who feel unsatisfied with their high school AP CS A experience do start with CS106A. We recommend that you take CS106A if - You are interested in learning the first fundamentals of how to program computers. We recommend that you __not__ take CS106A if - You have prior programming experience at a level comparable to an introductory college course (for example, if you scored a 4 or 5 on the AP CS A / Java exam). - You have significant prior programming experience and just want to learn Python as a second+ programming language. We sometimes hear that students are concerned that starting in CS106A means being too far "behind" their peers to successfully complete a CS major, but this is not true at all. Nearly half of the CS department's bachelor's degree graduates each June got their start in CS106A, so you'll be in good company! In the 2021-2022 academic year, CS106A is offered every quarter. Visit the [CS106A website](https://cs106a.stanford.edu). ## CS106B: Next step CS106B is our second course in computer programming. It focuses on techniques for solving more complex problems than those covered in CS106A and for analyzing program efficiency. Specifically, it explores fundamental data types and data structures, recursive problem solving, and basic algorithmic analysis. It's taught in C++, but the focus is on conceptual understanding of algorithms. If you'd like a focused study of C++ the language itself, consider taking CS106L. CS106B assumes you have programming experience at the level of CS106A, though you don't necessarily have to have taken __our__ CS106A course. If you are experienced with basic control structures (conditions, loops), variables, arrays/lists, maps, and program decomposition, then you should be ready to take CS106B. If you've had some experience with coding, it can be hard to decide if CS106A or CS106B is the right starting point for you. If you've taken AP CS Principles (but not AP CS A / Java), we recommend that you start with CS106A. If you completed AP CS A, then CS106B is most likely the best match for you, although some students who feel unsatisfied with their high school AP CS A experience do start with CS106A. We recommend that you take CS106B if - You have prior programming experience at the level of CS106A. - You are interested in learning more about problem-solving with computers. - You've programmed before but have not seen recursion, data structures, or algorithmic analysis. We recommend that you __not__ take CS106B if - You already have completed equivalent coursework elsewhere. - You have taken AP CS Principles, but not AP CS A / Java. In the 2021-2022 academic year, CS106B is offered every quarter. Visit the [CS106B website](https://cs106b.stanford.edu). ## Optional add-ons to CS106B We offer several courses that are designed to complement CS106B with additional material. None of these courses are required, and they do not count toward the CS major or CS minor requirements. However, if you're interested in going deeper for your own enrichment, you may find them worth checking out! ### Additional foundation support: CS100B CS100B is an optional 1-unit companion course to CS106B that provides extra support to students from under-resourced backgrounds. It meets for an additional weekly section where students receive access to additional mentoring, in depth content review, and other study resources. It is part of the Pathfinders/ACE program jointly sponsored by the CS department and the School of Engineering. Enrollment is by application, read more at We accept applications from all students who believe they may benefit from participating in small active-learning sessions led by a highly trained graduate student. ### C++ Language: CS106L CS106L is an optional 1-unit companion course to CS106B that focuses purely on the C++ programming language. Unlike CS106A and CS106B, which focus more on general programming skills and fundamental programming concepts, CS106L is specifically designed to focus on language features particular to C++ and how to use the C++ programming language to solve problems. Although CS106L is designed as a companion course to CS106B, it's open to anyone with a comparable background. We recommend that you take CS106L if - You have prior programming experience at the level of CS106B (or are currently enrolled.) - You are interested in learning more about the C++ programming language and the standard libraries. - You are willing to put in more work than is necessary for CS106B. We recommend that you __not__ take CS106L if - You want a deeper understanding of topics like recursion, data structures, or big-O notation. - You want to learn programming at the level of CS106B, but don't have the time to take those courses. In the 2021-2022 academic year, CS106L is offered in Autumn, Winter, and Spring quarters. Visit the [CS106L website](https://cs106l.stanford.edu). ### More adventures: CS106M CS106M is an optional 1 unit add-on course to CS106B that explores supplemental material in a small discussion setting. For example, this year's offering will likely cover topics like data compression, error-correcting codes, and digital signatures. The topics covered in CS106M will not be required by later CS courses, even if you are planning to major in CS. We recommend that you take CS106M in addition to CS106B if: - You are currently enrolled in CS106B. - You are interested in exploring additional topics and deepening your study of the course material in a small discussion setting. - You are willing to put in more work than is necessary for CS106B. We recommend that you __not__ take CS106M if - You are concerned that you "need" to take CS106M to avoid falling behind everyone else. In the 2021-2022 academic year, CS106M is offered only in Fall quarter. ### Social Good: CS106S CS106S is an optional 1 unit add-on course to CS106B that gives you a chance to work on programs for social good. The class brings in student groups, nonprofits, and local tech companies and is a mix of a speaker series and small project course. The course also teaches basic web development, but is not meant to be a stand-alone web development course. We recommend that you take CS106S in addition to CS106B if - You are interested in exploring social good applications of computer science. - You are willing to put in more work than is necessary for CS106B. In the 2021-2022 academic year, CS106S is offered in Autumn, Winter, and Spring quarters. Visit the [CS106S website](https://cs106s.stanford.edu). ## Can I skip the intro courses altogether? Many students entering Stanford today have had considerable programming experience in high school or from their own independent work with computers. If you are in that position, the idea of starting with a beginning programming course-even an intensive one like CS 106B-seems like a waste of time. Your perception may in fact be correct. In our experience, there are somewhere between 10 and 15 students in each entering class who should start at a more advanced point in the sequence. Below we talk about some of these more advanced classes (CS107 and CS107E). For most of you, however, the right place to start is with the CS 106 series. Most high-school computing courses are somewhat weak and provide little background in modern software engineering techniques. By taking CS 106, you will learn how the CS department at Stanford approaches programming and get a solid foundation for more advanced work. If you're unsure where you should start the programming sequence, please talk with us. ## CS107: How it all works After completing the intro programming sequence, CS107 takes you under the hood to learn the ins and outs of computer systems. It explores how high-level programming constructs are represented internally inside the computer and how those internal representations affect program behavior and performance. Along the way, it provides programming maturity and exposure to developing software in a Unix environment. CS107 has CS106B as a prerequisite and assumes an understanding of fundamental programming techniques and good programming style. As a result, it's rare for incoming students to jump directly into CS107 and to skip the CS106 series entirely. Typically, we'd only recommend this to students with a background comparable to CS106A/B and who already have good programming style. Most students, even those who go on to be CS majors, usually begin in the CS106 sequence. We recommend that you take CS107 if - You have completed CS106B or have the equivalent programming background, including familiarity with recursion and fundamental data structures (binary trees, dynamic arrays, linked lists, graphs, etc.) - You have experience writing readable code - writing comments, decomposing problems into smaller pieces, etc. We recommend that you __not__ take CS107 if - You have never before taken a class in computer programming. - You have prior programming experience but have not met the postconditions of CS106B. Visit the [CS107 website](https://cs107.stanford.edu). ## CS107E: How it works, embedded CS107E is version of CS107 that covers similar topics but which focuses on programming a small computer that can easily fit into the palm of your hand. The class is smaller and more project-oriented than CS107 and lets you play around with small embedded devices to see how low-level systems concepts directly let you control physical devices. The [CS107E FAQ](https://cs107e.stanford.edu) offers perspective on advice on choosing between 107 and 107E. We recommend that you take CS107E if - You meet all the requirements for CS107. - You enjoy working on open-ended projects. We recommend that you __not__ take CS107E if - You're nervous about taking CS107 and want to satisfy that requirement in a different way. Visit the [CS107E website](https://cs107e.github.io).
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/ethics.html
This week in lecture you learned about Big-O and Asymptotic Analysis to determine runtime and performance characteristics of algorithms and data structures. Some of the algorithms we learn in this class underlie ubiquitous technologies like search engines, so tools that can help us analyze and improve their speed and performance are crucial. As we discussed in Wednesday's lecture, improving performance by decreasing runtime also can be a "green" software engineering practice. {% question 7 %} In a short paragraph, describe a real or plausible scenario not previously presented in lecture in which using techniques like Big-O or Asymptotic Analysis to improve the performance of an algorithm might benefit the environment. Include your thoughts on how a software engineer working on this piece of code might identify such potential benefits and take them into consideration when designing the code. {% endquestion %} As ethical and socially-conscious computer programmers, we also know that many considerations other than the speed and runtime are important in choosing the appropriate algorithm for a particular use. Dr. Gillian Smith, an associate professor at the Worcester Polytechnic Institute, identifies an interesting fallacy that computer scientists often fall into when applying algorithmic analysis techniques like Big-O analysis: _If it's more efficient, it's better. The quality of a program, independent of how people interact with it, should be evaluated only with respect to how well it minimizes cost._ The following case study illustrates the importance of supplementing efficiency and performance analyses with human-centric evaluation. In 2006 the state of Indiana awarded IBM a contract for more than $1 billion to modernize Indiana's welfare case management system and manage and process the State's applications for food stamps, Medicaid and other welfare benefits for its residents. The program sought to ___increase efficiency and reduce fraud___ by moving to an automated case management process. After only 19 months into the relationship, while still in the transition period, it became clear to Indiana that the relationship was not going as planned. In particular here are some "lowlights" of the system's failures to provide important and necessary services for those in need: - "Applicants waited 20 or 30 minutes on hold, only to be denied benefits for 'failure to cooperate in establishing eligibility' if they were unable to receive a callback after having burned through their limited cellphone minutes." - "Applicants faxed millions of pages of photocopied driver's licenses, Social Security cards, and other supporting documents to a processing center in Marion, Indiana; so many of the documents disappeared that advocates started calling it "the black hole in Marion" [...] Any application missing just one of tens to hundreds of pieces of necessary information or paperwork were automatically denied." - "By February 2008, the number of households receiving food stamps in Delaware County, which includes Muncie, Indiana, dropped more than 7 percent, though requests for food assistance had climbed 4 percent in Indiana overall." (Quotations from [this article written by Virginia Eubanks](https://www.thenation.com/article/archive/want-cut-welfare-theres-app/).) In light of these failures, the State of Indiana cancelled its contract with IBM and sued the company for breach of contract, stating that the company had failed to deliver a system that was supposed to help people get the services they needed. In court, IBM argued that they were not responsible for issues related to wait times, appeals, wrongful denials, lost documents, etc. as the contract only stated that a successful system would succeed by reducing costs and fraud. IBM's system did reduce costs, but did so by denying people the benefits they needed. In light of this, we would like you to consider the following questions: {% question %} According to the contract that IBM struck with the state of Indiana, the criteria for optimization were improving efficiency of the overall welfare system and reducing fraud. Criteria for reducing wait times and wrongful denials were not included. However, wrongfully denying benefits has a huge negative impact on the citizens who rely on the system. If criteria like minimizing wrongful denials were not included in the contract, should engineers have included them in their optimization algorithm? Why or why not? {% endquestion %} {% question %} Imagine that after completing CS106B you are hired at IBM as an engineer working on this system. How might you have approached designing and setting the goals of this system? How might you apply algorithmic analysis tools to build a system that achieved the desired goals? Could you do so in a way that avoids the severe negative impacts on users of the system that are outlined in the case study? {% endquestion %} If you're interested in reading more about this case study, we highly recommend reading the [linked article above](https://www.thenation.com/article/archive/want-cut-welfare-theres-app/). The author, Virginia Eubanks, also wrote a great book titled _Automating Inequality_ that can make for interesting further exploration!
http://cs106b.stanford.edu/class/cs106b/index.html
{{ site.data.course.id }} {{ site.data.course.title }} {{ site.data.course.quarter}} {{ site.data.course.lecture_schedule }} Announcements {% include announcements.html limit=5 %}
http://cs106b.stanford.edu/class/cs106b/honor_code
Since 1921, academic conduct for students at Stanford has been governed by the Honor Code, which reads as follows: THE STANFORD UNIVERSITY HONOR CODE 1. The Honor Code is an undertaking of the students, individually and collectively: 1. that they will not give or receive aid in examinations; that they will not give or receive unpermitted aid in class work, in the preparation of reports, or in any other work that is to be used by the instructor as the basis of grading; 2. that they will do their share and take an active part in seeing to it that others as well as themselves uphold the spirit and letter of the Honor Code. 1. The faculty on its part manifests its confidence in the honor of its students by refraining from proctoring examinations and from taking unusual and unreasonable precautions to prevent the forms of dishonesty mentioned above. The faculty will also avoid, as far as practicable, academic procedures that create temptations to violate the Honor Code. 1. While the faculty alone has the right and obligation to set academic requirements, the students and faculty will work together to establish optimal conditions for honorable academic work. {: type="A" .alert .alert-danger} The purpose of this handout is to make our expectations as clear as possible regarding the Honor Code. The basic principle under which we operate is that each of you is expected to submit your own work in this course. In particular, attempting to take credit for someone else's work by turning it in as your own constitutes plagiarism, which is a serious violation of basic academic standards. Under the Honor Code you are obligated to follow all of the following rules in this course: ## Rule 1: You must not look at assignment solutions that are not your own. It is an act of plagiarism to take work that is copied or derived from the work of others and submit it as your own. For example, using a solution from the Internet, a solution from another student (past or present), a solution taken from an answer set released in past quarters, or some other source, in part or in whole, that is not your own work is a violation of the Honor Code. Many Honor Code infractions we see make use of past solution sets. The best way to steer clear of this possibility is simply to not search for solutions to the assignments. Moreover, looking at someone else's solution in order to determine how to solve the problem yourself is also an infraction of the Honor Code. In essence, you should not be looking at someone else's answers in order to solve the problems in this class. This is not an appropriate way to "check your work," "get a hint," or "see alternative approaches." Additionally, you must not solicit solutions from anyone. For example, it is a violation of the Stanford Honor Code to ask another student to share their answers with you, to ask a tutor to share other students' solutions with you, or to ask for solutions on sites like Stack Overflow or Chegg. ## Rule 2: You must not share your solutions with other students. In particular, you should not ask anyone to give you a copy of their answers or, conversely, give your answers to another student who asks you for it. Similarly, you should not discuss your solution strategies to such an extent that you and your collaborators end up turning in the same answers. Moreover, you are expected to take reasonable measures to maintain the privacy of your solutions. For example, you should not leave copies of your work on public computers nor post your solutions on a public website. ## Rule 3: You must properly cite any assistance you received. If you received aid while producing your solution, you must mention who you got help from (if that person is *not* a TA or the instructor) and what specifically he/she helped you with. A proper citation should specifically identify the source (e.g., person's name, book title, website URL, etc.) and a clear indication of how this assistance influenced your work. For example, you might write "Student *X* mentioned the idea of having the base case be *Y* and the recursive step work in way *Z*." If you make use of such assistance without giving proper credit - or, if you provide a misleading or inaccurate statement describing the help you received - you may be guilty of plagiarism. It is also important to make sure that the assistance you receive consists of general advice that does not cross the boundary into having someone else write the actual solutions or show you their solutions. It is fine to discuss ideas and strategies, but you should be careful to write your solutions on your own, as indicated in Rules 1 and 2. ## Rule 4: You may only reuse past work in certain, limited situations. We tend to reuse assignments from quarter to quarter. Following the general principle that the names affixed to a submission should accurately represent its authorship, you may only resubmit work from prior quarters provided that the exact same set of people who initially turned in the assignment resubmit. This means, in particular, that - if you completed an assignment individually in a previous quarter, you may only resubmit that assignment if you do so individually; and - if you completed an assignment with a partner in a previous quarter, you may only resubmit that assignment if you submit with that exact same partner. To elaborate on that last point, if you worked with a partner in a previous quarter, you are retaking the course or resolving an incomplete, and your partner is not also retaking the class or resolving an incomplete, you may not resubmit the past work you did on that assignment in any circumstance. The policies above apply equally to reading, copying, or adapting solutions you submitted in previous quarters. For example, if you submitted an assignment individually in a previous quarter, you should not refer to your submission on that assignment if you are planning on redoing the assignment in a pair. Similarly, if in a previous quarter you worked with a partner who is not retaking the class, you must not reread or copy anything from that previous submission in the course of redoing the assignment. ## Note: all submissions are subject to automated plagiarism detection. Stanford employs powerful automated plagiarism detection tools that compare assignment submissions with other submissions from current and previous quarters. The tools also compare submissions against a wide variety of online solutions. These tools are effective at detecting unusual resemblances in programs, which are then further examined by the course staff. The staff then make the determination as to whether submissions are deemed to be potential infractions of the Honor Code and referred to Stanford's Community Standards office.
http://cs106b.stanford.edu/class/cs106b/lectures/05-grid-stack-queue/
Today we will discuss use of `Stack` and `Queue`. These containers store data in an ordered format and are used to solve many problems. {% include common/lecture.md %}
http://cs106b.stanford.edu/class/cs106b/lectures/03-strings/
In today's lecture, we explore operations on the C++ `string` datatype and start to talk about what testing fundamentals look like in CS106B. {% include common/lecture.md %}
http://cs106b.stanford.edu/class/cs106b/lectures/01-welcome/
Introduction to CS106B {% include common/lecture.md %}
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/easter_egg/
Here is a hidden file on the course website, buried deep among the many other pages containing important course information. For your entertainment, it also contains an easter egg, which can only be found by using the "Search" feature in the top left corner of the website... --- Your quote for today: ## Have a GREAT day! --- Presenting... __Bad Trombone Guy__ Your browser does not support the audio element. Latest release from indie band _College Avenue Lockdown_ (mixed by sons Rein & Kalev, husband Matt on trombone, Rein on drum kit, vocals by Julie) --- A classic XKCD comic that we can all relate to: [![comic about dead end searches on internet](https://imgs.xkcd.com/comics/wisdom_of_the_ancients.png){:.center-block}](https://xkcd.com/979/) ---
http://cs106b.stanford.edu/class/cs106b/section/section1/
This week's section exercises explore the very fundamentals of programming in C++. We'll be exploring the material from Week 1 and the beginning of Week 2 (functions, parameters, return, decomposition, strings and basic data structures). Have fun! Each week, we will also be releasing a Qt Creator project containing starter code and testing infrastructure for that week's section problems. When a problem name is followed by the name of a `.cpp` file, that means you can practice writing the code for that problem in the named file of the Qt Creator project. Here is the zip of the section starter code: [Starter code](section1_starter.zip) __Note: `Map`s will be covered in lecture on Friday, Oct. 1. For those that have section Wednesday or Thursday, these problems can be useful practice to cement your understanding after lecture!__ ## 1) Returning and Printing _Topics: Function call and return, return types_ Below is a series of four `printLyrics_v#` functions, each of which has a blank where the return type should be. For each function, determine * what the return type of the function should be, * what value, if any, is returned, and * what output, if any, will be produced if that function is called. Is it appropriate for each of these functions to be named `printLyrics`? Why or why not? ```c++ _____ printLyrics_v1() { cout & values) { for (int elem: values) { cout values) { for (int i = 0; i & values) { for (int elem: values) { elem *= 137; } } void heihei(Vector& values) { for (int& elem: values) { elem++; } } Vector teFiti(const Vector& values) { Vector result; for (int elem: values) { result += (elem * 137); } return result; } int main() { Vector values = { 1, 3, 7 }; maui(values); printVector(values); moana(values); printVector(values); heihei(values); printVector(values); teFiti(values); printVector(values); return 0; } ``` {% solution %} Here's the output from the program: ```c++ 1 3 7 1 3 7 2 4 8 2 4 8 ``` Here's a breakdown of where this comes from: * The `maui` function takes its argument by value, so it's making changes to a copy of the original vector, not the vector itself. That means that the values are unchanged back in main. * The `moana` function uses a range-based for loop to access the elements of the vector. This makes a copy of each element of the vector, so the changes made in the loop only change the temporary copy and not the elements of the vector. That makes that the values are unchanged back in main. * `heihei`, on the other hand, uses `int&` as its type for the range-based for loop, so in a sense it's really iterating over the elements of the underlying vector. Therefore, its changes stick. * The `teFiti` function creates and returns a new vector with a bunch of updated values, but the return value isn't captured back in main. {% endsolution %} ## 3) SumNumbers (sum.cpp) _Topics: Vectors, strings, file reading_ The function `sumNumbers` reads a text file and sums the numbers found within the text. Here are some library functions that will be useful for this task: - [`readEntireFile`](https://web.stanford.edu/dept/cs_edu/resources/cslib_docs/filelib.html#Function:readEntireFile), to read all lines from a file stream into a Vector - [`stringSplit`](https://web.stanford.edu/dept/cs_edu/resources/cslib_docs/strlib.html#Function:stringSplit), to divide a string into tokens - [`isdigit`](https://en.cppreference.com/w/cpp/string/byte/isdigit), to determine whether char is a digit - [`stringToInteger`](https://web.stanford.edu/dept/cs_edu/resources/cslib_docs/strlib.html#Function:stringToInteger), to convert a string of digits to integer value In particular you will be asked to write the following function `int sumNumbers(string filename)` When given the following file, named `numbers.txt`, as input, your function should return 42. ```output 42 is the Answer to the Ultimate Question of Life, the Universe, and Everything This is a negative number: -9 Welcome to CS106B! I want to own 9 cats. ``` {% solution %} ```c++ bool isNumber(string s) { // strip negative sign off negative numbers if (s.length() > 0 && s[0] == '-'){ s = s.substr(1); } for (char ch : s) if (!isdigit(ch)) return false; return s.length() > 0; } int sumNumbers(string filepath) { ifstream in; Vector lines; int sum = 0; if (!openFile(in, filepath)) return 0; readEntireFile(in, lines); for (string line : lines) { Vector tokens = stringSplit(line, " "); for (string t : tokens) { if (isNumber(t)) { sum += stringToInteger(t); } } } return sum; } ``` {% endsolution %} ## 4) Debugging Deduplicating (`deduplicate.cpp`) _Topics: Vector, strings, debugging_ Consider the following ___incorrect___ C++ function, which accepts as input a `Vector` and tries to modify it by removing adjacent duplicate elements: ```c++ void deduplicate(Vector vec) { for (int i = 0; i hiddenFigures = { "Katherine Johnson", "Katherine Johnson", "Katherine Johnson", "Mary Jackson", "Dorothy Vaughan", "Dorothy Vaughan" }; deduplicate(hiddenFigures); // hiddenFigures = ["Katherine Johnson", "Mary Jackson", "Dorothy Vaughan"] ``` The problem is that the above implementation of `deduplicate` does not work correctly. In particular, it contains three bugs. First, find these bugs by writing test cases that pinpoint potentially erroneous situations in which the provided code might fail, then explain what the problems are, and finally fix those errors in code. {% solution %} There are three errors here: 1. Calling `.remove()` on the `Vector` while iterating over it doesn't work particularly nicely. Specifically, if you remove the element at index `i` and then increment `i` in the for loop, you'll skip over the element that shifted into the position you were previously in. 2. There's an off-by-one error here: when `i = vec.size() - 1`, the indexing `vec[i + 1]` reads off the end of the `Vector`. 3. The `Vector` is passed in by value, not by reference, so none of the changes made to it will persist to the caller. Here's a corrected version of the code: ```c++ void deduplicate(Vector& vec) { for (int i = 0; i & vec) { for (int i = vec.size() - 1; i > 0; i--) { if (vec[i] == vec[i - 1]) { vec.remove(i); } } } ``` {% endsolution %} ## 5) Pig-Latin (`piglatin.cpp`) _Topics: Strings, reference parameters, return types_ Write two functions, `pigLatinReturn` and `pigLatinReference`, that accept a string and convert said string into its pig-Latin form. To convert a string into pig-Latin, you must follow these steps: - Split the input string into 2 strings: a string of characters BEFORE the first vowel, and a string of characters AFTER (and including) the first vowel. - Append the first string (letters before the first vowel) to the second string. - Append the string "ay" to the resulting string. Here are a few examples... `nick -> icknay` `chase -> asechay` `chris -> ischray` You will need to write this routine in `two ways`: once as a function that `returns` the pig-Latin string to the caller, and once as a function that `modifies` the supplied parameter string and uses it to store the resulting pig-Latin string. These will be done in `pigLatinReturn` and `pigLatinReference`, respectively. You may assume that your input is always a one-word, all lowercase string with at least one vowel. Here's a code example of how these functions differ... ```c++ string name = "julie"; string str1 = pigLatinReturn(name); cout & grid) { for (int r = 0;r & grid) { Grid result(grid.numCols(), grid.numRows()); for (int r = 0; r 9) { foo(a(2)); }") // returns -1 (balanced) // index 01234567890123456789012345678901 checkBalance("for (i=0;i parens; for (int i = 0; i & s) { Queue q; Stack s2; while (!s.isEmpty()) { if (s.peek() % 2 == 0) { q.enqueue(s.pop()); } else { s2.push(s.pop()); } } while (!q.isEmpty()) { s.push(q.dequeue()); } while(!s2.isEmpty()) { s.push(s2.pop()); } cout > friendList(String filename)` {% solution %} ```c++ Map > friendList(string filename) { ifstream in; Vector lines; if (openFile(in, filepath)) { readEntireFile(in, lines); } Map > friends; for (string line: lines) { Vector people = stringSplit(line, " "); string s1 = people[0]; string s2 = people[1]; friends[s1] += s2; friends[s2] += s1; } return friends; } ``` {% endsolution %} ## 10) Twice (`twice.cpp`) _Topic: Sets_ Write a function named `twice` that takes a vector of integers and returns a set containing all the numbers in the vector that appear exactly twice. Example: passing `{1, 3, 1, 4, 3, 7, -2, 0, 7, -2, -2, 1}` returns `{3, 7}`. Bonus: do the same thing, but you are not allowed to declare any kind of data structure other than sets. {% solution %} ```c++ // solution Set twice(Vector& v) { Map counts; for (int i : v) { counts[i]++; } Set twice; for (int i : counts) { if (counts[i] == 2) { twice += i; } } return twice; } // bonus Set twice(Vector& v) { Set once; Set twice; Set more; for (int i : v) { if (once.contains(i)) { once.remove(i); twice.add(i); } else if (twice.contains(i)) { twice.remove(i); more.add(i); } else if (!more.contains(i)) { once.add(i); } } return twice; } ``` {% endsolution %}
http://cs106b.stanford.edu/class/cs106b/lectures/04-console-vector/
In this lecture, we will introduce use of `Vector` and `Grid`. {% include common/lecture.md %}
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/
{% include common/assignment.md %} This week's lectures have introduced you to some of the classic Abstract Data Types (ADTs), and now it's time to put those handy collections to use! With the low-level details of how these data structures work abstracted away, your attention is free to solve more interesting problems. This assignment asks you to write client code that leverages these ADTs to implement some nifty algorithms and systems. The tasks may sound a little daunting at first, but given the powerful tools in your arsenal, each requires a very manageable amount of code. Let's hear it for abstraction! _This assignment is to be completed __individually__. Working in pairs/groups is not permitted._ ## Learning goals - To more fully experience the joy of using pre-written classes. Most of the heavy-lifting is handled by the collection ADTs. - To stress the notion of abstraction as a mechanism for managing data and providing functionality without revealing the representational details. - To learn how to model and solve problems using classic data structures such as vectors, grids, stacks, queues, sets, and maps. ## Assignment parts This assignment consists of a short warmup exercise using the debugger and two coding tasks featuring use of different ADTs. Finally, you'll finish off by answering a couple of embedded ethics questions related to this week's assignment content. - ### [Warmup](warmup) Practice with testing and debugging on different abstract data types. __Do the warmup first!__ - ### [Maze](maze) A `Grid` of walls and corridors is used to represent a maze, and the `Stack`, `Queue`, and `Set` ADTs are used in the implementation of a famous algorithm that can efficiently find a solution path to escape the maze. - ### [Search Engine](searchengine) A `Map` is used to associate words with a `Set` of documents containing that word. Using the map, you can find matching entries that contain terms from simple or compound queries, and construct a mini search engine. - ### [Beyond Algorithmic Analysis](ethics) In this section, you will explore beyond traditional Big-O analysis to study some of the potential human and societal impacts of designing and optimizing efficient software systems. The two coding tasks are roughly comparable to each other in size and scope, so pace yourself to complete each in about three days. Note: The code you will write for Assignment 2 is considerably more complex than Assignment 1, so __be sure to get an early start!__ ## Getting started We provide a ZIP of the starter project. Download the zip, extract the files, and double-click the `.pro` file to open the project in Qt Creator. [Starter code](../../restricted/starter-assign2.zip) The two source files you will edit are `maze.cpp` and `search.cpp`. Additionally, you will answer the questions in `short_answer.txt`. ## Resources - The CS106B [Guide to Testing][testing_guide] - The CS106B [Style Guide][style_guide] - Resolving [Common Build/Run Errors][build_run_issues_link], compiled by section leader Jillian Tang. - Stanford library documentation for [Vector], [Grid], [Stack], [Map], [Set] ## Getting Help Keep an eye on the [Ed forum][forum] for an announcement of the __YEAH__ (YEAH = Your Early Assignment Help) group session where our veteran section leaders will answer your questions and share pro tips. We know it can be daunting to sit down and break the barrier of starting on a substantial programming assignment -- come to YEAH for advice and confidence to get you on your way! We also here to help if you get run into issues along the way! The [Ed forum][forum] is open 24/7 for general discussion about the assignment, lecture topics, the C++ language, using Qt, and more. Always start by first searching Ed to see if your question has already been asked and answered before making a new post. To troubleshoot a problem with your specific code, your best bet is to bring it to the [LaIR] helper hours or [office hours][office_hours]. ## Submit Before you call it done, run through our [submit checklist][submit_checklist] to be sure all your `t`s are crossed and `i`s dotted. Then upload your completed files to Paperless for grading. Please submit only the files you edited; for this assignment, these files will be: * `maze.cpp` * `search.cpp` * `short_answer.txt` You don't need to submit any of the other files in the project folder. [Submit to Paperless][paperless] _Note: On Paperless, all due dates and submission times are expressed in Pacific time._
http://cs106b.stanford.edu/class/cs106b/lectures/02-cpp/
Introduction to Fundamentals of C++ Programming {% include common/lecture.md %}
http://cs106b.stanford.edu/class/cs106b/assignments/1-cpp/
{% include common/assignment.md %} Here it is - the first programming assignment of the quarter! Completing this assignment will get you up and running with the C++ language and the tools used in CS106B. The work involves a mix of coding, testing, and debugging tasks. By the end of this assignment, you'll have fully gotten your C++ legs under you! (our apologies for the bad pun...) The code you will write involves expressions, control structures, functions, and string processing. You have prior experience with these concepts, but the tricky part is figuring how to map what you already know to the strange new world of C++. The transition is what this assignment is all about. In addition to giving you practice with C++ syntax and libraries, the assignment will guide you through the tools and approaches you can use to test and debug your code. By the time you've completed it, you'll be a lot more comfortable working in C++ and will be ready to start building larger projects! _This assignment is to be completed __individually__. Working in pairs/groups is not permitted._ ## Learning goals - To become comfortable using the Qt Creator IDE to edit, build, run, and debug simple C++ programs. - To practice writing C++ functions that manipulate numbers and strings. - To learn basic use of the SimpleTest framework for unit tests and time trials. ## Assignment parts This assignment consists of two parts. Click on the links below for the full instructions. - ### [Perfect Numbers](perfect) is a fun warmup exercise involving number theory, algorithms, and optimization. It gives you a guided transition into C++, as well as the CS106B testing and debugging tools. You can start on this task right away &mdash; and we recommend doing so! Completing this warmup in the first few days reserves the better part of the week for the bigger second part. - ### [Soundex Search](soundex) is a complete program that demonstrates a nifty algorithm for matching and grouping names based on their pronunciation. This program uses C++ strings, console I/O, and the `Vector` class. There is a substantial chunk of code for you to write, so get an early start to give yourself sufficient time to work through issues and reach out for help if you hit any snags. ## Getting started We provide a ZIP of the starter project. Download the zip, extract the files, and double-click the `.pro` file to open the project in Qt Creator. [Starter code](../../restricted/starter-assign1.zip) The two source files you will edit are: - `perfect.cpp` - `soundex.cpp` Additionally, you will write short answers to some questions in `short_answer.txt`. ## Resources - The CS106B [Style Guide][style_guide] reviews the coding standards in the rubric applied to grading the style of your submission. - The CS106B [guide to testing your code][testing_guide] explains the use of `SimpleTest`. - This [guide to transitioning from Python to C++][python_to_cpp] points out syntactical and functional differences between the two languages. Thank you to section leaders Jillian Tang and Ethan Chi for this wonderful resource! - Resolving [Common Build/Run Errors][build_run_issues_link], compiled by section leader Jillian Tang. ## Getting Help Keep an eye on the [Ed forum][forum] for an announcement of the Assignment 1 __YEAH__ (YEAH = Your Early Assignment Help) group session where our veteran section leaders will answer your questions and share pro tips. We know it can be daunting to sit down and break the barrier of starting on a substantial programming assignment -- come to YEAH for advice and confidence to get you on your way! We also here to help if you get run into issues along the way! The [Ed forum][forum] is open 24/7 for general discussion about the assignment, lecture topics, the C++ language, using Qt, and more. Always start by searching first to see if your question has already been asked and answered before making a new post. To troubleshoot a problem with your specific code, your best bet is to bring it to the [LaIR] helper hours or [office hours][office_hours]. ## Submit Before you call it done, run through our [submit checklist][submit_checklist] to be sure all your `t`s are crossed and `i`s are dotted. Then upload your completed files to Paperless for grading. Please submit only the files you edited; for this assignment, these files will be: * `perfect.cpp` * `soundex.cpp` * `short_answer.txt` You don't need to submit any of the other files in the project folder. [Submit to Paperless][paperless] That's it; you're done! Congratulations on finishing your first CS106B assignment!
http://cs106b.stanford.edu/class/cs106b/assignments/0-namehash/
## Due {{ page.duedate | date: "%A, %B %-d at %l:%M %P Pacific" }} - In CS106B, we build flexibility into the assignment deadlines by extending a short grace period (typically 48 hours) beyond the deadline where late submissions are accepted without penalty. Read more about the [late policy][late]. - While we strongly encourage you complete Assignment 0 by the deadline so that you are able to start on the first programming assignment without delay (Assignment 1 will be released Friday), if you need extra time, you may submit up through Sunday 11:59pm with no penalty. Welcome to CS106B! In this assignment, you will first install the Qt tools and CS106-specific package and then work through compiling, running, and debugging a sample program. This confirms you and your development environment are ready for the awesome adventures to come this quarter! ## Step 1) Install Qt Creator You will first need to install Qt Creator, the development environment that we use in CS106B. Follow the instructions in the [Qt Installation Guide][qt] for your operating system. If you run into an install snag, don't panic! The course staff will hold a __Qt Creator install help session 7PM - 9PM PDT on Thursday, September 23 in the basement of the Huang building.__ You can also ask for help with a post to the [Ed forum][forum] or by coming to [office hours][office_hours]. ## Step 2) Download starter project We will configure a starter project with the files needed for each assignment and post it in the form of a ZIP archive. The starter project for Assignment 0 contains the files for the `NameHash` program. - [Starter code](starter-assign0.zip) Download the starter code archive and extract all. Double-click the `NameHash.pro` file to open the project in Qt Creator and configure to use the default kit. ## Step 3) Hash your name In Qt Creator, build and run the `NameHash` program. When prompted, enter your preferred first and last names. The program will compute the hash code for your name. The hash code is like a unique "fingerprint". __Write down the hash code for your name__; you'll need it when you submit your work. At this point, you don't need to deeply understand what the code is doing, just consider it a teaser for ideas we will explore together later in the quarter. ## Step 4) Practice with debugger Knowing your debugger is a key component in your programming tool belt. Our colleague Keith Schwarz wrote a wonderful tutorial to introduce students to the Qt debugger. The tutorial guides you through using the debugger to inspect the `NameHash` program. Open the [debugger tutorial](DebuggerTutorial.pdf) and follow along step-by-step. At some point, you'll be asked to remember a special value. __Write this special value down__; you'll need it when you submit. ## Step 5) Read course policies (Syllabus and Honor Code) Please read the handouts on the website that detail the course policies for the [syllabus][syllabus] and [Honor Code][honor_code]. We want to ensure that you know what to expect from us and what we will expect from you. If you have any questions or concerns about the course policies, make a post on [Ed][forum] or via private email to clarify or resolve issues before choosing to enroll. ## Step 6) Submit Once you've finished everything, fill out this form: - Submit Google form: Enter the numbers from Steps 3 and 4, confirm that you understand the course policies, and submit the form. __Congrats and welcome to CS106B!__
http://cs106b.stanford.edu/class/cs106b/lair
## What are LaIR helper hours? LaIR helper hours are an additional set of office hours staffed by our awesome fleet of section leaders. At the LaIR, students can get individual help with debugging and conceptual questions. ## Logistics 1. The LaIR help queue is open Sunday-Thursday this quarter. To sign up for help, add your request to the queue using the [LaIR signup page](https://cs198.stanford.edu/lair). 2. You will use the "ohyay" video platform to join your 1-on-1 meeting with the section leader that is assigned to service your help request. In order to get help, you should navigate to the link for LaIR ([cs198.stanford.edu/lair][lair_signup]) and log in with your Stanford credentials. This will lead you to a video platform called "ohyay," where all LaIR sessions will be hosted this quarter. When you click on the link, there will be a landing page that explains how to use ohyay and how to sign up in the queue to get help. ## Weekly schedule LaIR runs __Sunday-Thursday, 7-11PM Pacific Time__ ## Common questions about LaIR --- ### I tried to sign up for the LaIR helper queue, but the queue was closed, even though now is within the open hours for the LaIR. In times of peak demand, the helpers may need to close the queue to new requests before the end of the open hours. We do this to ensure that we have sufficient resources to assist all students in the queue before the end of the shift.
http://cs106b.stanford.edu/class/cs106b/late
_Hofstadter's Law: "It always takes longer than you think, even when you take Hofstadter's Law into account."_ The assignment deadline policy this quarter has been designed with built-in flexibility in mind. With that being said, here are the details of the assignment deadline and late policy this quarter: * All assignments this quarter will have a published deadline date, which means that the assignment must be submitted by 11:59pm in the [Pacific time zone][PT]. __Submission by this deadline will earn a small on-time bonus.__ These bonuses will be added into your course grade at the end of the quarter and will not be reflected in your individual bucket grades on the assignment. * __All students will be granted a penalty-free 24- or 48-hour "grace period" for submission on all assignments (except possibly the final one).__ The grace period allows you to submit the assignment after the original deadline, with no impact on your final grade. Following our prior example, for an assignment with a July 10 submission deadline and a 24-hour grace period, all students have the opportunity to submit until 11:59pm in the [Pacific time zone][PT] and still be eligible to submit their work for full credit. This grace period is meant to give built-in flexibility for any unexpected snags - however, we strongly recommend that students submit by the original deadline if possible, in order to avoid falling behind on the class cadence. * __Late submissions are not accepted after the grace period expires__, unless there have been arrangements made for an exceptional situation as described below. The grace period is effectively an automatic extension that has been approved in advance, to accommodate you when minor life emergencies arise (laptop dies, get a cold, etc). Given that, additional extensions beyond the grace period are generally not granted, except in cases of truly exceptional major emergencies. Such needs will be considered on a case-by-case basis and __handled by the Head TA (do not email your section leader)__. ## Common questions about late policy --- ### Does the deadline change depending on my time zone? No, deadlines are the same for all students. For this course, we express all times in the [Pacific time zone][PT]. You can assume any time in our course materials, website, and tools (Paperless, etc) refers to Pacific time. ### If I make a first submission by the deadline, can I use the grace period to make a later submission or add a extension? If you make multiple submissions, we grade the last one. If the last submission is during the grace period, you forfeit any on-time bonus. The grace period is intended to provide flexibility when unexpected circumstances prevent you from finishing the assignment by the deadline. Late submissions put extra pressure on your section leader to return feedback to you in a timely manner. We ask that you use the grace period only when your situation requires it.
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/maze.html
A _maze_ is a twisty and convoluted arrangement of corridors that challenge the solver to find a path from the entry to the exit. This assignment task is about using ADTs to represent, process, and solve mazes. ![animation of BFS search on maze](img/solve_animating.gif) {: .w-50 .mx-auto .border} ## An introduction to mazes Labyrinths and mazes have fascinated humans since ancient times (remember [Theseus and the Minotaur](https://www.youtube.com/watch?v=8qrZ1clEp-Y)?), but mazes can be more than just recreation. The mathematician Leonhard Euler was one of the first to analyze mazes mathematically, and in doing so, he founded the branch of mathematics known as topology. Many algorithms that operate on mazes are closely related to graph theory and have applications to diverse tasks such as designing circuit boards, routing network traffic, motion planning, and social networking. Your goal for this part of the assignment to implement a neat algorithm to solve a maze, while gaining practice with ADTs. Before you jump into code, please carefully read this background information on how we will represent mazes, which ADTs to use, and the format of the provided maze data files. ### A Grid represents a maze A maze can be modeled as a two-dimensional array where each element is either a wall or a corridor. - The `Grid` from the Stanford library is an ideal data structure for this. A maze is represented as a `Grid`. - Each grid element is one "cell" of the maze. The boolean value for each element indicates whether that cell is a corridor. - The element `grid[row][col]` is `true` when there is a open corridor at location (row, col) and `false` if it is a wall. A maze is read from a text file. Each line of the file corresponds to one row of the maze. Within a row, the character `@` is used for walls, and `-` is used for corridors. Here is a sample 5x7 maze file (5 rows by 7 columns): ```output ------- -@@@@@- -----@- -@@@-@- -@---@- ``` Our starter code provides the `readMazeFile` function that reads the above text file into a `Grid`. In the project "`Other files/res`" folder are a number of provided maze files. ### A GridLocation is a row/col pair The `GridLocation` struct is a companion type used to present a row/col location in a `Grid`. A `GridLocation` has two fields, `row` and `col`, that are packaged together into one aggregate type. The sample code below demonstrates using a `GridLocation` variable and assigning and accessing its `row` and `col` fields. ``` // Declare a new GridLocation GridLocation chosen; // default initialized to 0,0 chosen.row = 3; // assign row of chosen chosen.col = 4; // assign col of chosen // Initialize row,col of exit as part of its declaration GridLocation exit = { maze.numRows()-1, maze.numCols()-1 }; // last row, last col // You can use a GridLocation to index into a Grid (maze is a Grid) if (maze[chosen]) // chosen was set to {3, 4} so this accesses maze[3][4] ... // You can directly compare two GridLocations if (chosen == exit) ... // You can also access a GridLocation's row,col separately if (chosen.row == 0 && chosen.col == 0) ... ``` ### A Stack of GridLocations is a path A path through the maze is a sequence of connecting `GridLocation`s. The sequence is stored into a `Stack`. To extend a path, push an additional `GridLocation` onto the stack. Peeking at the top of the stack accesses the end location of the path. A valid maze solution path must start at the maze entrance (bottom of stack), travel through connecting `GridLocation`s, and end at the maze exit (top of stack). Along with the maze text files in the "`Other files/res`" folder, there are `.soln` text files that contain maze solutions. For example, this is the solution file corresponding to the 5x7 maze shown previously. (The elements in a stack are written in order from bottom to top) ```output {r0c0, r0c1, r0c2, r0c3, r0c4, r0c5, r0c6, r1c6, r2c6, r3c6, r4c6} ``` Our starter code provides the `readSolutionFile` function that reads a solution text file into a `Stack`. Not every maze will have a corresponding `.soln` file, and part of your job is to write code to generate solutions! ### ADT resources - Documentation for Stanford collections: [Grid], [GridLocation], [Stack], [Queue] - Section 6.1 of the [textbook] introduces `struct` types - In the lower-left corner of your Qt Creator window, there is a magnifying glass by a search field. If you type the name of a header file such as `grid.h` or `gridlocation.h` , Qt will display the corresponding header file. ### Notes on Stanford collection types - The assignment operator for our ADTs makes a deep copy. Assigning from one Stack, Vector, Set, ... to another creates a copy of the ADT and a copy of all its elements. - Take care to properly specialize a template type and ensure all types match. Using `Vector` without specifying the element type just won't fly, and a `Vector` is not the same thing as a `Vector`. The error messages you receive when you have mismatches can be cryptic and hard to interpret. Bring your template woes to LaIR or the Ed forum, and we can help untangle them with you. ## On to the code! Your work is structured in three tasks: two helper functions that process grids and paths and a final de to solve the maze. You will also be writing comprehensive test cases for your functions. All of the functions and tests for mazes are to be implemented in the file `maze.cpp`. ## 1) Write `generateValidMoves()` Your first task is to implement the helper function to generate the neighbors for a given location: ``` Set generateValidMoves(Grid& maze, GridLocation cur) ``` Given a maze represented as a `Grid` of `bool` and a current cell represented by the `GridLocation cur`, this function returns all valid moves from `cur` as a `Set`. Valid moves are those GridLocations that are: - Exactly one "step" away from `cur` in one of the four cardinal directions (N, S, E, W) - Within bounds for the `Grid` - An open corridor, not a wall There are a few provided tests for `generateValidMoves`, but these tests are not fully comprehensive. __Write at least 3-4 additional tests to make sure your helper function works correctly.__ Remember to label your tests as `STUDENT_TEST`. ## 2) Write `validatePath()` Next you'll write a function to confirm that a path is a valid maze solution: ``` void validatePath(Grid& maze, Stack path) ``` ![rectangular maze with green dotted path leading from entry to exit](img/solved.png) {: .w-50 .mx-auto} The image above displays a valid solution; each green dot marks a GridLocation along the path. A path is a valid solution to a maze if it meets the following criteria: - The path must start at the entry (upper left corner) of the maze - The path must end at the exit (lower right corner) of the maze - Each location in the path is a valid move from the previous location in the path - Hint: rather than re-implement the same logic you already did for `generateValidMoves`, simply call that function and check whether a move is contained in the set of valid moves. - The path must not contain a loop, i.e. the same location cannot appear more than once in the path - Hint: a `Set` is a good data structure for tracking seen locations and avoiding a repeat. If `validatePath` detects that a path violates one of the above constraints, use the `error` function from [error.h] to report what is wrong. When you call `error`, it stops the program and reports the message you passed as an argument: ``` error("Here is my message about what has gone wrong"); ``` If all of the criteria for a valid solution are met, then `validatePath` completes normally. {% question 4 %} In lecture, Cynthia noted the convention is to pass large data structures by reference for reasons of efficiency. Why then do you think `validatePath` passes `path` by value instead of by reference? {% endquestion %} This function has quite a few different things to confirm and will need rigorous tests to verify all of its functionality. We've provided some initial tests to get you started, but you will need to write tests of your own. __Write at least 3-4 student test cases for `validatePath`.__ After writing `validatePath`, not only will you be familiar with using the ADTs to represent mazes, but you will also have a function that makes testing the next milestone much easier. Having thoroughly tested your `validatePath` on a variety of invalid paths means you can be confident that it is the oracle of truth when it comes to confirming a solution. Your future self will thank you! Note the use of the new test types `EXPECT_ERROR` and `EXPECT_NO_ERROR` in our provided tests. An `EXPECT_ERROR` test case evaluates an expression, expecting that the operation will raise an error. While the test is executing, SimpleTest framework will _catch_ the error, record that it happened, and resume. Because the error was expected, the test case is marked as `Correct`. If an error was expected but didn't materialize, the test case fails. This is the opposite of an `EXPECT_NO_ERROR` test case, which is expecting the code to run without errors and treats a raised error as a test failure. More information on the different test macros and how they all work can be found in the [CS106B Testing Guide][testing_guide]. {% question %} After you have written your tests, describe your testing strategy to determine that your `validatePath` works as intended. {% endquestion %} ## 3) Write `solveMaze()` Now you're ready to tackle the final task, writing the function to find a solution path for a maze: ``` Stack solveMaze(Grid& maze) ``` Solving a maze can be seen as a specific instance of a shortest path problem, where the challenge is to find the shortest route from the entrance to the exit. Shortest path problems come up in a variety of situations such as packet routing, robot motion planning, analyzing gene mutations, spell correction, and more. __Breadth-first search__ (BFS) is a classic and elegant algorithm for finding a shortest path. A breadth-first search reaches outward from the entry location in a radial fashion until it finds the exit. The first paths examined take one hop from the entry. If any of these reach the exit location, you're done. If not, the search expands to those paths that are two hops long. At each subsequent step, the search expands radially, examining all paths of length three, then of length four, etc., stopping at the first path that reaches the exit. Breadth-first search is typically implemented using a queue. The queue stores partial paths that represent possibilities to explore. The first paths enqueued are all length one, followed by enqueuing the length two paths, and so on. Because of the queue's FIFO behavior, all shorter paths are dequeued and processed before the longer paths make their way to the front of queue. This means the paths are tried in order of increasing length. At each step, the algorithm considers the current path at the front of the queue. If the current path ends at the exit, it must be a completed solution path. If not, the algorithm takes the current path, extends it to reach locations that are one hop further away in the maze, and enqueues those extended paths to be examined in later rounds. Here are the steps followed by a breadth-first search: 1. Create a queue of paths. Each path is a `Stack` and the queue of paths is of type `Queue>`. - A nested ADT type looks a little scary at first, but it is just the right tool for this job! 2. Create a length-one path containing just the entry location. Enqueue that path. - In our mazes, the entry is always the location in the upper-left corner, and the exit is the lower-right. 3. While there are still more paths to explore: - Dequeue the current path from queue. - If the current path path ends at exit: + You're done. The current path is the solution! - Otherwise: - Determine the viable neighbors from the end location of the current path. A viable neighbor is a valid move that has not yet been visited during the search - For each viable neighbor, make copy of current path, extend by adding neighbor and enqueue it. A couple of notes to keep in mind as you're implementing BFS: - You may make the following assumptions: + The maze is well-formed, rectangular, and the grid locations at upper-left (enter) and lower-right (exit) are both open corridors. + The maze has a solution. - The search should not revisit previously visited locations or create a path with a cycle. For example, if the current path leads from location `r0c0` to `r1c0`, you should not extend the path by moving back to location `r0c0`. - You should not call `validatePath` within your `solveMaze` function, but you can call it in your test cases to confirm the validity of paths found by `solveMaze`. - A strategy that will help with visualizing and debugging `solveMaze` is to add calls to our provided graphic animation functions, explained below. We have provided some existing tests that put your `solveMaze` functions to the test. __Add 2-3 more tests that verify the correct functionality of the `solveMaze` function.__ Note there are a number of additional maze files in the `res` folder that can be useful. You can also create your own files. ### Add graphics functions to draw the maze The Stanford C++ library includes extensive functionality for drawing and user interaction, but we generally won't ask you to dig into those features. Instead, we will supply any needed graphics routines pre-written in a simple, easy-to-use form. For this program, the provided `mazegraphics.cpp` module has functions to display a maze and highlight a path through the maze. Read the `mazegraphics.h` header file in the starter project for details of these functions: - `MazeGraphics::drawGrid(Grid& grid)` - `MazeGraphics::highlightPath(Stack path, string color, int mSecsToPause = 0)` To call these functions, you must refer to them using their full name (including the weird looking `MazeGraphics::` prefix). We will talk more about what this notation means a little later in the quarter! Call `MazeGraphics::drawGrid` once at the start of `solveMaze` to draw the base maze and then repeatedly call `MazeGraphics::highlightPath` to mark the cells along the path currently being explored. You can use the optional pause argument to slow down the animation to help you understand the algorithm and debug its operation. This gives you time to watch the animation and see how the path is updated and follow along with the algorithm as it searches for a solution. Now sit back and watch the show as your clever algorithm finds the path to escape any maze you throw at it! Congratulations! ## References There are many interesting facets to mazes and much fascinating mathematics underlying them. Mazes will come up again several times this quarter. Chapter 9 of the [textbook] uses a recursive depth-first search as path-finding algorithm. At the end of the quarter when we talk about graphs, we'll explore the equivalence between mazes and graphs and note how many of the interesting results in mazes are actually based on work done in terms of graph theory. - Walter Pullen, _Maze Classification_. Website with lots of great info on mazes and maze algorithms - Jamis Buck. _Maze Algorithms_. Fun animations of maze algorithms. He also wrote the excellent book about [Mazes for Programmers: Code Your Own Twisty Little Passages](https://www.amazon.com/gp/product/1680500554/). ## Extensions If you have completed the assignment and want to explore further, here are some ideas for extensions. - Instead of reading pre-written mazes from a file, you could instead generate a new random maze on demand. There is an amazing (I could not resist...) variety of algorithms for maze construction, ranging from the simple to the sublime. Here are a few names to get you started: backtracking, depth-first, growing tree, sidewinder, along with algorithms named for their inventor: Aldous-Broder, Eller, Prim, Kruskal, Wilson, and many others. - Try out other maze solving algorithms. How does BFS stack up against the options in terms of solving power or runtime efficiency? Take a look at random mouse, wall following, depth-first search (either manually using a Stack or using recursion once you get the hang of it), Bellman-Ford, or others. - There are many other neat maze-based games out there that make fun extensions. You might gather ideas from Robert Abbott's or design a maze game board after inspiration from [Chris Smith on mazes](https://medium.com/analytics-vidhya/optimizing-a-maze-with-graph-theory-genetic-algorithms-and-haskell-e3702dd6439f).
http://cs106b.stanford.edu/class/cs106b/assignments/1-cpp/perfect.html
This warmup task gives you practice with C++ expressions, control structures, and functions, as well as testing and debugging your code. Throughout this exercise, we have posed thought questions (in the highlighted yellow boxes) for you to answer. The starter project includes the file `short_answer.txt` (located under "Other Files" in the Qt Project pane) where you are to fill in your answers. Keep in mind that we ask these questions not so as to judge your response as "right" or "wrong", but instead to hear your own reflection and reasoning about your work. ## Perfect numbers This exercise explores a type of numbers called _perfect numbers_. Before we jump into the coding, let's begin with a little math and history. A _perfect number_ is an integer that is equal to the sum of its proper divisors. A number's proper divisors are all of the positive numbers that evenly divide it, excluding itself. The first perfect number is `6` because its proper divisors are `1`, `2`, and `3`, and `1 + 2 + 3 = 6`. The next perfect number is `28`, which equals the sum of its proper divisors: `1 + 2 + 4 + 7 + 14`. Perfect numbers are an interesting case study at the intersection of mathematics, number theory, and history. The [rich history of perfect numbers](https://mathshistory.st-andrews.ac.uk/HistTopics/Perfect_numbers/) is a testament to how much these numbers have fascinated humankind through the ages. Now with our coding superpowers, we can explore several different algorithms for calculating these beloved numbers! ## An exhaustive algorithm One approach to finding perfect numbers is using an _exhaustive_ algorithm. An exhaustive search operates by brute force, trying every possible option until finding an answer or exhausting all the possibilities. For perfect numbers, the search tries every integer starting at 1 and counting upwards, testing each value to find those that are perfect. Testing whether a value is perfect involves another loop to find those numbers which divide the value and add those divisors to a running sum. If that sum and the original number are equal, then you've found a perfect number! Here is some Python code that performs an exhaustive search for perfect numbers: ```python def divisor_sum(n): total = 0 for divisor in range(1, n): if n % divisor == 0: total += divisor return total def is_perfect(n): return n != 0 and n == divisor_sum(n) def find_perfects(stop): for num in range(1, stop): if is_perfect(num): print("Found perfect number: ", num) if num % 10000 == 0: print('.', end='',flush=True) # progress bar print("Done searching up to ", stop) ``` The Python code from above is re-expressed in C++ below. If your CS106A was taught in Python, comparing and contrasting these two may be a helpful way to start adapting to the language differences. If instead your prior course/experience was with Java/Javascript, just sit back and enjoy how C++ already seems familiar to what you know! ```c++ /* This function takes one argument `n` and calculates the sum * of all proper divisors of `n` excluding itself. To find divisors * a loop iterates over all numbers from 1 to n-1, testing for a * zero remainder from the division. * * Note: long is a C++ type is a variant of int that allows for a * larger range of values. For all intents and purposes, you can * treat it like you would an int. */ long divisorSum(long n) { long total = 0; for (long divisor = 1; divisor N, since we only have to inspect &radic;N divisors for every number along the way. If you plot runtimes on the same graph as before, you will see that they grow much less steeply than the runtimes from our previous experiment. {% question %} Make a prediction: how long will `findPerfectsSmarter` take to reach the fifth perfect number? {% endquestion %} ## Mersenne primes and Euclid Back to story time: in 2018, there was [a rare finding of a new Mersenne prime](https://www.mersenne.org/primes/press/M82589933.html). A _Mersenne number_ is a number that is one less than a power of two, i.e., of the form 2k-1 for some integer `k`. A _prime number_ is one whose only divisors are 1 and the number itself. A Mersenne number that is prime is called a _Mersenne prime_. The Mersenne number 25-1 is 31 and 31 is prime, making it a Mersenne prime. Mersenne primes are quite elusive; the most recent one found is only the 51st known and has almost 25 million digits! Verifying that the found number was indeed prime required almost two weeks of non-stop computing. The quest to find further Mersenne primes is driven by the [Great Internet Mersenne Prime Search (GIMPS)](https://www.mersenne.org/), a cooperative, distributed effort that taps into the spare cycles (computing power) of a vast community of volunteer machines. Besides being hard to find, Mersenne primes are of great interest because they are some of the [largest known prime numbers](https://en.wikipedia.org/wiki/Largest_known_prime_number) and because they show up in interesting ways in games like the [Tower of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi) and the [wheat and chessboard problem](https://en.wikipedia.org/wiki/Wheat_and_chessboard_problem). Back in 400 BCE, Euclid discovered an intriguing one-to-one relationship between perfect numbers and the Mersenne primes. Specifically, if the Mersenne number 2k-1 is prime, then 2(k-1) * (2k-1) is a perfect number. The number 31 is a Mersenne prime where k = 5, so Euclid's relation applies and leads us to the number 24 * (25-1) = 496 which is a perfect number. Neat! Those of you enrolled in CS103 will appreciate this lovely [proof of the Euclid-Euler theorem](https://primes.utm.edu/notes/proofs/EvenPerfect.html). ## Turbo-charging with Euclid The final task of the warmup is to leverage the cleverness of Euclid to implement a blazingly-fast alternative algorithm to find perfect numbers that will beat the pants off of exhaustive search. Buckle up! Your job is to implement the function: `long findNthPerfectEuclid(long n)` {: .text-center} In our previous exhaustive search algorithm, we examined every number from 1 upwards, testing each to identify those that are perfect. Taking Euclid's approach, we instead hop through the numbers by powers of two, checking each Mersenne number to see if it is prime, and if so, calculate its corresponding perfect number. The general strategy is outlined below: 1. Start by setting `k = 1`. 2. Calculate m = 2k-1 (Note: C++ has no exponentiation operator, instead use library function [`pow`][pow]) 3. Determine whether `m` is prime or composite. (Hint: a prime number has a `divisorSum` equal to one. Code reuse is your friend!) 4. If `m` is prime, then the value 2(k-1) * (2k-1) is a perfect number. If this is the `nth` perfect number you have found, stop here. 5. Increment `k` and repeat from step 2. The call `findNthPerfectEuclid(n)` should return the `n`th perfect number. What will be your testing strategy to verify that this function returns the correct result? You may find this [table of perfect numbers](https://en.wikipedia.org/wiki/List_of_perfect_numbers) to be helpful. One possibility is a test case that confirms each number is perfect according to your earlier function, e.g. `EXPECT(isPerfect(findNthPerfectEuclid(n)))`. __Note:__ The `findNthPerfectEuclid` function can assume that all inputs (values of `n`) will be positive (that is, greater than 0). In particular, this means that you don't have to worry about negative values of `n` or the case where `n` is zero. __Add at least 4 new student test cases of your own to verify that your `findNthPerfectEuclid` works correctly.__ {% question %} Explain how you chose your specific test cases and why they lead you to be confident `findNthPerfectEuclid` is working correctly. {% endquestion %} Try a call to `findNthPerfectEuclid(5)` and it will near instantaneously find the fifth perfect number. Woah! Quite an improvement over the exhaustive algorithm, eh? But if you try for the sixth, seventh, eighth, and beyond, you can run into a different problem of scale. The super-fast algorithm works in blink of an eye, but the numbers begin to have so many digits that they eventually exceed the capability of the `long` data type (`long` can hold a maximum of 10 to 20 digits depending on your system. When a number gets too large, the value will unexpectedly go negative &mdash; how wacky is that? Take CS107 to learn more) So much for the invincibility of modern computers... As a point of comparison, back in 400 BC, Euclid worked out the first eight perfect numbers himself - not too shabby for a guy with no electronics! ## Warmup conclusions Hooray for algorithms! One of the themes for CS106B is the tradeoffs between algorithm choices and program efficiency. The differences between the exhaustive search and Euclid's approach is striking. Although there are tweaks (such as the square root trick) that will improve each algorithm relative to itself, the biggest bang for the buck comes from starting with a better overall approach. This result foreshadows many of the interesting things to come this quarter. Before moving on to the second part of the assignment, confirm that you have completed all tasks from the warmup. You should have answers to questions 1 to 9 in `short_answer.txt`. You should have implemented the following functions in `perfect.cpp` * `smarterSum` * `isPerfectSmarter` * `findPerfectsSmarter` * `findNthPerfectEuclid` as well as added the requisite number of tests for each of these functions. Your code should also have thoughtful comments, including an overview header comment at the top of the file. In the header comment for `perfect.cpp`, share with your section leader a little something about yourself and to offer an interesting tidbit you learned in doing this warmup (be it something about C++, algorithms, number theory, how spunky your computer is, or some other neat insight).
http://cs106b.stanford.edu/class/cs106b/assessments/refsheet.html
We often distribute a reference sheet during exams that gives the syntax of common operations so that students do not need to memorize these details. Below is a sample such reference sheet. You may want to have this on hand when practicing or taking an exam. This list isn't comprehensive - for that, read the header files (available in Qt Creator) or view the online docs for [standard C++ functions][cpp_doc] or the [Stanford C++ libraries][stanford_doc]. ### Standard string ``` s.at(index) or s[index] // access single char s + text OR s += text s.substr(start) // return new substring s.substr(start, length) s.find(key) // returns index or string::npos s.erase(index, length) // erase, insert, replace destructively modify s s.insert(index, text) s.replace(index, length, text) for (char ch: s) { ... } // chars visited in order ``` ### Stanford strlib.h ``` if (equalsIgnoreCase(s1, s2)) { ... } if (endsWith(str, suffix)) { ... } if (startsWith(str, prefix)) { ... } if (stringContains(str, key) { ... } integerToString(int) stringToInteger(str) realToString(double) stringToReal(str) stringSplit(str, delimiter) // returns Vector of tokens toLowerCase(str) toUpperCase(str) trim(str) // remove leading/trailing whitespace ``` ### Vector ``` Vector v = {elem, elem, ... , elem}; v.add(elem) OR v += elem v.clear() v.get(index) OR v[index] v.insert(index, elem) if (v.isEmpty()) { ... } v.remove(index) v.set(index, elem) OR v[index] = elem v.size() for (T elem: v) { ... } // visited in order of index ``` ### Grid ``` Grid g(nRows, nCols); int numCells = grid.numRows() * grid.numCols() if (grid.inBounds(row, col)) { ...} grid[row][col] = value for (T elem: grid) { ... } // visited left-to-right, top-to-bottom ``` ### GridLocation ``` GridLocation loc = {row, col}; grid[loc] = value if (grid.inBounds(loc)) { ... } GridLocation neighbor = {loc.row + 1, loc.col - 1}; if (loc == neighbor) { ... } ``` ### Stack ``` Stack s = {bottom, ... , top}; s.push(elem) s.peek() // Look at top s.pop() // Removes top s.size() if (s.isEmpty()) { ... } s.clear() // no iterator, no direct access to elements other than top ``` ### Queue ``` Queue q = {front, ... , back}; q.enqueue(elem) q.peek() // Look at front q.dequeue() // Removes front q.size() if (q.isEmpty()) { ... } q.clear() // no iterator, no direct access to elements other than front ``` ### Set ``` Set s = {elem1, elem2, ... , elemN}; if (s.contains(elem)) { ... } s.add(elem) s + elem; s + set2; // union s.remove(elem) s - elem; s - set2; // difference intersectSet = s1 * s2; s.size() if (s.isEmpty()) { ... } if (s.isSubsetOf(s2)) { ... } s.clear() // Visits elements in sorted order for (T elem: set) { ... } ``` ### Map ``` Map m = { { key1, val1}, ... {keyN, valN } }; m.put(key, value) OR m[key] = value m.get(key) OR m[key] // bracket form auto-inserts default value if not present if (m.containsKey(key)) { ... } m.size() if (m.isEmpty()) { ... } m.remove(key) m.clear() Vector keys = m.keys(); // Visits keys in sorted order for (K key: m) { ... } ``` ### Lexicon ``` Lexicon lex(filename); if (lex.contains(word)) { ... } if (lex.containsPrefix(prefix)) { ... } ```
http://cs106b.stanford.edu/class/cs106b/schedule.html
This is a preview of our planned schedule. We will update this schedule as we go. This page should faithfully describe the past, but it won't always accurately predict the future. {% include schedule.html %}
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/searchengine.html
## An introduction to search engines ![google search home page](img/search.png) {: .w-50 .mx-auto } Search engines are one of the most influential developments of the modern internet age and have completely revolutionized how users interact with the web. What was once an intractable jumble of data with no semblance of organization has become a treasure trove of useful information due to the magic of search. Armed with the power of `Map` and `Set`, you can recreate this phenomenal achievement yourself. Once completed, you will understand the core technology underpinning Spotlight, Google, and Bing, and have solid practice in the use of the classic ADTs `Map` and `Set`. Want to see the power of search right now? Click Search in the top-right of the page navigation bar and search for __easter egg__ to see what lurks deep in the recesses of the course website... In your search engine, each web page has a _URL_ ("Uniform Resource Locator") that serves as its unique id and a string containing the body text of the page. You will first write functions that pre-process the body text and populate the data structure. Next you'll implement the function to search for pages matching a search query. Finally, you will write a console program that allows the user to enter many search queries and retrieve the matching web pages. Put all together, you will have built your own mini search engine! ### Understanding the web page data We have harvested the body text from the pages of our course website into a database file. The format of the database file is as follows: - The lines of the file are grouped into pairs with the following structure: - The first line of a pair is a page URL. - The second line of a pair is the body text of that page, with all newlines removed (the entire text of the page in a single string). - The first two lines in the file form the first pair. The third and fourth lines form another pair, the fifth and sixth another, and so on, with alternating lines of page URL and page content. To view an example database, open the file `tiny.txt` or `website.txt` in the folder `Other files/res` of the starter project. ### Using an inverted index for searching To make our search efficient, we need to be thoughtful about how we structure and store the data. A poor choice in data structures would make search painfully slow, but clever use of our wonderful ADTs can avoid that fate. A search engine typically uses a nifty arrangement known as an _inverted index_. An inverted index is akin to the typical index in the back of a book. If you look up the keyword "internet" in the index of the CS106B [textbook], it lists two page numbers, 18 and 821. The word internet occurs on page number 18 and again on page number 821. Thus, an inverted index is a mapping from word to locations. You look up a keyword in an inverted index to find all the locations where that keyword occurs. In contrast a _forward index_ operates in the other direction. For our book example, the forward index would be a mapping from a location (page number) to all the words that occur on that page. To build an inverted index, you typically start with the data in the form of a forward index and then process to convert to the inverted form. Inverting the index takes time up front, but once complete, the inverted index supports extremely fast operations to find query matches. ## On to the code! Decomposition is one of your best tools for tackling a complex problem. We'll guide you along by breaking the problem down into a sequence of steps. Follow our steps to success! All of the functions and tests for the search engine are to be implemented in the file `search.cpp`. ## 1) Write helper function `cleanToken()` Start by writing a helper function for this small but important string-processing task: ``` string cleanToken(string token) ``` `cleanToken` takes in a string of characters that appears in the body text and returns a "cleaned" version of that token, ready to be stored in the index in its canonical form. The cleaned version trims off any leading or trailing punctuation and converts all letters to lowercase. `cleanToken` also checks to see if the token contains only non-letters, in which case it returns empty string to indicate this token is to be discarded. More precisely, `cleanToken(str)` should: - __Trim all punctuation from the _beginning_ and _end_ of `str`, but not from inside `str`__. The inputs `section` and `section!` and `>` all trim to the same result `section`. An input such as `section's` or `section-10` is unchanged by trimming, since the punctuation characters are in middle of the token. - Treat as punctuation exactly and only those characters for which [`ispunct`][cctype] returns true. - You may encounter a few oddball characters (curly quotes, bullets, and the like) that are not recognized as punctuation by `ispunct`. Do not make a special case of this; trim exactly according to `ispunct`. - __Confirm the token contains at least one letter character__. You can test whether a character is a letter by using [`isalpha`][cctype]. If none of the characters is a letter according to `isalpha`, `cleanToken` returns the empty string to indicate that this token is not a word and should not be entered into the index. + Hint: this check does not remove non-letters or modify the string, it just examines the character to find if one is a letter - __Convert the token to lowercase__. Standardizing on a canonical form allows search queries to operate case-insensitively. The return value from `cleanToken` is the trimmed lowercase version, or empty string if the token is to be discarded. Writing this function will be right up your alley after your great work on soundex last week. The specification of this function is a bit quirky, so pay close attention to the details and ask questions if anything is unclear. __Write comprehensive test cases to confirm that `cleanToken` works correctly in all situations before moving on!__ You will make heavy use of `cleanToken` when building the inverted index and if this helper mishandles tokens, it will throw off your results and lead to sad times. Covering all your bases now will take quite a few test cases, but we assure you that it is well worth your time. Remember to label your tests as `STUDENT_TEST`. ## 2) Write helper function `gatherTokens()` The helper function `gatherTokens` extracts the set of unique tokens from the body text. ``` Set gatherTokens(string bodytext) ``` The argument to `gatherTokens` is a string containing the body text from a single web page. The function returns a Set of the unique cleaned tokens that appear in the body text. The function first _tokenizes_ the body text &mdash; this means to separate the string into words using the space character as delimiter. Call the [`stringSplit`][strlib.h] function from the Stanford library for this task. Then call your `cleanToken` helper function on each token and store the cleaned tokens into a Set. No matter how many repeat occurrences of a given word are in the body text, repeatedly adding it to a set stores just the one copy, which makes this ADT ideal for gathering the unique tokens. Time to test! Add test cases that confirm the output from `gatherTokens`, so you will later be able to call on this function with confidence that it does its job correctly. ## 3) Create inverted index in `buildIndex()` The function `buildIndex` reads the content from the database file and processes it into the form of an inverted index. ``` int buildIndex(string dbfile, Map>& index) ``` The first argument to `buildIndex` is the name of the database file of the web page data, the second argument is the Map to be populated with data for the inverted index. The return value of `buildIndex` is the number of documents processed from the database file. Before starting to code, work through a small example on paper to ensure you understand the data structure you are trying to build. Open the `res/tiny.txt` database file in Qt and manually process the content to build the inverted index. {% question 6 %} Sketch the contents of the inverted index built from the `res/tiny.txt` database file. {% endquestion %} When you are ready to start writing code, read the previous section [Understanding the web page data](#understanding-the-web-page-data) to review the format of the database file. Look at the code we provided for `readMazeFile` in `maze.cpp` for an example of C++ code to open a file and read the contents into a Vector of lines. Feel free to reuse that code for `buildIndex`. For each page, you will call `gatherTokens` to extract the set of unique tokens. For each token in the page, you record the match to the page's URL by adding to its entry in the inverted index. The index is of type `Map>`, this map associates each keyword with the set of the URLs where that word occurs. The function returns the count of web pages that were processed. Our starter code contains some provided tests to get you started; add student tests of your own to ensure your coverage is comprehensive. Although the process to build the inverted index is complex for a larger database file, the operation can be done in reasonable time, i.e. generally no more than X seconds. Use a `TIME_OPERATION` test case to confirm on your function. ## 4) Search using `findQueryMatches()` Next up is to implement the function: ```c++ Set findQueryMatches(Map>& index, string query) ``` The `query` string can either be a single search term or a compound sequence of multiple terms. A search term is a single word, and a sequence of search terms is multiple consecutive terms, each of which (besides the first one) may or may not be preceded by a modifier like `+` or `-` (see below for details). The same [`stringSplit`][strlib.h] function used to divide the body text into tokens will be used to divide the query sentence into search terms. When finding the matches for a given query, follow these rules: - For a single search term, result is all web pages that contain that term. - A sequence of terms is handled as a compound query, where the matches from the individual terms are synthesized into one combined result. - A search term has a slightly altered meaning when the term is prefaced by certain modifiers: - By default when not prefaced with a `+` or `-`, the matches are unioned across search terms. (any result matching either term is included) + Hint: use [Set] operation `unionWith` - If the user prefaces a search term with `+`, then matches for this term are intersected with the existing results. (results must match both terms) + Hint: use the [Set] operation `intersect` - If the user prefaces a search term with `-`, then matches for this term are removed from the existing result. (results must match one term without matching the other) + Hint: use the [Set] operation `difference` - __The same token cleaning applied to the body text is also applied to query terms.__ Call your helper `cleanToken` to process each search term to strip punctuation, convert to lowercase, and discard non-words before performing the search for matches. Note that __searching is case-insensitive__, that is, a search for "binky" returns the same results as "Binky" or "BINKY". Be sure to consider what implications this has for how you create and search the index. Here are some example queries and how they are interpreted - `quokka` - matches all pages containing the term "quokka" - `simple cheap` - means `simple OR cheap` - matches pages that contain either "simple" or "cheap" or both - `tasty +healthy` + means `tasty AND healthy` + matches pages that contain both "tasty" and "healthy" - `tasty -mushrooms` - means `tasty WITHOUT mushrooms` - matches pages that contain "tasty" but do not contain "mushrooms" - `tasty -mushrooms simple +cheap` - means `tasty WITHOUT mushrooms OR simple AND cheap` - matches pages that match ((("tasty" without "mushrooms") or "simple") and "cheap") There is no precedence for the operators, the query is simply processed from left to right. The matches for the first term are combined with matches for second, then combined with matches for third term and so on. In the last query shown above, the matches for `tasty` are first filtered to remove all pages containing `mushrooms`, then unioned with all matches for `simple` and lastly intersected with all matches for `cheap`. You may assume that the query sentence is well-formed, which means: - The query sentence is non-empty and contains at least one search term - If a search term has a modifier, it will be the first character + A modifier will not appear on its own as a search term + A `+` or `-` character within a search term that is not the first character is not considered a modifier - The first search term in the query sentence will never have a modifier - You can assume that no search term will clean to the empty string (i.e. has at least one letter) There is a lot of functionality to test in query processing, be sure you add an appropriate range of student tests to be sure you're catching all the cases. ## 5) Put it all together with `searchEngine()` Thus far, your amazing code has re-arranged a mass of unstructured text data into a highly-organized inverted index with instantaneous retrieval and fancy query-matching capability. Now take it over the finish line to build your own search engine! The final function to implement is: ```c++ void searchEngine(string dbfile) ``` This function implements a console program that works as follows: - It first constructs an inverted index from the contents of the database file. - It prints how many web pages were processed to build the index and how many distinct words were found across all pages. - It enters a loop that prompts the user to enter a query - For each query entered by the user, it find the matching pages and prints the URLs. - When the user enters the empty string (`""`), this indicates they are done and the program finishes. After you have completed this function, your program should behave as shown in the transcript shown below. Example program run (executed by running `searchEngine("res/website.txt")` in `main.cpp`): ``` output Stand by while building index... Indexed 32 pages containing 3883 unique terms Enter query sentence (RETURN/ENTER to quit): llama Found 1 matching pages {"http://cs106b.stanford.edu/assignments/2-adt/searchengine.html"} Enter query sentence (RETURN/ENTER to quit): expect_error +testing Found 3 matching pages {"http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/maze.html", "http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/warmup.html", "http://cs106b.stanford.edu/class/cs106b/resources/testing_guide.html"} Enter query sentence (RETURN/ENTER to quit): cs106a cs106b -qt Found 5 matching pages {"http://cs106b.stanford.edu/class/cs106b/about_assessments", "http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/ethics.html", "http://cs106b.stanford.edu/class/cs106b/course_placement", "http://cs106b.stanford.edu/class/cs106b/lectures/01-welcome/", "http://cs106b.stanford.edu/class/cs106b/lectures/03-strings/"} Enter query sentence (RETURN/ENTER to quit): All done! ``` Way to go, you're well on your way to becoming the next internet search pioneer! ## Notes - The `res` folder of the starter project includes two database files: `tiny.txt` is the small example used in the writeup and `website.txt` is the body text extracted from all of the pages in our course website (as of Oct 1). You can open these files in Qt Creator to view their contents. The project resource files are listed under `Other files` -> `res`. Your program can open a resource file by specifying the path as `"res/myfilename"`. ## References - [Inverted Index on GeeksForGeeks](https://www.geeksforgeeks.org/inverted-index/) - [Wikipedia article on Inverted Indexes](https://en.wikipedia.org/wiki/Inverted_index) - [Stanford Natural Processing Group on Tokenization](https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html) ## Extensions If you have completed the basic assignment requirements and and want to go further, we encourage you to try adding an extension! A non-exhaustive list of potential extensions is listed below: - Weights - When you get the results from a Google search, they are ranked so that the more relevant results are first on the list. The current Google algorithm is a well-kept trade secret (though it was originally the [_Page Rank_](https://en.wikipedia.org/wiki/PageRank) algorithm, named for its creator, then-Stanford graduate student Larry Page), but a simple approach is to give higher rank to pages with more occurrences of the search term. For this extension, you would need to re-think how you create your index to include the number of matches. - Phrase search - The assignment does not allow a search for multi-word terms, such as `section leader`. Searching for phrases is not trivial, as you cannot simply keep a mapping of all possible phrases in the document. You could, however, keep track of _where_ in each document a word is, and then use that information to determine if words in a phrase are next to each other in any particular document. - Stop Words - The English language has many, many words that show up in text but are not particularly important for content, such as `the`, `and`, `if`, `a`, etc. These words are called _Stop Words_, and it would make your index smaller if you removed such stop words from the index. Here is more [info about stop words](https://kavita-ganesan.com/what-are-stop-words/#.XptrFy-ZM0o). - Stemming - In the current design, if a user searches for `section` they won't find matches for `sections`, even though pages that mention either might be a relevant match. [_Stemming_](https://en.wikipedia.org/wiki/Stemming) is the process of reducing words to their base form, so that (for example) both `section` and `sections` would become, simply, `section` in the index. If you have other creative ideas for extensions, run them by the course staff, and we'd be happy to give you guidance!
http://cs106b.stanford.edu/class/cs106b/assignments/1-cpp/soundex.html
This task is mostly C++ string processing, with a little bit of file reading and use of `Vector`. For this program, you will be writing code and making changes in `soundex.cpp`, as well as answering a few short answer questions in `short_answer.txt`. You can find the text file under "Other Files" in the Qt Creator project pane. Here is a summary of what you can expect to complete in this part of the assignment: * You will study a real-world algorithm used by the U.S. Census to encode the phonetic pronunciation of surnames. * You will implement the algorithm, developing a function that can take surnames as input and produce phonetic encodings as output. * You will implement a console program that allows users to input a surname and then find all matches in a database of Stanford surnames that have the same encoding. * You will respond to a few reflective questions on the efficacy and limitations of this algorithm. __Note: Before getting started on this portion of the assignment, please watch [this short video](https://stanford-pilot.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=6b63e33c-a562-4266-bd96-adad0002cdf2) put together by Katie, where she introduces the Soundex algorithm that you will be implementing. You will need the content presented in this video to complete the ethical reasoning short answer questions.__ ## Why phonetic name matching is useful One of the more pesky features of the English language is the lack of consistency between phonetics and spelling. Matching surnames is a particularly vexing problem because many common surnames come in a variety of spellings and continue to change over time and distance. Many of these variations are the result of incorrectly inputted data, cultural differences in spelling, and transliteration errors. Traditional string matching algorithms that use exact match or partial/overlap match perform poorly in this messy milieu of real world data. In contrast, the Soundex system groups names by phonetic structure to enable matching by pronunciation rather than literal character match. This makes tasks like tracking genealogy or searching for a spoken surname easier. Soundex was patented by Margaret O'Dell and Robert C. Russell in 1918, and the [U.S. Census](https://www.archives.gov/research/census/soundex) is a big consumer of Soundex along with genealogical researchers, directory assistance, and background investigators. The Soundex algorithm is a coded index based on the way a name sounds rather than the way it is spelled. Surnames that sound the same but are spelled differently, like "Vaska," "Vasque," and "Vussky," have the same code and are classified together. ## How Soundex codes are generated The Soundex algorithm operates on an input surname and converts the name into its Soundex code. A Soundex code is a four-character string in the form of an initial letter followed by three digits, such as `Z452`. The initial letter is the first letter of the surname, and the three digits are drawn from the sounds within the surname using the following algorithm: 1. Discard all non-letter characters from the surname: dashes, spaces, apostrophes, and so on. 2. Encode each letter as a digit using the table below. |__Digit__|represents the letters| __0__ | A E I O U H W Y __1__ | B F P V __2__ | C G J K Q S X Z __3__ | D T __4__ | L __5__ | M N __6__ | R {:.table-condensed .table-striped .table-bordered} 3. Coalesce adjacent duplicate digits from the code (e.g. `222025` becomes `2025`). 4. Replace the first digit of the code with the first letter of the original name, converting to uppercase. 5. Remove all zeros from the code. 6. Make the code exactly length 4 by padding with zeros or truncating the excess. Note that Soundex algorithm does not distinguish case in the input; letters can be lower, upper, or mixed case. The first character in the result code is always in upper case. To ensure you understand the construction, get a piece of scratch paper and manually compute a few names, such as "Curie" (`C600`) and "O'Conner" (`O256`). {% question 10 %} What is the Soundex code for "Angelou"? What is the code for your own surname? {% endquestion %} ## Decomposing the problem Your best strategy for approaching a complex algorithm like this is to decompose the problem into smaller, more manageable tasks and proceed step by step, testing as you go. {% question %} Before writing any code, brainstorm your plan of attack and sketch how you might decompose the work into smaller tasks. Briefly describe your decomposition strategy. {% endquestion %} To get you started, we're going to walk you through what it might look like to decompose and implement the first step of the Soundex algorithm. Decomposition is important here because if you tried to implement a single function that accomplished the whole Soundex algorithm all in one go, you could end up with one big, unwieldy piece of code. However, if you break down the problem into a number of different steps, each corresponding to its own helper function, you can develop these helper functions one at a time and test each one as you go. For example, Step 1 of the Soundex algorithm might correspond to a helper function to remove non-letters from the string. The C++ library function [`isalpha`][cctype] will report whether a character is alphabetic (i.e. is a letter). Here is a starting point (provided for you in `soundex.cpp`): ``` // WARNING: Code is buggy! Add test cases to identify which inputs are mishandled string removeNonLetters(string s) { string result = charToString(s[0]); for (int i = 1; i `. 2. Prompt the user to enter a surname. - The function [`getLine`](https://web.stanford.edu/dept/cs_edu/resources/cslib_docs/simpio.html#Function:getLine) from `"simpio.h"` will be helpful here. 3. Compute the Soundex code of the surname. 4. Iterate over the database, compute Soundex code of each name, and gather a vector of those surnames with a matching code. 5. Print the matches in sorted order. - The Vector has a handy `sort()` operation (you can use `vec.sort()` where `vec` is the name of your vector), and you can print a vector using the ` - Online Soundex code calculator: ## Extending your phonetic algorithm If you have completed the assignment and want to go further, we encourage you to try working on an extension! There are many other phonetic systems out there besides Soundex. Here is a non-extensive list: - [Daitch-Mokotoff](https://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex) - [Beider-Morse](https://stevemorse.org/phonetics/bmpm.htm) - [Metaphone](https://en.wikipedia.org/wiki/Metaphone) - [New York State Identification and Intelligence System](https://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System) Try implementing one of these other systems and see if you can get better or more intuitive surname matches! When implementing an extension, add a new .cpp file to your project that contains the extension code, keeping it separate from the regular Soundex implementation. If you have other creative ides for extensions, run them by the course staff, and we'd be happy to give you guidance!
http://cs106b.stanford.edu/class/cs106b/resources/style_guide.html
You may think the motivation for good style is earning that from your section leader, but the most important beneficiary of your efforts is __you yourself__. Committing yourself to writing tidy, well-structured code from the start sets you up for good times to come. Your code will be easier to test, will have fewer bugs, and what bugs there are will be more isolated and easier to track down. You finish faster, your results are better, and your life is more pleasant. What's not to like? The guidelines below identify some of style qualities we will be looking for when grading your programs. As with any complex activity, there is no one "best" style, nor a definitive checklist that covers every situation. That said, there are better and worse choices and we want to guide you toward the better choices. In grading, we will expect that you make a concerted effort to follow these practices. While it is possible to write code that violates these guidelines and yet still exhibits good style, we recommend that you adopt our habits for practical purposes of grading. If you have theoretical points of disagreement, come hash that out with us in person. In most professional work environments you are expected to follow that company's style standards. Learning to carefully obey a style guide, and writing code with a group of other developers where the style is consistent among them, are valuable job skills. This guide gives our general philosophy and priorities, but even more valuable will be the guidance on your own particular style choices. Interactive grading with your section leader is your chance to receive one-on-one feedback, ask questions, and learn about areas for improvement. Don't miss out on this opportunity! ------------------------------------------------------------------------ ## Layout, Indentation, and Whitespace - __Indentation:__ Consistent use of whitespace/indentation always! Proper whitespace/indentation illuminates the structure of your program and makes it easier to follow the code and find errors. - Increment indent level on each opening brace `{`, and decrement on each closing brace `}`. - Chose an increment of 2-4 spaces per indent level. Be consistent. - Do not place more than one statement on the same line. ``` // confusing, hard to follow while (x > str) { ... } ``` - __break and continue in loops:__ Wherever possible, a loop should be structured in the ordinary way with clear loop start, stop, advance and no disruptive loop control. That said, there are limited uses of `break` that are okay, such as loop-and-a-half (`while(true) `with `break`) or need to exit loop mid-iteration. Use of `continue` is quite rare and often confusing to reader, better to avoid. - __Use of fallthrough in switch cases:__ A switch case should almost always end with a `break` or `return` that prevents continuing into the subsequent case. In the very rare case that you intend to fallthrough, add a comment to make that clear. Accidental fallthrough is the source of many a difficult bug. ``` switch (val) { case 1: handleOne(); break; case 2: handleTwo(); // NOTE: fallthrough *** case 3: handleTwoOrThree(); ``` - __return statements__ Although it is allowed for a function to have multiple `return` statements, in most situations it is preferable to funnel through a single `return` statement at the end of the function. An early `return` can be a clean option for a recursive base case or error handled at the beginning of a function. `return` can also serve as a loop exit. However, scattering other `return` throughout the function is not a good idea-- experience shows they are responsible for a disproportionate number of bugs. It is easy to overlook the early-return case and mistakenly assume the function runs all the way to its end. - __Always include `{}` on control statements:__ The body an `if/else`, `for`, `while`, etc., should always be wrapped in `{}` and have proper line breaks, even if the body is only a single line. Using braces prevents accidents like the one shown below on left. ``` // ugh if (count == 0) error("not found"); for (int i = 0; i 0) { return true; } else { return false; } ``` {:.bad .col-sm-6} ``` // better if (isWall) { ... } return (matches > 0); ``` {:.good .col-sm-6} - __Favor `&&`, `||`, and `!` over `and`, `or`, and `not`:__ For various reasons mostly related to international compatibility, C++ has two ways of representing the logical connectives AND, OR, and NOT. Traditionally, the operators `&&`, `||`, and `!` are used for AND, OR, and NOT, respectively, and the operators are the preferred ways of expressing compound booleans. The words `and`, `or`, and `not` can be used instead, but it would be highly unusual to do so and a bit jarring for C++ programmers used to the traditional operators. ``` // non-standard if ((even and positive) or not zero) { ... } ``` {:.bad .col-sm-6} ``` // preferred if ((even && positive) || !zero) { ... } ``` {:.good .col-sm-6} - __Use error to report fatal conditions:__ The `error` function from the Stanford library can be used to report a fatal error with your custom message. The use of `error` is preferred over throwing a raw C++ exception because it plays nicely with the debugger and our SimpleTest framework. ``` // raw exception if (arg = 0) { remove(reallySlowSearch(term)); } ``` {:.bad .col-sm-6} ``` // avoid recompute int index = reallySlowSearch(term); if (index >= 0) { remove(index); } ``` {:.good .col-sm-6} - __Avoid copying large objects:__ When passing an object as a parameter or returning an object from a function, the entire object must be copied. Copying large objects, such as collection ADTs, can be expensive. Pass the object by reference avoid this expense. The client and the function then share access to the single instance. ``` // slow because of copying void process(Set data) { ... } Vector fillVector() { Vector v; // add data to v ... return v; // makes copy } ``` {:.bad .col-sm-6} ``` // improved efficiency void process(Set& data) { ... } // shares vector without making copy void fillVector(Vector& v) { // add data to v ... } ``` {:.good .col-sm-6} ## Unify common code, avoid redundancy When drafting code, you may find that you repeat yourself or copy/paste blocks of code when you need to repeately perform the same/similar tasks. Unifying that repeated code into one passage simplifies your design and means only one piece of code to write, test, debug, update, and comment. - __Decompose to helper function:__ Extract common code and move to helper function. ``` // repeated code if (g.inBounds(left) && g[left] && left != g[0][0] ) { return true; } else if g.inBounds(right) && g[right] && right != g[0][0] ) { return true; } ``` {:.bad .col-sm-6} ``` // unify common into helper bool isViable(GridLocation loc, Grid& g) { return g.inBounds(loc) && g[loc] && loc != g[0][0]); } ... return isViable(left, g) || isViable(right, g); ``` {:.good .col-sm-6} - __Factoring out common code:__ Factor out common code from different cases of a chained if-else or switch. ``` // repeated code if (tool == CIRCLE) { setColor("black"); drawCircle(); waitForClick(); } else if (tool == SQUARE) { setColor("black"); drawSquare(); waitForClick(); } else if (tool == LINE) { setColor("black"); drawLine(); waitForClick(); } ``` {:.bad .col-sm-6} ``` // factor out common setColor("black"); if (tool == CIRCLE) { drawCircle(); } else if (tool == SQUARE) { drawSquare(); } else if (tool == LINE) { drawLine(); } waitForClick(); ``` {:.good .col-sm-6} ## Function design A __well-designed function__ exhibits properties such as the following: - Performs a single independent, coherent task. - Does not do too large a share of the work. - Is not unnecessarily entangled with other functions. - Uses parameters for flexibility/re-use (rather that one-task tool). - Clear relationship between information in (parameters) and out (return value) __Function structure:__ An overly long function (say more than 20-30 lines) is unwieldy and should be decomposed into smaller sub-functions. If you try to describe the function's purpose and find yourself using the word "and" a lot, that probably means the function does too many things and should be subdivided. __Value vs. reference parameters:__ Use reference parameters when need to modify value of parameter passed in, or to send information out from a function. Don't use reference parameters when it is not necessary or beneficial. Notice that `a`, `b`, and `c` are not reference parameters in the following function because they don't need to be. ``` /* * Solves a quadratic equation ax^2 + bx + c = 0, * storing the results in output parameters root1 and root2. * Assumes that the given equation has two real roots. */ void quadratic(double a, double b, double c, double& root1, double& root2) { double discr = sqrt((b * b) -(4 * a * c); root1 = (-b + discr) / (2 * a); root2 = (-b - discr) / (2 * a); } ``` {:.good} __Prefer return value over reference 'out' parameter for single value return:__ If a single value needs to be sent back from a function, it is cleaner to do with return value than a reference `out` parameter. ``` // harder to follow void max(int a, int b, int& result) { if (a > b) { result = a; } else { result = b; } } ``` {:.bad .col-sm-6} ``` // better as int max(int a, int b) { if (a > b) { return a; } else { return b; } } ``` {:.good .col-sm-6} __Avoid "chaining" calls__, where many functions call each other in a chain without ever returning to `main`. Here is a diagram of call flow with (left) and without (right) chaining: ``` // chained control flow main | +-- doGame | +-- initAndPlay | +-- configureAndPlay | +-- readCubes | +-- playGame | +-- doOneTurn ``` {:.bad .col-sm-6} ``` // better structured as main | +-- welcome | +-- initializeGame | | | +-- configureBoard | | | +-- readCubes | +-- playGame | | | +-- doOneTurn ``` {:.good .col-sm-6} {% comment %} ## Class Design - __Encapsulation:__ Properly encapsulate your objects by making any data fields in your class `private`. ``` class Student { private: int homeworkScore; ... ``` - __.h vs. .cpp:__ Always place the declaration of a class and its members into its own file, `ClassName.h`. Place the implementation bodies of those members into their own file, `ClassName.cpp`. Use `#pragma once` to avoid multiple declarations of the same class. ``` // Point.h #pragma once class Point { public: Point(int x, int y); int getX() const; int getY() const; void translate(int dx, int dy); private: int m_x; int m_y; }; ``` ``` // Point.cpp #include "Point.h" Point::Point(int x, int y) { m_x = x; m_y = y; } void Point::translate(int dx, int dy) { m_x += dx; m_y += dy; } ... ``` - __class vs. struct:__ Always favor using a `class` unless you are creating a very small and simple data type that just needs a few public member variables. Examples of such small `struct` types might be `Point` or `LinkedListNode`. - __Avoid unnecessary fields__: use fields to store important data of your objects but not to store temporary values only used within a single call to one function. - __Helper functions:__ If you add a member function to a class that is not part of the homework spec, make it `private` so that other external code cannot call it. ``` class Student { ... private: double computeTuitionHelper(); ``` - __`const` members:__ If a given member function does not modify the state of the object upon which it is called, declare it `const`. ``` class Student { public: int getID() const; double getGPA(int year) const; void payTuition(Course& course); string toString() const; ... ``` {% endcomment %} ## Commenting Some of the best documentation comes from giving types, variables, functions, etc. meaningful names to begin and using straightforward and clear algorithms so the code speaks for itself. Certainly you will need comments where things get complex but don't bother writing a large number of low-content comments to explain self-evident code. The audience for all commenting is a C++-literate programmer. Therefore you should not explain the workings of C++ or basic programming techniques. Some programmers like to comment before writing any code, as it helps them establish what the program is going to do or how each function will be used. Others choose to comment at the end, now that all has been revealed. Some choose a combination of the two, commenting some at the beginning, some along the way, some at the end. You can decide what works best for you. But do watch that your final comments do match your final result. It's particularly unhelpful if the comment says one thing but the code does another thing. It's easy for such inconsistencies to creep in the course of developing and changing a function. Be careful to give your comments a once-over at the end to make sure they are still accurate to the final version of the program. - __File/class header:__ Each file should have an overview comment describing that file's purpose. For an assignment, this header should include your name, course/section, and a brief description of this file's relationship to the assignment. - __Citing sources:__ If your code was materially influenced by consulting an external resource (web page, book, another person, etc.), __the source must be cited__. Add citations in a comment at the top of the file. Be explicit about what assistance was received and how/where it influenced your code. - __Function header:__ Each function should have a header comment that describes the function's behavior at a high level, as well as information about: - __Parameters/return:__ Give type and purpose of each parameter going into function and type and purpose of return value. - __Preconditions/assumptions:__ Constraints/expectations that client should be aware of.("this function expects the file to be open for reading"). - __Errors:__ List any special cases or error conditions the function handles (e.g. "...raises error if divisor is 0", or "...returns the constant NOT_FOUND if the word doesn't exist"). - __Inline comments:__ Inline comments should be used sparingly where code complex or unusual enough to warrant such explanation. A good rule of thumb is: explain what the code accomplishes rather than repeat what the code says. If what the code accomplishes is obvious, then don't bother. ``` // inline babbling just repeats what code already says, don't! int counter; // declare a counter variable counter++; // increment counter while (index < length) // while index less than length ``` {:.bad} - __TODOs:__ Remove any `// TODO:` comments from a program before turning it in. - __Commented-out code:__ It is considered bad style to submit a program with large chunks of code "commented out". It's fine to comment out code as you are working on a program, but if the program is done and such code is not needed, just remove it. - __Doc comments:__ You can use "doc" style (`/** ... */`) comment style if you like, but it is not required for this class. A style of commenting using `//` or `/* ... */` is just fine.
http://cs106b.stanford.edu/class/cs106b/resources/submit_checklist.html
This handy checklist is designed to help confirm your code is fully ready to go before you submit it. Before you send it in for grading, take a few minutes to work through this checklist. ## Functionality - I've re-read the assignment writeup and verified my program matches the required specification. - I've reviewed my code and understand each line of code I've written, why it's there, and why it's necessary. ## Test cases - I've run all of the provided test cases against my code and resolved any test failures. - I've supplemented the provided test cases with student tests of my own. For each part of the assignment, I've written at least one test case to make sure it works well in the common case. - For each part of the assignment, I've thought of at least one edge case that could cause problems and written a student test for it. (And ideally, I've done this for multiple edge cases!) ## Style - I've read the [CS106B Style Guide][style_guide] at least once in its entirety and asked questions about any parts I don't understand. - My code follows the guidelines from the [Style Guide][style_guide]. My code meets the expectations for readability, decomposition, and program design and adheres to restrictions such as no global variables, no use of obscure language features, and so on. - Each of my functions, conceptually, performs a single task. - All of my variables and functions use clear and descriptive names. - The dense parts of my code are commented, and those comments describe what the code accomplishes rather than restating the logic in plain English. - Every function I've written has a comment preceding it that explains what the function does, what its parameters are, and what its return value (if any) means. - Each file I am submitting has a header overview comment that clearly identifies the code authorship, including any necessary [citations][citation]. ## Final Submission - I've auto-indented my code in Qt Creator. The final layout is clean and readable. - I've completed the tasks from all `TODO` comments in the starter code and removed those comments. - I've removed all print statements left in the code that I used for debugging purposes. - I've removed all commented-out blocks of code that are no longer necessary. - I've read the __Submit__ section of the assignment handout and confirmed that I'm submitting all of the required files.
http://cs106b.stanford.edu/class/cs106b/syllabus
> __Hi there and welcome to CS106B!__ {:.alert .alert-success .text-center} __CS106B Programming Abstractions__ is the second course in our introductory programming sequence. The prerequisite, CS106A, establishes a solid foundation in programming methodology and problem-solving in Python. CS106B will give you the tools to solve more complex computational problems while focusing on the theme of abstraction, all using the C++ programming language. We're excited to share this great material with you and have a superb team of section leaders that will support you through the challenges to come. We hope you will find the time worth your investment and that you enjoy your growing mastery of the art of programming! ## Teaching Team {% for p in site.data.course.staff %} {% assign photo = p.name | headshot %} {% include captioned_img.html img=photo %} {% endfor %} Our wonderful undergraduate [section leaders](https://cs198.stanford.edu) lead [sections][about_section] and staff [LaIR][lair] starting week 2. ## I) Online Course Essentials The central place for all CS106B resources is the course website. The site is located at . You should regularly check the class website for handouts, announcements, and other information, including the most up-to-the-date information on assignments and errata. All lectures and other course meetings will be recorded and posted on the "Course Videos" tab of the [course Canvas page][canvas]. In CS106B, we will only use the Canvas page to distribute recorded materials - all other material will be published on the course website. There will be an online Ed Discussion forum available to all students, where you can ask questions about lecture, section, assignments, and course logistics. Please join the forum using [this link][forum] at your earliest convenience. Finally, all assignment submissions, feedback, grading, and virtual office hours will be conducted using the [CS198 Paperless][paperless] website. You will be able to use this website to submit assignments, view graded assignments, and sign up for [LaIR][lair] (virtual office hours staffed by the section leader community). ## II) Course Topics ### Learning Goals After you're finished with CS106B, we hope you'll have achieved the following learning goals: * I am excited to use programming to solve real-world problems I encounter outside class. * I recognize and understand common abstractions in computer science. * I can identify programmatic concepts present in everyday technologies because I understand how computers process and organize information. * I can break down complex problems into smaller subproblems by applying my algorithmic reasoning and recursive problem-solving skills. * I can evaluate design tradeoffs when creating data structures and algorithms or utilizing them to implement technological solutions. We'll also be giving you tools to tackle the following questions (note that these don't have single right or wrong answers!): 1. What is possible with technology and code? What isn't possible? 2. How can I use programming to solve problems that I otherwise would not be able to? 3. What makes for a "good" algorithm or data structure? Why? 4. Which problems should I solve with algorithms and data structures? What does a responsible programmer do when using data about real people? ### Lecture Schedule While the below schedule is subject to change over the course of the quarter, we will cover the following topics (in approximate order): 1. C++ basics 2. Abstract data structures 3. Recursion 4. Classes and object-oriented programming 5. Memory management and implementation-level abstractions 6. Linked data structures 7. Advanced algorithms ### Prerequisites The prerequisite for CS106B is completion of CS106A and readiness to move on to advanced programming topics. A comparable introductory programming course or experience (including high school AP courses) is often a reasonable substitute for Stanford's CS106A. If you are unsure if this course is the right for you, read [more about course placement][course_placement]. ## III) Course Structure ### Units If you are an undergraduate, you __must enroll in CS106B for 5 units__ (this is by department and university policy, no exceptions). If you are a graduate student, you may enroll in CS 106B for 3 or 4 units to reduce your units for administrative reasons. Taking the course for reduced units has no change on the course workload. ### Lectures Lectures will take place on Monday, Wednesday, and Friday from 11am-12pm PT in Bishop Auditorium. If there is a day where you cannot attend a lecture live, recordings of the sessions will be available later on Canvas. Read more [about lectures](about_lectures). ### Sections In addition to lecture, you'll also attend a weekly, 50-minute small group discussion section. Each discussion section will be led by an assigned section leader, who will act as your mentor, grader, and personal connection to the greater CS106B course staff. You'll be asked to submit your section preferences between 5:00 PM on Thursday, September 23, 2021 and 5:00 PM on Sunday, September 26, 2021. The sign-up form will be available on the web at the URL , and after a matching process, your section assignments will be emailed out to you by the morning of Wednesday, September 29. __Note that you should only sign up for sections at the URL indicated previously (you should not sign-up for sections on Axess).__ Sections begin the second week of classes, and attendance and participation will be mandatory for all students. Your section leader will be grading your participation in section on a weekly basis; this participation contributes to your course grade. Participation during section can take many forms, including asking questions, contributing answers, and participating in discussions with fellow students. Read [more about section][about_section]. ### Assignments There will be regular assignments, about one per week. An assignment may include written problems, hands-on exercises with the tools, coding tasks and/or a larger complete program. All assignments are done on an individual basis. The assignment deadline policy has been designed to build in flexibility. Assignments submitted by the due date earn a small on-time bonus. After the due date, there is a "grace period" (typically 48 hours) where we will accept late submissions without penalty. Read [more about the late policy][late]. Programs will be graded on "functionality" (is the program's behavior correct?) and "style" (is the code well-written and designed cleanly?). W We use a bucket grading scale to focus attention on the qualitative rather than quantitative feedback. Read [more about assignments and assignment grading][about_assignments]. ### Assessments There will be assessments at mid-quarter and end-quarter. Read [more about assessments][about_assessments]. ### Course Grades Final grades for the course will be determined using the following weights: * __60%__ Programming assignments * __15%__ Mid-quarter assessment * __20%__ End-quarter assessment * __5%__ Section participation ### Incompletes The university "I" grade ("incomplete") is appropriate for circumstances of significant personal or family emergency disruption that prevent a student from finishing course requirements on schedule. To be considered for an incomplete, you must have completed all of the assignments up until your "incomplete" request at a passing level. You must also have an extenuating circumstance that warrants an extension of time beyond the end of the quarter. Approval for an incomplete is at the instructors' discretion. Incompletes will not be considered for reasons such as low performance in the course or workload difficulties. ## IV) Course Resources ### Textbook Roberts, Eric. *Programming Abstractions in C++*. ISBN 978-0133454840. You can either purchase a physical copy or use the [course reader][textbook] . Recommended readings for each lecture will be posted on our lecture schedule. ### Software The official CS106 programming environment is **Qt Creator**, which is an editor bundled with C++ compiler and libraries. The software runs on Windows, Mac, and Linux. The [Qt Installation Guide][qt] has instructions for installing the tools onto your computer. ### Getting help We want to enable everyone to succeed in this course and offer different paths to help. The instructors and Head CA will hold office hours in person, with a zoom option available when needed by request. The course helpers and section leaders staff regular [LaIR][lair] helper hours on OhYay. The [CS106B Ed Discussion forum][forum] allows public Q&A and discussion with your peers. Here is the [Quick Start Guide to using Ed](https://us.edstem.org/help). ### Accommodations Students who need academic accommodations based on the impact of a disability should initiate a request with the Office of Accessible Education. Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter dated in the current quarter. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. The OAE has contact information on their web page: . Once you obtain your OAE letter, please [send it to the head TA](mailto:neelk@stanford.edu). ## V) Honor Code As a student taking a Stanford course, you agree to abide by the Stanford Honor Code, and we expect you to read over and follow the [CS-specific Honor Code expectations](honor_code) detailed on the CS106B website. Your programs should be your own original, independent effort and must not be based on, guided by, or jointly developed with the work of others. Stanford employs powerful automated plagiarism detection tools that compare assignment submissions with other submissions from the current and previous quarters, as well as related online resources. The tools also analyze your intermediate work, and we will run the tools on every assignment you submit. The vast majority of you are here to learn and will do honest work for an honest grade. We celebrate and honor your commitment. Because it's important that all cases of academic dishonesty are identified for the sake of those playing by the rules, we will refer all cases of concern to the Office of Community Standards. If we find that you have violated Stanford's Honor Code, you will automatically fail the course. No exceptions can be made to this policy.
http://cs106b.stanford.edu/class/cs106b/resources/testing_guide.html
## Why testing? Anybody that writes code for some purpose (whether as a researcher, a software engineer, or in any other profession) will get to the point where others are relying on their code. Bugs in software can be [dangerous or even deadly](https://royal.pingdom.com/10-historical-software-bugs-with-extreme-consequences/). Additionally, users do not enjoy using software that is buggy and crashes, and fixing bugs once the software is in production is very costly. Most importantly, __good engineers take pride in building things that work well and are robust.__ The key to writing working software is developing good tests. In this course we follow an approach called test-driven development. As you write code, you will also write companion tests. These tests are used to verify that the code you just finished writing works as intended. This strategy is sometimes called "_test-as-you-go_." You work in small steps, being sure to test thoroughly, and only move on after you having confirmed the correctness and fixed all issues. The beauty of this approach is that each step is relatively straightforward and easy to debug. Imagine the opposite approach: you write hundreds of lines of code, the code does not work, and now you need to figure out which one of those hundreds of lines of code isn't working as expected! That is the sort of frustration that we want to help you all avoid as you continue to develop your skills as programmers. ## SimpleTest For CS106B, we provide a unit-test framework called `SimpleTest` that you will use to test your code. This framework was pioneered by our ace colleague Keith Schwarz. `SimpleTest` provides a simple, clean approach to writing and running test cases. Here is an example of how you might see the `SimpleTest` framework used in the starter code of an assignment. ```c++ // reversed(str) returns copy of str with characters in reverse order. string reversed(string s) { string result; for (int i = s.length() - 1; i >= 0; i--) { result += s[i]; } return result; } /* * * * * * Test Cases * * * * * */ PROVIDED_TEST("Demonstrate different SimpleTest use cases") { EXPECT_EQUAL(reversed("but"), "tub"); EXPECT_EQUAL(reversed("stanford"), "drofnats"); } ``` When we provide tests for you in the starter code, each test case is wrapped in the special macro `PROVIDED_TEST`. The string argument in parentheses describes the purpose of the test, and the code block that follows (enclosed in curly braces) defines the actual test behavior. When you add your own test cases, you will wrap your test code blocks in the `STUDENT_TEST` macro instead. The `STUDENT_TEST` functionality and structure are exactly the same as `PROVIDED_TEST`; it simply distinguishes the tests you've written yourself from those we provide for the benefit of your grader. You will see many examples of this in the following sections. ### EXPECT_EQUAL The test macro `EXPECT_EQUAL(your_result, expected_result)` tests whether your result matches the expected. A typical use for `EXPECT_EQUAL` compares a value produced by your code (e.g. the return value from a call to one of your functions) to the expected result and confirms they are equal. As an example, consider the first test case from the code above: ```EXPECT_EQUAL(reversed("but"), "tub");``` This test case compares the result of the call `reversed("but")` to the expected answer `"tub"`. If the two are indeed equal, the test will be reported as `Correct`. If they do not match, the test is reported as a failure. See below the added `STUDENT_TEST` code block with three tests of your own. These test cases use `EXPECT_EQUAL` to try out further scenarios not covered by the provided tests. ```c++ /* * * * * * Test Cases * * * * * */ PROVIDED_TEST("Demonstrate different SimpleTest use cases") { EXPECT_EQUAL(reversed("but"), "tub"); EXPECT_EQUAL(reversed("stanford"), "drofnats"); } STUDENT_TEST("my added cases not covered by the provided tests") { EXPECT_EQUAL(reversed("racecar"), "racecar"); EXPECT_EQUAL(reversed(""), ""); EXPECT_EQUAL(reversed("123456789"), "987654321"); } ``` >Important note: You should never modify the provided tests &mdash;these are the same tests that will be used for grading, so it is not in your best interest to modify them. When adding tests, put them in a new `STUDENT_TEST` block of your own. {:.alert .alert-danger} ### EXPECT The `EXPECT(expression)` test case confirms the truth of a single expression. If the expression evaluates to true, the test is reported as `Correct`. If false, it reports a test failure. For example, if you added the `isPalindrome` function to the above program, you could add a test case that uses `EXPECT` to confirm the correct result from `isPalidrome`, as shown below. ```c++ // reversed(str) returns copy of str with characters in reverse order. string reversed(string s) { string result; for (int i = s.length() - 1; i >= 0; i--) { result += s[i]; } return result; } bool isPalindrome(string s) { return s == reversed(s); } /* * * * * * Test Cases * * * * * */ PROVIDED_TEST("Demonstrate different SimpleTest use case") { EXPECT_EQUAL(reversed("but"), "tub"); EXPECT_EQUAL(reversed("stanford"), "drofnats"); } STUDENT_TEST("test additional cases not covered by the provided tests") { EXPECT_EQUAL(reversed("racecar"), "racecar"); EXPECT_EQUAL(reversed(""), ""); EXPECT_EQUAL(reversed("123456789"), "987654321"); } STUDENT_TEST("test my isPalindrome function") { EXPECT(isPalindrome("racecar")); EXPECT(!isPalindrome("stanford")); } ``` When would you use `EXPECT` instead of `EXPECT_EQUAL`? `EXPECT_EQUAL` is appropriate when you have a result that can be compared for equality to an expected result (e.g. two numbers, two strings, two Vectors, etc.). For most situations, confirming that your code "got the right answer" is exactly what you need. On the other hand, `EXPECT` allows you to express a wider variety of conditions beyond simple equality. For example, you could confirm the truth of a complex set of conditions by using a compound expression such as `EXPECT(x > y && y != z || y == 0);` ### EXPECT_ERROR The `EXPECT_ERROR(expression)` test macro is used to verify that evaluating the given expression raises an error (i.e. calls the `error()` function). If an error is raised, the test is reported as `Correct`. If not, the test is reported as a failure. As an example, `EXPECT_ERROR(stringToInteger("cow"));` would confirm that an error is raised when trying to convert the non-numeric string to a number value. `EXPECT_ERROR` is used in the specific situation of confirming expected handling of errors within your code. ### EXPECT_NO_ERROR The `EXPECT_NO_ERROR(expression)` is the opposite of the above. If the expression successfully runs to completion without raising an error, then the test is reported as `Correct`. The test is reported as a failure if the `error()` function is called. `EXPECT_NO_ERROR` is used in situations where you want to confirm that functions run to completion on correct input. ### TIME_OPERATION `SimpleTest` also has support for simple execution timing. The macro `TIME_OPERATION(size, expression)` is used to measure the time it takes to evaluate an expression, which is of the specified size. ``` STUDENT_TEST("Time operation vector sort on tiny input") { Vector v = {3, 7, 2, 45, 2, 6, 3, 56, 12}; TIME_OPERATION(v.size(), v.sort()); } ``` The first argument to `TIME_OPERATION` is the input size; this is used to label this timing result in the output. The second argument is the expression to evaluate. `TIME_OPERATION` will start a new timer, evaluate the expression, stop the timer, and report the elapsed time. It is often useful to have a sequence of `TIME_OPERATION` on different sizes to see the larger pattern. Each operation is individually evaluated and timed. Below demonstrates use of `TIME_OPERATION` in a loop to time how long it takes to sort the items in successively larger vectors. ``` STUDENT_TEST("Time operation vector sort over a range of input sizes") { for (int size = 50000; size v; for (int i = 0; i v = {3, 7, 2, 45, 2, 6, 3, 56, 12}; TIME_OPERATION(v.size(), v.sort()); EXPECT(checkIsSorted(v)); } ``` ### runSimpleTests The `main` function of our projects will begin by offering the user a choice in what to execute: run all the tests, select which tests to run, or run no tests and proceed with normal execution. It does this by calling the `runSimpleTest` function as shown below: ``` int main() { if (runSimpleTests( )) return 0; ... // rest of normal main() here } ``` The argument to `runSimpleTests` is either: - `ALL_TESTS` (run all tests for all files) - `SELECTED_TESTS` (provide menu for user to select which tests to run) + The user can enter zero for "no tests", which causes the program to continue with rest of `main()` ## Debugging a failing test Your goal when testing your code should be to get all of your tests to pass. However, if you get a failed test result, don't look at this as sad times; this test result is news you can use. The failing test case indicates an operation that behaved unexpectedly. This means you know where to focus your attention. Dig into that test case under the debugger to analyze how it has gone astray. Set a breakpoint inside the text code block, and choose to stop at the line that is at or before the failing `EXPECT/EXPECT_EQUAL` statement. ![screenshot of setting breakpoint in debugger on EXPECT statement](img/break_on_test.png){: width="80%"} Now run the tests using the debugger. When the program stops at the breakpoint, single-step through the code while watching the variables pane to observe how the state of your program changes, using a technique just like you did in the [debugging tutorial][debugger_tutorial] in Assignment 0. After you understand the failure and apply a fix, run that test again. When you see the test pass, you can [celebrate having squashed that bug!](http://phdcomics.com/comics/archive.php?comicid=180) ## Debugging your test cases Your test cases are implemented as code, which means that they, too, can have bugs of their own. Having a bug in your test case can truly be a maddening experience! A test case that produces a false negative can lead you to investigate a non-existent defect and a false positive lulls you into overlooking a lurking one. You attribute the erroneous test result to the code being tested, yet the real issue is within the test case itself. Unlike hackneyed sitcom plots, hilarity does _not_ ensue from this misunderstanding. For example, suppose you have written a function that returns the square of a number. You write some tests for it: ``` int square(int n) { return n * n; } STUDENT_TEST("confirm my square function works correctly for 5, 10, and 15") { EXPECT_EQUAL(square(5), 25); EXPECT_EQUAL(square(10), 100); EXPECT_EQUAL(square(15), 275); // this test case is BUGGY! } ``` The first two tests pass but the third will fail. The square of 15 is actually 225, not 275. The problem isn't with the `square()` function, but with the buggy test case that produces a false negative. Every programmer can relate to a time when a buggy test case reported an erroneous failure that led to wild goose chase to find a non-existent flaw in code that was correct all along, argh! There can also be tests that produce a false positive, i.e. report that code is `Correct` when it has a defect. This could be due to a buggy test case that compares to the wrong expected value, such as shown above. Another source of false positives is when your test cases are not sufficiently robust or comprehensive to surface the problem. If `square()` returned the wrong value only for negative inputs and your test cases only tested positive inputs, you would receive all `Correct` results and no mention of the lurking defect. Or perhaps you took a shortcut and wrote your test cases to only confirm that `square()` returned a non-negative value (e.g. `EXPECT(square(15) >= 0)`) without checking the specific value. These test cases are not buggy per se, but they are not thorough enough to fully vet the code being tested. A key takehome is that your test results are meaningful exactly and only if your test cases accurate and robust. Put extra care into verifying each test case is properly constructed and produces accurate results. Ensure your suite of test cases cover a comprehensive range of scenarios, including unusual inputs and edge conditions. Now when your program earns its clean sweep of `Correct` results, you can celebrate that success with confidence! ## Test-driven development We highly recommend employing test-driven development when working on your assignments. To do so, follow these steps: - identify a small, concrete task (bug to fix, feature to add, desired change in behavior) - construct tests for the desired outcome, add them to the file in which you're currently working, and verify the current code fails these tests - implement the changes in your code to complete the task - re-run your newly added tests and verify they now succeed - test the rest of the system (by running all tests) to verify you didn't inadvertently break something else This process allows you to change only a small amount of code at once and validate your results with carefully constructed tests before and after. It keeps your development moving forward while ensuring you have a functional program at each step! ## Test cases and grading The `SimpleTest` framework will be supplied with each assignment, and there will be some initial test cases provided in the starter project, but you will also be expected to add your own tests. You will submit your tests along with the code, and the grader's review will consider the quality of your tests. We will also provide comments on your tests to help you improve your testing approach. Please incorporate our feedback into future assignments; it will improve your grade and, more importantly, your effectiveness as a programmer. We guarantee future employers will appreciate your ability to write good tests and well-tested code! Here are some things we look for in good tests. - Are the tests comprehensive? Is all the functionality tested? - Where possible, are the tests self-contained and independent? - Did you anticipate potential problems, tricky cases, and boundary conditions? - Did you develop the tests in a good order? Did you test basic functionality before more advanced functionality? Did you take small, carefully chosen steps? ## Common questions {:.faq} ### Should each `EXPECT`/`EXPECT_EQUAL` be in a `STUDENT_TEST` code block of its own or can I list several within one code block? For tests that are closely related, it may be convenient to group them together in the same code block under one test name. The tests will operate as one combined group and show up in the report as one aggregate success (if all pass) or one failure (if at least one fails). However, there are advantages to separating each individual test case into its own code block. You will be able to choose a clear, specific name for this block. The separation isolates each test so you can easily identify exactly which cases are passing and which are failing. For example, if you have, ```c++ STUDENT_TEST("Many tests together"){ EXPECT(... Test A ...) EXPECT(... Test B ...) EXPECT(... Test C ...) } ``` then if Test B fails, Test C will never run and you won't be able to see the output - you won't know if Test C passed or failed. On the other hand, if you structure your tests like this ```c++ STUDENT_TEST("Test A"){ EXPECT(... Test A ...) } STUDENT_TEST("Test B"){ EXPECT(... Test B ...) } STUDENT_TEST("Test C"){ EXPECT(... Test C ...) } ``` then all the tests will run individually, and even if Test B fails, you will still get independent information about Tests A and C. Having this sort of isolated behavior might make debugging any problems you encounter a little bit easier! ### When an assignment requirement says to "add 2 tests," do we count each `STUDENT_TEST` or each `EXPECT_EQUAL`? Each use of `EXPECT`/`EXPECT_EQUAL` is counted as one test case. Read the answer to the previous question for some things to consider when deciding whether to group multiple test cases under a single `STUDENT_TEST` group or keep separated. ### The font/sizes/colors in the Simple Test result window are not pleasing to me. Can I customize the display? Yes! Look in the Qt project browser under `Other files->testing` for a file named `styles.css`. This file is the CSS stylesheet for the Simple Test window. Edit this file to change the display styles. Each project has its own copy of the stylesheet. Copy the edited stylesheet from this project into a new project to carry those customizations forward.
http://cs106b.stanford.edu/class/cs106b/assignments/2-adt/warmup.html
![How I got better at debugging by Julia Evans](https://drawings.jvns.ca/drawings/better-debugging.png) {: .w-50 .mx-auto } During this course, you will write quite a lot of C++ code. But writing the code is only the first step; you also need strong skills in testing and debugging to bring a program to successful completion. Knowing your way around the debugger is key. Our assignments will feature warmup exercises that are designed to give you guided practice with the skills and tools for effective testing and debugging. This warmup exercise demonstrates use of the debugger and testing on the ADT types. You are to answer the questions posed below by writing your answers in the file `short_answer.txt`. This file is submitted with your assignment. ## 1) View ADTs in debugger (manually configure if needed) Look over the provided code in `adtwarmup.cpp` that operates on Stacks, Queues, and Sets. Some of this code is buggy and provided for the purpose of practicing your testing and debugging skills. The `reverse` function is implemented correctly and does not have bugs. It uses a `Stack` to reverse the contents of a `Queue`. Build and run the program. When prompted, enter the choice to run tests from `adtwarmup.cpp`. The `reverse` function should pass its single test case. (Ignore the test failures reported for the other functions, we'll get to those in a bit). Use the `reverse` function to practice debugging an ADT. Set a breakpoint on the first `while` loop in `reverse` and run the program in Debug mode. When the debugger stops at your breakpoint, look at the `Variables` pane in the upper-right of the Qt window. You should see the variables `q` , `s` and `val`. Expand `q` by clicking the triangle to the left of its name. The expanded contents of the Queue should look like this: ![screenshot of upper right pane of debugger which shows values of variables](img/debugger-adts.png) {: .w-50 .border .mx-auto .mb-2} > __IMPORTANT:__ If the Queue contents in your debugger look very different than the screenshot above, this likely means that your debugger is not configured to properly display variables of Stanford collection type. Stop here and follow the [instructions to configure your Qt debugger](http://web.stanford.edu/dept/cs_edu/resources/qt/debugging-helper). {: .alert .alert-danger} The Queue `q` was passed as an argument to `reverse`; its contents were initialized in the calling function to `{1, 2, 3, 4, 5}`. The Stack `s` and integer `val` were declared as local variables in `reverse`. Neither of these variables was assigned an initial value. In the debugger variables pane, `s` is displayed as a Stack containing ``. The Stack class has a "default" initializer that configures a new stack that is not otherwise initialized. The default initializer for Stack makes an empty stack. In fact, __all__ of the Stanford collection classes have a default initializer that creates an empty collection. Compare that behavior to what you see for the integer variable `val`. Like `s`, the variable `val` was declared without an explicit initialization. Unlike the collection types, the int type __does not__ have a default initializer. This means the int variable is left uninitialized. The debugger displays the variable's "value," but that value is merely the leftover contents in the memory location where `val` is being stored. In the screenshot above, the leftover value happened to be `28672`, but you may see something different on your system. Using the value of a variable that has not been initialized will lead to erratic results. For now, just file this fact away. If at some later point in debugging, you observe a variable holding a nonsensical value, check to see if what's wrong is a missing initialization. Use the __Step Over__ button to single-step through the first iteration of the while loop. After executing the assignment to `val`, the uninitialized garbage value now becomes sensible. Single-stepping the next line pushes the value onto the stack. Expand the Stack `s` by clicking the triangle to the left of its name. You should now have both the stack and queue expanded. Continue single-stepping through the loop. As you step, keep your eye on the Variables pane and watch values being moved on and off the stack and queue. {% question %} The display of the Stack in the debugger uses the labels `top` and `bottom` to mark the two ends of the stack. How are the contents labeled when the Stack contains only one element? {% endquestion %} You now know how to inspect and interpret ADTs in the debugger. We hope that you will find this a useful skill when working on the rest of the assignment. ## 2) Test duplicateNegatives Testing and debugging are closely related. After writing a piece of code, you will want to test it to verify its behavior. If you uncover a problem, you run your test case under the debugger to further diagnose the bug. The intention of the function `duplicateNegatives` is to modify a `Queue` to duplicate each negative number, i.e. turning the queue `{3, -5, 10}` into `{3, -5, -5, 10}`. The given code is buggy and does not behave correctly on all inputs. The provided test cases include an input with no negative numbers, one with a single negative number, and another with mixed negative numbers. Some of these inputs are problematic. Run those tests and observe which tests pass and which do not. Do you see any pattern to which inputs work correctly and which do not? There is no test for input of all negative numbers, so you decide to add one. Write your own `STUDENT_TEST` for this case. Run the tests again. This new case seems to be taking a really, really long time to run. In fact what has happened is that the program has entered an _infinite loop_. A infinite loop is one that never reaches its stopping condition. Use the interrupt button on the debugger ![interrupt button](img/interrupt.png) to stop the program. __Pro tip: dealing with an infinite loop__ When a program is stuck, your only recourse is to manually intervene to stop it. - If not running in Debug mode, you can stop the program by closing the console window or choosing "Quit" from the menu bar. These actions forcibly exit the program. - If running under the debugger, your better option is to _interrupt_ the program using the interrupt button ![interrupt button](img/interrupt.png). Interrupting the program will pause its execution and return control to the debugger. The program state is preserved and can be examined in the debugger. This will allow you to gather more information to diagnose what's gone wrong. What have we learned so far from testing `duplicateNegatives`? The function works correctly for some inputs. For other inputs, the function completes but produces the wrong result. Your new test shows there is also a third category, where some inputs go into an infinite loop. Precisely identifying what kind of inputs trigger a problem is very helpful, as this will focus the debugging process. You have observed that an input of all negative numbers triggers an infinite loop. Is there more to the pattern? What if only half of the input numbers are negative or if the input starts or ends with a negative number? Add more student test cases and re-run until you narrow in on the precise trigger for the infinite loop. Gather the results of your observations and answer the following questions in `short_answer.txt`: {% question %} For which type of inputs does the function go into an infinite loop? {% endquestion %} Rather than identify one specific input, describe the general characteristics of all such inputs. ## 3) Debug duplicateNegatives Now that you've observed the buggy behavior and know what kind of input triggers it, let's use the debugger to diagnose the flaw. (You may have already seen the bug when reading over the code; if so, great! But the purpose of this exercise is to show you a methodology for using the debugger that will help you in later times when you cannot spot the bug just from reading the code.) Start with your test case that goes into a infinite loop. Set a breakpoint on the call to `duplicateNegatives` inside the test case and run in Debug mode. When the breakpoint is hit, __Step Into__ the call to `duplicateNegatives` and then use __Step Over__ to single step through a few iterations of the `for` loop. Expand the variable `q` to see its contents and pay attention to the changing values for `i` and `q`. Trace out what is happening and work through why the loop never reaches its intended termination condition. Answer the following question in `short_answer.txt`: {% question %} What is the bug within `duplicateNegatives` that causes it to get stuck in an infinite loop? {% endquestion %} Given the above detective work, come up with a fix to apply to the `duplicateNegatives` code. Try out that fix and see that it resolves the problem with the infinite loop inputs. As a followup, re-test the inputs that terminated but produced incorrect results. You should find they are also working correctly now. In this case, the same underlying flaw was producing two seemingly unrelated symptoms. Debugger use for the win! ## 4) Recognize a common ADT error in `absoluteValues` The last part of the warmup is learning how to recognize when a test case fails due to raising an error. The function `absoluteValues` is intended to modify a `Set` to remove any element that is negative and replace it with the absolute value of the element. Run the provided test cases. It passes the first test that makes no change, but the subsequent test goes down in flames. An error is raised during the test execution. The error message reports that __it is disallowed to add/remove elements in the midst of iterating over that collection__. (If you think it through, you can see why this would be problematic....) Students very commonly run afoul of this restriction so we thought we'd get it on your radar before it trips you up. When an error is raised in the middle of a test, SimpleTest reports it like this: ```output Test failed due to the program triggering an ErrorException. This means that the test did not fail because of a call to EXPECT() or EXPECT_ERROR() failing, but rather because some code explicitly called the error() function. ``` When you see this message, it means a fatal error was raised when running the test and that the error prevented the test from completing. The error was not expected. It is due is a bug in your code that attempts an illegal operation. Sometimes there is additional commentary which further explains what made the operation illegal, e.g. index out of bounds, attempt to read a non-existent file, or modification of a collection while iterating over it. You follow the same debugging process for an error as a failing test case: set a breakpoint in the test case and step through the test to see what has gone wrong. There is one added twist - you can step up to, but cannot step over, the actual crashing operation. If you step over an operation that crashes, a clever bit of C++ "hyperjumps" control to an error-handling routine (or your program may terminate if the error is more catastrophic). You must restart the program and step over until the crash again to regain the context at the point of the crash. That's it. Now that you've made it through the debugging warmup, you should be prepared with all the skills you need to attack the rest of this assignment. Go to it!
