9. Fast Fourier Transform
Thu Nov 12
Our goal for this week will be to take the code for a naive Fourier transform and team together to optimize through recursive divideandconquer, taking key insight from the CooleyTukey Fast Fourier Transform algorithm. Here is my derivation of the math behind the very clever CooleyTukey decomposition, this math forms the basis for how to rework the code.
Pre Todo
I'd like to ask each of you to put in a little prep work to come with a basic understanding of the Fourier transform. I've been hunting down introductory resources and can highly recommend these two:
 A superb visual explanation from 3Blue1Brown https://www.youtube.com/watch?v=spUNpyF58BY
 Grant's ability to communicate deep concepts with illuminating visualization is unsurpassed. This video is well worth 20 minutes of your time!
 A fun interactive guide from Better Explained https://betterexplained.com/articles/aninteractiveguidetothefouriertransform/
 Helpful use of analogy, motivating context, and great use of examples and interactive tools to build understanding through intuition and exploration. I had great fun playing the visualization of time <> frequency. Try it out!
Fourier transform visualized through periodic motion is a popular approach in Youtube videos. I am not sure these work as an introduction all on their own, but after getting the gist of things from the sites above, these animations can reinforce your understanding and are mesmerizing to watch:
 Harmonic circles https://www.youtube.com/watch?v=LznjC4Lo7lE
 Square wave https://www.youtube.com/watch?v=k8FXF1KjzY0
 Smarter Every Day https://www.youtube.com/watch?v=ds0cmAVYek
 The famous Fourier of Homer Simpson https://youtu.be/QVuU2YCwHjw
Later followup
I found many good analytic resources, but most dive straight into the math and rarely surface for air. If you're intrigued to learn more, exploring these heftier resources could be a project for winter break.
 FFTW Fastest Fourier Transform in the West http://www.fftw.org/
 Hyperoptimized, fullfeature C library implementation developed by Frigo/Johnson at MIT
 Julius Smith III, "Mathematics of the Discrete Fourier Transform" https://ccrma.stanford.edu/~jos/log/
 The definitive reference on the DFT written by a professor at Stanford CRMA (and former colleague of mine at NeXT!)
 Lovely Digital Signal Processing primer from Jack Schaedler https://jackschaedler.github.io/circlessinessignals/
 Brad Osgood teaches an entire Stanford course EE 261 on the Fourier transform and applications .
 All course materials/lectures https://see.stanford.edu/Course/EE261
 Lecture 22 works through the algebra for FFT reduction of DFT https://see.stanford.edu/Course/EE261/142
 Norwood Russell Hanson, "The Mathematical Power of Epicyclic Astronomy", 1960. A volume of sheer insight packed into 10 pages, wow!
 Visual Microphone https://www.youtube.com/watch?v=FKXOucXB4a8
 Research using Fourier analysis to reconstruct sound from video, amazing (and a little creepy)
 Abe Davis TED Talk https://www.youtube.com/watch?v=npNYP2vzaPo
8. Data compression
Thu Nov 05
For today's meeting, I'm planning to give a quick introduction to the LempelZivWelch compression algorithm and then quickly move us into breakouts so we can get coding. If we hunker down and apply the power of teamwork, we may be able to pull off a complete LZW encoderdecoder, which would be a cool accomplishment! We can then compare/contrast Huffman to LZW in terms of effectiveness, performance, ease of implementation, and so on.
 Wikipedia https://en.wikipedia.org/wiki/LZW
 Note there are a number of different variants in the family. Abraham Lempel and Jakob Ziv published the original
LZ77
andLZ78
in 1977 and 1978. We are going to look specifically atLZW
, an improved variant from Welch in 1984.
 Note there are a number of different variants in the family. Abraham Lempel and Jakob Ziv published the original
 Eric Roberts taught a very popular Sophomore College seminar "The Intellectual Excitement of Computer Science". Here are his resources on data compression and LZW https://cs.stanford.edu/people/eroberts/courses/soco/projects/datacompression/lossless/lz78/index.htm
 Algorithm walkthrough with examples http://michael.dipperstein.com/lzw/
 Note that decoder runs behind by one step on building the dictionary, making possible for decode to encounter "surprise" (needs special case handling)
 I remember reading Mark Nelson's LZW article in Dr Dobbs in 1989 (I'm old!) https://marknelson.us/posts/1989/10/01/lzwdatacompression.html
 Mark's page links to a more recent update with code for implementation in C++ https://marknelson.us/posts/2011/11/08/lzwrevisited.htm
 (Let's plan to not refer to this when writing our own, save for later review/reference)
 Duke University CS Prof Owen Astrachan is a giant in the field (and a longtime friend!). Owen is always up to something cool. Check out this Duke project to develop an assignment where students explore different compression algorithms. https://www2.cs.duke.edu/csed/curious/compression/ Should we do something like this in our CS106B?
 A video by our very own Niveditha on the Shannon limit https://www.youtube.com/watch?v=r7GK4pMODmI
 Claude Shannon, "A Mathematical Theory of Communication", 1948
 His 55page masters thesis is the seminal work on information theory. (Scientific American dubbed it the "Magna Carta of the Information Age")
 "The Joy of Data", Mathematician Hannah Fry explains information theory to a lay audience in this documentary film.
 The BurroughsWheeler transform is another cool and accessible algorithm for data compression – check it out! https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform
7. Simulation and Visualization
Thu Oct 29
We'll explore how to generate results via simulation and consider how data visualization can aid our understanding of a simulated process. The simulation algorithms we will look at belong to the family of "Monte Carlo" methods that use repeated random trials or statistical sampling. We will use the GWindow class of Stanford's CS106B library for simple drawing
 Wikipedia https://en.wikipedia.org/wiki/Monte_Carlo_method
 A classic example to estimate pi by throwing darts
 Concise writeup with visualization video https://felixdmr.com/2020/09/20/pifrommontecarlo/
 If you distrust computational pseudorandomness, get true randomness from nature, raindrops! https://www.youtube.com/watch?v=IBC_vI4CAE
 Random walk on roulette wheel
 Configuration of colors on a roulette wheel http://www.fouroulette.com/roulettewheel.htm
 Quorans respond to offer "best redblack betting system?" https://www.quora.com/Whatisthebestwaytowinrouletteinredandblack?
 Try proposed strategies in simulation – do they perform any better than random bet?
 Application of Monte Carlo to fractals and stock market https://www.youtube.com/watch?v=x3b_DAEmZnU
 What is the probability that a random chord of a circle has length longer than the radius?
 Bertrand's Paradox https://en.wikipedia.org/wiki/Bertrand_paradox_(probability)
 Arguments can be made for
1/2
,2/3
, and3/4
, and "confirmed" by a Monte Carlo simulation. How is that possible? (afterwards, argument for another option,sqrt(3)/2
!)  How can visualization help illuminate what is different about the random sampling?
 Extend idea to sequence of events to get Markov Chain Monte Carlo https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo
 Generating random text https://blog.demofox.org/2019/05/11/markovchaintextgeneration/
 Image synthesis http://graphics.cs.cmu.edu/people/efros/research/EfrosLeung.html
6. Sorting
Thu Oct 22
We'll get a head start on CS106B by doing our own exploration of sorting in advance of Friday's lecture. The competition to be "the" standard sort for language/library is stiff. What does it take to be "The One"? Let's take a deep dive into Quicksort, one of the most celebrated of all sorting algorithms.
 Wikipedia https://en.wikipedia.org/wiki/Quicksort
 Robert Sedgewick's dissertation on Quicksort
 Bob's was the first CS Phd thesis I ever read cover to cover. You couldn't check them out of library, and I remember sitting on the library floor until my legs fell asleep. It really was that riveting!
 Jon Bentley & Doug McIlroy, "Engineering a Sort Function" , Software Practice and Experience, Nov 1993
 This highlyengineered sort function was the gold standard of sorting for nearly 20 years.
 I think everything about this paper is awesome – packed with algorithmic knowhow, beautiful math, performance tuning, testing strategies and solid engineering practice.
 In 2009, the "Dual Pivot Quicksort" displaced the previous king and become the standard sort in JDK >= v8
 Video with algorithm animation and helpful voiceover https://www.youtube.com/watch?v=yRJ1Zb4pCM4
 Java code https://algs4.cs.princeton.edu/23quicksort/QuickDualPivot.java.html
 Doug McIlroy, "A Killer Adversary for QuickSort", Software Practice and Experience, Dec 1999
 Not so practical, no, but a devilish sort of fun – any quicksort can be "broken"
 Timsort https://en.wikipedia.org/wiki/Timsort
 Standard sort for python (merge/insertion hybrid, highperformance, stable)
 The post from Tim Peters himself with motivation for timsort https://mail.python.org/pipermail/pythondev/2002July/026837.html
 Algorithm explained https://hackernoon.com/timsortthefastestsortingalgorithmyouveneverheardof36b28417f399
 Sorting diversions
 Bubble sort won't win any prizes, but does have entertainment value
 Owen Astrachan, "Bubble Sort: An Archaeological Algorithm Analysis" , SIGCSE 2003
 Barack Obama cracks the coding interview! https://www.youtube.com/watch?v=k4RRi_ntQc8
 (and Larry Schwimmer, who asked question, was one of my undergrad advisees!)
5. Supercharging recursion
Thu Oct 15
Recursion is a powerful and elegant problemsolving technique, but the exhaustive nature of certain recursive algorithms can come with a high cost in runtime performance. Let's look at techniques for paring down those costs by trading space for time and accepting approximation in place of exactness.
 Memoization is a simple process to save the computed result of an expensive operation to avoid recomputing it in future. It belongs to the broader category of caching; a technique used in hardware, operating systems, networks, browsers, filesystems, database lookups, and more.
 Wikipedia https://en.wikipedia.org/wiki/Memoization
 How caches work https://computer.howstuffworks.com/cache.htm
 Dynamic programming
 If the problem to solve is constructed of overlapping subproblems, dynamic programming may offer phenomenal improvement. e.g. bringing an exponentialtime recursive algorithm down to polynomial
 https://medium.com/swlh/memoizationindynamicprogrammingthroughexamples25c769d59961
 Optimal solution to 0/1 knapsack is a classic example for DP https://medium.com/@fabianterh/howtosolvetheknapsackproblemwithdynamicprogrammingeb88c706d3cf
 What is pseudopolynomial time?
 A DP algorithm we will explore runs in pseudopolynomial time, a new category in our Big O ranking.
 Good explanation on StackOverflow https://stackoverflow.com/questions/19647658/whatispseudopolynomialtimehowdoesitdifferfrompolynomialtime
 I have it on good authority that the author of this post is none other than the fabulous Keith Schwarz!
 Greedy approximation/heuristic
 If able to relax goal of finding globally optimal solution, a greedy approach that make the locally optimal choice may work well as fast heuristic.
 Wikipedia on greedy algorithms https://en.wikipedia.org/wiki/Greedy_algorithm
 Several greedy approximation for the knapsack problem http://www.brainkart.com/article/ApproximationAlgorithmsfortheKnapsackProblem_8066/
 P != NP A proof will earn fortune and fame beyond your wildest dreams.
 Millenium Prize https://www.claymath.org/millenniumproblems/pvsnpproblem
 Understanding Minesweeper can make you a millionaire https://www.claymath.org/sites/default/files/minesweeper.pdf
4. Fractals
Thu Oct 08
Pre Todo
Students who followed up with me after last week were clamoring for "more code!", so I plan to move in direction of more handson activities where we can. For this week, we'll be playing with code that riffs on the classic Sierpinski fractal and exposes intriguing and surprising connections from fractals to other parts of mathematics, including chaos and dynamic systems, iterated function systems, cellular automata, number theory, binary/ternary number representation, spacefilling curves, and more. It is all quite magical!
Here are some wonderful background resources. If you have some time to check into these in advance, please do!
 Grant Sanderson of 3Blue1Brown produced two delightful videos that tease out the connections between counting in binary/ternary, the Towers of Hanoi and the Sierpinski curve. The fabulous Keith Schwarz makes a guest appearance, too.
 Michael Barnsley's Chaos game, related to iterated function systems
 What a surprising result!
 Ben Sparks of Numberphile has a great video explanation: https://www.youtube.com/watch?v=kbKtFN71Lfs
 https://beltoforion.de/en/recreational_mathematics/chaos_game.php
 Great site on IFS http://larryriddle.agnesscott.org/ifs/ifs.htm
 Elementary cellular automata
 Good introduction at Wolfram Mathworld https://mathworld.wolfram.com/ElementaryCellularAutomaton.html
 Longer treatment in Nature of Code book https://natureofcode.com/book/chapter7cellularautomata/
 Coloring Pascal's triangle by divisibility
 http://larryriddle.agnesscott.org/ifs/siertri/Pascal.htm, includes proof of Lucas' theorem, binomial coefficients mod p, neat number theoretic result, relationship to binary expansion – wow
 Lindermayer stringrewriting systems, Sierpinski curve
 Wikipedia on Lsystems https://en.wikipedia.org/wiki/Lsystem
 Why Sierpinski is the fractal of Halloween https://blogs.scientificamerican.com/rootsofunity/afewofmyfavoritespacesthesierpinskitriangle/
 The mother of all Sierpinski resources: 270 pages of nonstop action that explores every nook and cranny relating to Sierpinski
 Just for fun, a fractal poem http://larryriddle.agnesscott.org/ifs/poem.htm
I'll have some sample code prepped that will get us started and you'll have a chance to work more closely on the code in small groups in breakouts. If you are able to, please update your Zoom to version >= 5.3. ("Check for Updates" in Zoom menu). The 5.3 version adds a few convenience features, one of which is the ability to selfselect a breakout room.
3. Search engines, Google's Page Rank algorithm
Thu Oct 01
Pre Todo
Here are some resources to check out in advance of this Thursday's meeting. I'm not asking you to read all of these, but please do try to get the gist of the basic ideas underpinning Page Rank (Wikipedia page is a good start). If you're eager to go deeper, the original epic Brin/Page papers are quite accessible. To be an informed citizen and consumer of search it behooves us to know some of the ways those results can be distorted, I picked a few press articles to give a taste.
 Tim Bray (inventor of XML) has a nice series of blog posts on search engine architecture https://www.tbray.org/ongoing/When/200x/2003/07/30/OnSearchTOC
 Sergey Brin and Larry Page's landmark paper introducing Google at the 1998 WWW7 Conference in 1998. "The Anatomy of a LargeScale Hypertextual Web Search Engine" http://infolab.stanford.edu/pub/papers/google.pdf
 The Wikipedia page on PageRank is good overview (although gets a little mathy) https://en.wikipedia.org/wiki/PageRank
 The paper that started it all: Larry Page's "The PageRank Citation Ranking: Bringing Order to the Web" http://ilpubs.stanford.edu:8090/422/1/199966.pdf
 Some short articles from popular press on ethical issues raised by search, result filtering/ranking:
 "Google is finally admitting it has a filterbubble problem" https://qz.com/1194566/googleisfinallyadmittingithasafilterbubbleproblem/
 "The Trouble With the Echo Chamber Online" https://www.nytimes.com/2011/05/29/technology/29stream.html
 "The Dirty Little Secrets of Search" https://www.nytimes.com/2011/02/13/business/13search.html
 Latest update on antitrust case against Google https://www.nytimes.com/2020/09/22/technology/justicedeptcasegooglesearchdominance.html
 Wikipedia on Google bombing https://en.wikipedia.org/wiki/Google_bombing
If you know of other good resources, please share on Ed!
2. Binary search
Thu Sep 24
Pre Todo
 We will be using Ed Workspaces for collaborative coding; be sure you are signed up for 106M Ed forum.
 Familiarize yourself with basics of Stanford Vector class
 Review the CS106B Guide to Testing
For this week's meeting, I'm asking you to not read up on binary search in advance. Just come with a "clean mind" and we'll explore together. I will post reference links after the meeting for follow up.
Resources
Here are references on binary search. Some of these are excerpts from books available in the Stanford Library digital collection. Click the link, authenticate with your Stanford sunet, and find link to full book in section labeled "Available online".
 Jon Bentley, "Writing Correct Programs", Column 4 of Programming Pearls
 Tim Bray, On the Goodness of Binary Search
 Mike Taylor, Are you one of the 10% of programmers who can write a binary search?
 Albert Savoia, "Beautiful Tests", Chapter 7 of Beautiful Code
 Josh Block, Nearly All Binary Searches are Broken
Followup excursions
We got to just the tip of the iceberg this week. Let's keep the discussion going! Below are some further exercises and questions to consider. If you get a chance to try some out, please share your results on Ed and we can all learn from each other and offer feedback.
 Refine algorithm into its cleanest, clearest form
 should low/high be inclusive or exclusive?
 how should low/high change each iteration? what invariant is being maintained?

is it necessary to ensure low/high always stay within bounds?
To ponder: if low and high are in bounds at start of loop, then mid is in bounds. Updating bound low=mid or high=mid thus guarantees low/high to stay in bounds. But including mid in the remainder to search is not quite right, and as we saw in class, when low was already mid, this does not advance and gets stuck in infinite loop. Updating bound to mid+1 and mid1 seems correct. But in some cases this can cause low/high to go out of bounds. When does this happen? Would this cause a problem? Why or why not?
 what is correct stopping condition?
 some argue that the cleanest form is for the loop to test for match and return early, but always run loop down to range of single element. If that final element is the target, return index, otherwise, not present. Try rewriting into that form, do you agree that is it cleaner? Why or why not?
 how should duplicate elements be handled?
 asis returns index of a match, but not necessarily first/last.
 if specification is to return index of first match, how could you change to do so?
 does this version still run in logarithmic time? if not, how could you make it so?
 what if requirement were to return the count of matching elements?
 Confirm the correctness of your algorithm
 could be proof via pre/post conditions, invariants
 and/or rigorous test suite
 what mix of whitebox/blackbox/fuzz constitutes "rigorous" coverage?
 Performance studies using
TIME_OPERATION
 confirm that binary search is operating in logarithmic time
 determine cost of a recursive implementation versus iterative
 compute "payback horizon"
 how many searches to hit break even where savings of binary vs linear search offsets costs of onetime sort?
1. Meet & Greet
Thu Sep 17
Pre Todo
Nothing to prep, just bring yourself and your enthusiasm. Zoom details. We will use our first meeting to introduce ourselves and brainstorm our plans for the seminar. Look over the list of possible topics.
Post meeting
Everyone please join the Ed forum and weigh in with your topic interests.
Thank you all for coming today and sharing your stories! We had two dozen attend, almost all first year or transfer students. An awesome journey awaits you at Stanford, I'm happy to be a part of welcoming you!