Q: does the capital letter matter?
A1: It is going to be stored as lowercase for this data structure, but you could make that important if you wanted to
Q: So hash table doesn’t work because the point of hash table is to put words into unique buckets but contains prefix requires words with the same prefix to be in the same bucket?
A1: Yes, exactly! Specifcially it would make containsPrefix very inefficient (we would have to search all words in the hash trable)
A2: Well, there is no way to search by a prefix — the hash table only hashed by the full word.
Q: The graph is directed right?
A1: Yes, it’s a tree
A2: Yes, in the same way a tree would be (only follow links from parents to children)
Q: Would the array be built with a “pre-order” traversal if you were to build it from the trie?
A1: It would be a level order traversal
Q: if there are varying number of children, how can we tell where each group of children stop?
A1: You are looking at one word at a time, so you go to A, then its children (and you know where they are from A) and then to the next children, etc. No need to know anything else about the structure.
Q: What about words (like about) that end in t but can’t have those suffixes (aboution) isnt a word
A1: They would not be included in the shared suffix aggregation – they would just be represented “normally” I believe
Q: Why do we do we do “25 bits childIndex”? Why 25?
A1: live answered
A2: Its a little bit of backward calculation – we want the overall struct to be 4 bytes (32 bits). After using 5 bytes for the char, 1 for isWord and 1 for lastChild, we have 25 bits left for the child index count
Q: so would you store this bit collection as a primitive type (like long long) and then typecast to the struct?
A1: Yes, you would just load the raw memory and then lay the struct over it.
Q: how would you only use 1 bit for something? Is that something you learn in 107?
A1: Yes, you’ll get super familiar with that in 107!
Q: Does suffix unification take place after creating the initial trie or does it happen as each word is added?
A1: live answered
Q: If we have a true/false for 'lastChild' and no pointers, how do we deal with cases like 'too' and 'tool'?
A1: live answered
Q: Where can we read more about the implementation of Lexicon (i.e. the specific algorithms of Dawg generation, etc?)
A1: There’s the original paper linked at the end of the lecture slides!
Q: Are dawgs still used as the foundation of most of the modern autocomplete algorithms?
A1: live answered
Q: sorry i think i missed it but will Julie still have office hours today?
A1: Yes
Q: If you wanted to include the word “add” would it jump down from the top d to the bottom left d
A1: live answered