Checkpoint due: Thursday, November 12 at 11:59 PM
Rest of assignment due: Wednesday, November 18 at 11:59 PM
According to veteran C++ programmers, there are two projects which bring together all the knowledge that a proficient C++ programmer should have:
In this assignment, we will be putting #1 on your resume by building a STL-compliant HashMap.
Recall that the Map abstract data type stores key → value pairs, and provides efficient lookup of the value of any key.
There are two versions of this in the STL: std::map, traditionally implemented using a balanced binary search tree, and std::unordered_map (introduced in C++11), which uses a clever technique called hashing and supports the most map operations in constant time (!).
You will not necessarily write a lot of code on this assignment (~90 lines), but some of your time will be spent on understanding the starter code given to you, and reasoning about design choices and common C++ idioms. You may optionally work with a partner. We recommend having a partner to discuss the design questions. If you choose to work with a partner, we ask that you and your partner submit a single version of the assignment among both of you. This means that you will ideally be working with your partner on a shared version of the code and short answer questions. Feel free to start a shared Google Doc, a private GitHub repository, or anything else that you'd need to keep track of a shared version of the code. You may also choose to schedule video chat sessions with each other to work on the assignment.
The starter code passes 7 of the 14 required test cases. To pass the assignment, your job is to get the remaining 7 test cases (4 in milestone 2, and 3 in milestone 3) to pass, and to reasonably answer all 10 short answer questions.
To help you stay on track for this assignment, we will require you to submit a checkpoint approximately one week before the deadline. For the checkpoint, please have Milestones 0 and 2 finished.
Only Milestones 0, 2, 3, and 4 are required, but we are happy to support you in the other milestones:
Read about hashing. Examine starter code/test harness.
Complete the implementation of rehash.
Overload 4 operators and make Hashmap const-correct.
Implement 4 special member functions.
Answer 10 short answer questions about your code.
Implement initializer_list and range constructors.
Implement an iterator class.
See extensions.
Please download the starter code here. The starter code contains the following files:
hashmap.h: This file contains the function prototypes for the minimal HashMap, and also contains very thorough comments about each function. The prototype for rehash() is provided here, but you will need to add your function prototypes for any function that you write for milestone 2 and 3.hashmap.cpp: This file contains the implementation of the HashMap, and also contains very thorough comments about implementation details. The skeleton for rehash() is provided here, and you will need to declare your own functions for milestones 2 and 3.. student_main.cpp: When the test harness is turned off (see below), the student_main() function in this file will be run. You can treat this as if you were writing your own main function to test your program.test_settings.cpp: This file contains macros which lets you on the test harness. If you turn on the test harness (#define RUN_TEST_HARNESS 1), test cases will be run. If it is turned off (#define RUN_TEST_HARNESS 0), the student_main function will be run. This is analogous to using NO_TESTS vs. ALL_TESTS in CS 106B's SimpleTest.tests.cpp: This file contains the code for each test case. You are welcome to look at this, add some map.debug() statements, or even add your own test cases. We will test your code on an unaltered version of this file, so be careful about changing this file. Note: some authors insert a node to the end of a bucket's linked list. We will always insert into the front, since order doesn't matter and inserting to the front is faster and easier.
We could have asked you to implement a HashMap yourself given you knowledge from CS 106B. However, since practicing linked list algorithms is not the main goal of CS106L, we've given you the starter code.
Read through the starter code's interface (.h) and template implementation (.cpp) files to understand what has already been implemented. This handout you are reading is fairly sparse, and the reason is that there is a lot of documentation provided in the starter code. You should be able to identify the standard map interface (constructor, insert(), at(), contains(), etc.) as well as the hash table implementation (_bucket_arrays, node*'s, etc.).
You can call map.debug() inside the test cases to see a visual representation of the map printed to the console. You can call this function in your student_main function without the test harness, or inside the test cases. This will be useful when you begin milestone 1.
There's nothing you need to turn in after completing this milestone, but please do make sure that you're comfortable with the above content before moving on.
This milestone is optional. It's a challenging linked list problem that will be great practice for CS 106B assignments and exams.
Implement the public member function rehash(), whose expected behavior is documented in the .h file. Your implementation of rehash() must not allocate or leak memory; instead, it should reuse existing nodes.
The purpose of this milestone is to get you familiarized with the starter code and with working in a template class. It should also serve as a good review of the linked lists algorithms you learned in CS 106B. Our solution is 11 lines of code long. If you are unsure about the hash functions, we recommend looking at the implementations of insert() and erase().
Once you are done, turn on the two tests for Milestone 1:
rehash() with calls to erase() and insert(). Finally, there is a check to see if you correctly throw an exception when rehash(0) is called (this was done for you in the very first line of rehash()).Test 1B roughly checks the internal correctness of the rehash() function. The idea is that buckets with longer linked lists should take longer to search, so we time various calls to contains() to verify that the sizes of the linked lists are roughly correct. This test is fairly rough as it depends on the speed of your computer, but we give it a large enough leeway that you should be able to pass the test. It is fine if this test fails with a small probability.
In this section, you will first implement four operators, then make your HashMap const-correct. After implementing an operator, turn on that operator's tests. There is one test for each category of operator, and one for const-correctness.
[])The index operator accepts a key, and returns a reference to its mapped value. The index operator should support auto-insertion—if a key is not found, then a new key/mapped pair is inserted with the given key and a default value of the mapped type.
<<)The stream insertion operator writes the contents of a HashMap into an output stream. The format resembles the format used by the Stanford HashMap, as seen below:
HashMap<string, int> map;
cout << map << endl; // prints "{}"
map.insert({"Anna", 2});
cout << map << endl; // prints "{Anna:2}"
map.insert({"Avery", 3});
cout << map << endl; // prints either "{Avery:3, Anna:2}"
// or "{Anna:2, Avery:3}"
The stream insertion operator will need to support chaining.
==) and Inequality (!=)K, M, and H in this template declaration because these template parameters are already in use in the class declaration. Instead, we can use names other than K, M, and H for our function's template declaration.
Any good STL-compliant class must have correct special member functions.. Recall that there are six major special member functions:
The first two are implemented for you; your job is to implement the last four. Specifically, the copy operations should create an identical copy of the provided HashMap, while the move operations should move the contents of the provided HashMap to this. Avoid memory leaks and perform your copy/move as efficiently and safely as possible.
We are requiring you to use the member initializer list (colon after constructor) whenever possible for efficiency.
Test 3A creates copies of your HashMap using the copy constructor and copy assignment operator, and verifies their correctness. In addition, there are tests for edge cases such as self-assignment. Test 3B tries to move HashMaps from one to another using the move constructor and move assignment, and verifies correctness just like test 3A. Note that Test 3B passes even if you only implemented the copy operations (recall that if no move operations are declared, the compiler defaults to the copy operations). Test 3C times your move operations to verify that you are actually moving the contents of the HashMaps rather than copying the contents.
The tests use some compiler directives (#pragma) to silence compiler warnings about self-assignment. You can safely ignore those.
Answer the following questions in the file named "short_answers.txt". These questions do not require long answers or even complete sentences. We will grade for reasonable correctness and effort—see an example below:
Consider the test case B_move_ctor_assignment. You should see a line with std::move(map1). Then, in the next line, we move from map1 to create move_constructed. Why is this valid?
Answer: std::move marks map eligible to be moved, but does not actually move. A function taking an r-value (eg. move constructor) does the moving. map1 is not passed to such a function, so we can still move from map1.F_custom_hash_function() of the file tests.cpp. What is the purpose of decltype, and is there a way to write that line of code without using decltype? Explain.std::pair<const K, M>. What would be the problem in the HashMap class if value_type were instead std::pair<K, M>?check_map_equal() inside tests.cpp. You might notice that in the test cases, we use check_map_equal() to compare a HashMap and a std::map, but never to compare two HashMaps (we instead use operator==). Why can't we use check_map_equal() to compare two HashMaps? at(), but only a non-const version of operator[]? (unlike in Vector where operator[] also had a const version)operator<< is the most interesting. operator<< as a member, a friend, or neither? Why?operator<< as an exercise in operator overloading, but none of the STL collections have operator<< overloaded. Why do you think the STL designers made this decision?MyClass which has three private members - a size_t, a function object, and a std::vector<int>.
In this case the rule of zero applies and you should not write your own special member functions. In your HashMap class the rule of five applies - the starter code had the destructor, and you implemented the other four special member functions.
Why does the rule of zero apply to MyClass?
size_t, a function object, and a std::vector?For the checkpoint, you should submit Milestone 2 in two files: hashmap.cpp and hashmap.h. If you worked with a partner, please have one partner submit the assignment. There will be a place on Paperless for you to enter in the other partner's SUNet ID so that you both get credit.
When you're finished, submit three files: hashmap.cpp, hashmap.h, and your short-answer questions on Paperless! Follow the same instructions for partner work as above.
And finally... Good luck! :)
All of these are optional. Some may require supplemental material or additional readings. We provide an autograder for milestones 5 and 6. These tests are slightly less polished, but if you find any bugs, please let us know!
Add a constructor that allows you to construct your HashMap with an std::initializer_list. Also add a constructor that accepts a range (in the form of two iterators) to initialize your HashMap. This is a fairly easy extension that can be implemented in around 15 lines of code.
Using your response to question 7 in Milestone 4, implement an iterator class for your HashMap. Then, add support for const_iterator, and make sure that you can create const iterators, const_iterator, and const const_iterator. To reduce the repeated code, you may need to do some additional reading about type_traits and template metaprogramming. Talk to us if you want some guidance on this. Once you are done, change HashMap.cpp to use iterators. The code is much simpler with iterators!