Assign3: Text prediction


A long, long time ago in a land far, far away, smartphones with their full-power keyboards didn't exist, and texting had to be done using a numerical keyboard, with only the digits 0-9 available on a standard phone keypad. With only 10 digits for 26 letters, each digit key was necessarily mapped to several letters and a variety of janky interfaces were used to indicate which letter you intended. An example of an old-style keypad is shown below, with the associated number-to-letter mappings.

telephone keypad

To work around this restriction, there were many algorithms used to try and predict what words a user was trying to type based off of the digits they had entered. One of these algorithms was Tegic T9 (which stands for Text on 9 keys), a predictive algorithm that guessed which letter was intended based on which sequences formed actual words.

Let's work through an example to see how this algorithm works. For example, let's say that the user taps the digit keys "6263".

Thus, your task is to write the function

void predict(string digits, Set<string>& suggestions, Lexicon& lex, string sofar = "")

that implements a T9-style prediction algorithm. The function is given the sequence of digit keys tapped, say, "6263". It recursively forms all of the possible letter sequences that could have been intended ("mamd", "mame", and so on). Whichever of those letter sequences that is a valid word according to the Lexicon is added to the set of suggestions. The final result of calling this function is that all valid English words corresponding to the provided digit sequence are stored in the suggestions set.

Notes

Extensions