#include "lexicon.h"

class Lexicon

This class is used to represent a lexicon, or word list. The main difference between a lexicon and a dictionary is that a lexicon does not provide any mechanism for storing definitions; the lexicon contains only words, with no associated information. It is therefore similar to a set of strings, but with a different internal representation. The Lexicon class supports efficient lookup operations for words and prefixes.

This version of Lexicon is implemented internally using a data structure called a prefix tree or trie (pronounced "try"), which allows for efficient prefix and word searching. Some runtimes below are expressed in terms of "W" where W represents the length of the word being added.

Constructor
Lexicon() O(1) Initializes a new empty lexicon.
Lexicon(filename)  O(WN) Initializes a new lexicon that reads words from the given file or input stream.
Methods
contains(word)  O(W) Returns true if word is contained in the lexicon.
containsPrefix(prefix)  O(W) Returns true if any words in the lexicon begin with prefix.
size()  O(1) Returns the number of words contained in the lexicon.
Operators
for (string word : lex)  O(N) Iterates through the words in a lexicon in alphabetical order.

Constructor detail


Lexicon();
Lexicon(string filename);
Initializes a new lexicon. The default constructor creates an empty lexicon. The second form reads in the contents of the lexicon from the specified data file. The data file must be in one of two formats: (1) a space-efficient precompiled binary format or (2) a text file containing one word per line. The Stanford library distribution includes a binary lexicon file named EnglishWords.dat containing a list of words in English. The standard code pattern to initialize that lexicon looks like this:
Lexicon english("EnglishWords.dat");

Usage:

Lexicon lex;
Lexicon lex(filename);

Method detail


bool contains(string word) const;
Returns true if word is contained in the lexicon. In the Lexicon class, the case of letters is ignored, so "Zoo" is the same as "ZOO" or "zoo".

Usage:

if (lex.contains(word)) ...

bool containsPrefix(string prefix) const;
Returns true if any words in the lexicon begin with prefix. Like containsWord, this method ignores the case of letters so that "MO" is a prefix of "monkey" or "Monday".

Usage:

if (lex.containsPrefix(prefix)) ...

int size() const;
Returns the number of words contained in the lexicon.

Usage:

int n = lex.size();

Operator detail


for (string word : lex)
The range-based for loop can be used to iterate through the elements in a collection. The iteration accesses the lexicon words in alphabetical order.

Usage:

for (string word : lex) {
	cout << word << endl;
}