Building a QA system (IID SQuAD track)

img
In this project, we are dealing with building a Question Answering System that is expected to perform well on SQuAD. Our approaches to this task include the retraining of baseline model, improvement on embedding (BiDAF), modification of attention (Dynamic Coattention Model), replacement of LSTM with GRU and application of transformer (QANet). After experiments with different models and modifications, both BiDAF and QANet outperform the baseline model, with QANet being our best model. It takes some advantages of various features in other modifications mentioned before, and it consists of four layers: (1) Embedding layer where the combination of character-level and word-level embedding uses the pre-trained word embedding model to map the input into vector space. (2) Contextual embedding layer where the encoder block utilized contextual cues from surrounding words to refine the embedding of the words. (3) Attention flow layer where the coattention-like implementation produces a set of query-aware feature vectors for each word in the context. (4) Modeling and output layer where a stack of encoder blocks with fully-connected layers are sued to scan the context and provide an answer to the query. By submitting our best model to the test leaderboard, we have obtained satisfying results with F1 of 66.43 and EM of 62.45.