Character Embedding and Self Attention Mechanism with SQuAD

In this project, we have demonstrated the effectiveness of character embedding. According to our experiment results, adding Context2Context self attention mechanism can not improve the performance of the BiDAF model. The BiDAF model with character embedding performs well with its Context2Query attention and Query2context attention. Adding self attention to this model will include additional interference when the context words attend not only to the query words, but the context words itself, which slightly reduced the model performance. For the future work, we can add additive attention to the BiDAF model to see how it compares to the two attention implementations we use. In addition, there are plenty of modern techniques, including Transformer and Reformer, can be further explored to find the best performing model on SQuAD challenge.