Extending a BiDAF model with DCN for Question Answering

img
Our goal in this project is to improve the performance of the Bidirectional Attention Flow (BiDAF) model for the NLP task of question answering on the SQuAD 2.0 dataset. To do this, we 1) integrate character-level embeddings into the baseline BiDAF model and 2) replace the default attention layer with a coattention layer. While adding character-level embeddings has shown to improve the baseline BiDAF model's EM and F1 scores substantially, their addition to the DCN model actually decreased its scores slightly. Moreover, transforming the BiDAF model into a Dynamic Coattention Network (DCN) decreased the model's performance. Thus, the best model architecture we found is BiDAF with character-level embeddings. Future work includes tuning hyperparameters, experimenting with data processing techniques, adding optimizations like the Adam optimizer, and exploring different forms of attention.