QANet for Question Answering on SQuAD2.0

In this project, we study the application of a QANet architecture to question answering on the SQuAD2.0 dataset. Question answering consists in training models to answer questions provided in natural language from either prodided or general context. The QANet architecture, originally presented in 2018, was a top performer on the original SQuAD dataset before the advent of pre-training. While the original SQuAD dataset only contained answerable questions, the creators of the dataset published the updated SQuAD2.0 dataset that contains unanswerable question and demonstrated that while it had little effect on human performance, it greatly reduced the effectiveness of existing models. We study how the QANet model fair on this dataset compared with a BiDAF baseline model, another high-performing model. We show that QANet's effectiveness drops, but that simple modifications to the original architecture allow significant improvements in overall performance. We also study the benefits of ensembling different architectures to improve final performance. We achieve EM and F1 scores of 63.415 and 66.734 on the test dataset.