Extending QANet with Transformer-XL

This project tackles the machine reading comprehension (RC) problemon the SQuAD 2.0 dataset. It involves inputting a context paragraph and aquestion into a model and outputting the span of the answer from the contextparagraph. This project aims to extend the QANet, so that it can effectivelyperform RC on SQuAD 2.0. The segment-level recurrence with state reuse fromTransformer-XL is integrated into QANet to improve its ability of tacklinglong context paragraph (referred to as QANet-XL). In addition, character embeddings and a fusion layer after context-query attention are used to extend BiDAF. Experiments show that QANet-XL underperforms the vanillaQANet and outperforms the extended BiDAF. The segment-level recurrence mech-anism from Transformer-XL is proven not a proper improvement for QANet on theSQuAD 2.0 dataset, since segmenting context paragraph is somewhat harmful. For the dev set, The extended BiDAF achieved EM/F1 = 62.16/65.98, the vanilla QANet achieved EM/F1=66.81/70.38, and the QANet-XL achieved EM/F1 = 63.12/66.67. A majority voting ensemble model based on previous mentioned models achieved EM/F1=66.85/69.97 on the test set.