Domain Adaptive Adversarial Feature Disentanglement for Neural Question Answering

img
Learning-based Question Answering systems have achieved significant success with the help of large language models and pre-trained model weights. However, existing approaches assume that data is drawn i.i.d from the same distribution, which violate the more realistic scenario that test-time text and questions are under different distributions. Deep networks have been used to learn transferable representations for domain adaptation, which has shown success in various vision tasks. In this project, we study the problem of domain adaptive question answering leveraging various techniques, ranging from Data Augmentation, Layer Re-initialization and Domain Adversarial Alignment. Specifically, we propose to use a wasserstein-stablized adversarial domain alignment scheme on the distilBert backbone with last layer reinitialized, to train on both the data-rich in-domain QA datasets and data augmented out-of-domain (OOD) datasets, following a finetuning stage on data-augmented OOD datasets. We have conducted extensive experiments to demonstrate the effectiveness of our proposed method in bringing significant performance boost for the task of domain-adaptive Question Answering. We also conducted carefully-designed ablation studies to show the performance gain resulted from each of the proposed components. Our proposed model addresses the problem of domain-adaptive question answering from various perspectives, including data, model architecture, and training scheme. The evaluation results on the provided OOD validation datasets show that our proposed method is able to bring 8.56% performance improvement, compared to the vanilla baseline using DistilBert without any of such domain adaptive designs.