Robust Question Answering using Domain Adversarial Training

While recent developments in deep learning and natural language understanding have produced models that perform very well on question answering tasks, they often learn superficial correlations specific to their training data and fail to generalize to unseen domains. We aim to create a more robust, generalized model by forcing it to create domain-invariant representations of the input using an adversarial discriminator system that attempts to classify the outputs of the QA model by domain. Our results show improvements over the baseline on average, although the model exhibited worse performance on certain datasets. We hypothesize that this is caused by differences in the kind of reasoning required for those datasets, differences which actually end up being erased by the discriminator.