In our CS224N project, we examine a QA model trained on SQuAD, NewsQA, and Natural Questions and augment it to improve its ability to generalize to data from other domains. We apply a method known as domain adversarial training (as seen in a research paper we reviewed by Seanie Lee and associates) which involves an adversarial neural network attempting to detect domain-specific model behavior and discouraging this to produce a more general model. We explore the efficacy of this technique as well as the scope of what can be considered a "domain" and how the choice of domains affects the performance of the trained model. We find that, in our setting, using a clustering algorithm to sort training data into categories yields a performance benefit for out-of-domain data. We compare the partitioning method used by Lee et al. and our own unsupervised clustering method of partitioning and demonstrate a substantial improvement.