Analysis of Bias in U.S. History Textbooks Using BERT
U.S. History textbooks have a profound influence on childrens' social understanding of the United States. This is the reason activists and social scientists analyze textbooks for issues on bias and representation. Computational NLP methods can provide more holistic analyses compared to traditional qualitative studies. Our research supplements prior word2vec analyses of gender word relations in 15 U.S. History textbooks used in Texas by taking advantage of BERT's versatility with two studies. First, we compare BERT's embeddings between gender and interest words (related to home, work, and achievement). Second, we mask out the gender word in each context and evaluate BERT's ability to predict the correct gender in different contexts. We repeat both studies on fine-tuned and pretrained BERT. Furthermore, our analysis is done with all textbooks taken as a collective, as well as stratified by historical time period discussed. Overall, we find that the textbooks contain idiosyncrasies that tend to associate women with "home" and "work" contexts more strongly than "achievement," and that these trends stay relatively constant over historical time periods discussed.