BiDAF with Explicit Token Linguistic Features

img
How do you do reading comprehension? When I learned reading comprehension with English as my second language, I was taught a few tricks. One important trick is to find word correspondences between the text and the question. Another trick is to use information such as part of speech and sentiment of known words to infer meaning of other unknown words. In this project, I explore the effectiveness of those tricks when applied to SQuAD, by supplying BiDAF with explicit linguistic features from the tokenizer as part of the input. I found that although effective at improving the scores, using those features is prone to overfitting if not regulated.