Summarizations and Dragons

img
Summarization models are often trained and tested on constrained datasets, which could limit their ability to handle less structured or out-of-domain data, like conversations. Unlike existing datasets, the Critical Role dataset (CRD-3) of Dungeons and Dragons game transcripts is unstructured and conversational, but it still is centered around a shared game-playing goal and is of sufficient size to properly train and test these models \cite{crd3}. In this paper, I examine the dataset and identify the characteristics that differentiate it from previous data. Then, I implement an expanded version of the pointer-generator summarization model, and evaluate its performance on this dataset in order to identify which model choices and architectures are well-suited to the dataset, as well as the limitations this dataset reveals.