Abstractive Summarization of Long Medical Documents with Transformers

Summarizing long documents has proven itself a difficult NLP task to perform using current transformer architectures. Transformers have context windows which limit them to processing only short to mid-length sequences of text. For our project, we employed a multi-step method for long document summarization. First, an extractive summarizer extracts key sentences from the original long text, and then an abstractive summarizer summarizes the extracted sentences. This allows us to work around the context window limitation and condition our final summary on text from throughout the whole document. We expanded upon previous implementations of this method by leveraging Transformers for both the extractive and abstractive steps. In particular, we show that our model quantitatively improves performance in the extractive step, and qualitatively provides more context and readability to the abstractive step.