Speech and Language Processing (3rd ed. draft)
Dan Jurafsky and James H. Martin

Here's our Dec 29, 2021 draft! This draft includes a large portion of our new Chapter 11, which covers BERT and fine-tuning, augments the logistic regression chapter to better cover softmax regression, and fixes many other bugs and typos throughout (in addition to what was fixed in the September draft, which added various missing sections (more on transformers, including for MT, various updated algorithms, like for dependency parsers, etc.).

Minor update: the Dec 29, 2021 draft had some errors in PDF files for chapters 9, 10, and 11, that made them unviewable in some PDF readers.
On Jan 12, 2022 we updated the pdfs for the entire book and for those 3 chapters, so if you downloaded earlier, you might want to redownload. The content did not change from the Dec 29 2021 draft.

We've put up a list here of the amazing people who have sent so many fantastic suggestions and bug-fixes for improving the book. We are really grateful to all of you for your help, the book would not be possible without you!

Individual chapters are below; here is a single pdf of all the chapters in the Jan 12, 2022 draft of the book so far!
(This is the same exact content as the Dec 29, 2021 draft but with some PDF errors fixed that prevented Adobe Acrobat Reader from displaying some figures in chapters 9, 10, and 11)

Feel free to use the draft chapters and slides in your classes, the resulting feedback we get from you makes the book better!

As always, typos and comments very welcome (just email slp3edbugs@gmail.com and let us know the date on the draft)!

(Don't bother reporting missing refs due to cross-chapter cross-reference problems in the indvidual chapter pdfs, those are fixed in the full book draft)

When will the whole book be finished? Don't ask.

If you need last year's December 2020 draft chapters, they are here; the September 2021 draft chapters are here.

Chapter Slides Relation to 2nd ed.
1:Introduction [Ch. 1 in 2nd ed.]
2: Regular Expressions, Text Normalization, Edit Distance 2: Text Processing [pptx] [pdf]
2: Edit Distance [pptx] [pdf]
[Ch. 2 and parts of Ch. 3 in 2nd ed.]
3: N-gram Language Models 3: N-grams [pptx] [pdf]
[Ch. 4 in 2nd ed.]
4: Naive Bayes and Sentiment Classification 4: Naive Bayes + Sentiment [pptx] [pdf]
[new in this edition]
5: Logistic Regression 5: LR [pptx] [pdf]
[new in this edition]
6: Vector Semantics and Embeddings 6: Vector Semantics [pptx] [pdf] [new in this edition]
7: Neural Networks and Neural Language Models 7: Neural Networks [pptx] [pdf] [new in this edition]
8: Sequence Labeling for Parts of Speech and Named Entities 8: POS/NER Intro only [pptx] [pdf] [expanded from Ch. 5 in 2nd ed.]
9: Deep Learning Architectures for Sequence Processing [new in this edition]
10: Machine Translation [newly written for this edition, earlier MT was Ch. 25 in 2nd ed.]
11: Transfer Learning with Contextual Embeddings and Pre-trained language models [new in this edition]
12: Constituency Grammars [Ch. 12 in 2nd ed.]
13: Constituency Parsing [expanded from Ch. 13 in 2nd ed.]
14: Dependency Parsing [new in this edition]
15: Logical Representations of Sentence Meaning
16: Computational Semantics and Semantic Parsing
17: Information Extraction [Ch. 22 in 2nd ed.]
18: Word Senses and WordNet
19: Semantic Role Labeling and Argument Structure [expanded from parts of Ch. 19, 20 in 2nd ed]
20: Lexicons for Sentiment, Affect, and Connotation 20: Affect [pptx] [pdf] [new in this edition]
21: Coreference Resolution [mostly newly written; some sections expanded from parts of Ch 21 in 2nd ed]
22: Discourse Coherence [mostly new for this edition]
23: Question Answering [mostly newly written ; a few sections on classic algorithms expanded from parts of Ch 23 in 2nd ed]
24: Chatbots and Dialogue Systems 24: Dialog [pptx] [pdf] [mostly new, parts expanded from Ch 24 in 2nd ed]
25: Phonetics [Ch 7 in 2nd ed]
26: Automatic Speech Recognition and Text-to-Speech [mostly newly written, expanded from some parts of Chs 8 and 9 in 2nd ed]
Appendix Chapters (will be just on the web)
A: Hidden Markov Models
B: Spelling Correction and the Noisy Channel
C: Statistical Constituency Parsing [Ch. 14 in 2nd ed.]