Stanford CS 293 / EDUC 473 | Empowering Educators via Language Technology

Schedule

Note: tentative schedule is subject to change.
🔎 Means that the paper will be a core part of the lecture.
🌟 Means that the paper will be the focus of reading discussions.

All homework assignments will be hosted on this GitHub repository.

Week	Date	Theme	Course Material
1	Sep 27 Wednesday	Class Introduction [slides]	Optional Readings: U.S. Department of Education, Office of Educational Technology. Artificial Intelligence and Future of Teaching and Learning: Insights and Recommendations, Washington, DC, 2023. Litman, D. (2016, March). Natural language processing for enhancing teaching and learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
2	Oct 2 Monday	Discovery & Exploration in Educational Language Data Parsing, Lexical Analyses [slides]	Required Reading: 🔎 Nguyen, D., Liakata, M., DeDeo, S., Eisenstein, J., Mimno, D., Tromble, R., & Winters, J. (2020). How We Do Things With Words: Analyzing Text as Social and Cultural Data. Frontiers in Artificial Intelligence, 3. 🔎 Lucy, L., Demszky, D., Bromley, P., & Jurafsky, D. (2020). Content analysis of textbooks via natural language processing: Findings on gender, race, and ethnicity in Texas US history textbooks. AERA Open, 6(3), 2332858420940312. Optional Reading: Hovy, D. (2020). Text analysis in Python for social scientists: Discovery and exploration. Cambridge University Press. Dan Jurafsky and James H. Martin (2021). Speech & language processing. Chapters 2, 6, 8, 18, 21, 23, 25, 26.
2	Oct 4 Wednesday	Discovery & Exploration in Educational Language Data Topic Modeling, Clustering [slides]	Required Reading: 🌟 Liu, J., & Cohen, J. (2021). Measuring teaching practices at scale: A novel application of text-as-data methods. Educational Evaluation and Policy Analysis, 43(4), 587-614 Optional Reading: Chang, J., Gerrish, S., Wang, C., Boyd-graber, J. L., & Blei, D. M. (2009). Reading Tea Leaves: How Humans Interpret Topic Models. Advances in Neural Information Processing Systems, 288–296. Markowitz, D. M., Kittelman, A., Girvan, E. J., Santiago-Rosario, M. R., & McIntosh, K. (2023). Taking Note of Our Biases: How Language Patterns Reveal Bias Underlying the Use of Office Discipline Referrals in Exclusionary Discipline. Educational Researcher, 0(0). Alvero, A. J., Giebel, S., Gebre-Medhin, B., Antonio, A. L., Stevens, M. L., & Domingue, B. W. (2021). Essay content and style are strongly related to household income and SAT scores: Evidence from 60,000 undergraduate applications. Science advances, 7(42), eabi9031.
3	Oct 9 Monday	Guest Lecture by Michael Madaio Bias & Fairness in AI for Education [slides] HW1 due on Tuesday at 11:59pm Download the homework here.	Required Reading: 🌟🔎 Madaio, M., Blodgett, S. L., Mayfield, E., & Dixon-Román, E. (2021). Beyond “Fairness:” Structural (In)justice Lenses on AI for Education (arXiv:2105.08847). arXiv. Optional Reading: Mayfield, E., Madaio, M., Prabhumoye, S., Gerritsen, D., McLaughlin, B., Dixon-Román, E., & Black, A. W. (2019). Equity Beyond Bias in Language Technologies for Education. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, 444–460. Blodgett, S. L., & Madaio, M. (2021). Risks of AI Foundation Models in Education (arXiv:2110.10024). arXiv. Olteanu, A., Castillo, C., Diaz, F., & Kıcıman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in big data, 2, 13.
3	Oct 11 Wednesday	Using NLP for Educational Measurement Grounded Discovery, Data Annotation [slides]	Required Reading: 🌟 Chen, N.-C., Drouhard, M., Kocielnik, R., Suh, J., & Aragon, C. R. (2018). Using Machine Learning to Support Qualitative Coding in Social Science: Shifting the Focus to Ambiguity. ACM Transactions on Interactive Intelligent Systems, 8(2), 1–20. Optional Reading: Hills, O. H. L. (2023). Leveraging Human Feedback to Scale Educational Datasets: Combining Crowdworkers and Comparative Judgement (arXiv:2305.12894). arXiv. Bauer, M. W., & Gaskell, G. (Eds.) (2000). Qualitative researching with text, image and sound. SAGE Publications Ltd, https://doi.org/10.4135/9781849209731. O’Connor, C., & Joffe, H. (2020). Intercoder Reliability in Qualitative Research: Debates and Practical Guidelines. International Journal of Qualitative Methods, 19. https://doi.org/10.1177/1609406919899220 Paullada, A., Raji, I. D., Bender, E. M., Denton, E., & Hanna, A. (2021). Data and its (dis) contents: A survey of dataset development and use in machine learning research. Patterns, 2(11).
4	Oct 16 Monday	Using NLP for Educational Measurement Model Development [slides] Project Rationale due on Tuesday at 11:59pm	Required Reading: 🌟 Suresh, A., Jacobs, J., Perkoff, M., Martin, J. H., & Sumner, T. (2022). Fine-tuning Transformers with Additional Context to Classify Discursive Moves in Mathematics Classrooms. Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), 71–81. Optional Reading: Hovy, D. (2020). Text analysis in Python for social scientists: Prediction and classification. Cambridge University Press.
4	Oct 18 Wednesday	Using NLP for Educational Measurement Measure Validation [slides]	Required Reading: 🌟 Hunkins, N., Kelly, S., & D'Mello, S. (2022, March). “Beautiful work, you're rock stars!”: Teacher Analytics to Uncover Discourse that Supports or Undermines Student Motivation, Identity, and Belonging in Classrooms. In LAK22: 12th International Learning Analytics and Knowledge Conference (pp. 230-238). 🔎 Demszky, D., Liu, J., Mancenido, Z., Cohen, J., Hill, H., Jurafsky, D., & Hashimoto, T. B. (2021, August). Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1638-1653). Optional Reading: Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior research methods, 48, 1227-1237.
5	Oct 23 Monday	Generative Language Models for Education Case Study: Teacher Feedback [slides] HW2 due on Tueday at 11:59pm Download the homework here.	Required Reading: 🌟 🔎Wang, R. E., Zhang, Q., Robinson, C., Loeb, S., & Demszky, D. (2023). Step-by-Step Remediation of Students’ Mathematical Mistakes. arXiv. Optional Reading: Wang, R., & Demszky, D. (2023). Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction. Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), 626–667. Handa, K., Clapper, M., Boyle, J., Wang, R. E., Yang, D., Yeager, D. S., & Demszky, D. (2023). “Mistakes Help Us Grow”: Facilitating and Evaluating Growth Mindset Supportive Language in Classrooms . EMNLP. Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual differences, 103, 102274.
5	Oct 25 Wednesday	Guest Lecture by Katie Keith Causal Effect Estimation with Text Data [slides]	Required Reading: 🌟🔎 Keith, K., Jensen, D., & O’Connor, B. (2020). Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5332–5344. Optional Reading Zhang, R., Kennard, N. N., Smith, D., McFarland, D., McCallum, A., & Keith, K. (2023). Causal Matching with Text Embeddings: A Case Study in Estimating the Causal Effects of Peer Review Policies. Findings of the Association for Computational Linguistics: ACL 2023, 1284–1297.
6	Oct 30 Monday	Practice Pitches	No Reading
6	Nov 1 Wednesday	Practice Pitches	No Reading
7	Nov 6 Monday	Project Work Session! Readings: Using LLMs for Student Assessment and Feedback Experimental Protocol due on Tuesday at 11:59pm	Optional Reading: Baffour, P., Saxberg, T., & Crossley, S. (2023, July). Analyzing Bias in Large Language Model Solutions for Assisted Writing Feedback Tools: Lessons from the Feedback Prize Competition Series. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 242-246). Watters, Audrey. Teaching machines: The history of personalized learning. MIT Press, 2023. (TODO: select chapters) Skinner, Burrhus Frederic. "Teaching machines." Scientific American 205.5 (1961): 90-106. Paulson Gjerde, Kathy, Margaret Y. Padgett, and Deborah Skinner. "The Impact of Process vs. Outcome Feedback on Student Performance and Perceptions." Journal of Learning in Higher Education 13.1 (2017): 73-82. Singer, Natasha. "In Classrooms, Teachers Put A.I. Tutoring Bots to the Test." The New York Times, 26 June 2023. Pardos, Zachary A., and Shreya Bhandari. "Learning gain differences between ChatGPT and human tutor generated algebra hints." arXiv preprint arXiv:2302.06871 (2023). Matelsky, J. K., Parodi, F., Liu, T., Lange, R. D., & Kording, K. P. (2023). A large language model-assisted education tool to provide feedback on open-ended responses (arXiv:2308.02439). arXiv. http://arxiv.org/abs/2308.02439 Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050 Basic, Zeljana, et al. "Better by you, better than me, chatgpt3 as writing assistance in students essays." arXiv preprint arXiv:2302.04536 (2023).
7	Nov 8 Wednesday	Designing NLP Tools for Empowering Teachers in the Real World Q&A with Rakiya Brown from TeachFX	Required Reading: 🌟 Nicholson, R., Bartindale, T., Kharrufa, A., Kirk, D., & Walker-Gleaves, C. (2022). Participatory Design Goes to School: Co-Teaching as a Form of Co-Design for Educational Technology. CHI Conference on Human Factors in Computing Systems, 1–17. Optional Reading: Meyer, D. (2023, September 6). EdTech Companies Are Racing to Build a GitHub Copilot for Teachers. This Will Not Be Easy. Mathworlds. Jacobs, J., Scornavacco, K., Harty, C., Suresh, A., Lai, V., & Sumner, T. (2022). Promoting rich discussions in mathematics classrooms: Using personalized, automated feedback to support reflection and instructional change. Teaching and Teacher Education, 112, 103631. Lee, V. R., Clarke-Midura, J., Shumway, J., & Recker, M. (2022). “Design for Co-Design” in a Computer Science Curriculum Research-Practice Partnership. Kulkarni, C. (2019). Design Perspectives of Learning at Scale: Scaling Efficiency and Empowerment. Proceedings of the Sixth (2019) ACM Conference on Learning @ Scale, 1–11. Markel, J. M., Opferman, S. G., Landay, J. A., & Piech, C. (2023).GPTeach: Interactive TA Training with GPT Based Students.
8	Nov 13 Monday	Deploying NLP Tools To Empower Teachers Experimental Design & Evaluation [slides] HW3 due on Monday at 11:59pm Download the homework here.	Required Reading: 🌟 Demszky, D., Wang, Rose E., Yu, C., Geraghty, S. (in preparation). Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform. Working Paper 🔎 Demszky, D., Liu, J., Hill, H. C., Jurafsky, D., & Piech, C. (2023). Can automated feedback improve teachers’ uptake of student ideas? Evidence from a randomized controlled trial in a large-scale online course. Educational Evaluation and Policy Analysis, 01623737231169270. Optional Reading: Papay, J. P., Taylor, E. S., Tyler, J. H., & Laski, M. E. (2020). Learning job skills from colleagues at work: Evidence from a field experiment using teacher performance data. American Economic Journal: Economic Policy, 12(1), 359-388.
8	Nov 15 Wednesday	Round 2 Practice Pitches	No reading
	Nov 20 & Nov 22	Thanksgiving Break
9	Nov 27 Monday	Guest Lecture by Diane Litman eRevise	Required Reading: 🌟 Zhang, H., Magooda, A., Litman, D., Correnti, R., Wang, E., Matsmura, L. C., Howe, E., & Quintana, R. (2019). eRevise: Using Natural Language Processing to Provide Formative Feedback on Text Evidence Usage in Student Writing. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 9619–9625. Optional Reading: Zhexiong Liu, Diane Litman, Elaine Wang, Lindsay Matsumura, and Richard Correnti. 2023. Predicting the Quality of Revisions in Argumentative Writing. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 275–287, Toronto, Canada. Association for Computational Linguistics. Alhindi, T., & Ghosh, D. (2021, April). “Sharks are not the threat humans are”: Argument Component Segmentation in School Student Essays. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications (pp. 210-222)
9	Nov 29 Wednesday	Frontiers and Open Questions	Required Reading: 🌟 Demszky, D., Bush, J. B., D’Mello, S. K., Jacobs, J., Hau, I., Hill, H., … Wentworth, L. (2023, October). Empowering Educators via Language Technology.
10	Dec 4 Wednesday	Final Pitches	No Reading
10	Dec 6 Wednesday	Final Pitches Final paper due on Tuesday, Dec 12 at 11:59pm	No Reading

Stanford CS 293 / EDUC 473 | Empowering Educators via Language Technology

Stanford / Fall 2023

Instructors

Welcome!

Schedule

Overview

Course Info

Office Hours

Prerequisites

Academic Accommodations

Well-Being, Stress Management, & Mental Health