We will begin most classes with a short, in-class reading quiz (about 3 questions; ~5 minutes). The goal is not to trick you — it’s to help everyone arrive ready for discussion and to give us quick signal about what landed (and what didn’t) in the readings.
Quizzes are based on the required readings, and the questions will be straightforward if you did the reading.
Your lowest three quiz scores are automatically dropped (this includes absences and late arrivals). Because quizzes happen at the very start of class, we can’t offer make-ups. Please arrive on time.
Each student will be a discussant for assigned readings once during the quarter. Students will work in groups of 2 to serve as discussants for a particular class. Your goal as discussant is to prepare and facilitate an engaged and stimulating discussion for about 30-45 minutes total, depending on the available time. Be creative!
Your responsibilities as discussants include:
Grading rubric (10 points total):
There will be four assignments during the course, which will jointly build towards your final project.
At the beginning of the class, you can select a team of 1-3 students to work with on the assignments as well as the final project. You will be working with the same team for all assignments. In our experience, groups of 3 lead to the best outcomes, so we encourage you to form a team of that size. Each project team will be assigned a mentor (a member of the teaching team), who will provide feedback on all their project-related work and generally be available.
Please discuss your project idea with instructor/TA early on in the course.
You have a choice among four datasets for your assignments/final project. These datasets span classroom discourse, student writing, curriculum-grounded math reasoning, and teacher–AI interaction:
We will share evaluation criteria for each assignment advance:
Goal: Form your team, identify a problem you care about in teaching/learning, and ground your project in a real instructional context. The instructions are linked from the syllabus.
Goal: Choose a dataset, and have a first close read of your data and context. You should leave this assignment with a clear understanding of what’s in the data, and a plan for what to measure.
Deliverables:
Goal: Build/implement and validate measure of some dimension of instructional or learning quality that matters for your use case (e.g., eliciting reasoning, responsiveness, equity of participation, conceptual clarity, feedback quality, student agency).
You will choose one of two tracks:
You will be asked to triangulate across multiple data sources:
Deliverable: A 4–6 page write-up (PDF) with clear research questions, methods, validation/evaluation, and an error analysis that identifies where your measure fails and why:
We will provide advice on all of the above, as well as scripts for measure development.
The goals for this assignment will differ by the track chosen for Assignment 2.
For Track 1:
Goal: Use your measurement work to design a support (tool, workflow, or intervention) that helps someone do something better in a real instructional context.
Deliverables: (i) a short demo-ready prototype, (ii) a 3–5 page write-up with design goals, user feedback, iteration notes, and an evaluation plan, and (iii) a brief discussion of risks/guardrails and responsible deployment.
For Track 2:
Deliverables: an expanded version of the write-up from Assignment 2 (8-10 pages total) describing the final details of measure development, validation, and results. Include a brief ethics note on what could go wrong if the measure were deployed.
You will be asked to provide feedback to peers on assignments 1-3.
Grading: Peer feedback is graded on quality, not on whether your suggestions are “correct.” Strong feedback is (i) specific, (ii) evidence-based (points to a place in the write-up/code), (iii) actionable (clear next steps), and (iv) respectful. Each round, you should include at least two concrete strengths and two high-impact suggestions.
Each team will give a final presentation during the last week of the quarter, which would be targeted to a broad audience, including practitioners, policy-makers and researchers (e.g. imagine giving it at the AI education summit at Stanford).
Grading rubric: clarity and framing (30%), technical soundness and evidence [legible to non-technical audiences] (40%), thoughtful reflection on limitations/ethics (15%), and quality of communication/demo (15%).
Your final paper will closely build on your assignments. This paper should be up to 10 pages long including references, and should adhere to the formal requirements and stylistic expectations for research contributions in computational social science / NLP.
Unlike the assignments that were free-form, you are required to use one of the following templates for your submission:
If you have any questions about organizing your paper, please talk to the instructors. This handout by Chris Potts can provide helpful guidelines for presenting your research to an NLP audience, and it is helpful even if your work is targeted at a different audience (e.g. learning sciences).
There are two required paper sections that are special to our course:
Ethical Consideration: Please write an explicit discussion section of any potential ethical issues, such as around the ethical implication of the project, the use of the data, and potential applications of your work. Here are some recommendations from ACL's ethics guideline: "Ethical questions may arise when working with a variety of types of computational work with language, including (but not limited to) the collection and release of data, inference of information or judgments about individuals, real-world impact of the deployment of language technologies, and environmental consequences of large-scale computation."
This is the system we will use at the end of the quarter to map numerical final grades to letter grades. No curve is applied, and there are no other factors shaping the mapping from weighted averages to letter grades.
| Grade range | Letter grade |
|---|---|
| ≥ 100 | A+ |
| ≥ 94 | A |
| ≥ 90 | A− |
| ≥ 87 | B+ |
| ≥ 84 | B |
| ≥ 80 | B− |
| ≥ 77 | C+ |
| ≥ 74 | C |
| ≥ 70 | C− |
| ≥ 67 | D+ |
| ≥ 64 | D |
| ≥ 60 | D− |
| < 60 | No pass |
Please familiarize yourself with Stanford's honor code. We will adhere to it and follow through on its penalty guidelines.
It is expected that you accurately represent your own work and the work of others in this class. Ideas should be your own. Please see the course AI Policy below.
Because this course is about productive uses of language technology in teaching and learning, we encourage you to use AI tools thoughtfully — as you would any other powerful tool — while keeping your work rigorous, transparent, and clearly your own.
Disclosure requirement (mandatory): Every assignment and the final paper must include a short AI Use Statement (2–6 sentences) describing whether you used AI, which tool(s), and for what purpose (e.g., “debugging,” “rewriting,” “prompt-based scoring”), plus what you did to validate/spot-check outputs. When AI meaningfully contributes to an artifact (e.g., generated code blocks, prompts, rubrics, or evaluation outputs), include the relevant prompts and settings in an appendix or link to a reproducible log.
Honor Code note: Failing to disclose substantive AI use is an academic integrity violation. When in doubt, disclose.
On the one hand, we want to encourage you to pursue unified interdisciplinary projects that weave together themes from multiple classes. On the other hand, we need to ensure that final projects for this course are original and involve a substantial new effort.
To try to meet both these demands, we are adopting the following policy on joint submission: if your final project for this course is related to your final project for another course, you are required to submit both projects to us by our final project due date. If we decide that the projects are too similar, your project will receive a failing grade. To avoid this extreme outcome, we strongly encourage you to stay in close communication with us if your project is related to another you are submitting for credit, so that there are no unhappy surprises at the end of the term. Since there is no single objective standard for what counts as "different enough", it is better to play it safe by talking with us.
Fundamentally, we are saying that combining projects is not a shortcut. In a sense, we are in the same position as professional conferences and journals, which also need to watch out for multiple submissions. You might have a look at the ACL/NAACL policy, which strives to ensure that any two papers submitted to those conferences make substantially different contributions – our goal here as well.
It is very important to us that all assignments are properly graded. The teaching staff works extremely hard to grade fairly and to turn around assignments quickly. We know what you work hard, and we respect that. Occasionally, mistakes happen, and it's important to us to correct them. If you believe there is an error in your assignment grading, please submit an explanation in writing to the staff within seven days of receiving the grade. We will regrade the entire assignment to ensure quality.
No regrade requests will be accepted orally, and no regrade requests will be accepted more than seven days after receipt of the assignment. Regrade requests must be respectful; we will not consider any regrade requests containing disrespectful language.