10 weeks of intensive learning, building, and shipping. Four sprints. One Demo Day.
Why LLMs are exciting right now; where we are in 2026; what makes now unique; course logistics; introduction to responsible AI development
Student project proposal pitches (2 min each); Modern LLM stack overview; ethical considerations in project selection
Reasoning models (OpenAI o1/o3, DeepSeek R1, Claude Sonnet 4); chain-of-thought prompting; when to use reasoning vs. standard models; LMM architectures; cost/latency tradeoffs
Four pillars of building software with LLMs: Iteration, Evaluations, Deployment, Observability; building a data flywheel; continuous systems
Context engineering as evolution of prompt engineering; the full context stack (system prompts, conversation history, tool definitions, parameters); RAG architecture; managing context windows; memory systems; token budget economy; advanced prompting techniques; prompt injection attacks and defenses
Architecting the full context stack for reliable production-grade AI systems; the three strata of instructions (model system prompt, product system prompt, personas); injected knowledge (memories, RAG, uploads); data strategy fundamentals; user engagement and feedback loops; using LLMs to enhance feedback quality
Agent architectures (ReAct, multi-agent systems); orchestration frameworks (LangGraph, CrewAI); sequential, router, and collaborative patterns; tool use and function calling; state management; real-world agentic patterns; agent safety and containment strategies
Groups present their progress on Sprint 1 (3 min per team)
Agent architectures in production; handling complex workflows; system monitoring and automation; reliability and safety monitoring for autonomous systems; lessons from deploying AI agents at scale
The challenge of non-determinism in LLMs; applying the Theory of Constraints to AI systems; modeling your system (system prompt, user prompt, context, tools); identifying and fixing bottlenecks; proper evals vs. vibe evals; the improvement loop; practical examples from production AI products
Code assistants and AI-assisted development; code understanding, testing, documentation; practical strategies for AI-assisted development; security implications of AI-generated code; why AI coding tools are the highest-productivity-gain use case for LLMs
Groups present their progress on Sprint 2 (3 min per team)
Advanced orchestration patterns; memory systems (short-term and long-term); RAG pipeline design; when to use RAG vs. fine-tuning; data privacy in RAG systems
Building production-grade LLM applications; deployment strategies; prompt management; implementing guardrails and content moderation; compliance and regulatory considerations (SB 53, RAISE Act, EU AI Act)
Constitutional AI and RLAIF; potential harms: bias, privacy violations, misinformation; safety techniques: red teaming, adversarial testing; regulatory landscape 2026; open source vs. closed models
Groups present their progress on Sprint 3 (3 min per team)
Multimodal models (vision, audio, video); multimodal RAG strategies; voice AI applications; cross-modal retrieval; LLM + database design patterns; deepfake detection and prevention
Advanced prompt engineering tricks; fine-tuning vs RAG tradeoffs; build vs. buy considerations; preparing for demo day; pitch techniques for technical projects
Product development in consumer AI; lessons from successful AI products; defensible moats in AI; IP, data advantages, distribution; UX design for LLM apps; building trust with users
Open session for final project reviews, pitch feedback; demo day logistics and expectations; poster preparation guidance
Thursday, March 19th, 3:30-6:30pm at CoDa (Computing and Data Science) E160
3-hour event with investors, entrepreneurs, and guests
5-minute project demonstrations per team with recorded demos
Interactive poster presentations (36" x 48") with networking
VCs, entrepreneurs, Stanford faculty, CS 224G alumni, tech press