10 weeks of intensive learning, building, and shipping. Four sprints. One Demo Day.
Why LLMs are exciting right now; where we are in 2026; what makes now unique; course logistics; introduction to responsible AI development
Student project proposal pitches (2 min each); Modern LLM stack overview; ethical considerations in project selection
Reasoning models (OpenAI o1/o3, DeepSeek R1, Claude Sonnet 4); chain-of-thought prompting; when to use reasoning vs. standard models; LMM architectures; cost/latency tradeoffs
Four pillars of building software with LLMs: Iteration, Evaluations, Deployment, Observability; building a data flywheel; continuous systems
Context engineering as evolution of prompt engineering; the full context stack (system prompts, conversation history, tool definitions, parameters); RAG architecture; managing context windows; memory systems; token budget economy; advanced prompting techniques; prompt injection attacks and defenses
Architecting the full context stack for reliable production-grade AI systems; the compiled request model (what actually goes into an LLM call); call parameters and the context window (temperature, max tokens, logit bias, token budget economy); the three strata of instructions (model system prompt, product system prompt, personas); injected knowledge (memories, RAG, user-submitted data); tools, orchestrators, and agentic loops; the context engineering checklist of five controllable levers
Agent architectures (ReAct, multi-agent systems); orchestration frameworks (LangGraph, CrewAI); sequential, router, and collaborative patterns; tool use and function calling; state management; real-world agentic patterns; agent safety and containment strategies
Groups present their progress on Sprint 1 (3 min per team)
What "AI-native" actually means vs. AI-assisted; automated consulting and spec-driven development; evaluating agents without losing your mind (179 failures analyzed); the OpenClaw phenomenon and where the industry is heading; real-world examples from Vunda AI and heynoah.io
The challenge of non-determinism in LLMs; applying the Theory of Constraints to AI systems; modeling your system (system prompt, user prompt, context, tools); identifying and fixing bottlenecks; proper evals vs. vibe evals; the improvement loop; practical examples from production AI products
History of AI for code generation; benchmarking code gen (HumanEval, SWE-Bench, competitions, real-world impact); reasoning and decision-making techniques (CoT, Tree-of-Thoughts, ReAct, Reflexion, LATS); RL-driven reasoning in token space; using code generation wisely; coding agent setup (agents.md/CLAUDE.md); vibe engineering and shipping faster with coding agents; ephemeral software and environment engineering
Groups present their progress on Sprint 2 (3 min per team)
How to build an (almost) production-ready agent with PydanticAI; lightweight vs. heavy orchestration frameworks (PydanticAI vs. LangGraph, CrewAI, OpenAI Agents SDK); real-world application: AI for manufacturing operations at Dryft; hands-on agent implementation
Building sustainable competitive advantage in the age of generative AI; data as a strategic moat; 6 sources of advantage (Berkeley Research framework); real-world success stories (ElevenLabs, Harvey AI, Synthesia); product engagement and feedback loops; the data flywheel effect; internal tools and data quality; building effective AI agents with memory layers; CS224G project challenges
Groups present their progress on Sprint 3 (3 min per team)
Native realtime voice AI with OpenAI's WebRTC integration; legacy chained pipelines (STTβLLMβTTS) vs. native realtime pipelines; ephemeral session tokens and the architecture of trust; WebRTC data channels and media tracks; function calling for voice-driven agents; backend WebSocket sideband pattern for server-side control; scaling concerns (state, reconnection, cost); hands-on escape room challenge building a voice AI app
Constitutional AI and RLAIF; potential harms: bias, privacy violations, misinformation; safety techniques: red teaming, adversarial testing; regulatory landscape 2026; open source vs. closed models
Pitch techniques for technical projects; storytelling and structure for demo day; communicating impact and vision; handling Q&A; presentation polish and delivery
The origin story of OpenAI as YC Research; how the founding team came together; pitching and marketing AI products; growth and distribution strategies for AI startups; defensibility and competitive moats; how YC partners evaluate and qualify AI startups; predictions for where AI is headed
Thursday, March 19th, 3:30-6:30pm at CoDa (Computing and Data Science) E160
3-hour event with investors, entrepreneurs, and guests
5-minute project demonstrations per team with recorded demos
Interactive poster presentations (36" x 48") with networking
VCs, entrepreneurs, Stanford faculty, CS 224G alumni, tech press