Stanford CS 224N | Project Reports

Custom Projects

Project name	Authors
20 Questions for Code: Improving Code Generation with Information-Theoretic Clarification	Alexandra Suriya Kim, Julia Xi, Ria Garg
A Bigger Catch: Fine-Grained Curriculum Standards Alignment on the MathFish Benchmark	Mayank Sharma, Teah Shi, Xinman Liu
Adapting Language Models for Low-Resource GPU Kernel Programming	Annmaria Antony, Laasya Konidala, Natalia Pahlavan
Adaptive Test-Time Compute for Efficient Reasoning in Language Models	Ryan Tan
Adaptive Test-Time Compute for Pedagogically Grounded Reasoning in LLMs	Isha Jain, Jason Sejin Chon, Medhya Goel
Agents Don’t Always Do What They Think	Mark William Gernitis
Always-On Learning Companion: Proactive Multimodal Tutoring for Everyday Study Scenarios	Chenyue Li, Haowen Wang, Zhen Jia
Analyzing Robustness and Context Use in Clinical Natural Language Inference	Ryan Minh-Tri Le
Attention Modifications for Improved Adaptation	Jerry Yin, Michael Jang
Auditing Model-Generated Privacy Benchmarks: Do Synthetic Evaluations Reflect Real User Privacy Norms?	Selena She
Benchmarking and Improving Generative Diversity in Language Models via Diverse Preference Optimization	Annika Kaul Singh, Shyam Sai Bethina
Betting on Reasoning: Predicting Forecast Reliability in Prediction Markets	Rahul Rejeev
Beyond Bradley-Terry: Random Logit Preference Modeling for RLHF	Junyi Liu
Beyond Knowledge: Syntactic Complexity as a Bottleneck for Reasoning in "Bracket City" Puzzles	Amrita Malhotra
Bootstrapping Reasoning in Compact Language Models: A Multi-Stage Reinforcement Learning Pipeline with Targeted Failure Repair	Joseph Li, Max Luis Rodriguez, Victor Chen
Bootstrapping Safety-Aligned Reasoning in Small Language Models via Self-Instruct	J Yim, Komal Vij, Tim Jing
Building A Contextual Reasoning Aware Social-Intel Agent with Reinforcement Learning	Binbin Li, Da Sun, Ying Lu
Burst: Multi-Agent System for High-Quality Temporal Content Generation	Jeffrey Hao Wang
Can Coding Agents Manage Their Own Memory?	Jerry Wang, Ryan Wang, Sameer Agrawal
Cartoon Caption Humor Quality Assessment and Generation with DPO and LoRA	Isaiah Flores, Katherine Ha Wang
Causal Transfer of Semantic Operators Across Transformer Language Models	Shivatmica Murgai
Childproofing LLMs with Contrastive Activation Addition	Rosemary Mingrui Jiang
Childproofing LLMs: A Comparative Analysis of CAA, ReFT, and DPO for Safety Alignment	Alice Zhu, Anya Han Zhang
Coaching Qwen3 Coder 30B to Think Like a CodeClash Arena Agent	Ivy (Ning) Zhang
CoFi-PG: Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs	Elai Ben-Gal, Stela Tong
Cognitive Compression: Hierarchical Chain-of-Thought for Efficient LLM Reasoning	Anuj Jamwal
Collaborative Dynamic Cheatsheet: Multi-Agent Test-Time Learning with Small Language Models	Erica Wang, Malvyn Lai
Compiler-in-the-Loop: Decomposing the Value of Static Verification for Low-Resource Code Generation	Hlumelo Notshe, Joshua Martinez
Compositional Tool-Sequence Generation in Small Language Models	Benji Warburton, Maanit Goel
Concept Training for Human-Aligned Language Models	Christine Zhang
Confusion-Set Guided Retrieval for LLM-Constrained Brain-to-Text Decoding	Andrew Su, Hyungjae Kim, Vincent Jinpeng Yip
Context Under Pressure: How Language Model Agents Should Save and Read Information Over Long Interactions	Alexander Owen Worley
Continuous Utility Direct Preference Optimization	Muhammad Ahmed Mohsin
Cost-Aware Escalation from Scalar Reward Models to Generative Models	Cole Yarbrough, Landon Renjiro Maka'ike Choy, Rui Chen
Curriculum-Based Fine-Tuning for Summarization of Endometriosis Data	Ali Hicham Tout
Data-Centric Control of Verbosity for DPO-Based Instruction Alignment	Susan Lee, Will Richard Alex Furlow
DeepRoot: Graph-Coordinated Multi-Agent Reasoning	Sean J Wang, Sijbren Manuel Kramer, Zijian (Carl) Ma
Designing a Conservative Humor Filter Can a Model Tell If an Image Caption Is Funny?	Michael Roger
Diagnosing the Reversal Curse via Mechanistic Probing and Symmetric Training	Deepti Gupta, Ke Huang, Rafael Cardoso Ferreira
Diversity-Incentivized GRPO for Constrained Arithmetic Reasoning	Gaurav Tyagi
Do Language Models implement compositional solutions for natural language understanding	Ahmad Jabbar
Do Long Contexts Help Legal Knowledge? A Case Study on US–China Securities Regulation	Yufei Peng
Does Fine-Tuning Hurt Cross-Platform Generalization in Depression Detection?	Yanav Lall
Does Pedagogy Hurt Truth? Evaluating Educational Rewriting in Medicine	Cally Lin, Sasa Simic
Don’t Think About It: Activation Steering as Silent Defense Against Prompt Injection	Gaurav Anand
Dynamic Ledger: Retrieval-Augmented Structured Memory for Test-Time Learning	Jerry Gu, Sabrina Yen-Ko, Shurui Liu
Dynamic Token Merging for Efficient Subword Encoder-Decoder Transformers	Chris Gu, Marco Andono Sie, Nathan Zhou
Dynamic Token Merging for Encoder-Only Transformers: Adapting MrT5’s Delete Gate to BERT and XLM-RoBERTa	Aronima Dass, Hiva Zaad, Tianhui Huang
Effect of Text Embedding Scale on GraphRAG Accuracy	Jon Valur Bjornsson
Emotional Arc Preservation in LLM Literary Translation	Chloe Di Murdoch, Esidore Fajardo Eneinyang, Julia E Rhee
End-to-End Driving Trajectory Prediction with Vision-Language-Action Model	Anze Liu
Energy-Accuracy Trade-offs in Transformer-Based NLP Models: A Unified Benchmarking Study	Thibaud Xavier Clement
Entropy-Triggered RAG: Optimizing Retrieval Efficiency via Token-Level Shannon Entropy	Yucheng Yao
Evaluating Eliminative Reasoning in LLM-Based Differential Diagnosis	Seyun Bang, Tatiana Zhang
Evaluating JEPA for Natural Language Tasks	Henry Jingsong Zhou, Oleh Ivankiv, Yousef Hassan Ramadan
Evaluating Robustness of Large Language Models to Algospeak	Hnin Yupar Mon, Thet Htar Thin Zar
Evaluating Robustness of Social Bias Detection to Lexical and LLM-Driven Perturbations	Nomin-Erdene Bayarsaikhan
Evaluating User-Style Adaptation for Professional Text Generation	Andrea Ji Woo Nam Song
FACT: Attention Consistency Training Mitigates Sycophancy and Jailbreaks	Emma Sampietro, Justin Nicolas Hartenstein
FALCON: Factual-Aware Logical Consistency for Large Language Model Outputs via NLI-Guided Mixed Integer Optimization	Rehan Raza Azam
Fast Compression versus Exact Recall: Investigating the Trade-offs Between Models in Specialized Reasoning Tasks	Jerry Xiao, Nick Yan
Fast Vocabulary Transfer for African Languages in Multilingual Machine Translation	Kailash Chandran Elumalai, Biya Brook
From Premise to Punchline: A Fine-tuned Model and “Writers’ Room” Framework for Saturday Night Live Sketch Script Generation	Hannah Yu, Natalie Hampton
From Private Memory to Collective Intelligence: Collaborative Test-Time Learning	Jiaming Shen, Jiaxin Fang, Xinrui Jiang
From Symptoms to Syndromes: Development and Validation of Fine-Tuning Transformer Architectures for Genetic Neurological Disease Diagnoses	Anushka Rawat, Ximing Gao, Yi Li
Generative Dialogue State Tracking with GPT-2 for Task-Oriented Service Conversations	Venu Madhav Samprathi Ram Prasad
Gold-Guided Programmatic Distillation for Financial	Elana N Chen, Erica Zhao, Yun Dong
Grounded Go Commentary Generation via Expert Engines and Structured Terminology	Yudong Chen
HALLU-NLI: Revisiting Natural Language Inference (NLI) Hallucination Detection Methods forLLM-Generated Biographies	Nathania Elizabeth Lim, Sally Lee, Sarah Dong
Hash Routed Delta Patches for Fast Knowledge Updates in Small LLMs	Arash Hamzehlou
Hidden Signals: Hallucination Prediction in Medical QA	Catherine M Zhang, Christina Ba, Ina Kathleen Chun
How does fine-tuning change internal representations in an audio transcription model?	Brandon Liu, Jason Hu, Jenny Jin
Improving Lean4 Autoformalization via Cycle Consistency Fine-tuning	Arsen Shebzukhov
Improving Scientific Reasoning in Small Language Models via Process Preference Re-Ranking	Arya Gupta, Marianne Feng Liu
Investigating the Impact of Persona-Based System Prompts During SFT of Code LLMs	Tushar Aggarwal
Language-Augmented Flow Matching Policies for Robust Out-of-Distribution Robot Manipulation	Jeff Liu, Lucas Sosnick
Language-Conditioned Objectives for Task-Agnostic Preference Learning and Controller Updates	Kyeong-Won Park
Learning a Discriminator for Conceptual Diversity in LM Outputs	Hangoo Kang, James Liu
Learning Efficient Tool Orchestration with Language Models	Orhun Akengin
Learning from Critiques: A Geometric Framework for Response Improvement	Haozhan Gao
Learning When to Speak: Teaching LLMs Silence Through Specialized DPO and Distillation	Allison Sara John, Anthony D Argyropoulos, Yubo Ruan
Location, Location, Generation: Fine-Tuning a VLM for Real Estate Descriptions	Carey Chang, Niko Terebuh Ustin
Measuring the Measure: Mechanistic Prompt Sensitivity for LLM-Based Populism Coding	Jiehan Liu
Mechanistic Deconvolution of Memory and Context in Quantum Language Models	Nathan Roll
MedDistill: Improving Clinical LLM Performance Through Natural Language Tabular Insights	Joshua Logan Shunk, Patrick Ruibin Li
MGA: Mixed Gated Attention for Efficient Long Context Attention	Jen Ha, Bharat Kumar
Mixture-of-Steering Vectors (MoSV): Sparse Gating for Compositional Hallucination Mitigation	Daniel Winston Lee, Olufeolu Oluwapelumi Kolawole, Vedant Malolan Srinivas
MoSA: Mixture-of-Specialized-Agents for Cost-Efficient Long-Document Question Answering	Haseeb Ismail, Mert Karabiyik, Shayaan Memon
Multi-Lane Retrieval-Augmented Generation for Pharmaceutical Regulatory Dossier Writing	Omar Ingi Halldorsson
MuTaP: Multi-Task Mutation Predictor via LoRA-Adapted ESM-2	Aya Aburous, Jad Bitar
NanoVQA	Ellen Xu
Non-Toxic Trash-Talking Fantasy Football	Andrew Dana Lawlor, Xander William Russell
On-Policy Context Distillation	Darynne Lee, Shizhe He, Simon Pritchard
Perceptual-Aware Spatial Scene Synthesis (PASSS)	Karan Singh Soin, Na Young Son
Pinpointing Latent Planning in Language Models with Lightweight Mechanistic Methods	Harshvardhan Singh, Nick Rui, Nicole Ma
PocketSheet: Enhancing Test-Time Learning using Efficient Memory Augmentation in Small Language Models	Prabhjot Singh Rai, Sakthivel Sivaraman
Practical and Interpretable Unfair ToS Detection: Comparing Legal-Bert, Linear Lexical Models, and Editable trees	Basel AlKanjo
Practical Design Decisions Can Matter More Than Training Algorithm Choice: A Study of LLM-Based Rust Bug Repair	Ethan Charles Morgan
Precision Under Pressure: Pushing the Boundaries of the Accuracy-Efficiency Frontier in Question Answering with Mixture-of-Depths	Haoyue Yang, Jan Miroslaw Kopanski, Soha Sultan
Preference-Based Alignment of Code Generation for MCP Server Development	Kristjan Dagur Egilsson, Rami Ratl Mrad
Progressive Screenplay Narrative Understanding via Contrastive Learning	Luca Thomas Wheeler
Quantized Pre-training for Small Mixture-of-Experts	Raghavendra Pranith Koppula
RAG-Based LLM Supported by Clinically Structured Re-Ranking, RL-Tuned Retrieval, and Agentic Workflow for ED Triage Prediction	Charlotte Louise Kramer, Isha Arora, Nino Alex Triandafilidis
Rapping in Role: A Study of Persona Robustness in Large Language Models	Eunice Hyeyun Jung, Megan Ja
Recursive Self-Improvement for Continual Adaptation in Code	Aaditya Vikram Nalawade, Chandra Suda, Ethan David Goodhart
Reward Design for Medical Safety: Reducing Sycophancy via Truth-Weighted RLHF	Jillian Chang, Juli Huang, Michael Kuang Min Li
Scaling Test-Time Compute to Improve Formal Reasoning in Lean via Compiler Feedback	Adam Joseph Banks, Alexander Huang
Self-Distillation for Discrete Flow Map Consistency	Suchir Agarwal
Small Models Think Big: Toward Effective Memory Distillation for Small Co-Scientists	Jaanak Prashar, Renn Su, Summer Olivia Royal
Structural Line Markers and Multi-Pass Reranking for GPT-2 Sonnet Generation	Aalaap S Hegde, Mudit Baid, Rakshit Kaushik
SUMMEHRY: LLMs for Generating Temporal Patient Vignettes	Arlina Shen, Asmita Sood, Eashan Monga
Support-Aware Retrieval of Evidence Passages for Community Notes	Dorian Scott Gulley, Dyllan Han
SYMBRION: Symbol Context and Dream Ego Relations Across Lifelong Dream Series as a Tool for Psychoanalysis	Bobby Rohrkemper, Chia-Wei Cheng
Test Time Training for Sample-Efficient Practical Molecular Optimization	Aaron Chee-Hung Lee, Ishvi Mathai
Test-Time Training on Binary Sub-Problems	Andrew Sung, Darrow Robert Hartman, Leo Li
The Efficiency Threshold: Few-Shot Prompting vs. LoRA	Abi Lopez, Daniel Joseph Grossman, Shreyas Chikkanayakanahalli Seshadri
The Feasibility of Token-Level Compute Allocation across Depth in Pretrained Transformers	Anjali Sreenivas, Yuchen Li
The Rosetta Probe: Cross-Lingual Syntactic Transfer in Monolingual English BERT	Ananya Niharika Navale
Towards Robust Natural-Language Proof Verification	Slim Barkallah
TRACE: Tool-augmented Reasoning via Atomic Cheatsheet Editing	Arnold Tianyi Yang, Kyleen Liao, Roshen Sanjay Nair
Understanding Mechanisms of Sycophancy in Multi-turn Interactions	Camila Blank
Understanding Value Embeddings in GPT-2 Training Speedruns	Arihan Varanasi, Markus Zhang
Verified Anchor Selection and Adaptive Curriculum for Dynamic Cheatsheet Memory	Mengqian Chen
Verified On-Policy Self-Distillation	Jack Li, Sophia Yinfan Li
Verifier-Guided Reasoning for Cryptic Crossword Clue Solving	Aarav Arora, Caleb Youngjae Whang Choe, Shamit R Surana
Visceral Judgment: LLM Refusal through Affective State	Nicolas Kennedy
Vision-Language Model Router for Robotics	Jadelynn Kim Dao, Milan Ganai, Satvik Sharma
Where Reasoning Branches: How Preference Pair Construction Shapes DPO for Mathematical Reasoning	Duy Nguyen

Default Projects

Project name	Authors
A Study of SFT-DPO Interaction and LoRA vs Full Fine-Tuning in Small Language Models	Christy Yang, Yuming Feng
Accelerated DPO Fine-tuning GPT-2 with Constructed Data	Jessie Ou, Weixin Yu
Accelerating Attention for GPT-2 Using FLASHATTENTION, Longformer, and cosFORMER	Diego Sierra, Thomas Sarda, Tom-Eliot Jullien
AdamW The Last LLM-Bender: The Legend of LoRA	Ari Barbella-Blaha, Kieran Javier Barrett
Adapting GPT-2 for Sentiment Analysis, Paraphrase Detection, and Sonnet Generation	Fiona Han, Samih Shaheen Qureshi, William Charles Rose
Adapting GPT-2 Through Fine-Tuning Across NLP Tasks	Ritu Patil
Adapting Pretrained GPT-2 via LoRA: How Much Fine-Tuning Do We Actually Need?	Zengmingyu He, Zerong Chen
Adaptive Mixture-of-Heads: Routing Attention Heads in GPT-2 with Fixed and Dynamic Sparsity	David Stutz, Ryder Fried
An Investigation of GPT-2 Applications and Training Improvements, and Exploring Multi-Token Entity Predictions	Ben Wengreen, Bhavya Ashish Shah, Jeffrey Meng
Applying Direct Preference Optimization to Improve GPT-2 Sonnet Generation	Aadhav Prabu
Beyond Full Fine-tuning: Finding the Limits of GPT-2 Efficient Adaptation	Alexander Huayi Zhong, Kaitlyn Angel Kwan, Songyu Han
Build GPT-2	Lucia Losada, Nicole Cortes
Build GPT-2	Yuchan Guo, Yushi Feng
Build GPT-2	Pengyu Mo, Shirley Yu, Yixiao Zhang
Building GPT-2	Suzannah Dalton Wistreich
Building GPT-2 and Perfecting Performance with Low-Rank Adaptation	Chenyu Song, Juntao Cheng, Mingyang Li
Building GPT-2 for Paraphrase Detection and Sonnet Generation	Yifan Guo
Building GPT-2 with Finetuning Optimizations	Ethan R Lee, Ethan Y Lu, Jingyu Zhang
Building GPT-2: Revisiting a Key Milestone of NLP	Andy Tianqi Wang, Darren Chan, Derek Yan
Circuit-Aware Analysis of LoRA Fine-Tuning: What Changes, Where, and Why?	Nathan Maidi
Cloze-Style Paraphrase Detection and Sonnet Generation with GPT-2: Exploring LoRA and Decoding Strategies	Nick Fursa
Co-Adaptation in LoRA: Target Placement Effects and Inter-Module Interactions in GPT-2	Shekhar Sharma
Comparing ReFT and LoRA on Classification and Generative Tasks with GPT-2	Ryan Patrick Catullo
Cost–Performance Tradeoffs for GPT-2 Fine-Tuning: A Case Study on Paraphrase and Sonnet Continuation	Ricardo Ruiz
CS 224N Default Project	Kayla Li, Yaojing Huang He
Cutting Out the Middleman: Direct Preference Optimization for Paraphrase Detection and Sonnet Generation	Justin Yuankai Leong, William Li
Data Efficient Fine-Tuning and Alignment of GPT-2	Aryaman Gupta, Joseph Lee, Zeyuan Feng
Default Final Project: Efficient Adaptiation of GPT-2 via LoRA	Pedro Gaspar Pires
Direct Preference Optimization for Constrained Generation and Classification in GPT-2	Jingxiong Zhao, Weining Li
Direct Preference Optimization for Improving Sonnet Generation	Gio Ty
Direct Preference Optimization: From Paraphrase Detection to Sonnet Generation	Florencio Paucar Sedano
Does the Optimizer Matter? LoRA vs Full Fine-Tuning in NLP	Andy Dimnaku
DoRA the Explorer	Cayden Gu, Imogen Lee
DoRA: Parameter-Efficient Fine-Tuning for GPT-2 on Cloze Paraphrase Detection and Sonnet Generation	Aniket Gupta, Anjani Pangal, Mallika Parulekar
DPO for Structural Sonnet Generation and Paraphrase Detection with GPT-2	Daniel Marcelo Mottesi, Diego Bustamante, Jason McLeod Amsler
Effects of Quantization on GPT-2 Small	Isabella Lynne Jordan
Efficiency and Inference: A Comparative Study of PEFT and Full Fine-Tuning	Sanyam Gupta
Efficiency in GPT-2: Parameter Adaptation, Quantization, and Synthetic Data Augmentation	Abhinav Chinta, Ethan Hersch, Ryan D'Cunha
Efficiency–Performance Trade-offs in LoRA-family: Fine-Tuning Methods for GPT-2	Christine Li, Jason Yan, Justin Li
Efficient Adaptation and Structure-Aware Post-Training of GPT-2 for Paraphrase Detection and Sonnet Generation	Brandon Michael Kunitzer, Koa Lanakila Chang
Efficient Alignment Is All You Need	Lingbo Duan, Shatong Zhu, Yufei Liu
Efficient Fine-Tuning and Alignment of GPT-2 for Downstream NLP Tasks	Adam Alhousiki, Kamal Mohammed ElMallah, Tommy Leong
Efficient Fine-tuning of GPT-2 for Paraphrase Detection and Sonnet Generation	Jonathan You
Efficient Fine-Tuning of GPT-2 via Low-Rank Adaptation (LoRA)	Min Zhang, Shang Gao, Shang Gao
Efficient Fine-Tuning of GPT-2: LoRA, Hyperparameter Search, and Scaling for Paraphrase Detection and Sonnet Generation	Brian Sha
Efficient Steering and Preference Alignment: Applying LoReFT and DPO to a Custom GPT-2 Architecture	Haonan Zhu
Encoding Task Structure via Attention Biases and Adaptive Computation	Dario Gaitzi Soatto
Enforcing Rigid Syntax: Using LoRA to Adapt GPT-2	Monami Dutta Gupta
Enhanced Hybrid Search for LLM Hyperparamter Optimization	Aaron Michael Sequeira, Avery Graham Voss, CJ Indart
Evaluating LoRA for Efficient GPT-2 Fine-Tuning	Raymond Ruimeng Llata, Vania Chow
Evaluating Low-Rank Adaptation and Nested Low-Rank Architectures for Paraphrase Detection and Sonnet Generation	Ian Yue-Ran Chen
Evaluating Low-Rank Representation Finetuning for GPT-2 Downstream Tasks	Alvin Ayuyo
Evaluating Performance, Efficiency, and Memory Trade-offs in GPT-2 Attention Mechanisms	Devon Thomas Johnston Smith, Lily Annabelle Bailey
Exploring decoding and efficiency strategies for GPT-2	Stephanie Stephanie Vezich Tamayo
Exploring LoRA Variants With GPT-2	George Danchen Song, Justin Choo
Exploring Low-Rank Adaptation for Efficient GPT-2 Fine-Tuning	Andy Zhang, Yi Lu
Exploring Parameter-Efficient Fine-Tuning for Paraphrase Detection with GPT-2	Krisha K Chokshi
Extending GPT-2 for Informal and Slang Aware Language Understanding	Dhruv Darshan Naik, Ruby Hernandez
Fairness-Aware Fine-Tuning of GPT-2 for Paraphrase Detection	Deonna Owens
Fine-tuning GPT-2 for Sentiment Analysis, Paraphrase	Liliana Carolina Santos-Deonizio
Fine-Tuning GPT-2 for Sentiment Analysis, Paraphrase Detection, and Sonnet Generation with Parameter-Efficient Adaptation	Carl Liu, Zikun Zhu
Fine-Tuning GPT-2 for Sentiment Analysis, Paraphrase Detection, Sonnet Generation and Political Affiliation Detection	Anna Wu, Iris Zixiao Xu, Samantha Malowane Leventis
Fine-Tuning GPT-2 for Sentiment, Paraphrase, and Sonnet Tasks	Walter Lopez Chavez
Fine-tuning GPT-2 with LoRA	Manish Agarwal, Pierce Cailean Sayer Mullin
Fine-tuning GPT-2 with LoRA and DPO for Accurate Classification and Constrained Generation	Shaoxiong Zhang
Fine-Tuning GPT-2: A Playground for Discriminative and Generative Adaptation Tasks	Aditi Somayajula, Sahithi Ankireddy
Fine-Tuning, Alignment, and Efficient Adaptation of GPT-2 for NLP Downstream Tasks	Ahmed Mohamed Hassan Khidre Elsherbiny, Izhan Hamza, Patrick Wang
FlashAttention-Enhanced GPT-2 for Paraphrase Detection and Sonnet Generation	Katie Liu, Norah Asemota, William Yang
From Detection to Generation: Fine-tuning Large GPT2 Models for Paraphrasing and Poetry	Anna Guo
From-Scratch GPT-2 and Efficient Adaptation	Bryan Alexis Pineda, Michael James Nixon
Full Fine-Tuning v. LoRA: Parameter-Efficient Adaption of GPT-2 for Paraphrase Detection and Sonnet Generation	Megha Bindiganavale, Rydham Goyal
GaLore: Gradient Low-Rank Projection	Chung-Suen Stephen Chan
GDPO for GPT-2	Ahmed Sherif Ahmed Elbakry Mohamed
GPT-2 Default Project with Attention-only LoRA for Paraphrase Detection	Yiqing Liu
GPT-2 Implementation and Speedup	Alexia Huang, Qi Wu
GPT-2 with LoRA Optimization	Illia Shkirko, Janhavi Purkar, Zhang Bai-han
GPT-2 with Varying Attention Mechanisms	Aneesh Akella
GPT2 Optimization with PEFT and DPO	Kiran Sun
GRPOET-Rank: Group Relative Policy Optimization with External Text-Ranking	Eric Liang, Jamin Jia-Ming Xie, William Z Liu
Hardware-Aware Self-Attention for GPT-2: A FlashAttention-based Study	Siri Garudanagiri Virupaksha
Implementing a GPT-2 Decoder for Text Generation, Classification, and Paraphrase Detection	Alma Oralia Minerva Cooper, Antra Nakhasi, Louis Weisdorf
Implementing and Extending GPT-2 for Multi-Task NLP Applications: A Parameter-Efficient Fine-Tuning Perspective	Isabel Li, Lianyu Yao, Yunjie Xu
Implementing and Fine-Tuning GPT2 for Sentiment Analysis, Paraphrase Detection, and Sonnet Generation	Zhenghui Chen
Improving GPT-2 Fine-Tuning through Parameter-Efficient Adaptation and Preconditioned Optimization	Akhilesh Varadan Balasingam, Georgios Mikos
Improving GPT-2 Fine-Tuning with Direct Preference Optimization for Sonnet Generation and Paraphrase Detection	Anna Gutowska, Nicolas Bejar Arambula, Petru Cristian Budianu
Improving GPT-2 with Reinforcement Learning from AI Feedback: Automated Judges for Aligned Sonnet Generation	Jason Meng, Shinnosuke Yagi
Improving Performance and Efficiency of GPT-2 on Sonnet Generation and Paraphrase Detection Tasks	Divya Bhojraj
Investigating Structure Aware Decoding and Cloze Style Classification for Robust GPT 2 Fine Tuning	Anya Von Diessl
KV Caching and Speculative Decoding at GPT-2 Scale: Acceptance, Cost Ratio, and the Limits of Speedup	Amulya Parthasarathy
Lather, Rise, Repeat: The Shampoo Optimizer	Ricky Javier Rios
Learning to Rhyme with Token-Weighted DPO	Fisher Marks
Leveraging GPT-2 for Multiple Downstream NLP Tasks: Classification and Generation	Davi Ferreira Veronese
Longformer-Style Sparse Attention for GPT-2	Hoang D Nguyen, Peter Martin Alisky
LoRA vs LoReFT: Parameter-Efficient Fine-Tuning of GPT-2	Allen Yuan, Andrew Wooyong Chung, Ryan Joonwon Suh
Lora-Enhanced GPT-2 with DPO for Sonnet Generation	Filip William Henriksson, Krish Maniar, Nicholas Simon Allen
Low-Rank Adaptation and Preference Optimization for Accessible Multi-Task GPT-2 Fine-Tuning	Mahathi Mangipudi, Taylor Elizabeth Hamilton-Hankins, Tyler Kinh Ho
Low-Rank Adaptation for Efficient GPT-2 Fine-Tuning: Evaluation Across Classification and Generation Tasks	Vivek Tiwari
Low-rank fine-tuning of the GPT-2 model	Rongge Yan
Memory-Efficient Transformer Attention via Tiled FlashAttention-style Implementation	Sagar Kapare
Metric-Aligned Sonnet Generation with LoRA and Self-Critical RL Fine-tuning	Timothy Yu, Xiang Wan
MiniGPT: Implementation, Fine-Tuning, and Extensions for Constrained Generation	Shiwei Que
Modernizing GPT-2: Integrating Low-Rank Adaptation, FlashAttention, and Multi-Token Prediction for Efficient Sonnet Generation	Manan Sheth, Sanjay Dixit Bhuvanagiri
Optimizing GPT-2 for Downstream Tasks: An Exploration of PEFT, Preference Optimization, and SMART	Anastasiya Masalava, Eva Casto, Michael Rybalkin
Parameter-Efficient Adaptation of GPT-2 Across Classification and Generation Tasks	Isaias Martinez, Kristine Ma, Varsha Saravanan
Parameter-Efficient Adaptation of GPT-2 across Discriminative and Generative Tasks	Junran Jia, Xianya Fu
Parameter-Efficient Fine-Tuning for GPT-2: Comparing LoRA and ReFT on Paraphrase Detection	Matthias Jiro Walther, Ngoc Nguyen
Parameter-Efficient Fine-Tuning of GPT-2 for Classification and Text Generation	Ryan He
Parameter-Efficient Fine-Tuning of GPT-2 Using DoRA	Linika Goel, Mindy Kay Harkness
Parameter-Efficient Fine-Tuning of GPT-2 using Low-Rank Adaptation (LoRA)	Chris Alexander Perez
Parameter-Efficient Fine-Tuning of GPT-2 with LoRA	Chloe Yuri Jeon, Erick Angelo Ramirez
Parameter-Efficient Fine-Tuning of GPT-2 with LoRA: A Systematic Study of Rank, Scale, and Learning Rate	Cuiyuanxiu Chen
Parameter-Efficient Fine-Tuning of GPT-2: Comparing LoRA, LoReFT, and Prefix Tuning Across Classification and Generation Tasks	Sally Wang, Zijian Luo
Parameter-efficient fine-tunings for Downstream Adaptation of GPT-2	Peiyu Li, Zefang Zhou
Parameter-efficient Finetuning and Preference Optimization of GPT-2 for Downstream Tasks	Pankaj Rajak
Paraphrase Detection and Sonnet Generation using GPT-2	Dongyu Jia, Omar Walid Ayoub
Paraphrase Detection and Sonnet Generation with LoRA and DPO	Diya Bhattacharjee, Jaagat Prashar, Kyle Tianshi
Preference Optimization for Parameter-Efficient Multi-Task Learning of GPT-2	Jiecong Tan, Mark Yang
Preference-Optimized GPT-2 for Cloze-Style Paraphrase Detection and Sonnet Generation via DPO with GPT-Scored Pairs	Andrew Samuel Park, Arun J Moorthy, Welton T Wang
Preference-Tuning GPT-2 with LoRA and DPO for Classification and Poetry	Hiromichi Murakami, Yuliia Murakami
QLoRA Fine-Tuning for Reliable Structured Tool Calls with GPT-2	Jay Khemchandani
Quadapter: Adapter for GPT-2 Quantization	Ethan Cohen, Wesley Bian
Quantization of GPT-2 for Running on Edge Devices	Gordy D Sun
Rank, Bits, and Data: Efficient GPT-2 Adaptation for Paraphrase Detection and Sonnet Generation	Kerui Lu, Silin Du
Re-implementing GPT-2 for Classification, Paraphrase Detection, and Sonnet Generation	Saanvi Reddy Thummalapally, Shani Su
Robust Fine-Tuning of GPT-2 with SMART Regularization, LoRA, and Enhanced Decoding for Downstream NLP Tasks	Jake Klosowski, Kevin Stephen, Alex E Wurm
Robust GPT-2 Fine-Tuning via LoRA and Smoothness-Inducing Regularization	Pooya Nabavi
SADPOSS G: Structure-Aware Direct Preference Optimization for Shakespearean Sonnet Generation	Mario Felix Sumali
Second-Order Optimization for GPT-2 Fine-Tuning: Exploring K-FAC for NLP Downstream Tasks	Isabella Kai He
Sentiment Classification using GPT-2 Representations	Sahaj Saini
SMART Regularization and DPO for GPT-2 Sonnet Generation	Mac Broido
SMART-GPT2: Adapting SMART-Style Regularization for Decoder-Only Fine-Tuning	Austin Ho
SMART-GPT2: Adversarial Regularization for GPT-2 Fine-Tuning	Bahram Y Mohmand, Noah Sabbavarapu, Zihan Wang
SOAP and Sonnets: Improving the Optimization Efficiency of GPT-2	Kenna Zeng
Source, Relay, and Suppressor Heads in a Poetry Generation Circuit	Kai Wen, Shaoyi Zhang
Sparse ReFT	Jacob Daniel Householder
SPLoRA: Sonnet Generation and Paraphrase Detection with LoRA	Joseph Rabara Bailey, Maya Vendhan
Streaming Sonnets: Efficient Generation with KV Caching and Quantization	Codey Codey Sun, Michael Yang
Style Steering of GPT-2 Sonnet Generation with DPO	Claudia Perez D'Arpino
Task-Dependent Effects of Parameter-Efficient Fine-Tuning: A GPT-2-Based Study	Weiwei Wu
Task-Driven Fine-Tuning and Efficient Attention for GPT-2	Sixian Du, Susan Li, Yuzhou Bian
Task-Specific Fine-Tuning Strategies for Improving GPT-2 Across Classification and Generative NLP Tasks	Austin Chen, Cheney Sang, Harris Alan Lee
Task-Specific GPT-2 Adaptation: Structured LoRA for Paraphrase Detection and DPO for Sonnet Generation	Puyang Du, Xijia Liu
TaskRank	Fabio Ibanez, Peter Jason Benitez
Teaching LLMs to Forget Bad Data with Controlled Unlearning	Hanyu Yang
The LoRA(x)	Gerwin Delsocora Mateo, Ryan Da
Title of your project	Arun Brian Morris Chhetri, Ian Luka Lasic-Ellis, Marcus Batt Kushner
Uncertainty-Aware Self-Training for Paraphrase Detection + Learned Reranking for Sonnet Generation	Alex M Michael, Luis Marc Botin-Sanz de Sautuola, Xander Coulter Hnasko
Weight Decomposition Matters: DoRA vs. LoRA for Small GPT-2 Task Adaptation	Svea Drekshagen
Where Does LoRA Actually Help? Probing Layer-Wise Adaptation in GPT-2 for Paraphrase Detection and Sonnet Generation	Mona Anvarihosseinabad

CS224N: Natural Language Processing with Deep Learning

Stanford / Winter 2026

Project Awards

Best Default Project

Best Custom Project

Outstanding Default Projects

Outstanding Custom Projects

Custom Projects

Default Projects