Scaling Laws and Emergent Capabilities Reading List

Curated by Mouhssine Rifaki | Stanford Electrical Engineering | Last updated April 2026

How model performance scales with data, compute, and parameters, and what emerges at scale.

Scaling Laws for Neural Language Models
Kaplan et al.. arXiv 2020.
Training Compute-Optimal Large Language Models
Hoffmann et al.. NeurIPS 2022.
Emergent Abilities of Large Language Models
Wei et al.. TMLR 2022.
Are Emergent Abilities of Large Language Models a Mirage?
Schaeffer, Miranda, Koyejo. NeurIPS 2023.
Scaling Data-Constrained Language Models
Muennighoff et al.. NeurIPS 2023.
Scaling Laws for Reward Model Overoptimization
Gao et al.. ICML 2023.
Beyond Neural Scaling Laws: Beating Power Law Scaling via Data Pruning
Sorscher et al.. NeurIPS 2022.
Scaling Laws for Transfer
Hernandez et al.. arXiv 2021.
A Survey of Large Language Models
Zhao et al.. arXiv 2023.
Scaling Laws for Neural Machine Translation
Ghorbani et al.. ICLR 2022.

← Back to main page