Scaling Laws and Emergent Capabilities Reading List
Curated by Mouhssine Rifaki | Stanford Electrical Engineering | Last updated April 2026
How model performance scales with data, compute, and parameters, and what emerges at scale.
- Scaling Laws for Neural Language Models
Kaplan et al.. arXiv 2020.
- Training Compute-Optimal Large Language Models
Hoffmann et al.. NeurIPS 2022.
- Emergent Abilities of Large Language Models
Wei et al.. TMLR 2022.
- Are Emergent Abilities of Large Language Models a Mirage?
Schaeffer, Miranda, Koyejo. NeurIPS 2023.
- Scaling Data-Constrained Language Models
Muennighoff et al.. NeurIPS 2023.
- Scaling Laws for Reward Model Overoptimization
Gao et al.. ICML 2023.
- Beyond Neural Scaling Laws: Beating Power Law Scaling via Data Pruning
Sorscher et al.. NeurIPS 2022.
- Scaling Laws for Transfer
Hernandez et al.. arXiv 2021.
- A Survey of Large Language Models
Zhao et al.. arXiv 2023.
- Scaling Laws for Neural Machine Translation
Ghorbani et al.. ICLR 2022.
← Back to main page