Blog
In-depth notes on LLM architecture research — the ideas, the math, and the intuitions behind the papers.
Rethinking the Primitives: Next Generation LLM Architecture
A layer-by-layer redesign of the Transformer stack — from positional encoding and attention mechanisms, through linear hybrid architectures, to MoE routing and normalization. Each work finds the hidden mathematical structure of one component and derives a better design from first principles.
Positional Encoding
Attention Mechanisms
Hybrid SSM
Mixture of Experts
Normalization
5 Papers