📄 Research Papers

Landmark AI/ML Papers

Interactive breakdowns of the papers that shaped modern AI — with diagrams, equations, and key insights.

1 paper · more coming

Attention Is All You Need

Vaswani et al. · NIPS 2017 · arXiv:1706.03762

The paper that introduced the Transformer architecture — replacing RNNs with pure self-attention. Foundation of every modern LLM including GPT, Claude, Gemini, and LLaMA. Covers scaled dot-product attention, multi-head attention, positional encoding, and SOTA translation results.

📖 20 min read · 28.4 BLEU EN→DE · 41.8 BLEU EN→FR Read

📝 Language Models

BERT: Pre-training of Deep Bidirectional Transformers

Devlin et al. · NAACL 2019 · arXiv:1810.04805

How masked language modeling and next-sentence prediction created the first truly bidirectional pre-trained model — changing NLP forever.

Google AI, 2018 Coming Soon

✨ Generative AI

Language Models are Few-Shot Learners (GPT-3)

Brown et al. · NeurIPS 2020 · arXiv:2005.14165

175B parameter model demonstrating that scale unlocks emergent few-shot learning — prompting without any gradient updates.

OpenAI, 2020 Coming Soon

🎨 Diffusion · Vision

Denoising Diffusion Probabilistic Models (DDPM)

Ho et al. · NeurIPS 2020 · arXiv:2006.11239

The paper that launched the diffusion model revolution — the backbone of Stable Diffusion, DALL·E 2, and Imagen.

UC Berkeley, 2020 Coming Soon