Technical blog
LLMs, AI security, ML systems, and research notes.
A problem-driven technical blog where I explain machine learning systems from first principles: architecture, implementation, optimization, and research intuition.
Latest posts
Why GPT-2 Cannot Read Text Directly - Part 2
An explanation on why tokenization is important in Language modeling
Understanding GPT-2 · Part 2
GPT-2LLMsLanguage ModelingSelf-Supervised Learning
The Language Modeling Problem: How GPT-2 Learns Without Manual Labels - Part 1
A first-principles explanation of next-token prediction and why it became the foundation of GPT-style LLMs.
Understanding GPT-2 · Part 1
GPT-2LLMsLanguage ModelingSelf-Supervised Learning