Learn LLMs
I’ve spent the last 15+ years working in genomics, and I see myself as infinite learner. Recently, I began studying the technology behind Large Language Models (LLMs) —not to ride the latest AI-HASS (“hype as a service”) wave, but to understand the core technology that make these systems work.
These notes are my learning journal: a place where I collect, organize, and clarify what I’m discovering about LLMs and the broader ecosystem around them.
Building Blocks of GPT-2 LLM
- Introduction - Building Blocks of GPT-2 LLM
- Introduction to Large Language Models (LLMs)
- GPT - Generative Pretrained Transformer model
- Introduction to tokenization
- Introduction to embedding
- Transformer blocks
- Self-attention mechanism
- Masked Attention
- Multi-head self-attention
- Feedforward neural network (FNN)
- Language Modeling Head (LM Head)
- Pre-trained GPT-2 model end to end
- Dataflow across LLM
- Reference
Architecting Autonomous Systems