Brian Naklycky

I recently graduated with a Masters degree in Computer Science from UT Austin, looking to start my career in software development.

Building a LLM from Scratch

INPROGRESS

This article will cover everything you need to know to build and train a LLM from scratch while also covering some more advanced and modern techniques to achieve better performance. This requires a certain level of technical knowledge, if you come from a programming or mathematical background you should be able to follow along just fine. If you are intrested in building up your knowledge to better understand the material below I will list some soft prerequisites:

Linear Algebra
- LAFF (covers basic linear algebra)
- ALAFF (covers some more advanced topics to provide a deeper understanding that I believe to be very beneficial to your understanding)
- (ulaff.net/)[http://ulaff.net/]
Calculus
- openstax Calculus 1-3
- (openstax.org)[https://openstax.org/subjects/math]
Python
- Pytorch
- (pytorch.org/docs)[https://pytorch.org/docs/stable/index.html]

Tokenization

Byte Pair Encoding

Embeddings

Self-Attention

Multi-Head Self-Attention

Transformers

Positional Encoding

Training

Dataset

Gradient Accumulation

Training Time

Gradient Clipping

Context Length

Advanced Topics

RLHF

RoPE

Chain of Thought

nGPT

BUS

Byte Latent Transformer