Table of Contents

1) What Are LLMs?

Auto Complete on Steroids

Imagine your phone’s autocorrect, but instead of predicting the next word, it predicts the next idea.

That’s what a Large Language Model (LLM) is at its core: a super-charged auto-complete engine trained on mountains of text to guess the next most likely sequence of words in any context.

If regular auto-complete is a toddler repeating phrases it’s heard, an LLM is a seasoned storyteller who’s read every book, paper, meme, and code snippet it could find, and now improvises believable, coherent responses to anything you throw at it.

Every time you ask it a question, it’s not retrieving a pre-written answer. It’s imagining, word by word, what a good answer might look like, much like how your brain finishes someone’s sentence when you already “know where it’s going.”

Tokens

Tokens are sort of like the basic units in LLMs.

1) Tokens → Embeddings

Text is split into tokens (subword chunks via BPE/WordPiece).
Each token is mapped to a vector (an embedding) that represents its meaning-ish position in a high-dimensional space.

2) Positional Info (who’s next to whom?)

Because order matters, the model adds positional encodings (sinusoidal or RoPE—rotary positional embeddings).
Plain speak: the model learns where each token sits in the sentence so it can relate “bank” to “river” vs “money.”