Token
AI & LLM Glossary
What is Token?
Token: A token is the basic unit of text that LLMs process. In English, one token is roughly 4 characters or 0.75 words.
Token Explained
Tokens are how Large Language Models break down text for processing. The number of tokens in a prompt directly affects cost and context usage. Different models use different tokenization methods, but most modern LLMs use similar token counts. For English text, expect roughly 1 token per 4 characters. Code and non-English text may tokenize differently.
Examples
- *
"Hello world" = 2 tokens - *
A typical email (500 words) = ~650 tokens - *
This paragraph = ~100 tokens