A slice of text so small it barely means anything alone, but patterns emerge in swarms.
means A contiguous sequence of n items (usually words or characters) extracted from a text, used to analyze language patterns and predict what comes next.
from Coined in computational linguistics in the 1980s as researchers needed a way to break language into bite-sized pieces for statistical analysis. The 'n' is a variable—a 2-gram (bigram) is two words, a 3-gram (trigram) is three, and so on. Borrowed from mathematics's n-tuple concept but weaponized for text.
google search suggestions — Runs on bigrams and trigrams learned from billions of search queries.
gpt language models — Modern LLMs evolved from n-gram prediction into neural networks, but use the same core logic.
spam filters — Email systems identify spam using n-gram patterns of known malicious phrases.