LLMs as linear algebra

A handful of recent social posts boiled LLMs down to what they actually do: pattern‑matching across huge text datasets using high‑dimensional vector math, not human‑style understanding. For example, Gerardo Moscatelli explained LLM behavior as ‘linear algebra in hyperspace’ and others emphasized that models simulate understanding via text patterns rather than true domain comprehension. That matters because it explains why LLMs fail unpredictably in specialized fields unless retrained or augmented with domain tools. (x.com) (x.com)

A large language model does not store a tiny dictionary entry for every sentence you might type. It chops text into tokens, turns those tokens into lists of numbers called vectors, and does math on those vectors to predict the next token. (openai.com 1) (openai.com 2) Those vectors live in a space with hundreds or thousands of dimensions, which is just a way of saying each piece of text is tracked by many numeric coordinates at once. OpenAI’s current embedding guide says those vectors can be 1,536 or 3,072 numbers long, and nearby vectors tend to represent related text. (openai.com) That is why people keep calling these systems “linear algebra in hyperspace.” “Linear algebra” is the schoolbook math of vectors and matrices, and “hyperspace” is the many-dimensional map where the model places words, phrases, and patterns. (openai.com) (x.com) The model’s core job is prediction, not checking facts against the outside world. OpenAI’s documentation describes text generation models as systems trained on natural and formal language that produce outputs from prompts, which means they continue patterns in text rather than consult a built-in ground-truth database. (openai.com) That sounds abstract until you picture autocomplete with an enormous memory for style, structure, and association. If millions of examples connect “Paris” with “France,” the model learns a strong numeric path between those tokens and can keep extending that pattern fluently. (openai.com) Researchers are now measuring those pattern-following limits more directly. A 2026 International Conference on Learning Representations poster says large language models often succeed through pattern matching, and that the same mechanism breaks down on compositional tasks when the structure gets ambiguous. (iclr.cc) Inside the model, those patterns are not stored as neat English rules like “always do step two before step three.” Anthropic’s March 27, 2025 interpretability paper says researchers can find interpretable “features” in a model’s internal activity, but the overall mechanisms are still complex enough that the authors compare reverse-engineering them to biology. (transformer-circuits.pub) That is why a model can sound like a tax lawyer at 9:00 and invent a tax rule at 9:01. The fluent wording comes from learned statistical structure in text, while the missing reliability comes from not having the same kind of grounded domain model, legal accountability, or live verification process that a human expert uses. (iclr.cc) (transformer-circuits.pub) The recent posts resonated because they strip away the magic trick. When people say a large language model “understands,” they often mean it produces text that looks like understanding from the outside, even though the machinery underneath is vectors, matrices, and probability over token sequences. (x.com) (openai.com) That does not make the systems useless. It tells you when to trust them: drafting, summarizing, coding from familiar patterns, and searching a knowledge base they can retrieve from are safer than asking for unaided judgment in medicine, law, or any niche field where one wrong token can change the answer. (openai.com) (iclr.cc)

LLMs as linear algebra

Get your own daily briefing