Karpathy's KB wiki thread
What happened
Andrej Karpathy outlined a viral workflow for building personal knowledge bases as markdown wikis, combining Obsidian, LLM Q&A agents and linting to turn notes into queryable research tools. The thread doubles as a practical playbook for engineers who want a lightweight, local-first research and documentation stack. (x.com)
Why it matters
Andrej Karpathy published a step‑by‑step writeup (a GitHub gist) and an X thread that lay out the exact file layout, tools, and routines he uses to run a personal research wiki that the model itself maintains. (gist.github.com) The thread quickly went viral and immediately spawned community tool lists and clones, including a curated "awesome" GitHub repo collecting plugins, CLIs, and examples of the pattern. (github.com) Karpathy describes the workflow as a four‑phase loop: ingest, compile, query, and maintain — where "ingest" means saving raw articles, papers, repos, and images into a raw/ folder, and "compile" means the language model reads those sources and writes structured Markdown pages (summaries, concept articles, and backlinks). (academy.dair.ai) When he says "linting" he means automated health checks run by the model to find inconsistencies, fill missing facts, and suggest new links or pages — equivalent to a code linter but for knowledge so the wiki stays coherent over time. (ruberli.com) A key technical claim Karpathy emphasizes is that for mid‑sized personal collections this approach avoids the usual retrieval‑augmented generation (RAG) stack — RAG normally chops documents into chunks, converts them to numeric vectors called embeddings, and uses a vector database (a specialized store for those vectors) to find relevant chunks — instead he relies on the model's ability to reason over readable, interlinked Markdown that the model itself compiles. (venturebeat.com) He also documents practical details people care about in production: using the Obsidian web clipper to capture pages and store images locally so vision‑capable models can reference them (vision‑capable meaning the model can process images as well as text), running the LLM as an agent that can call command‑line tools to generate slides or plots, and then writing those outputs back into the wiki so the knowledge base becomes a single, auditable source of truth. (academy.dair.ai)
Quick answers
What happened in Karpathy's KB wiki thread?
Andrej Karpathy outlined a viral workflow for building personal knowledge bases as markdown wikis, combining Obsidian, LLM Q&A agents and linting to turn notes into queryable research tools. The thread doubles as a practical playbook for engineers who want a lightweight, local-first research and documentation stack. (x.com)
Why does Karpathy's KB wiki thread matter?
Andrej Karpathy published a step‑by‑step writeup (a GitHub gist) and an X thread that lay out the exact file layout, tools, and routines he uses to run a personal research wiki that the model itself maintains. (gist.github.com) The thread quickly went viral and immediately spawned community tool lists and clones, including a curated "awesome" GitHub repo collecting plugins, CLIs, and examples of the pattern. (github.com) Karpathy describes the workflow as a four‑phase loop: ingest, compile, query, and maintain — where "ingest" means saving raw articles, papers, repos, and images into a raw/ folder, and "compile" means the language model reads those sources and writes structured Markdown pages (summaries, concept articles, and backlinks). (academy.dair.ai) When he says "linting" he means automated health checks run by the model to find inconsistencies, fill missing facts, and suggest new links or pages — equivalent to a code linter but for knowledge so the wiki stays coherent over time. (ruberli.com) A key technical claim Karpathy emphasizes is that for mid‑sized personal collections this approach avoids the usual retrieval‑augmented generation (RAG) stack — RAG normally chops documents into chunks, converts them to numeric vectors called embeddings, and uses a vector database (a specialized store for those vectors) to find relevant chunks — instead he relies on the model's ability to reason over readable, interlinked Markdown that the model itself compiles. (venturebeat.com) He also documents practical details people care about in production: using the Obsidian web clipper to capture pages and store images locally so vision‑capable models can reference them (vision‑capable meaning the model can process images as well as text), running the LLM as an agent that can call command‑line tools to generate slides or plots, and then writing those outputs back into the wiki so the knowledge base becomes a single, auditable source of truth. (academy.dair.ai)