Large‑Memory Models Debut
Engramme unveiled a 'Large Memory Models' architecture that aims to mimic human memory by providing persistent, accurate recall of emails, conversations and app context without relying on retrieval‑augmented generation or vector search. The founders — backed by Harvard researchers — closed their lab to focus on product development and early demos show context‑aware recall across apps, positioning the tech for personal knowledge and productivity tools (x.com).
Most chatbots work like a student cramming for one exam: they can hold a lot in short-term context, but once the window closes, they need you to paste the facts back in. Engramme is pitching a different idea: software that keeps a running memory of your work across email, meetings, documents, and apps, then recalls the right detail later without you searching for it. (testingcatalog.com) That problem is older than chatbots. Human memory research has long split memory into short-term holding and longer-term storage, because the brain does not treat “what I saw 5 seconds ago” the same way it treats “the client changed the deadline last month.” (springer.com) The word “engram” comes from neuroscience, where it means the physical trace a memory leaves behind in the brain. Engramme borrowed that word because its founders are trying to build software that stores experiences more like episodes in a life log than like chunks in a search index. (jneurosci.org) Most current artificial intelligence memory tools use retrieval-augmented generation, which is a fancy name for “search your notes, then feed the results back into the model.” Engramme says its Large Memory Models avoid that pattern and also avoid vector search, which is the math trick many systems use to find “similar” text rather than exact past events. (testingcatalog.com) The company says its system is not built on the transformer design that powers most large language models. In Engramme’s description, the memory layer is meant to retrieve actual prior emails, calls, documents, and interactions tied to a person, not generate a plausible guess from patterns in training data. (testingcatalog.com) That is the pitch behind the company’s demos: if you are in Gmail, Zoom, Slack, WhatsApp, Google Docs, or even using Meta glasses, the software is supposed to notice your context and surface the one or two past items that matter. The company describes this as proactive recall, meaning the memory appears before you type a query. (testingcatalog.com) The people behind it are not coming from a normal software-search background. TestingCatalog reports that Engramme was founded by Gabriel Kreiman, a Harvard Medical School professor with more than 160 publications, and Spandan Madan, who has a Harvard computer science doctorate and experience at Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory, Google DeepMind, Meta, and Adobe. (testingcatalog.com) Kreiman’s own academic pages back up the neuroscience part of that story. Harvard and the Center for Brains, Minds and Machines list him as a professor at Harvard Medical School and Boston Children’s Hospital whose research covers perception, cognition, and memory, and his curriculum vitae lists earlier startups tied to memory technology. (cbmm.mit.edu, klab.tch.harvard.edu) The news this week is that Engramme has opened a beta Memory Application Programming Interface for developers rather than just showing private demos. TestingCatalog says the company has raised a $3 million pre-seed round and wants other products to plug its memory layer underneath their own apps instead of selling a single consumer app on top. (testingcatalog.com) You can already see that product direction in its code tools. The Engramme extension in the Visual Studio Marketplace says it watches coding sessions, git history, and chats in Cursor, Claude Code, and Codex to explain why a function exists and what conversation led to it, which is exactly the “remember the backstory, not just the file” use case the company is betting on. (marketplace.visualstudio.com) The hard part is not storing more data. The hard part is deciding which memory is relevant right now, which is the same problem human memory researchers have studied for decades: recall is useful only if the right trace comes back at the right moment. Engramme’s whole claim is that memory should work like recognition in context, not like opening a search bar. (springer.com, testingcatalog.com) If that claim holds up outside demos, the winner may not be the chatbot that writes the smoothest paragraph. It may be the system that remembers your Tuesday call with Priya, the draft you abandoned in February, and the promise buried in a Slack thread, then puts all three in front of you before you forget to ask. (testingcatalog.com)