Vector DBs challenged by tree RAG

PageIndex reported ditching vector databases in favor of a tree‑index RAG approach and claimed a 98.7% score on FinanceBench versus 45% for Perplexity, suggesting alternative retrieval structures can dramatically change retrieval accuracy. The result points to retrieval architecture — not just embeddings — as a critical lever for RAG performance. (x.com)

Most retrieval systems read a long document like a paper shredder. They chop a 10-K filing into fixed chunks, turn each chunk into numbers called embeddings, and ask a vector database to fetch the chunks whose numbers look closest to the question. (pageindex.ai) That works when the right answer sits inside one neat paragraph. It breaks when the answer depends on a section heading, a footnote, and a table three pages later, because semantic similarity can pull back text that sounds related without being the section that actually answers the question. (pageindex.ai) PageIndex is betting on a different map. Its system builds a tree from the document first, more like a table of contents with branches, and then has the model walk that tree to decide which section to open next. (github.com) The company says that switch let its finance system, called Mafin 2.5, hit 98.7% accuracy on FinanceBench. FinanceBench is a question-answering benchmark built from 10,231 questions about public companies and their filings. (github.com, github.com) FinanceBench exists because ordinary large language models were bad at this job. The benchmark paper says the open-source sample has 150 reviewed cases, and one tested setup using GPT-4-Turbo with retrieval answered incorrectly or refused to answer 81% of questions. (arxiv.org, github.com) That is why this result got attention. If a retrieval system can move from roughly half-right to almost always right on financial filings, the gain is not coming from prettier chat replies; it is coming from finding the right evidence before the model writes a word. (github.com, github.com) PageIndex says the tree method skips two habits of classic retrieval-augmented generation: vector databases and hard chunking. Instead of slicing a document into 512-token or 1,000-token blocks, it keeps natural sections intact and returns page and section references for the path it followed. (pageindex.ai, github.com) That trace matters in finance because a wrong line from the right company is still the wrong answer. A filing can repeat the same words in “Risk Factors,” “Management Discussion,” and the notes to the financial statements, but only one of those sections may contain the number or rule the question is asking for. (pageindex.ai) There is one important catch in the headline number. The 98.7% score comes from VectifyAI’s own evaluation repository and blog, so the claim is public and reproducible in code, but it is still a vendor-reported benchmark rather than an independent bake-off run by FinanceBench’s authors. (github.com, pageindex.ai) Even with that caveat, the argument lands. For two years, most retrieval talk has centered on better embeddings, bigger context windows, and faster vector search, and this result says the document index itself may be the bigger lever when the source is a 200-page filing instead of a short web page. (pageindex.ai, arxiv.org) The likely outcome is not that vector databases disappear next month. The likely outcome is that teams building tools for law, finance, and technical manuals start testing tree-style retrieval next to vector search, because in those domains the difference between “similar text” and “relevant evidence” is where most of the errors live. (github.com, pageindex.ai)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.