New RAG Method Claims to Ditch Embeddings

A new tree-based RAG method called PageIndex reportedly achieves 98.7% accuracy on FinanceBench without using embeddings, a vector database, or traditional chunking. The approach challenges standard vector search-based retrieval pipelines by creating a structured index of information. If validated, this technique could significantly alter the architecture and cost structure of enterprise RAG systems.

- PageIndex was created by Vectify AI and is part of an open-source framework; the system that achieved 98.7% accuracy on FinanceBench is named Mafin 2.5 and is powered by PageIndex. - The method works by first creating a hierarchical tree index of a document, similar to an intelligent table of contents with summaries at each node, and then uses an LLM to reason its way through the tree to find answers, a process inspired by AlphaGo's tree search algorithm. - The FinanceBench benchmark is considered challenging for RAG systems; the original paper introducing the benchmark found that a GPT-4-Turbo-based RAG system incorrectly answered or refused to answer 81% of the questions. - This approach avoids several cost centers associated with traditional RAG systems, such as embedding API calls (which can be $0.10 per million tokens for some models) and managed vector database hosting, which can range from $50 to over $200 per month for production workloads. - The technique falls into a broader category of "embedding-free" RAG, which includes other methods like keyword-based search (BM25), knowledge-graph-based retrieval (GraphRAG), and iterative LLM-driven retrieval. - Unlike vector search which measures semantic similarity, the tree-based navigation is designed to identify relevance through reasoning, which can be more effective for long, structured documents like financial reports where context is hierarchical. - The retrieval process is fully traceable, providing an explainable path through the document's structure that led to the answer, which contrasts with the "black box" nature of vector similarity scores.

New RAG Method Claims to Ditch Embeddings

Get your own daily briefing