Vector database Endee.io is now open source
Vector database provider Endee.io has open-sourced its high-performance database, adding a new option to a competitive market dominated by Pinecone and Weaviate. The move reflects a trend toward commoditization and innovation in the vector DB space. Industry commentators warn that choosing the right vector database is a critical architectural decision for LLM applications, impacting latency, scaling, and retrieval accuracy.
- Endee claims its database achieves sub-5ms p99 latency, over 10,000 queries per second, and 99% recall, while reducing infrastructure costs by up to 10x compared to alternatives. - The project is licensed under Apache License 2.0, allowing for modification and commercial use. Endee.io plans to offer a separate enterprise version with SLAs and support for production deployments, a common business model for open-source infrastructure. - The open-sourcing of Endee follows a broader industry trend of vector search becoming a commodity capability, with established databases like PostgreSQL (via pgvector) and cloud providers like AWS adding integrated vector search functions. - Key competitors Pinecone and Weaviate have raised significant venture funding; Pinecone raised $100 million in a Series B round, while Weaviate secured $50 million in its Series B. - Pinecone's main value proposition is as a fully managed, serverless database that minimizes operational overhead, making it fast to deploy but potentially more expensive at scale. - Weaviate, also open source, offers more control and customization, including self-hosting options and strong hybrid search capabilities that combine keyword and vector search. - A primary technical challenge in scaling any vector database is managing the memory footprint of indexing structures like HNSW (Hierarchical Navigable Small World); a dataset of one billion 768-dimensional vectors can consume roughly 3 TB of memory. - The vector database market is projected to grow from approximately $2.58 billion in 2025 to over $17.9 billion by 2034, driven by the adoption of retrieval-augmented generation (RAG) and other AI applications.