Databricks is a lakehouse, not a CDP

A recent analysis reiterates that Databricks functions as a lakehouse platform rather than a customer data platform, meaning it lacks several classic CDP capabilities and often needs complementary systems for activation. (cdp.com) The same platform is being used by vendors to build domain solutions—Persistent launched a Databricks-based merchant risk management product for real-time fraud detection. (cybersecurity-insiders.com)

Databricks is being sold into more customer-facing jobs, but its own documentation still describes it as a lakehouse: a system for storing, processing, and governing data, not a packaged customer data platform. (databricks.com) Databricks says a lakehouse combines the functions of a data lake and a data warehouse, and its current architecture guides focus on storage, ingestion, governance, and production deployment. Its documentation points to Delta Lake for transactions on data lakes and to Unity Catalog for governance across data and artificial intelligence assets. (docs.databricks.com) (docs.delta.io) (docs.databricks.com) By contrast, Databricks defines a customer data platform as software used by marketing and customer-experience teams to collect customer events, unify profiles, build audiences, and activate those audiences across channels. On the same page, Databricks frames the category as “composable,” meaning companies can assemble those functions from multiple products instead of buying one suite. (databricks.com) That distinction is visible in Databricks’ partner ecosystem. In a 2023 blog post, Databricks said Hightouch supplies the collection, modeling, and activation features needed to build a “complete and composable” customer data platform on top of the lakehouse. (databricks.com) A recent analysis from CDP.com makes the same argument more bluntly: Databricks can centralize and model customer data, but teams usually need other tools for identity resolution, consent handling, orchestration, and channel activation. The article says that leaves Databricks closer to the data layer of a customer stack than to a traditional packaged customer data platform. (cdp.com) Vendors are now building those domain-specific layers on top of Databricks instead of waiting for Databricks to become an all-in-one application. Persistent Systems said on April 9 that it launched a Merchant Risk Management and Fraud Detection product powered by the Databricks Data Intelligence Platform. (persistent.com) Persistent said the product is aimed at banks, acquirers, payment service providers, and digital platforms, and is available now as a Databricks-based accelerator. The company said the service uses real-time intelligence and workflows to monitor merchant onboarding, transactions, disputes, and chargebacks. (persistent.com) (prnewswire.com) Persistent also tied the launch to its Databricks services business. The company said it has more than 900 Databricks-certified professionals and more than eight accelerators on the platform. (persistent.com) (tmcnet.com) The result is a clearer split in the market: Databricks provides the governed data foundation, while partners and adjacent software vendors provide the packaged workflows that business teams actually use. For buyers, the question is less whether Databricks “is” a customer data platform than which missing layers they still need to add. (databricks.com) (cdp.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.