Databricks Launches AI Gateway for LLM Governance
Databricks has launched its AI Gateway in beta on AWS, designed as an enterprise control plane for governing large language model and agent endpoints. The gateway provides fine-grained access controls, monitoring, and auditability for generative AI in production. New "inference tables" will also allow users to log and monitor model performance to support governance in regulated workflows.
- The AI Gateway acts as a centralized and model-agnostic layer, allowing companies to route requests to various LLM providers like OpenAI, Anthropic, and Amazon Bedrock, as well as self-hosted models, through a single API. This eliminates the need for developers to manage multiple SDKs or rewrite code when switching between models. - A key feature for regulated industries is the ability to enforce safety and compliance at scale through configurable "AI Guardrails". These can filter for personally identifiable information (PII), block harmful or toxic content, and apply other policy-driven controls to both prompts and responses. For US-based PII, the gateway uses Presidio to detect categories like credit card numbers, social security numbers, and bank account information. - The gateway provides granular control over LLM usage and costs, with features like user- and endpoint-level rate limiting and tag-based cost attribution for chargebacks to different business units. All usage metadata, including token counts, cost, and latency, is automatically captured in a system table for observability. - To enhance reliability in production environments, the AI Gateway includes an automatic fallback routing feature. This allows platform teams to mitigate disruptions caused by LLM provider capacity limits or regional outages by redirecting traffic to alternative models or providers. - The introduction of inference tables, which log all requests and responses as a Delta table in Unity Catalog, is a core component of the gateway's governance capabilities. This continuous logging simplifies model monitoring, debugging, and auditing by making the data available for analysis using standard Databricks tools like SQL queries and notebooks. - LLM Operations, or LLMOps, is an emerging discipline that extends the principles of MLOps to address the specific challenges of managing large language models. This includes practices for prompt engineering, fine-tuning, and specialized monitoring of factors like token usage and response quality. - Competitors to Databricks in the broader data and AI platform space include Snowflake, Google BigQuery, Amazon Redshift, and Microsoft Azure Synapse for data warehousing and analytics. In the more specific area of AI and machine learning platforms, alternatives include Google's Vertex AI, Amazon Bedrock, and IBM watsonx. - The broader trend in enterprise AI involves moving beyond general-purpose models to create domain-specific "data intelligence" by building agents that can leverage a company's proprietary data. This often involves Retrieval-Augmented Generation (RAG) techniques to ground LLM responses in internal knowledge bases.