Datadog flags agent drift signals
- Datadog said its LLM Observability tools can now trace AI agents step by step, mapping tool calls, handoffs, retries, latency, errors, cost, and output quality. - The company launched AI Agent Monitoring as generally available on June 10, 2025, and said engineers can spot infinite loops, wrong tool calls, and latency spikes. - The push extends Datadog’s broader AI monitoring suite as companies deploy more agentic systems across coding, support, and operations. (datadoghq.com)
AI agents do not fail like normal software. They can finish a task, return an answer, and still be wrong, slow, expensive, or unsafe. (datadoghq.com) Datadog’s pitch is that teams need to watch those systems the way they already watch apps and infrastructure: with traces, metrics, and evaluations tied to each request. Its LLM Observability product records latency, errors, token usage, inputs, outputs, and the steps inside an agent run. (docs.datadoghq.com) (datadoghq.com) That matters more with agents than with a single chatbot response. Datadog says agentic systems can plan, loop, call tools, hand work to other agents, and retry failed steps, which turns one user request into a dynamic decision graph. (datadoghq.com) On June 10, 2025, Datadog announced AI Agent Monitoring, LLM Experiments, and AI Agents Console at its DASH conference. The company said AI Agent Monitoring is generally available and maps each agent’s decision path, including inputs, tool invocations, calls to other agents, and outputs. (datadoghq.com) Datadog’s own examples of trouble are concrete. Engineers can drill into latency spikes, incorrect tool calls, and unexpected behavior such as infinite agent loops, then correlate those failures with quality, security, and cost metrics. (datadoghq.com) The company frames this as an observability problem, not just a model problem. Its docs say failures in large language model applications often show up as low-quality or wrong responses rather than clean software errors, which makes traditional monitoring incomplete. (datadoghq.com) Datadog has been widening that tooling across major agent frameworks. In June 2025 it said its software development kit could automatically track agent operations built with OpenAI Agent SDK, LangGraph, CrewAI, and Bedrock Agent SDK. (datadoghq.com) Google and Amazon have since published integrations with the same Datadog stack. Google said in January 2026 that Datadog added automatic instrumentation for Agent Development Kit systems, while Amazon said in July 2025 that Bedrock Agents can be monitored step by step for model calls, tool invocations, and knowledge-base interactions. (cloud.google.com) (aws.amazon.com) Datadog has also pushed the same idea into coding assistants. In December 2025, it put Claude Code Monitoring into AI Agents Console preview, showing spend, token usage, error rates, latency patterns, commits, and pull requests tied to assistant activity. (datadoghq.com) The through line is that agent drift is less a single bug than an accumulation of bad choices: the wrong tool, one extra loop, slower responses, higher spend, or weaker outputs. Datadog’s answer is to make every one of those steps visible before the agent becomes expensive or unreliable. (datadoghq.com) (docs.datadoghq.com)