Gartner forecasts surge in observability

- Gartner said on May 12 that 40% of organizations deploying AI will use dedicated AI observability tools by 2028 to watch outputs and bias. - Akeyless said the same day that two-thirds of enterprises using AI agents suspect those agents already touched data beyond intended scope. - AI teams are shifting from prompt logs to full execution traces—retrievals, tool calls, permissions, lineage, and approvals all need watching.

AI observability is becoming its own software category. That is the real news here. Gartner said on May 12 that 40% of organizations deploying AI will be using dedicated AI observability tools by 2028, which is a sharp sign that “just ship the model” is giving way to “prove what the model did.” The reason is simple — once AI systems start making decisions, calling tools, and touching real company data, ordinary app monitoring stops being enough. ### What is “AI observability” actually watching? Regular observability watches servers, apps, latency, and errors. AI observability watches a different layer — model outputs, bias, drift, prompt behavior, retrieval quality, hallucinations, and whether an agent used the right tool or data source for the right reason. Gartner’s framing is telling because it centers model performance, bias, and outputs, not just uptime. Basically, the question is no longer only “did the system run?” but “did it behave acceptably?” (gartner.com) ### Why is that suddenly urgent? Because AI agents do not just answer questions. They can search internal documents, call APIs, trigger workflows, and act with machine credentials. That creates a much bigger blast radius when something goes wrong. Akeyless said on May 12 that two-thirds of organizations using AI agents suspect those agents have already accessed data beyond intended scope. In the same release, it said many organizations still cannot detect compromised agents for hours and are already spending more than $1 million dealing with the fallout. (gartner.com) ### Why can’t normal logs handle this? Normal logs tell you a request hit a service and maybe whether it failed. They usually do not tell you which prompt version ran, which retrieval chunks were pulled in, which model answered, what tool the agent called next, what permissions it used, and whether a human approved the action. That chain matters because AI failures are often not single-point crashes. They are bad sequences — the model misreads context, chooses the wrong tool, then gets valid access to the wrong data. (morningstar.com) Observability has to capture the whole path, not just the endpoint. ### Why do agents make this harder? An agent is basically software that can decide and act in steps. That is useful, but it means the system is less like a chatbot and more like a junior employee with a badge, a browser, and partial instructions. If that “employee” is over-permissioned, badly scoped, or hard to audit, the risk is obvious. The Cloud Security Alliance said in April that 82% of enterprises have unknown AI agents in their environments, which fits the same pattern — companies are deploying agents faster than they can inventory and govern them. (gartner.com) ### So what gets logged now? The baseline is expanding fast. Teams need traces for prompts, retrievals, tool calls, model versions, grounding sources, user approvals, policy checks, and final actions. They also need lineage — which model, dataset, guardrail, and workflow version produced a given output. That is what turns an AI system from a black box into something a security, compliance, or product team can actually inspect after the fact. (cloudsecurityalliance.org) Gartner’s forecast matters because it suggests this is moving from best practice to default enterprise plumbing. ### Is this mostly about security? Security is a big driver, but not the only one. Observability also matters for reliability and trust. If a model starts drifting, gets more biased, retrieves stale context, or degrades after a vendor update, companies need to catch that before users do. In other words, observability is becoming the control layer between “the AI worked in a demo” and “the AI is safe to run in production.” (gartner.com) ### What’s the bottom line? The market is maturing. Last year, a lot of AI teams were still proving that agents and copilots could do useful work. This year, the pressure is shifting to evidence — show the trace, show the permissions, show the approval trail. If Gartner is right, by 2028 a large chunk of enterprise AI will come with that instrumentation built in, because the alternative is letting autonomous systems operate faster than anyone can explain them. (gartner.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.