Bedrock’s endpoints trade availability for residency

AWS Bedrock now offers global endpoints for dynamic routing and regional endpoints for guaranteed geographic routing, turning endpoint type into a resilience-versus-residency decision. Anthropic’s Claude Sonnet 4.5 and later models are available as both endpoint types, formalising a policy trade-off teams must bake into their model gateways (platform.claude.com). At the same time, Unit42 disclosed sandbox bypasses in AgentCore that enable DNS tunnelling and credential exposure—so managed sandboxes shouldn’t be treated as airtight and require extra egress controls, scoped credentials and high-fidelity logs (unit42.paloaltonetworks.com).

# Bedrock’s endpoints trade availability for residency Amazon Web Services has turned a quiet infrastructure choice into a policy decision. In Amazon Bedrock, teams can now choose between endpoint types that optimize for different things: global endpoints that dynamically route requests for maximum availability and throughput, or regional endpoints that keep routing inside a defined geography. Anthropic says this split applies starting with Claude Sonnet 4.5 and later models on Bedrock, which means model access policy is no longer just about price and latency. It is now also about where traffic is allowed to go when systems are under load. (platform.claude.com) That sounds like a small implementation detail, but it changes how enterprise artificial intelligence stacks need to be designed. A model gateway used to answer straightforward questions like which model to call, how many tokens to allow, and what fallback to use if one provider fails. Now it also has to answer a governance question: when demand spikes, should the platform preserve service by routing more broadly, or preserve residency by refusing to leave a geographic boundary? AWS’s own documentation frames the trade-off directly. Geographic cross-Region inference keeps processing within boundaries such as the United States, Europe, or Asia Pacific, while global cross-Region inference can route to any supported commercial AWS Region worldwide. (docs.aws.amazon.com) Under the hood, Bedrock handles this through inference profiles. These profiles define a model and one or more Regions that can receive the request, letting AWS spread traffic across multiple locations instead of pinning every call to a single Region. AWS says the point is to absorb bursts, improve throughput, and reduce the chance that on-demand inference gets squeezed by quota limits or peak demand. In practice, that makes endpoint selection part of resilience engineering. If a single Region is crowded, a broader profile gives Bedrock more places to send work. (docs.aws.amazon.com) AWS also makes the commercial incentive explicit. In its current Bedrock documentation, geographic cross-Region inference is positioned for organizations with data residency requirements, while global cross-Region inference is positioned for organizations prioritizing performance and cost. The same page says global routing offers the highest available throughput and approximately 10 percent savings, while geographic routing offers higher throughput than a single Region but keeps data within a geography. That is a rare case where architecture, compliance, and pricing are all being adjusted with the same switch. (docs.aws.amazon.com) Anthropic’s documentation effectively formalizes this split for Claude customers. Its model overview says that, starting with Claude Sonnet 4.5 and all subsequent models, AWS Bedrock and Google Vertex AI offer two endpoint types: global endpoints for dynamic routing and regional endpoints for guaranteed routing through specific geographic regions. That wording matters because it tells buyers this is not a one-off launch quirk tied to a single model release. It is becoming part of the operating model for frontier systems sold through cloud platforms. (platform.claude.com) For platform teams, the practical consequence is that “use Claude on Bedrock” is no longer a complete instruction. They need separate paths for workloads with different legal and operational requirements. A customer support assistant serving public product questions may be allowed to use global routing for better uptime and lower cost. A claims workflow, health intake flow, or regulated internal assistant may need geographic or single-Region routing even if it means lower headroom during spikes. AWS says application inference profiles can be created to route requests to one Region or to multiple Regions while also tracking usage and costs, which gives teams a mechanism to encode these differences in production. (docs.aws.amazon.com) This would already be a meaningful cloud architecture story on its own. But it lands alongside a separate security warning that points in the opposite direction: do not confuse a managed boundary with a perfect one. On April 7, 2026, Unit 42 published research on Amazon Bedrock AgentCore describing ways to bypass Code Interpreter sandbox network isolation mode with Domain Name System tunneling and describing a credential exposure issue tied to the micro virtual machine metadata service configuration before AWS remediations. (unit42.paloaltonetworks.com) The network side of the finding is blunt. Unit 42 says AgentCore’s Code Interpreter sandbox mode was intended to isolate code from external network access, but the isolation was incomplete, allowing data to be sent and received through Domain Name System tunneling. Domain Name System traffic is usually treated as basic internet plumbing, which is exactly why tunneling through it can be dangerous: the channel often looks ordinary until someone uses it to smuggle data out. (unit42.paloaltonetworks.com) The identity side is just as important. Unit 42 says it found a security regression in which the AgentCore Runtime used a micro virtual machine metadata service without session token enforcement, and that before AWS fixed the issue, an attacker exploiting a standard web flaw such as server-side request forgery could have extracted sensitive credentials. Unit 42 also says customers cannot patch the managed environment directly and instead must rely on platform controls provided by AWS. (unit42.paloaltonetworks.com) Put together, the two developments sharpen the same lesson from different angles. On the inference side, the cloud is offering more elasticity by widening the places a request can go. On the agent runtime side, security researchers are reminding customers that a managed sandbox is still software with edges, assumptions, and failure modes. If teams treat “managed by the provider” as equivalent to “airtight,” they will build the wrong controls around both routing and execution. (docs.aws.amazon.com) That means the right response is not to avoid managed artificial intelligence services. It is to be more explicit about trust boundaries. For Bedrock inference, organizations should decide which applications are allowed to use global routing, which must stay within a geography, and which should remain pinned to a single Region for stricter policy reasons. AWS’s current guidance already distinguishes these paths, and Anthropic’s model documentation shows that newer Claude releases are being packaged around that choice. (docs.aws.amazon.com) For Bedrock AgentCore and similar managed sandboxes, the safer assumption is that containment reduces risk but does not eliminate it. Unit 42 says AWS applied internal remediations and published mitigation strategies after disclosure, but the customer-side takeaway remains the same: limit outbound paths, keep credentials tightly scoped, and collect logs detailed enough to reconstruct suspicious behavior. That is especially true for systems that execute generated code, touch internal data, or chain multiple tools together. (unit42.paloaltonetworks.com) The broader shift is easy to miss if you only look at feature checklists. Cloud artificial intelligence platforms are no longer selling just “access to a model.” They are selling routing policy, failure handling, identity boundaries, and managed execution environments as part of the product. Bedrock’s new endpoint split makes availability-versus-residency a first-class architectural choice. The AgentCore research is a reminder that every one of those choices needs compensating controls around it. (docs.aws.amazon.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.