DevOps Roundup: EBS Clones, K8s GPUs

This week's DevOps roundup flags Amazon EBS volume clones as a new feature, notes Kubernetes GPU workload guidance, and highlights Calico networking best practices on EC2—practical backend ops updates for scaling stateful services and ML workloads. The thread also revisits core patterns like outbox, saga, DB sharding and zero‑downtime deploys. (x.com, x.com)

AWS published the EBS Volume Clones announcement on October 14, 2025 via a blog post by Sébastien Stormacq that describes a single API/console action to create an instant, point‑in‑time copy of an EBS volume within the same Availability Zone. At launch the feature only supported encrypted volumes, the cloned volume becomes available within seconds, and initial performance for a clone is the minimum of a 3,000 IOPS/125 MiB/s baseline, the source volume’s provisioned performance, or the cloned volume’s provisioned performance while background initialization proceeds without impacting the source. Kubernetes documents state stable GPU support as of v1.26 and require vendor drivers plus a device plugin (exposing resources like nvidia.com/gpu), with GPUs specified in the limits section (limits==requests) and node labelling/NFD recommended for multi‑accelerator clusters. Practical K8s GPU guidance in community guides stresses using the NVIDIA GPU Operator to install drivers, device plugins and monitoring, and relying on cluster schedulers such as Kueue or Volcano for multi‑node/multi‑GPU orchestration in large ML workloads. (devopscube.com) Tigera’s Calico docs for AWS recommend avoiding overlays inside a VPC subnet and using CrossSubnet IPIP/VXLAN mode for only cross‑AZ encapsulation, while operations guides advise /24 IPAM blocks for typical 50–500 node clusters, enabling the eBPF dataplane for 40–60% higher throughput on modern kernels, and using ENA‑enabled instance families like c5n, m5n, r5n or c6gn for network‑intensive pods. The thread’s revisit of core backend patterns maps to concrete implementations: the Transactional Outbox pattern atomically persists state plus an outbox record (often coupled with CDC or an outbox relay) to guarantee eventual event delivery, SAGA coordinates distributed transactions via local commits plus compensating actions with orchestration vs choreography variants, and DB sharding is commonly implemented with hash‑, range‑ or directory‑based partitioning depending on query patterns. Zero‑downtime deploy playbooks called out in the thread enumerate rolling updates, blue/green and canary strategies along with readiness/liveness probes and GitOps tooling (Flux) as concrete mechanisms for safe cutovers and automated rollback gates.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.