Shopify's Use of Kubernetes for Scalable Commerce

Shopify's engineering teams are reportedly utilizing Kubernetes and microservices to build scalable and resilient backends for enterprise-level stores. This architectural approach is presented as essential for managing high transaction volumes and real-time inventory needs. The strategy allows for greater elasticity and developer velocity as platforms grow.

- Shopify's journey to Kubernetes began after their initial large-scale use of Docker in 2014, which led to a fragile, homemade middleware for container control by 2016. This prompted the move to a more robust orchestration solution to manage hundreds of applications, not just their core monolith. - The platform engineering team at Shopify operates on a "platform of platforms" model, with specialized teams for databases, streaming, and observability, all unified under Kubernetes. They run approximately 400 Kubernetes clusters to manage both stateless and stateful workloads, including all of their databases. - A key principle in their strategy was to create a "paved road" for developers, building an internal PaaS that meets the majority of use cases by default but still allows for customization. This involved abstracting away the complexity of Kubernetes for most developers through a web UI, rather than direct `kubectl` access, to improve their workflow and focus them on feature development. - To handle the scale of events like Black Friday Cyber Monday, where they've processed peaks of 58 million requests per minute, Shopify heavily relies on Google Kubernetes Engine (GKE). Their collaboration with Google Cloud is crucial for capacity planning to handle multiples of their normal traffic. - The engineering team initially attempted a highly abstracted layer on top of Kubernetes, which proved unsuccessful. They learned that they couldn't completely hide Kubernetes from developers and now provide meaningful defaults while allowing power users to manipulate manifests directly. - Shopify has developed and open-sourced several tools to enhance their Kubernetes workflow, including `krane` (formerly `kubernetes-deploy`), a command-line tool for shipping changes to a namespace and understanding the results. They also built custom controllers, referred to as "cloudbuddies," to automate tasks like DNS record creation and managing user quotas. - For development environments, Shopify evolved from a homegrown Ruby tool running on local VMs to a cloud-native solution called "Spin". This new approach provides each developer with a dedicated Kubernetes namespace, defining a workspace as a collection of pods, which aligns the development lifecycle more closely with their production environment. - The transition to a cloud-native infrastructure on Kubernetes has also driven cost-optimization efforts. By migrating key VM-based workloads to GKE and implementing FinOps practices with internal dashboards, Shopify has achieved significant savings on specific workloads.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.