Prime Video Reverts Architecture

- Amazon Prime Video moved its video quality analysis from microservices/serverless back to a monolith on ECS/EC2. - The change reportedly cut costs by about 90% while improving scale and resilience for that workload. - The example warns that distributed microservices can be inefficient for low-utilisation video pipelines, informing pipeline architecture choices (x.com).

Prime Video’s video-quality team moved one monitoring workload off serverless microservices and back into a single application, cutting infrastructure costs by more than 90%. (infoq.com) The workload was not the consumer streaming app itself. It was an internal Video Quality Analysis system that checked live streams for defects such as block corruption and audio-video sync problems, then triggered repairs. (wudsn.com) Prime Video engineer Marcin Kolny wrote on March 22, 2023 that the original design used distributed components on Amazon Web Services Step Functions, AWS Lambda, Amazon Simple Storage Service, and Amazon Simple Notification Service. The team said that setup was fast to build, but expensive to run at larger scale. (infoq.com) The bottleneck came from the shape of the work. The system had to split streams into frames and audio buffers, pass those intermediate files between services, and pay for repeated Step Functions state transitions and Simple Storage Service reads and writes. (infoq.com) Kolny said the first version topped out at about 5% of the team’s expected load. The team wanted to monitor thousands of concurrent streams and found the orchestration layer was hitting account limits on total state transitions. (infoq.com) The replacement packed the media conversion, defect detection, and result aggregation logic into one process running on Amazon Elastic Compute Cloud and Amazon Elastic Container Service. That kept data in memory instead of shipping frames across networked services. (thenewstack.io) Prime Video said the change improved scale as well as cost. After the rewrite, the team said it could handle thousands of streams and still had room to grow further. (infoq.com) The episode drew outsized attention because Amazon Web Services helped popularize service-oriented and serverless architectures. The case landed in May 2023 as a counterexample to the idea that breaking software into many services is always the modern choice. (thenewstack.io) Not everyone read the post as a broad indictment of microservices. John Bennett of SPR argued in August 2023 that Prime Video’s example looked less like “microservices versus monolith” in the abstract and more like a distributed utility pipeline that was a poor fit for that particular workload. (spr.com) That is the durable lesson from the Prime Video post: architecture choices depend on the job. For a pipeline that shuffles huge numbers of video frames between tightly coupled steps, one process on ECS and EC2 turned out to be cheaper and easier to scale than a chain of serverless parts. (spr.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.