Meta Shares Playbook for Scaling to 1B Daily Videos
Meta's engineering team detailed how it scales video processing for over a billion daily videos. The core strategies involve separating the decoder and encoder pipelines and heavily optimizing FFmpeg. The post offers a rare look inside the infrastructure decisions required to handle massive video workloads efficiently.
Meta's video processing operates at a scale where `ffmpeg` and `ffprobe` are executed tens of billions of times daily to handle over a billion video uploads. This immense workload necessitated a shift from a longstanding internal fork of FFmpeg to contributing directly to the open-source project, influencing the development of more efficient, threaded multi-lane encoding which landed in FFmpeg 6.0. To manage this scale, Meta developed custom silicon, the Meta Scalable Video Processor (MSVP), an ASIC specifically for video transcoding that has been in production since at least 2021. This hardware accelerator provides a 9x performance increase compared to their previous software-only encoding stack for H.264, all while consuming half the energy. The MSVP is programmable to handle both high-throughput video-on-demand (VOD) and low-latency live streaming. The development of the MSVP was driven by the need for an energy-efficient, low-latency solution that could maintain quality, especially since most uploaded videos are already compressed. The chip supports H.264 and VP9, with a second-generation version enabling the pervasive use of the more efficient AV1 codec in Reels. This custom hardware is a key component in a broader infrastructure that supports products like Reels, which now sees 3.5 billion daily shares across Facebook and Instagram. This move to custom hardware and deep open-source collaboration is part of a larger trend at Meta of re-architecting backend systems from application-specific stacks to shared horizontal layers. This defragmentation improves resource efficiency and engineering velocity across all of Meta's applications, including Facebook, Instagram, and Messenger. The investment in custom silicon extends beyond video processing into AI, with the Meta Training and Inference Accelerator (MTIA) being developed to handle recommendation workloads more efficiently than general-purpose GPUs. This focus on the full infrastructure stack, from custom chips to massive AI-optimized data centers, indicates a long-term strategy to control and optimize every layer of their service delivery. Future developments will likely focus on optimizing for short-form video and the efficient delivery of generative AI, AR, and VR content. This aligns with the company's plan to continue investing heavily in FFmpeg and other open-source projects, ensuring the entire industry can benefit from the solutions developed to handle Meta's massive scale.