Meta Inks $50M/Year Deal for News Corp Training Data

Meta has agreed to pay News Corp up to $50 million annually for multi-year access to its content for training AI models. The deal signals a major shift from scraping data to licensing it, as media companies begin to monetize their archives rather than suing AI firms.

This agreement is part of News Corp CEO Robert Thomson's "woo and a sue" strategy for AI, which involves partnering with some firms while suing others for scraping content. The media giant previously signed a five-year deal with OpenAI in 2024, valued at over $250 million, signaling a clear strategy to monetize its vast content archives for AI development. The deal is a key indicator of an industry-wide shift, as AI companies increasingly seek to license content to avoid legal challenges. Meta has also secured multi-year agreements with publishers like USA Today, CNN, and Fox News. Similarly, OpenAI has partnerships with the Associated Press and Axel Springer, establishing a market precedent for paying for high-quality training data. This pivot from scraping to licensing underscores the growing importance of data provenance for building trustworthy AI. For consumer health apps, which rely on sensitive user inputs from sources like Apple HealthKit and wearable APIs, establishing a clear, consensual data pipeline is critical for model accuracy and user trust. The growth of apps like Noom and Flo is heavily dependent on users trusting them with closely-held health data. The alternative to these licensing deals is litigation over copyright infringement, exemplified by The New York Times' lawsuit against OpenAI and Microsoft. This legal pressure highlights the significant risks for tech founders who use data without explicit permission. For a health startup, this reinforces the need for strict adherence to data privacy regulations like HIPAA and state-level laws to avoid costly legal battles and erosion of consumer confidence. News Corp's CEO views his company's content as a fundamental "input" for AI, much like semiconductors are for computing. This framing is a crucial lesson for founders transitioning from developer to CEO. In consumer health, user-generated data from chronic illness communities and wellness seekers is the core asset; building a business model around its ethical and consensual use is fundamental to long-term success and attracting early-stage digital health investment.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.