Encord Raises $60M to Challenge Scale AI
Encord, a data infrastructure startup for 'physical AI,' has landed $60M in a Series C round led by Wellington Management. The company is positioning itself as a direct competitor to Scale AI, focusing on creating data pipelines for robotics and drones where synthetic and real-world data converge.
Encord's platform is engineered for the complexities of "physical AI," focusing on multimodal data like images, videos, audio, and DICOM files. This specialization in sensor-rich data from environments like robotics and autonomous vehicles is a key differentiator from more generalist data labeling platforms. The company's AI-native infrastructure is designed to manage the entire data lifecycle, from curation and annotation to model evaluation and alignment with human feedback. A core component of Encord's strategy is "active learning," a machine learning technique where the model itself identifies the most informative data points for human annotation. This approach aims to achieve higher model accuracy with less labeled data, directly addressing the time and cost bottlenecks of manual annotation. By focusing on the most uncertain samples, active learning pipelines can significantly improve model performance and efficiency. For AI labs, Reinforcement Learning from Human Feedback (RLHF) is a critical process for aligning models with human values. This involves training a reward model based on human-ranked outputs, which then guides the AI's learning process. Encord's platform facilitates these complex RLHF workflows, which are essential for refining everything from conversational agents to generative video models. To further reduce reliance on human labelers and scale alignment, some labs are turning to Constitutional AI. Developed by Anthropic, this method uses a predefined set of principles—a "constitution"—to enable an AI model to critique and revise its own outputs. This self-correction process, known as Reinforcement Learning from AI Feedback (RLAIF), minimizes the need for human intervention in identifying harmful or undesirable responses. The debate between using synthetic data and human-labeled data is central to the AI development landscape. While synthetic data offers speed, scalability, and privacy advantages, it can lack the nuance and accuracy for context-sensitive tasks that human annotators provide. A hybrid approach, using synthetic data for broad coverage and human labeling for fine-tuning and edge cases, is often the most effective strategy. Evaluating agentic AI, which can perform multi-step tasks and use various tools, presents new data labeling challenges. The focus shifts from labeling final outputs to evaluating the entire "trace" of the agent's actions, including its intermediate decisions and tool usage. This requires structured labeling schemas to assess not just the outcome, but the quality of the agent's reasoning process. The venture capital landscape for AI infrastructure is robust, with AI-related startups capturing nearly half of all global funding in 2025. Total venture capital for AI companies surged to $211 billion in 2025, an 85% increase from the previous year, with a significant portion directed towards AI infrastructure and foundation models. This influx of capital signals strong investor confidence in companies building the foundational tools for AI development. This AI boom is reshaping the workforce, creating a high demand for data labelers and new roles focused on human-AI interaction. While AI automates many repetitive tasks, human expertise remains crucial for complex, nuanced data annotation and for overseeing the quality of AI-generated labels. The future of work in this sector will likely involve a collaborative model where humans upskill to manage and validate the outputs of increasingly sophisticated AI systems.