Data Quality Is Top AI Hurdle, Survey Says

A survey by Qlik and IDC found that data quality is the primary obstacle in AI implementation. According to the results, 35% of respondents cited data quality issues as their core hurdle, underscoring the foundational role of data in successful AI initiatives.

- Reinforcement Learning from Human Feedback (RLHF) is a critical technique for aligning large language models with human values, but its effectiveness is entirely dependent on the quality of human annotations. This process involves costly and time-consuming data collection from trained human labelers who provide preference data to train a reward model. The demand for high-quality, nuanced feedback has shifted the data labeling market away from a gig-economy model towards the use of domain experts like coders, lawyers, and doctors. - Constitutional AI, a technique developed by Anthropic, aims to make AI systems helpful and harmless by using a predefined set of principles, or a "constitution," to guide the model's behavior. This method reduces the reliance on extensive human feedback for safety by training the model to critique and revise its own outputs based on these principles, a process known as Reinforcement Learning from AI Feedback (RLAIF). However, this approach requires a robust data governance infrastructure to manage and audit the constitutional rules and AI decisions. - As high-quality text data from traditional sources like books and the internet becomes exhausted, AI labs are increasingly turning to synthetic data to train models. While Gartner predicts that 60% of all data used in AI will be synthetic by 2030, ensuring its quality and relevance requires rigorous validation against real-world data to avoid transferring biases. - The evaluation of agentic AI systems, which can reason, plan, and use tools, requires a shift from static benchmarks to assessing multi-step task completion and behavioral reliability. New benchmarks like AgentBench and WebArena are emerging to test these capabilities in realistic scenarios, focusing on metrics such as task success rate, tool selection accuracy, and error handling. - The fundraising environment for AI startups has become more selective, with investors prioritizing ventures with clear products and scalable technology. While AI startups raised a third of all venture capital in 2024, the bar for securing funding, particularly at later stages, has risen. Seed-stage AI startups, however, saw valuations 42% higher than their non-AI counterparts. - A successful go-to-market strategy for B2B AI startups involves moving beyond traditional sales funnels to an "intelligence orchestrator" model that uses AI for continuous market analysis and personalized messaging. This requires a deep alignment between marketing and sales on revenue processes before implementing AI tools to avoid amplifying existing inefficiencies. - The future of work in the AI era will see a continued demand for human data labelers, particularly those with specialized domain expertise, to handle complex and nuanced annotation tasks. While automation will assist with repetitive tasks, human intelligence will remain crucial for ensuring the quality and ethical grounding of AI systems, with the data labeling industry projected to become a multi-billion dollar sector.

Data Quality Is Top AI Hurdle, Survey Says

Get your own daily briefing