Study: Org Barriers Hamper Data Modeling

A new analysis from Practical Data Modeling finds that 89% of engineers are struggling with organizational, rather than technical, barriers to effective data modeling. The primary issues identified are poorly defined roles, a lack of clear ownership, and insufficient data literacy investment outside of engineering. The 11% of engineers who are thriving work in organizations where data modeling is a shared, collaborative responsibility.

- A "Data Mesh" is a decentralized data architecture that addresses organizational bottlenecks by giving domain-specific teams ownership of their data. This approach, in contrast to traditional centralized models like data lakes, treats data as a product and provides teams with self-service platforms to manage and share their data. - The role of a Data Product Manager (DPM) is emerging to bridge the gap between business needs and technical data teams. DPMs are responsible for the entire lifecycle of data products, from development to ensuring they align with business goals and are built on reliable infrastructure. - Insufficient data literacy across an organization can lead to a significant gap between business teams and data experts, with data professionals often getting stuck handling ad-hoc reporting requests. Establishing a common data vocabulary and providing training with real-world examples can help bridge this divide and improve decision-making. - Organizational structure directly impacts the effectiveness of data teams, with common models including centralized, embedded, and federated approaches. A centralized team can standardize practices and ensure governance, but may create bottlenecks, while decentralized models increase agility. - In modern data stacks, tools like dbt (data build tool) promote a layered approach to data modeling, typically consisting of staging, intermediate, and marts layers. Best practices for dbt include using modular models, clear naming conventions, and robust testing to prevent bad data from moving downstream. - MLOps, or Machine Learning Operations, applies DevOps principles to the machine learning lifecycle to streamline the deployment, monitoring, and maintenance of models in production. This practice fosters collaboration between data scientists and operations teams and includes key components like data management, model training, and continuous monitoring. - A significant challenge in enterprise data modeling is the lack of clear ownership, which can lead to models degrading over time as updates are missed and trust in the data erodes. Assigning a dedicated owner to a data model ensures its continued maintenance and alignment with business objectives. - Gartner predicts that through 2026, up to 60% of AI projects will be abandoned due to a lack of AI-ready data and foundational data-modeling infrastructure. This highlights the critical need for well-designed and scalable data models to support successful AI and machine learning initiatives.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.