4 MINS READ
Leading the way in innovation for over 55 years, we build greater futures for businesses across multiple industries and 55 countries.
Our expert, committed team put our shared beliefs into action – every day. Together, we combine innovation and collective knowledge to create the extraordinary.
We share news, insights, analysis and research – tailored to your unique interests – to help you deepen your knowledge and impact.
At TCS, we believe exceptional work begins with hiring, celebrating and nurturing the best people — from all walks of life.
Get access to a catalog of the latest news stories from across TCS. Discover our press releases, reports, and company announcements.
Every year, poor data quality costs organizations around $12.9 million according to Gartner.
In 1983, an Edmonton-bound Air Canada flight had to make an emergency landing in Manitoba, Canada. The reason? A metric conversion mix-up had led the ground crew to load only half the fuel required to reach Edmonton. In another incident from 1999, NASA lost its $125 million Mars climate orbiter due to an erroneous English to metric conversion while exchanging vital data before the craft was launched.
Both these incidents point to the need for data standardization and high-quality data. Unfortunately, the outlook around this is still lacking. Many organizations are losing millions of dollars every year due to poor data quality.
From duplicate to inaccurate and inconsistent data, the quality issues around data are manifold.
While there are solutions available for addressing data quality, they are far from ideal as they mostly take a reactive approach. Moreover, these solutions predominantly rely on available metadata and rules. It’s no surprise then that companies are still reeling from the impact of poor data quality, such as:
Revenue loss: According to the Gartner report mentioned above, every year, poor data quality costs organizations an average of $12.9 million.
Missed opportunities: 21 cents for every dollar spent on media is wasted due to poor data quality, based on a Forrester report.
Inaccurate decisions: 76% of IT decision makers believe that revenue opportunities have been missed due to lack of accurate data insights. Furthermore, 77% of IT decision makers do not trust the data they are basing their decisions on. Both these insights are gleaned from Snaplogic.
Improve the quality of enterprise data with AI/ML by augmenting the people, processes, and technology around data quality management.
Start by identifying data quality issues to address and take corrective measures. It’s best to address these issues as close to the data’s origin as possible. Although this seems like a simple, straightforward ask, it’s often not the case as explained below.
There can be several sources for metadata and data quality rules in an organization. However, the main challenge in ensuring quality lies in gathering only the most relevant and accurate sources. And business subject matter experts are the most reliable people to gather this data from. It is a complex undertaking that requires defining and implementing a framework supported by people, processes, and technology. It needs niche skills and know-how around data quality and metadata management—a task often confined to teams that are strictly IT functions.
This slows down the implementation of data quality strategies. It is hence important to democratize data quality management. Build solutions that allow all the relevant users to contribute to tackling existing data challenges and creating high quality data. Companies can leverage ML techniques for data quality management.
Pivot from a top-down to a bottom-up approach to define data quality rules.
It’s time to rethink the way data quality rules are defined. In the traditional top-down approach, subject matter experts define the rules based on their domain and business know-how, making it a time-consuming process. Instead, companies can adopt a bottom-up approach where data quality rules are generated using AI/ML to automatically discover metadata and relationships from the existing data. The result? Improved turnaround time for creating data quality rules, which can then be used to create a rules repository that’s accessible to the wider organization.
With open data quality rules, a user will be able to include them as part of the application source code, data pipelines, and data quality management solutions. What’s more, this will enable data scientists to focus on building more accurate ML models without worrying about data quality issues. Above all, this approach will ensure enterprises take accurate data-driven decisions that yield expected results.