| Data quality gives a competitive edge. Everybody | | | | this is the first time that customer records of |
| agrees how important good data quality is. And | | | | disparate systems are merged. There is typically |
| everybody has been agonized by erroneous data. | | | | tremendous "fallout", and records that do get |
| We've all lost a lot of time working with crappy data, | | | | merged contain many inconsistencies. This then often |
| and "Garbage In, Garbage Out" is probably the most | | | | leads to disappointed end-users, and unmet |
| commonly cited proverb in IT. Then how come it is | | | | expectations. |
| always so hard to find volunteers to do something | | | | 6. Data quality is a management issue, not a |
| about it? | | | | technology issue |
| Because the consequences of non-quality data are | | | | The typical situation in the overwhelming majority of |
| propagated throughout the organization, one | | | | organizations I have visited is like this: |
| seemingly innocent problem upstream can easily | | | | - there is low awareness of the embedded cost of |
| cause a dozen problems downstream, and | | | | their data quality issues |
| sometimes even more! The accumulated costs of | | | | - management has no idea of the potential value in |
| dealing with the resulting errors can become | | | | fixing data quality issues "upstream" |
| staggering. Tackling and resolving the issues that | | | | - those who have insight in data quality issues have |
| cause data quality problems is one of the most | | | | little or no incentive in bringing these issues out |
| high-leverage investments a company can make, in a | | | | Hence, the problems have a nasty habit of |
| world that is increasingly relying on digital information. | | | | perpetuating themselves. For sure, subordinates need |
| Why do these problems exist, and why do they live | | | | to carry their weight and take responsibility. But |
| on? It often appears to be business misalignment of | | | | notice how far all three of these issues, essentially |
| the worst kind when many 'bystanders' realize there | | | | the final responsibility for bringing these "unwelcome |
| are indeed data problems, but nobody "owns" these | | | | surprises" out in the open lies with management. |
| problems. This commonly recurring phenomenon lies | | | | What is the culture like in your company? My |
| at the heart of the omnipresent challenge to find | | | | experience has been that managers may or may not |
| resources (both money and time) to overcome such | | | | be motivated to bring such issues out in the open, |
| data quality problems. | | | | sometimes depending on the time horizon they |
| 1. What is data quality? | | | | consider for their own tenure. |
| Data quality is determined not only by the accuracy | | | | 7. Manage data for what it is: a strategic resource |
| of data, but also by relevance, timeliness, | | | | Data is not merely a byproduct of business |
| completeness, trust and accessibility (Olson, 2003). All | | | | processes, but something that has value beyond its |
| these "qualities" need to be attended to if a business | | | | immediate processes. Finding new uses for existing |
| wants to improve its competitive advantage, and | | | | data makes it more valuable, at no capital investment! |
| make the best possible use of its data. Data quality | | | | Future changes to the way the data are to be used |
| implies its fitness for use, including unanticipated | | | | cannot be predicted, yet are guaranteed to happen! |
| future use. Accuracy takes up a special place | | | | This proliferation of data usage needs to be |
| because none of the others matter at all if the data | | | | anticipated, and calls for flexible data models. Good |
| is inaccurate to begin with! All other qualities can be | | | | database design is resilient in the face of |
| compromised, albeit at your peril. | | | | unanticipated changes. This means flexibility in |
| 2. Data non-Quality is expensive | | | | hardware/infrastructure on the tangible side (avoid |
| "Reports from the Data Warehousing Institute on | | | | vendor or platform lock-in). On the intangible side, |
| data quality estimate that poor-quality customer data | | | | you want to avoid aggregating or any other data |
| costs US business a staggering $611 billion a year in | | | | commitments that can not be reversed within the |
| postage, printing and staff overhead" (Olson, 2003). | | | | data scheme. It is fundamentally impossible to find a |
| There are many ways in which non-quality data can | | | | generic "right" way to aggregate inconsistencies in |
| cost money: typically these costs remain largely | | | | data. That is why flexibility calls for late commitments |
| hidden. Senior management either doesn't notice | | | | in the data model. |
| these costs, or even more likely: is grappling with | | | | 8. Higher quality data lead to far more flexibility for |
| problems of which it never becomes clear that they | | | | your corporate strategy |
| are caused by poor-quality data. | | | | Fast access to accurate data not only gives a |
| 3. Quantifying the cost of non-quality is very | | | | competitive advantage. What is even more important |
| important | | | | is the flexibility such companies enjoy in adjusting to |
| Since data quality has such a strong tendency to go | | | | changes in market conditions. So over time, as |
| unnoticed, it is even more important to translate the | | | | market changes will occur, the gap with the |
| consequences of poor-quality data to the one | | | | competition can grow even further. Also, changes in |
| dimension each and every manager understands so | | | | legislation or market regulation can be much more |
| well: dollars. This also gives a perspective on the kinds | | | | easily exploited and turned into an opportunity rather |
| of investments that are appropriate to make in order | | | | than 'suffered'. |
| to resolve such issues. Also, a mechanism for | | | | 9. Data quality improvement is a process, not an |
| prioritizing improvement programs is desirable. You | | | | event |
| want to begin picking the low-hanging fruit first, but | | | | In many ways, one can draw parallels between Total |
| you certainly also want to know where the | | | | Quality Management efforts, and the issues |
| whoppers are! According to Gartner, Fortune 1000 | | | | surrounding data quality. The Japanese use a word |
| enterprises may lose more money in operational | | | | "Kaizen" that denotes both an incremental |
| inefficiency due to data quality issues than they | | | | improvement method as well as a philosophy. What is |
| spend on Data Warehouse and CRM initiatives. | | | | crucial is that it's an on-going, never-ending effort to |
| 4. Data quality issues typically arise when existing | | | | keep raising the bar. Data quality is never "perfect" |
| data are used in new ways | | | | as every new application of existing data is likely to |
| In my experience as a data miner, where I am very | | | | bring up new issues. And the proliferation of data |
| often looking for new ways of using existing data, | | | | usage is not ending any time soon. So data quality |
| this is where many problems originate. The data itself | | | | issues are guaranteed to stay with us for a while. |
| hasn't changed, but it are new uses for existing data | | | | 10. Collecting data is only a few decades old |
| that make problems apparent that were already | | | | No wonder we're dealing with "growing pains". Few |
| there. So what constitutes "data quality" needs be | | | | corporations actually planned their data strategy, and |
| considered in relation to its intended use. And change | | | | their IT infrastructure grew in a time when data |
| of usage then brings up new ways to evaluate the | | | | were being handled in silos. As data are being shared |
| quality and hence may bring up concerns. The reason | | | | and warehoused increasingly, we need to think |
| these problems didn't surface before is usually | | | | through the goals and objectives of the enterprise |
| because the business adapted to the data, the way | | | | with regards to the data. This is all fairly new, and |
| they are. People and processes avoided the | | | | few if any 'established' standards exist. A sort of |
| consequences of inaccurate entries. Which incidentally, | | | | 'global plan' or 'road map' as to where and how to |
| is also why legacy system migrations can be so | | | | expand on existing capabilities is a sound investment |
| painful. | | | | to manage project risks. Also, this 'road map' needs |
| 5. Many CRM projects collapse under data quality | | | | to conform to the existing IT strategy. Time and |
| issues | | | | money will only be invested if project goals are in line |
| Gartner and Forrester have estimated that 60-70% | | | | with the overall corporate strategies. The road is |
| of CRM implementations fail to deliver on | | | | littered with unsuccessful BI projects, many of which |
| expectations. That is not to say that these projects | | | | started without a clear business case. A |
| are all abandoned halfway; it's foremost that | | | | well-conceived data strategy greatly leverages the |
| expectation aren't met. One of the biggest reasons | | | | considerable investments that are needed to get the |
| for the 'technical' challenges in bring CRM projects to | | | | best mileage from your data. |
| completion is that disparate data sources are getting | | | | We appreciate comments and feedback. |
| merged to create a 360° customer view. Often, | | | | |