Of course no one would do that on purpose, but I as a consultant over many years I’ve often seen it. A vendor fulfills a contract to the letter, which unfortunately allows them to deliver required reports in various, sometimes changing, formats with suspect data quality. The customer company absorbs these costs, leaning on the data analyst to update PowerPoint decks on schedule before the next monthly management meeting in spite of the extra programming work.
These contracts have been for various goods and services, but almost every business contract today is also a contract for data. If a regional gas company hires a vendor to inspect residential lines, then I suspect it wants reports showing inspections conducted and results; a healthcare firm that sends nurses on house calls needs data detailing call schedules and results; and so on.
Companies that supply goods or provide services often don’t feature data management as a core competency, and the quality of their reporting often doesn’t match the quality of their goods or services. Someone in the customer organization has to code around every addition or omission of an expected Excel column, every “N/A” in a numeric field, and every unexpected change from imperial to metric units. Continue reading →
Data quality improvements follow specific, clear leadership from the top. Project leaders count data quality among project goals when senior management encourages them to do so with unequivocal incentives, a common business vocabulary, shared understanding of data quality principles, and general agreement on the objects of interest to the business and their key characteristics.
Poor data quality costs businesses about “$15 million per year in losses, according to Gartner.” AsTendü Yoğurtçuputs it, “artificial intelligence (AI) and machine learning algorithms are only as effective as the data they use.” Data scientistsunderstandthe difficulties well, as they spend over 70% of their time in data prep.
Recent studies report that data entry typos are the largest source of poor data quality (here and here). My experience says otherwise. From what I’ve seen, operational data is generally good, and data errors only appear when data changes context. In this post I’ll detail why data quality is management’s responsibility, and why data quality will remain poor until leadership makes it a priority.Continue reading →
It’s been a truism that data is a resource, but to prove it you just have to follow the money. As the illustration shows, the vast majority of corporate market value draws from intangible assets. Just as money is an abstraction that represents wealth, data is an abstraction that represents these intangible assets.
It’s year three after initial rollout of the Leader’s Data Manifesto (LDM). Since then, many widely publicized events have highlighted the value of data and metadata, and the importance of sound data management (here, here, and here). Recently at Enterprise Data World, John Ladley, Danette McGilvray, James Price, and Tom Redman presented this year’s LDM update. They reintroduced the Manifesto, recounted events of the past year, discussed strategy for the coming year, and issued a call to action for data professionals. Continue reading →
In data management and analytics, we often focus on correcting apparent inability and unwillingness on the part of business leaders to effectively gather and capitalize on data resources. With that perspective, we often see ethics as a side issue difficult to prioritize given the scale and persistence of our other challenges.
At least that was my perspective, and my initial response when confronted recently by a family member on this topic. Her view from outside the field was that ethics should be a primary concern. As I’ve reflected on this conversation, I’ve come around to her point.
In recent years we’ve seen many examples of data misuse due to ethical lapses. Here’s a post that gives five examples, including police officers looking up data on individuals not related to any police business, an employee passing personal data including SSNs to a text sharing site, and Uber’s “god view”, available at the corporate level, which an employee used in 2014 to track a journalist’s location. Continue reading →
Modern data architectures, by enabling data analytics insights, promise to drive order of magnitude value gains across many business sectors (here, here, and here). Not so long ago, big data presented a daunting challenge. Although tools were plentiful, we struggled to conceptualize the architecture and organization within which to capitalize on those tools. Now solid frameworks have emerged. This post reviews two promising models for modern data architecture, and discusses two key cultural values critical to their successful adoption: drive to solve business challenges and drive for universal data correctness. Continue reading →
A year ago I recounted proceedings from the 2017 EDW World conference, which included release of the Leader’s Data Manifesto (LDM). Last week’s EDW World 2018 served as a one-year status report on the Manifesto. The verdict: there’s still a long way to go, but speakers and attendees report dramatic progress and emergence of shared values supporting data management’s role in enabling success and reducing risk.
To me the most compelling example of progress was the story of the Lopez women, told by Tommie Lawrence, who leads patient data quality efforts at Sharp Healthcare, a major San Diego, Ca, healthcare network. Ms. Lawrence’s team is responsible for data quality related to about six million patient records in the 40 highest priority of Sharp’s ~400 systems containing Patient Health Information (PHI).
A few years ago, Sharp Healthcare had two patients named Maria Lopez*, with birthdays one day apart. One suffered from kidney disease, the other had cancer. After a long wait a kidney was found, and the hospital called the Maria with kidney disease and asked her to come to the hospital for a transplant immediately. During operation prep, an assistant noticed that Maria had cancer, and put a halt to proceedings – it didn’t make sense to give the kidney to someone with cancer. Continue reading →
In response to findings that many “safety-related events were caused by or related to incorrect patient identification”, the Office of the National Coordinator for Health Information Technology (ONC) worked with CMMI to develop the PDDQ Framework in order help organizations implement effective, sustainable data management practices around patient data management.
Groundbreaking? Yes. As a lean framework appropriate for small business the PDDQ shows how you can rightsize the Data Management Maturity Model to match your situation. That it is freely available demonstrates CMMI’s commitment to improving data quality in healthcare. Continue reading →
Even now the business case for a metadata tool seems unclear and difficult to quantify, but it isn’t impossible.
We in the data management business tend to devalue solutions that don’t clearly derive from a coherent top-level view. We seek applications defined from an enterprise architecture, database designs from an enterprise data model, and data elements consistent with the enterprise business glossary.
However, sometimes tactical gains make sense even when the big picture is missing, and tactical successes of metadata for analytics teams can raise consciousness that helps set the stage for evolving data management improvements. Continue reading →
Data quality doesn’t have to be a train wreck. Increased regulatory scrutiny, NoSQL performance gains, and the needs of data scientists are quietly changing views and approaches toward data quality. The result: a pathway to optimism and data quality improvement.
Here’s how you can get on the new and improved data quality train in each of those three areas: Continue reading →
Obviously, data management is important. Unfortunately, it is not prioritized in most organizations. Those that effectively manage data perform far better than organizations that don’t. Everyone who needs data to do his/her job must drive change to improve data management.
That was the theme of the recent Enterprise Data World (EDWorld) conference this week. This year’s EDWorld event might be the start of a new vitality and influence for the field, marked by introduction of a Leader’s Data Manifesto.
Over the years, data practitioners struggled for recognition and resources within their organizations. In reaction, they often focused on data “train wrecks” that this neglect causes. This year’s conference was no exception. For example: Continue reading →