Why pay good money for bad data?
Of course no one would do that on purpose, but I as a consultant over many years I’ve often seen it. A vendor fulfills a contract to the letter, which unfortunately allows them to deliver required reports in various, sometimes changing, formats with suspect data quality. The customer company absorbs these costs, leaning on the data analyst to update PowerPoint decks on schedule before the next monthly management meeting in spite of the extra programming work.
These contracts have been for various goods and services, but almost every business contract today is also a contract for data. If a regional gas company hires a vendor to inspect residential lines, then I suspect it wants reports showing inspections conducted and results; a healthcare firm that sends nurses on house calls needs data detailing call schedules and results; and so on.
Companies that supply goods or provide services often don’t feature data management as a core competency, and the quality of their reporting often doesn’t match the quality of their goods or services. Someone in the customer organization has to code around every addition or omission of an expected Excel column, every “N/A” in a numeric field, and every unexpected change from imperial to metric units.
The solution is for data standards to be “baked in” to contracts for goods and services, so I searched for references about the intersection between procurement and data governance. A few interesting links came up. There’s this two part series (part 2) from Michael Bulman that provides a comprehensive view, and this one from Malcolm Chisholm that focuses on data purchases, and this one from Lance Mercereau focusing on data protection and governance.
As good as they are, the strategic tilt of these articles doesn’t help, in the near term, the data analyst who receives the third new Excel format this year from the gas inspection vendor, or the .csv file from the house call vendor that’s missing two required columns.
Here are my thoughts on steps companies can take to ensure timely, consistent data delivery from vendors:
- Understand your data needs: Identify what you need to know in order to manage the vendor’s goods or services, and to integrate with other business processes in your company. It’s best to think in terms of raw data about each individual widget or service item. Your department’s reporting might only use summary gas line inspection results, but the maintenance department might need full content of the inspection form in order to set up a service call. You might make images or data transcriptions of inspection forms part of your RFP’s requirements for inspection services.
- Define your data standards: In addition to defining what data you need, define how you want it delivered, how frequently, and how good it has to be. What’s your desired format, Excel, CSV, pipe delimited, fixed format, or something else? How should transmissions be named and delivered? What columns/fields, and what is the data type for each field? What are the expected values in each field and what’s your tolerance for data outside the expected values?
- Negotiate data incentives into the contract: Bad data costs you good money. After defining what you need and how you want it, work to structure the contract so that the vendor pays for bad data. Large organizations may have the luxury of imposing costs for non-compliance with data requirements. Those with less influence may have to offer bonuses for good data. Consider the labor costs of errors, the opportunity costs of foregone analytics, and the costs of straightening out a persistent data quality error. Add those factors to your negotiation, with the goal of achieving a deal that ensures data compliance.
- Partner with your vendor to maintain data quality: Some vendors will have trouble meeting quality standards. Define a channel for communications between your data analyst and vendor data prep staff. As the relationship proceeds, keep the lines open and resolve problems as they arise. For example, if the vendor sends over a spreadsheet missing a column, get in touch with them quickly so they have a chance to resubmit.
While no solution to overall data governance challenges, these tactical steps can help an organization improve quality and timeliness of vendor reporting, and free up data analysts for analytics instead of the high pressure drudgery of changing ETL streams.