In response to findings that many “safety-related events were caused by or related to incorrect patient identification”, the Office of the National Coordinator for Health Information Technology (ONC) worked with CMMI to develop the PDDQ Framework in order help organizations implement effective, sustainable data management practices around patient data management.
Groundbreaking? Yes. As a lean framework appropriate for small business the PDDQ shows how you can rightsize the Data Management Maturity Model to match your situation. That it is freely available demonstrates CMMI’s commitment to improving data quality in healthcare. Continue reading →
Even now the business case for a metadata tool seems unclear and difficult to quantify, but it isn’t impossible.
We in the data management business tend to devalue solutions that don’t clearly derive from a coherent top-level view. We seek applications defined from an enterprise architecture, database designs from an enterprise data model, and data elements consistent with the enterprise business glossary.
However, sometimes tactical gains make sense even when the big picture is missing, and tactical successes of metadata for analytics teams can raise consciousness that helps set the stage for evolving data management improvements. Continue reading →
Even though it happens annually, teams building new visualizations often forget to think about the effects of turning over from one year to another.
In today’s fast paced, Agile world, requirements for even the most critical dashboards and visualizations tend to evolve, and development often proceeds iteratively from a scratchpad sketch through successively more detailed versions to release of a “1.0” production version. Organized analytics teams evolve dashboards within a process framework that include checkpoints ensuring standards are met for security, reliability, usability, and so on.
A reporting team can build a revolutionary analytics capability enabling unprecedented visibility into operations, and then, if year turnover isn’t included in requirements, experience embarrassing errors and usability challenges in the January after initial deployment. In effect, the system experiences its own Y2.xK crisis, not too different from the expected Y2K crisis 16 years ago. Continue reading →
There’s an unfortunate and rather rude saying about assumptions that I’ve found popular among IT folks I’ve worked with. I say unfortunate because, to me, assumptions that are recognized early and handled the right way are a key to successful projects. Technical players who use assumptions well can help set projects on the right path long before they go astray.
Sometimes on waterfall and hybrid projects technical players are asked to estimate work early, before requirements are complete. My instinctive reaction is not to provide an ungrounded estimate, but that’s not helpful. The way to handle this uncomfortable uncertainty is to fill out the unknowns with assumptions: detailed, realistic statements that provide grounding for your estimate. Continue reading →
There’s consensus among data quality experts that, generally speaking data quality is pretty much bad (here, here, and here). Data quality approaches generally focus on profiling, managing, and correcting data after it is already in the system. This makes sense in a data science or warehousing context, which is often where quality problems surface. To quote William McKnight at the first of those sources:
“Data quality is no longer the domain of just the data warehouse. It is accepted as an enterprise responsibility. If we have the tools, experiences, and best practices, why, then, do we continue to struggle with the problem of data quality?”
So if the data quality problem is Garbage In Garbage Out (GIGO), then I would think that it would be easy to find data quality guidelines for app dev, and that those guidelines would be lightweight and helpful to those projects. Based on my research there are few to none such sources (please add them to the comments if you find otherwise).
So, all that said here’s my cut at app dev data quality guidelines by project activity: Continue reading →
Logical data modeling is one of my tools of choice in business analysis and requirements definition. That’s not particularly unusual – the BABOK (Business Analysis Body of Knowledge) recognizes the Entity-Relationship Diagram (ERD) as a business analysis tool, and for many organizations it’s a non-optional part of requirements document templates.
In practice, however, data models in requirements packages often include many-to-many relationships. I’ve heard experienced data modelers advocate this practice, and it unfortunately seems consistent with the “just enough, just in time” approach associated with agile culture.
In my experience unresolved M:M relationships indicate equally unresolved business questions. The result: schedule delays and budget overruns as missed requirements are built back in to the design, or the familiar “that’s not what we wanted” reaction during User Acceptance Testing (UAT). Continue reading →
I recently stumbled upon one of The Martin Agency’s hilarious Geico caveman ads and wondered, rather geekily, why they didn’t do one about data analysis. I think if a caveman suddenly arrived in the 2010s he or she would see parallels between his life and the activities of today’s knowledge worker. When I thought it through, it seemed obvious that knowledge workers need to be more like farmers and less like hunter/gatherers if they want to achieve the full potential of business intelligence.
Recently I was in a conversation about data modeling standards. I confess that I’m not really the standards type. I understand the value of standards and especially how important it is to follow them so others can interpret and use work products. It is just that I prefer to focus on understanding of the principles behind the standards. In general, it seems to me that following standards is trivial for someone who understand the principles, but impossible for someone who doesn’t. But there doesn’t seem to be general understanding of data modeling principles. Continue reading →
In the past I’ve never understood what people really mean they say “think outside the box” but Jim Harris, in a recent OCDQ blog post, helped me figure it out.
Mr. Harris ends with this provocative line: “the bottom line is Google and Facebook have socialized data in order to capitalize data as a true corporate asset.” The post starts with a cold war analogy and proceeds to describe how Facebook and Google have made big money as “internet advertising agencies:” offering free services with which users (like us) serve up personal data in return for use of the service, then selling advertising space based on our data (hopefully anonymized).
I’m a data modeler, so I enjoyed Jonathon Geiger’s recent article entitled “Why Does Data Modeling Take So Long”. But why does he say it like it’s a bad thing?
Mr. Geiger’s bottom line is exactly right: “Most of the time spent developing data models is consumed developing or clarifying the requirements and business rules and ensuring that the data structure can be populated by the existing data sources.” On the projects he describes, no one took time before modeling to determine available data sources and identify business entities of interest, relationships among them, and attributes that describe them before database design started, so the data modeler had to do it.