As a relational database professional I couldn’t help but feel like something would be lost with the emergence of the new Big Data/NoSQL database management systems (DBMS). After about two years of buzz around the topic, I’m really excited about the emerging possibilities. However, I’m pretty sure we’ll miss the relational model’s strengths in requirements definition and conceptual design. Continue reading »
Data quality in most large organizations is commonly known to be rather lacking. Most would argue that things haven’t gotten much better since this 2007 Accenture study found that “Managers Say the Majority of Information Obtained for Their Work Is Useless”. To some, quotes like that are shocking, but if you think about how information is processed in most Fortune 1000 sized organizations it is surprising that data available to managers is as good as it is. These slides have been useful in my efforts to explain the persistence of data quality problems in large organizations. Continue reading »
QlikTech’s QlikView reporting and analysis tool is among a new class of Business Intelligence (BI) software tools. As Ben Harden reported in a recent blog post, BI vendors like SAP, Microsoft, and IBM have traditionally sold “to the IT enterprise, but companies like QlikTech and Tableau are targeting the business and bypassing IT. Their tools are quicker to stand up, more intuitive and don’t need the configuration, support, and hardware that the bigger players require.”
A Quick Overview
At first look QlikView is fairly accessible to those experienced with BI tools. A “.qvw” QlikView file contains three classes of user-facing components: a script-based data integration language that runs when the user requests a “reload”, a data modeling component that looks deceptively like a relational data modeling tool, and a familiar array of data visualizations: graphics, charts, lists, etc.
I’ve posted a couple of articles at my company’s blog site that reflect my view on data quality efforts:
- Yes, there is a business case for improving data quality, and I’ve got real business value examples. If you look for real money where you anecdotally know there are data quality problems, you’ll likely find it in high costs of data correction and rework, and savings related to business process improvements that reliable data enables.
- There are distinct things an organization can do to reap benefits of improved data management and data quality. (1) Get started in the first place, (2) find the tangible benefits, (3) cross the departmental silos that exist in every large organization, and (4) promote sound data management practices.
I’m a data modeler, so I enjoyed Jonathon Geiger’s recent article entitled “Why Does Data Modeling Take So Long”. But why does he say it like it’s a bad thing?
Mr. Geiger’s bottom line is exactly right: “Most of the time spent developing data models is consumed developing or clarifying the requirements and business rules and ensuring that the data structure can be populated by the existing data sources.” On the projects he describes, no one took time before modeling to determine available data sources and identify business entities of interest, relationships among them, and attributes that describe them before database design started, so the data modeler had to do it.
I recently completed ScrumMaster training ably presented by Lyssa Adkins. Throughout the two-day class we appreciated Lyssa’s Zen-like, enabling, style. If her name is familiar, it’s because Ms. Adkins is the author of the book Coaching Agile Teams, one of the leading texts on the subject.
I’ve participated on agile projects, but so far only in a piggish/chickenish role, once in a three-week stint as a consulting architect and twice as the project manager serving as interface to the non-agile organization.
To me Ms. Adkins rocks at making students very introspective and critical of their past project experiences. These lessons stand out:
It is really bad, according to a recent survey by the Ponemon Institute (available here with registration). The white paper, entitled Health Data at Risk in Development: A Call for Data Masking, presents the results of a survey of 492 health care IT professionals on their companies’ practices regarding use of live personal health care data in application testing.
It makes a scary read. Here are the lowlights: Continue reading »
Who would want to be a national health care administrator? Who would want the responsibility for managing health care and formulating health policy for tens or hundreds of millions of people? It seems obvious that such decisions would rely on quality data. A recent interview impressed upon me how much data managers can learn from a field where data recording millions of separate life and death decisions aggregates to support decisions on the future allocation of health care resources.
There’s a data explosion going on and perhaps the strangest result is that business intelligence analysts need to become more artistic.
Recently my friend Ben Harden directed my attention to a post from Steve Bennett of Oz Analytics on the future of BI. One challenge to analysts that Mr. Bennett cited was the unprecedented explosion in data quantity to “an almost inconceivable 35 trillion gigabytes” by 2020. Part of the solution, according to the post, is “actionable insight”, as illustrated by Harry Beck when he created the now-iconic map of the London underground network from the previous rather spaghetti-ish version. What Mr. Beck did was to distinguish significant from insignificant detail for the intended audience and present that detail in a clear and appealing way.
Recently there has been a long, and very interesting, discussion of do-it-yourself versus third-party metadata tools on LinkedIn’s TDWI BI and DW discussion forum (membership required to follow the link). I have followed but haven’t commented, but I suppose I contributed when Information Management kindly published my article on DIY metadata.
The discussion is extremely informative, presenting the views of a variety of knowledgeable professionals in different situations, and describing successful and sometimes not-so-successful efforts to solve the essential metadata challenge: how to document what information is locked up in databases. Continue reading »