Tag Archives: Alignment

Data Architecture for Improved Dashboard Performance

Sometimes success seems like a data analytics team’s worst enemy. A few successful visualizations packaged up into a dashboard by a small skunkworks team can generate interest such that a year later the team has published scores of mission critical dashboards. As their use spreads throughout the organization, and as features expand to meet the needs of an expanding user base, the dashboards can slow down and data refreshes fail as they exceed database and  analytics tool time and resource limits.

There are steps teams can take to deal with such slowdowns. Analytics tool vendors typically offer efficiency guides, like this one, that help resolve dashboard response time issues. A frequent recommendation is for the dashboard to use summary tables rather than full detail, reducing the amount of data that the dashboard has to parse as the user waits for a viz to render.*

Summary tables also help resolve data refresh timeouts, but their long term success for the team depends on the foundation on which they are built and how they are organized. The most obvious approach is to build custom summaries serving each dashboard. While report-specific tables stand out as a quick win, analysis shows they are a suboptimal solution because they tend to (1) reduce ability to respond to requirements evolution, and (2) make metrics in different dashboards less consistent. Continue reading

How to be a good client


I recently listened to Brian O’Neill’s excellent interview with Tom Davenport, headlined “Why on a scale of 1-10, the field of analytics has only gone from a one to about a two in ten years time.”

The conversation covered a lot of ground as Mr O’Neill and Mr Davenport explored the reasons why. Highlights included general lack of technical literacy and lack of an organizational data driven culture. But to their credit, they took responsibility on behalf of analytics professionals, emphasizing how we in the field could change in order to make more analytics efforts successful. Rather than focusing on providing technology-centered solutions, they recommended that data and AI professionals seek first to understand and empathize with their clients or internal customers, enabling data and AI pros to develop more effective analytics capabilities in light of that understanding.

I agree that analytics professionals can improve their game. However, as a former consultant who’s switched over to the client side, I think there’s room for improvement all around. To me, clients who work proactively to prepare for an analytics project position themselves for better outcomes. Continue reading

Prioritize data initiatives with the new Data Management Maturity Index

In my experience, data management is both a mission critical and an undervalued capability. Perhaps recent customer data losses and regulatory initiatives like GDPR tend to raise the stock of data maturity efforts, but it remains undervalued. For example, any Fortune 1000 firm building end-to-end processes finds that much of the cost goes to translating data from different systems that integrate into the process.

Today we have available stage models like CMMI’s Data Management Maturity Model (DMMM) which, as I’ve written, help organizations assess an organization’s maturity level. However, the DMM model aims to assess data maturity at a single agency. It lacks mechanisms to compare multiple agencies or business functions, and therefore can be difficult to translate to prioritized plans for improvement.

Recently I participated, with Manoj Thomas, Joseph Cipolla, and Lemuria Carter, in a study introducing techniques for assessing relative data management maturity of different organizations, and different data management capabilities, within a larger enterprise. Continue reading

Leadership Must Prioritize Data Quality

Data quality improvements follow specific, clear leadership from the top. Project leaders count data quality among project goals when senior management encourages them to do so with unequivocal incentives, a common business vocabulary, shared understanding of data quality principles, and general agreement on the objects of interest to the business and their key characteristics.

Poor data quality costs businesses about “$15 million per year in losses, according to Gartner.” As Tendü Yoğurtçu puts it, “artificial intelligence (AI) and machine learning algorithms are only as effective as the data they use.” Data scientists understand the difficulties well, as they spend over 70% of their time in data prep.

Recent studies report that data entry typos are the largest source of poor data quality (here and here). My experience says otherwise. From what I’ve seen, operational data is generally good, and data errors only appear when data changes context. In this post I’ll detail why data quality is management’s responsibility, and why data quality will remain poor until leadership makes it a priority. Continue reading

Leader’s Data Manifesto at #EDW19: Building a Foundation for Data Science

It’s been a truism that data is a resource, but to prove it you just have to follow the money. As the illustration shows, the vast majority of corporate market value draws from intangible assets. Just as money is an abstraction that represents wealth, data is an abstraction that represents these intangible assets.

It’s year three after initial rollout of the Leader’s Data Manifesto (LDM). Since then, many widely publicized events have highlighted the value of data and metadata, and the importance of sound data management (here, here, and here). Recently at Enterprise Data World, John Ladley, Danette McGilvray, James Price, and Tom Redman presented this year’s LDM update. They reintroduced the Manifesto, recounted events of the past year, discussed strategy for the coming year, and issued a call to action for data professionals. Continue reading

Enterprise Data Prep for Analytics: Two Principles

Data scientists spend most of their time doing data integration rather than gathering insights. In my interview with data scientist Yan Li, she said that data collection and prep takes at least 70% of her time. Obviously, there’s a lot of integration work to do on data that’s new to analytics efforts, but not every analysis uses brand new data. Organizations can improve analytics efficiency by staging commonly used data pre-integrated for data science.

For years, large organizations have supported data warehouses, but prevailing data warehousing practices often fail when faced with “big data” volume and velocity. Still, warehousing teams in large organizations can pre-prep frequently-used internal data. Examples include reference and master data, production and sales records, and so on.    Continue reading

Anonymize Data for Better Executive Analytics

Reading articles about data anonymization makes it clear that it is not an entirely effective security measure (here and here), but still part of a robust security capability, and required if your organization is affected by GDPR. (I use “anonymization” as a general term encompassing techniques that de-identify personal data within a given data set.)

But there’s a positive side of anonymized data that hasn’t received much press. Providing anonymous data to senior managers who don’t need access to personal data can encourage them to take a broader perspective, and thereby bring new energy to fact-based senior planning and analysis. Continue reading

Data Integration Benefits? They’re Obvious.

“At least 84 percent of consumers across all industries say their experiences using digital tools and services fall short of expectations.”* That quote headed a recent article by David Roe on the role of data integration in digital workplace apps. However, the opening quote reflects the pervasive dearth of integrated data among the companies most of us frequent.

We’ve all experienced the effects. Last week I was in a fender bender. Due to a mixup I didn’t have my insurance card with me, so I called the insurance company to get the info. They had no record of me associated with my car. It turned out that my car is insured under my wife’s name, hers under mine. Although I’ve been their customer for 25 years, and was driving my own car, they couldn’t give me insurance info. Sure, they were following good security practices. But I’m not letting them off the hook.  Continue reading

Meaningful Requirements Start Successful Data Projects

To me, development projects fail or succeed in the first few weeks. Once a project starts off in the wrong direction, momentum and expectations tend to prevent a return to the proper path. With today’s wealth of database options each addressing exciting new possibilities, the right choice for the application’s data foundation plays a large part in steering a project to success.

At this year’s Enterprise Data World conference, William Brooks showed the relations among different data modeling approaches, in effect detailing how to derive nine different model types from a detailed conceptual entity relationship model. Mr Brooks’ presentation hinted at a way to correctly frame up your data direction early on in a project, setting the stage for success.

According to his presentation, called “Symmetry in Modeling Approaches“, the different model types — relational, graph, dimensional, JSON, XML, and so on — all represent different perspectives on the same data relationships. Each suits a different application, like dimensional for reporting applications, data vault for data warehouses, graph databases for multi-layered search, and so on. However, if properly constructed they all map back in predictable and specific ways to a normalized entity-relationship model.

I and others write that ER modeling should be integral to requirements definition, but Mr. Brooks’ presentation implies that ER modeling can also serve as the basis for application architecture as well. Continue reading

Start Data Quality Improvements with a New Definition

What is Data Quality anyway? If you are a data professional, I’m sure someone from outside our field has asked you that question, and if you’re like me you’ve fallen into the trap of answering in data-speak.

To my listener, I’d guess that the experience was similar to having a customer service rep who has just turned down his simple request justify it by describing byzantine company policies.

There’s a ton of great writing available on data quality, and I in no way mean to disparage it or its value in the field. But in that writing I’ve yet to find a concise and compelling definition that’s useful to non-data professionals. I’ll review one or two prevailing definitions and then offer one that could help us unlock real data quality improvements. Continue reading