Frequently in my career I’ve selected or helped select ETL and reporting professionals who need SQL skills. For some of those opportunities, placement firms returned resumes with interminable, and nearly identical, lists of technical achievements with excruciating unnecessary detail (paraphrasing: “Wrote SELECT statements using GROUP BY”, “Applied both inner and outer joins”). Before interviewing we typically ranked candidates in order of preference based on resumes. Candidates’ interview success bore little relation to resume-based rankings.
With some candidates I’ve encountered consistent pauses after fact questions, sometimes accompanied by keyboard clicks. They obviously used “think time” to look up answers on the internet. As a result, “fact” questions didn’t distinguish one candidate from another.
On the other hand, open ended questions worked well. I’ve written before that interviews should ask opinion rather than fact questions. Open ended questions or thought exercise, as opposed to fact questions, assess SQL skill level, are hard to quickly look up on the web, and have the added benefit of demonstrating a candidate’s reasoning and communication skills. Here are two of my favorite examples: Continue reading →
“Business process reengineering is the act of recreating a core business process with the goal of improving product output, quality, or reducing costs.”* Recently I’ve perused articles on business process reengineering and have been surprised to find that they share a lack of emphasis on data definition.
By establishing a shared business vocabulary, identifying and describing business-critical entities and events, and applying the defined entities and events in process and system design, BPR teams can ensure an efficient redesigned process that works smoothly from end to end.
In spite of data concerns making up two of the seven key BPR principles (“Merging data collection and processing units” and “Shared databases to interconnect dispersed departments”), articles on the topic tend to lump these concerns into general Information Technology topics, without acknowledging the need for business driven data definition and management. For example, this post stresses the need for “more sources of data and enhanced connectedness to information”. This one recounts a famous Ford BPR example where a new database was central to the solution. Many, like this one, cite “shared databases” as a core principle. However, none details the business leadership and participation necessary to define a common data foundation across a reengineered business process. Continue reading →
Sometimes success seems like a data analytics team’s worst enemy. A few successful visualizations packaged up into a dashboard by a small skunkworks team can generate interest such that a year later the team has published scores of mission critical dashboards. As their use spreads throughout the organization, and as features expand to meet the needs of an expanding user base, the dashboards can slow down and data refreshes fail as they exceed database and analytics tool time and resource limits.
There are steps teams can take to deal with such slowdowns. Analytics tool vendors typically offer efficiency guides, like this one, that help resolve dashboard response time issues. A frequent recommendation is for the dashboard to use summary tables rather than full detail, reducing the amount of data that the dashboard has to parse as the user waits for a viz to render.*
Summary tables also help resolve data refresh timeouts, but their long term success for the team depends on the foundation on which they are built and how they are organized. The most obvious approach is to build custom summaries serving each dashboard. While report-specific tables stand out as a quick win, analysis shows they are a suboptimal solution because they tend to (1) reduce ability to respond to requirements evolution, and (2) make metrics in different dashboards less consistent. Continue reading →
In our current “social distancing” situation, many are working remotely in a serious way for the first time. As one who’s worked full time from home for the past four years, and frequently before that, I thought I should share some tips based on experience. Below are my top three tips and then some of the gear that I’ve used to set up a comfortable workspace.
But first to sum it all up, WFH works well for me. On a team of folks mostly working from home, we are engaged and productive, and have developed what I hope are lasting relationships with each other, although we rarely see each other in person.
It’s not unusual for talented teams of business analysts to find themselves maintaining significant inventories of Tableau dashboards. In addition to sound development practices, following two key principles in data source design help these teams spend less time in maintenance and focus more on building new visualizations: publishing Tableau data sources separately from workbooks and waiting until the last opportunity to join dimension and fact data.
Imagine a business team — let’s call it Marketing Analytics — with read-only access to a Hadoop store or an enterprise data warehouse. They gain approval for Tableau licenses and Tableau Server publication rights for five tech-savvy data analysts. After a few initial successes with some impactful visualizations, the team gathers steam. After a while the team finds itself supporting scores of published workbooks serving a few hundred managers and executives. In spite of generally sound practices, Marketing Analytics struggles to maintain consistency from one Tableau workbook to another.
Not too long ago I posted on how to avoid the dreaded “No more spool space” error in Teradata SQL. That post recounted approaches to restructuring SQL queries so that they would avoid being cancelled for using inordinate amounts of Teradata resources. Teradata is an immensely powerful, even if aging, database engine but it does little to help one not steeped in knowledge of its structure to use its resources efficiently.
But what if, as sometimes happens, your DB admin team further tightens the screws by reducing spool space, or imposing new execution time or CPU usage limits? Then, you’ll have to go further to make queries efficient, as happened on one team that I was a part of. Beyond the steps previously recommended, here’s what we did: Continue reading →
I recently listened to Brian O’Neill’s excellent interview with Tom Davenport, headlined “Why on a scale of 1-10, the field of analytics has only gone from a one to about a two in ten years time.”
The conversation covered a lot of ground as Mr O’Neill and Mr Davenport explored the reasons why. Highlights included general lack of technical literacy and lack of an organizational data driven culture. But to their credit, they took responsibility on behalf of analytics professionals, emphasizing how we in the field could change in order to make more analytics efforts successful. Rather than focusing on providing technology-centered solutions, they recommended that data and AI professionals seek first to understand and empathize with their clients or internal customers, enabling data and AI pros to develop more effective analytics capabilities in light of that understanding.
I agree that analytics professionals can improve their game. However, as a former consultant who’s switched over to the client side, I think there’s room for improvement all around. To me, clients who work proactively to prepare for an analytics project position themselves for better outcomes. Continue reading →
In my experience, data management is both a mission critical and an undervalued capability. Perhaps recent customer data losses and regulatory initiatives like GDPR tend to raise the stock of data maturity efforts, but it remains undervalued. For example, any Fortune 1000 firm building end-to-end processes finds that much of the cost goes to translating data from different systems that integrate into the process.
Today we have available stage models like CMMI’s Data Management Maturity Model (DMMM) which, as I’ve written, help organizations assess an organization’s maturity level. However, the DMM model aims to assess data maturity at a single agency. It lacks mechanisms to compare multiple agencies or business functions, and therefore can be difficult to translate to prioritized plans for improvement.
Of course no one would do that on purpose, but I as a consultant over many years I’ve often seen it. A vendor fulfills a contract to the letter, which unfortunately allows them to deliver required reports in various, sometimes changing, formats with suspect data quality. The customer company absorbs these costs, leaning on the data analyst to update PowerPoint decks on schedule before the next monthly management meeting in spite of the extra programming work.
These contracts have been for various goods and services, but almost every business contract today is also a contract for data. If a regional gas company hires a vendor to inspect residential lines, then I suspect it wants reports showing inspections conducted and results; a healthcare firm that sends nurses on house calls needs data detailing call schedules and results; and so on.
Companies that supply goods or provide services often don’t feature data management as a core competency, and the quality of their reporting often doesn’t match the quality of their goods or services. Someone in the customer organization has to code around every addition or omission of an expected Excel column, every “N/A” in a numeric field, and every unexpected change from imperial to metric units. Continue reading →
By now Agile has taken over waterfall as the dominant app dev project pattern*. In many large organizations, the traditional waterfall PMO also owns Agile projects. One aspect of PMO oversight that can work against Agile culture is the project audit. If the goal of an audit is to ensure the project reflects Agile values, it can help ensure working software and a satisfied customer. If not, an Agile project audit can reinforce process, documentation, and other values that don’t directly promote project success.
In this post I’ll briefly review the Agile Manifesto, recount some prominent advice for auditors of Agile projects, and offer suggestions for auditors who want to reinforce rather than suppress Agile values. Continue reading →