Tag: Data Science
-
One More Species of Overloaded Data
A while back I wrote the post A Field Guide to Overloaded Data, which publicized the work of Duane Hufford, who examined different types of overloaded data during the 1990s. Over the years his classifications of overloaded data effectively categorized data anomalies I encountered in the wild. That is until recently, when a colleague encountered…
-
Two Design Principles for Tableau Data Sources
It’s not unusual for talented teams of business analysts to find themselves maintaining significant inventories of Tableau dashboards. In addition to sound development practices, following two key principles in data source design help these teams spend less time in maintenance and focus more on building new visualizations: publishing Tableau data sources separately from workbooks and…
-
More on “Select Failed. [2646] No more spool space”
Also see the previous related post Escaping Teradata Purgatory (Select Failed. [2646] No more spool space) Not too long ago I posted on how to avoid the dreaded “No more spool space” error in Teradata SQL. That post recounted approaches to restructuring SQL queries so that they would avoid being cancelled for using inordinate amounts…
-
Leadership Must Prioritize Data Quality
Data quality improvements follow specific, clear leadership from the top. Project leaders count data quality among project goals when senior management encourages them to do so with unequivocal incentives, a common business vocabulary, shared understanding of data quality principles, and general agreement on the objects of interest to the business and their key characteristics. Poor…
-
Leader’s Data Manifesto at #EDW19: Building a Foundation for Data Science
It’s been a truism that data is a resource, but to prove it you just have to follow the money. As the illustration shows, the vast majority of corporate market value draws from intangible assets. Just as money is an abstraction that represents wealth, data is an abstraction that represents these intangible assets. It’s year…
-
Enterprise Data Prep for Analytics: Two Principles
Data scientists spend most of their time doing data integration rather than gathering insights. In my interview with data scientist Yan Li, she said that data collection and prep takes at least 70% of her time. Obviously, there’s a lot of integration work to do on data that’s new to analytics efforts, but not every…
-
Toward an Analytics Code of Ethics
In data management and analytics, we often focus on correcting apparent inability and unwillingness on the part of business leaders to effectively gather and capitalize on data resources. With that perspective, we often see ethics as a side issue difficult to prioritize given the scale and persistence of our other challenges. At least that was…
-
Fixing Tableau Desktop Blue Screen or Unresponsive
Tableau desktop (10.2.2 on Windows 7 at work) was consistently locking up my computer or causing a BSOD when I tried to start it. After struggling for a while trying to solve the problem, I found out it was because it used all resources when opening the log file, which had over time grown to…
-
Leader’s Data Manifesto Annual Review: “It’s About the Lopez Women”
A year ago I recounted proceedings from the 2017 EDW World conference, which included release of the Leader’s Data Manifesto (LDM). Last week’s EDW World 2018 served as a one-year status report on the Manifesto. The verdict: there’s still a long way to go, but speakers and attendees report dramatic progress and emergence of shared…
-
Escaping Teradata Purgatory (Select Failed. [2646] No more spool space)
Also see the related post More on “Select Failed. [2646] No more spool space” If you are a SQL developer or data analyst working with Teradata, it is likely you’ve gotten this error message: “Select Failed. [2646] No more spool space”. Roughly speaking, Teradata “spool” is the space DBAs assign to each user account as…