At the very first TDWI Conference, Duane Hufford described a phenomenon he called “embedded data”, now more commonly called “overloaded data”, where two or more concepts are stuffed into a single data field (“Metadata Repositories,” TDWI Conference 1995). He described and portrayed in graphics three types of overloaded data. Almost 20 years later, overloaded data remains rampant but Mr Hufford’s ideas, presented below with updated examples, are unfortunately not widely discussed.
Overloaded data breeds in areas not exposed to sound data management techniques for one reason or the other. Big data acquisition typically loads data uncleansed, shifting the burden of unpacking overloaded fields to the receiver (pity the poor data scientist spending 70% of her time acquiring and cleaning data!)
One might refer to non-overloaded data as “atomic”. Beyond making data harder to use, overloaded data requires more code to manage than atomic data (see why in the sections below) so by extension it increases IT costs.
Here’s a field guide to three different types of overloaded data, associated risks, and how to avoid them: Continue reading →
I had pondered writing a post called “Requirements Decay” about how requirements don’t last forever. In my research I found that such a post, complete with “my” words “requirements decay” and “requirements half-life”, had already been done comprehensively here. In a compact argument underpinned by half-life mathematics, the anonymous author proposes that a requirement isn’t likely to stand unchanged forever and explores the implications.
For me, requirements decay is an idea that helps us think realistically about project planning and improves our chances of meeting business needs. Continue reading →
Application developers and business people accessing relational databases need data dictionaries in order to properly load or query a database. The data dictionary provides a source of information about the model for those without model access, including entity/table and attribute/column definitions, datatypes, primary keys, relationships among tables, and so on. The data dictionary also provides data modelers with a useful cross reference that improves modeling productivity.
It is particularly useful for the dictionary to be a filterable/sortable Excel document, but out of the box ERwin, one of the leading data modeling tools, includes a notably inflexible reporting capability. Luckily, it is possible to directly query the ERwin “metamodel”. However, I found the ERwin documentation a bit hard to decipher and not quite accurate. Hopefully this post will save modelers some steps in figuring out how to query the metamodel.
I believe that early, effective big picture diagrams are key to application development project success. According to the old saw, no project succeeds without a catchy acronym. Maybe so, but I’d say no project succeeds without a good big picture diagram. The question: what constitutes a good one? To me good high-level diagrams have four key characteristics: they are simple, precise, expressive, and correct.
A technique for reporting requirements has emerged as the de facto standard in the business intelligence community. The technique, which emerged in the mid-2000s, is new enough to be as yet unacknowledged by the requirements analysis powers that be. David Loshin describes how it works in this 2007 post:
Start with a business question about how to monitor a business process using a metric, like “How many widgets have been shipped by size each week by warehouse?” Continue reading →
How does this sound as advice for an app dev manager leading his or her team from waterfall to Agile?
Clearly articulate a compelling end-state vision
Work from a position of authority
Weather the storms
Reward creativity while fostering improvement
A post at scrumsource.com lists leadership, organizational culture, and people as three of the five key factors in making the transition. Another at the Scrum Alliance site describes the transition as a migration from externally-organized to self-organizing teams. In my experience the transition requires leadership by a strong advocate who shows the way to willing, empowered team members.
The US men’s national soccer team (USMNT) is playing out a strikingly similar transition. Continue reading →
Out of curiosity I recently reviewed articles critical of Agile Methodologies. I had expected agile-versus-waterfall arguments and attacks from vendors selling new alternatives, but even given the reputation that advocates have for flaming well-intentioned critics, I wasn’t prepared for the level of emotion I found.
My opening position was that Agile techniques are great, but like any other tool there are limits and prerequisites. The critical articles I read strengthened that view. Let’s review three examples that stood out, in reverse order: Continue reading →
As important as it is, data modeling has always had a geeky, faintly impractical tinge to some. I’ve seen application development projects proceed with a suboptimal, “good enough”, model. The resulting systems might otherwise be well-architected, but sometimes strange vulnerabilities emerge that track directly to data design flaws.
Recently I saw an example where a “good enough” data design, similar to the one pictured, enabled a significant application bug.
Recently the BBC posted this video. On first view it is just funny, but watching those dogs learn to drive really reminded me of personal experiences with IT teams making big learning transitions. To represent those real situations let’s consider a fictional team of SQL developers facing the daunting task of deploying a functional Hadoop-based analytics prototype in two months. The video parallels their critical learning success factors: (1) set audacious goals, (2) learn bit by bit, and (3) know your limits.
In some presentations, I assert that top-down data modeling should result in not only a business-consistent model but also a pretty well normalized model.
One of the basic concepts behind normalization is functional dependency. In layperson’s terms, functional dependency means separating entities from each other and putting attributes into the obviously correct entity. For example, a business person knows that item color doesn’t belong in the order table because it describes the item, not the order. Everyone knows that the order isn’t green! Continue reading →