Data and Wine?

Great together, check this out:

DataAndWine

Followership II – Individualists, Enablers, & Subversives

In a previous post I posed this question: “more people are followers than leaders, so isn’t it more important to cultivate effective followership than effective leadership?”  In reality the distinction between leading and following isn’t very interesting.  The goal of each member of a group should be to contribute to individual and shared goals in a balanced manner and promote the dignity of group members.  In every group effort, whether business, charity, sports, or anything else, everyone leads and everyone follows.

I recently read Testimony, Solomon Volkov’s controversial publication of the memoirs of Dimitri Shostakovich, the great 20th century Russian composer.  There were three different individuals in the book who demonstrated three different ways of “leading”, or behaving with character within a group.  (See the note at the end of this post on the question of authenticity)

The Cheerful Individualist

Volkov presents Modest Mussorgsky, known to most of us today as the composer of Pictures at an Exhibition, as an eccentric but cheerful individualist who would listen attentively to criticism as “everyone who felt like it harangued and criticized him…When he was criticized, he kept quiet, nodded, almost agreed.  But the agreement lasted only as far as the door; once he was outside, he took up his work again, like one of those dolls you can’t knock down.”  Mussorgsky was a unique individual with a unique musical voice, and in Volkov’s account had the confidence to overcome negative reaction to harsh feedback.

The Enabler

Alexander Glazunov, known as the “Russian Brahms,” was head of the Leningrad Conservatory from 1906 to 1928.  Glazunov was Shostakovich’s mentor during his time at the conservatory.  Glazunov enjoyed the company of young musicians, “performers came to his house every day”.  From Shostakovich’s account of Glazunov I’m reminded of a teacher I had who didn’t seem to teach at all, but by being welcoming, cheerful, and obviously talented and passionate about the subject matter, created an atmosphere in which we couldn’t help but learn.  More importantly, Glazunov saved lives during years of starvation and repression in Russia by “giving away his salary to needy students” and saving Jewish musicians from repression in their home towns by signing petitions for them to live in Petersburg without having them play for him.  As Volkov atrributes to Shostakovich after relating this story: “All things in life can be separated into the important and the unimportant.  You must be principled when it comes to the important things and not when it comes to the unimportant.”

The Supportive Subversive

Volkov’s Shostakovich lived bemused in a world of individual and institutional stupidity.  For example, one official order demanded “a quartet of 10 musicians.”  In this world  Shostakovich trod a fine line but never crossed far enough over to draw retribution.  Finally Shostakovich (among others) was denounced as a “decadent formalist” at the first Composer’s Congress in 1948, perhaps in part due to his 8th Symphony.  It was a solemn response to the end of World War II rather than the expected victory celebration.

Life during the Stalin years and World War II was characterized by deprivation and repression.  Many of Shostakovich’s friends and associates were denounced, exiled to Siberia, or killed in mysterious circumstances.  As a leading Soviet composer Shostakovich provided the soundtrack for the regime.  However, to the careful listener it seems not to celebrate the Stalinists but rather to channel the anguish of the people.  A telling example is his Fifth Symphony, ostensibly a joyous tribute to Party but arguably a veiled protest (see this analysis by Michael Tilson Thomas).

Volkov’s Shostakovich seems to have blundered on in spite of the worst possible conditions, sublimating his genius into musical irony and thereby doing what little he could in small ways to help his fellow Russians.

Many question Testimony’s authenticity, but this from the Wikipedia entry on the book might sum up a reasonable assessment: “the book gives a true picture of the political situation in the USSR and correctly represents his father’s political views, but [Maxim Shostakovich] continues to speak of the book as being ” ‘about my father, not by him.’ “  I’m neutral on whether or not Testimony is authentic, and whether Shostakovich was a toady of Stalin’s regime or an undercover dissident.  This post reflects subjective impressions from the book, nothing more.

Stuck inside of problems with the business blues again?

Elements of IT Architecture

Many see IT as application of technology to solve business problems. 

Of course, this is true but it leaves out the third element, which is to apply the right architectural pattern to solve the problem.  For example, when the business problem is that reporting is slow and reports from different departments don’t match, the astute IT professional immediately thinks in terms of a data warehousing pattern employing technologies like databases, extract-transform-load (ETL) tools, and multi-dimensional reporting suites.

A strategy based on the tools alone may solve the immediate problem, but understanding the solution-pattern enables the IT professional to bring to the business the additional benefits that come with the pattern, organizational and IT support impacts, and any risks that might emerge by applying the pattern.  In the data warehousing case the informed architect might cite improved executive dashboards and ability to drill down to root causes from summary reports, the need for data stewardship, and potential long term increase in data storage capacity needs.

Beyond that, when the architect lacks the pattern approach he or she seems to the business person like Bob Dylan’s debutante“Your debutante just knows what you need, but I know what you want.” On the projects I’ve been on where designers lacked a pattern-based perspective the technical team did exactly what the business folks said they wanted, but for the most part didn’t contribute to the business value of the solution.  On projects like this developers slavishly ensure the solution matches each and every requirement, rarely bring to the table new business requirements that are logical consequences of the design, and tend to avoid questioning defined requirements even if they are contradictory or counterproductive.

Sure, patterns aren’t strictly necessary.  An outstanding architect can design from whole cloth an original solution that precisely matches business need.  However, that’s not how outstanding architects do business, at least the ones I’ve known.  In my experience the best IT architects know patterns, like data warehousing, SOA, and others, well enough to match the business problem to the right pattern and then evolve the architecture from the generalized pattern into a problem-specific architecture based on the particulars of the business problem at hand.

Some examples of common solution patterns in the world of business IT are Data Warehousing, Master Data Management, and Service Oriented Architecture (the Wikipedia article on this one is preliminary at this writing but still a good intro for the uninitiated).   Those interested in a more technical introduction to patterns might start with Avel Avram’s quick intro at InfoQ (Membership at InfoQ is required but free and worthwhile).

And of course, apologies to Mr. Dylan for the title…

Cloud databases and business/IT alignment

Today, the foundation of most of our custom-built systems is a relational dbms.  While development frameworks vary, they overwhelmingly access and maintain data in relational tables and columns.  As I write I routinely save this post in a MySQL database, and at work I tend SQL Server applications.  Millions of others develop, use, and extract analytical data from thousands of SQL Server, DB2, and Oracle applications, on servers and networks maintained in-house by in-house administrators.

Some claim that the relational dbms may be out of style very soon.  Cool new “cloud computing” and “SaaS” apps and services  delivered over the internet seem to be popping up everywhere – just look at Salesforce.com, the well-established Customer Relations Manager vendor, and the many cloud-based PC backup sites.  As part of that trend, Amazon, Google, Microsoft and others offer database services over the internet that don’t look much like relational dbms’s.  Some supporters of the cloud-db options seek alternatives to the standard relational DBMS (note this widely read article).  Of these, many are OO developers.  There’s a fundamental dissonance between OO and relational approaches, requiring an intermediate object/relational mapping (ORM) layer for OO systems to operate effectively with relational DBMSs.  Many of the new cloud-db options are open source, lightly structured data services provided via the internet, capable of storing and delivering large data stores for high availability, fast response applications.

The convenient thing about relational databases is that they pretty much match the business view of data, and therefore give business people and developers common ground.  A well thought out relational data model is one way to express the inherent structure of business rules (see this previous post).  A relational model at the back end of a custom-built system means that both developers and business people can talk about the real guts of a system in ways that make sense to both, like this:

  • Developer to business person: “Should we allow a part_order to include items from only one division”
  • Business person to developer: “After a call from our shipping department, I ran a query on the part_order table and found a that there is a part_order with null shipper_phone_number. I thought it was a required column, what’s up?

How’s it going to be when those comments don’t reflect the underlying structure of the database?  Today’s cloud db offerings vary in structure, but tend to favor highly efficient and flexible models like name-value pairs, and avoid the overhead required by semantic layers like the relational model.  According to the MongoDB site, “by reducing transactional semantics the db provides, one can solve an interesting set of problems where performance is very important.”

In such databases the structure of the data will be hidden from business people; there will be no shared business/IT view.  Rather than talking with business people about the actual database structure we’ll talk about its custom abstraction, and when things go poorly with performance and functionality the developer will in effect say “trust me on this one” to the business person rather than explaining what’s up.

For a long time cloud databases will be another option alongside the relational model, but the more prominent cloud databases become the more difficult it will be for developers and business people to communicate about business data in IT applications, and it could be a serious challenge for developers to learn to cross that communications gap without the bridge provided by the relational model.

BI Business Case Basics: Three Things to Remember

Here are three things to remember when putting together a BI business case:

InformationManagement

Excerpt from "Show Me the Money: A DM/BI Business Value Primer", Bob Lambert and Tri Truong, Information Management Special Reports, March 24, 2009

  1. Intangible benefits don’t count.
  2. BI has no inherent value.
  3. Senior managers often make decisions about future outcomes with insufficient data.

Intangible Benefits Don’t Count: An effective business case communicates tangible future value in a convincing way.  An argument has a chance of convincing a skeptical reader if the reader agrees that the argument’s assumptions are reasonable and that the conclusion follows logically from the assumptions.  Quantifying financial metrics like Return on Investment (ROI) or Net Present Value (NPV) help build the case, but such measures are credible only if readers agree with the underlying assumptions and the logic built upon them.

BI has no inherent value
: We in the BI field believe that any organization’s fortunes would improve if it rationalized its data stewardship, integrated its data, and applied analytics creatively in management and operations.  However true, that view must ring hollow to senior business managers.  Without a compelling and motivating story about how a new system contributes to revenue or reduces costs, that system’s business case stops dead in its tracks.  Of course sometimes “someone at a high level” just wants BI, but organizations don’t often embark on BI efforts without first evaluating tangible costs and benefits.

Senior Managers Make Decisions about Future Outcomes with Insufficient Data: Although BI practitioners must make a convincing case for future business value, there’s room for uncertainty.  Executives and senior managers aren’t highly compensated for playing it safe, but rather for understanding current conditions and setting direction based on educated but sometimes courageous predictions of future conditions.  A successful BI business case matches or extends the executive’s knowledge of current conditions and expands his or her view of potential future outcomes of near term actions.

Don’t forget to get it done

In a recent article at Information Management, Maria Villar and Theresa Kushner offer 4 Steps to Create an Effective IT and Business Partnership, a very useful list of ways to ensure “strong partnership between IT and business”.  To the authors this partnership “is the most important, and often overlooked, component to successfully managing critical business data. Undertaking business intelligence, data quality or an enterprise data management [program] without full cooperation and collaboration between IT and the business is a formula for frustration.”  The authors suggest these four steps: “know your partner, develop a relationship, define roles and responsibilities, and establish open, regular communication channels.”  I recommend reading this article because IT folks (like me) seem tempted to neglect the habits that enable building a solid relationship with business people.

That said, it seems to me that there’s something missing.  Consider one BI manager I know who has fractious relations with his business customers.  I won’t go into detail, but trust me, relations have been rocky, and reviews from key business players poor.  What this person does extremely well is to build rock-solid, reliable systems, deliver on time, meet business needs, and ensure that the solution meets regulatory and audit concerns.  This BI group is essentially unchanged after many years, enduring even the recent recession in a devastated industry segment, and outlasting many of its critics.

To me, building a good relationships is important, but execution is the sine qua non of IT/business alignment.  Think about it.  Say you hire a really nice contractor to fix a leaky roof.  However personable he is, you won’t hire him to replace your windows if the roof still leaks after you’ve paid the bill.

My view: if you want to do it right adopt Villar’s and Kushner’s excellent suggestions but the fundamentals remain the same:

  1. Either (a) present a robust business case that all accept, or (b) pay attention to what you’ve been asked to do
  2. Deliver what was requested/promised in 1
  3. When things change go to 1

Got chaos? Manage to milestones with risks and issues

When you are in the middle of a story it isn’t a story at all, but only a confusion; a dark roaring, a blindness, a wreckage of shattered glass and splintered wood; like a house in a whirlwind, or else a boat crushed by the icebergs or swept over the rapids, and all aboard powerless to stop it.  It is only afterwards that it becomes anything like a story at all.  When you are telling it, to yourself or to someone else.” – from Alias Grace by Margaret Atwood

Whatever project management approach a team uses, sometimes everything falls apart, commonly due to work piling up at the end, but sometimes due to a key individual leaving, or a pivotal assumption no longer holding true, or many other reasons.  When that happens, the project can become like a whack-a-mole game, with leads working from issue to issue as they pop up faster and faster.

I served as one of many workstream PMs on one very large project where this didn’t happen.  Out of the seeming chaos the multi-million dollar IT project came in on time.  Here’s my view of what we did to succeed:

  • Had very clear interim milestones that were generally known and served as reference points for discussion.  I’d characterize them as a milestone per workstream per month.
  • Held seemingly interminable weekly risk/issue discussions.  These were open, no holds barred reviews of anything at all that could endanger achieving milestones.  Often risk/issue discussions are polite exercises in avoiding the fact that the emperor has no clothes, with team members carefully avoiding forbidden topics.  On this project everything was open for discussion.
  • The program manager excelled at visibly not sweating the small stuff, directing workstream leadership to handle their localized risks and issues themselves, and focusing program energy only on those he judged to have overall impact.
  • Each of the many workstreams had its own project schedule; the program insisted on detail where it mattered but not where the detail was irrelevant.  For example, there was a minutely detailed cutover plan for production migration.

Many of us were surprised when the chaos all came together on time with surprisingly few glitches. I attribute this program’s success to the program manager’s unflappable focus on milestones, encouragement of unfettered group risk/issue analysis, and ability to parse program from project concerns.

Coming soon: data like money

It is a commonplace to say we should manage data like a resource. But when you think about it, data is an asset but not a resource.  Data isn’t a thing like real estate, employees, or customers, but rather it represents all of those things.  In data-geek-speak, data is a meta-resource that holds information about resources.  That makes data a lot like money.

In his book Money Mischief Milton Friedman made the point that money has no intrinsic value: “The value of money is the value people attribute to what they want to exchange, no more, no less.” Likewise, data has no value in itself.  Its value is derived from people’s desire to know about the things the data describes, and how reliably and accurately it describes those things.  So an organization’s data, like its money, is not a resource in itself.  It is an asset that represents the resources that an organization manages and controls.  It follows then that data management should look a lot like money management.

A cornerstone of our economic stability is consensus that organizations must manage money well and make their internal money management visible to investors, regulators, and independent standards groups.  We’ve evolved a standard for money management where a department represented by a C-level executive administers formal accounting, budgeting, planning, and financial reporting.  The organization evaluates every manager’s compliance to money management policies, and independent auditors evaluate the organization’s soundness in terms of its money management.  Accounting professionals meet rigorous, generally respected certification standards.

Overall, our volume of online purchases and use of FDA-approved drugs, for example, attest to our general confidence in current data management practices.  But still, data  professionals know that it could be a lot better.  Scarcely a week goes by without another scandal involving lost customer data, and consider these snafus:

  • This article cites multiple non-compliant databases as a significant contributor to the chaos in reuniting families in the wake of the Katrina disaster
  • “The Mars Climate Orbiter, a key part of NASA’s program to explore the planet Mars, vanished in September 1999 after rockets were fired to bring it into orbit of the planet. An investigative board later discovered that NASA engineers failed to convert English measures of rocket thrusts to newtons, a metric system measuring rocket force, and that was the root cause of the loss of the spacecraft. The orbiter smashed into the planet instead of reaching a safe orbit.” (cited here)
  • One Fortune 1000 services company carried separate customer records in each of its operating units resulting in a number of anomalies visible to the customers.  For example, the same customer would receive separate invoices with different terms for each of the services purchased from the company.

In parallel with emergence of these types of issues, regulators and industry associations have set data management standards for many industries and practice areas.   Food and consumer product safety rests on a regulatory foundation of correctly recording and managing results of inspections.  The International Air Transport Association sets standards for safety data collection and management.  Likewise, the US Food and Drug Administration and other governing bodies set clinical safety data management and reporting standards.

It is just a matter of time before the many separate externally imposed data management guidelines congeal into a a set of general best practices that apply across the organization.  Then investors, regulators, and standards groups will hold organizations responsible for effective data management in the same way they are held to account for effectively managing money. An internal department represented by a C-level executive will administer formal data management standards and procedures.  The organization will evaluate every manager’s compliance with data management policies, independent auditors will evaluate the organization’s soundness in terms of the quality of its data management, and data management professionals will be held to rigorous, generally respected certification standards.

Farfetched? Maybe.  But it isn’t farfetched to think that as a society we’ll begin to recognize what data professionals have known for a long time: that the quality of an organization’s products, its care of and protection of its customers, workforce, resources, stewardship of the environment, and even its financial health depend to a significant degree on sound data management practices.

Here are some resources on data management:

DAMA, the organization for data management.

The Wikipedia page quotes this definition: “Data management is the development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets.”

Data Stewardship Strategy: 6 Keys to Success by Jill Dyché: “As executives increasingly agree that data is a corporate asset, they are also funding data governance and data quality efforts more willingly. But … entrenched organizational behaviors are much more difficult to shift. Many companies have introduced the role of data steward before fully defining the role. In these cases, the beleaguered data stewards are doomed before they even begin. ”

Leverage Data Quality to Build an Effective Enterprise Architecture by Mark Amspoker.  “It might be time to rethink the notion that effective information architecture development will solve the data quality problem.”

Guidelines for Responsible Data Management in Scientific Research from the Office of Research Integrity, US Department of Health and Human Services.   “Data management is one of the essential areas of responsible conduct of research, as outlined by the Office of Research Integrity. This educational course will educate new investigators about conducting responsible data management in scientific research.”

Study data early to improve application alignment

A recurring theme in the literature on IT over the years has been frequent failure of IT projects.  Most studies lay the bulk of the blame on requirements (examples here and here).  One way to improve accuracy and fit-to-purpose of requirements, and thereby promote project success, is to include data analysis as well as process analysis in the requirements plan.

I’ve cited here the need to start data interface analysis early to avoid budget and schedule blow-ups when, as a result of not thinking early about interface complexity, data integration work turns out to be bigger and nastier than anticipated.

Early data study also helps business analysts elicit more detailed and accurate business requirements.  Say a mid-level football (soccer) team in the UK is looking to recruit a couple of strikers who can reliably punch home goals for the club.  The obvious data they seek is (1) the number of goals scored per game by each prospect, and (2) over their careers how much time have they spent on the bench due to injury.  At the same time, this club is building a strategic recruiting system to support growth into the higher echelons of English football.  A process-oriented requirements strategy (like the one described here) asks the team’s recruiters what they need to in order to get good people into the club, and often emerges with a list of statements about what the system will do (”The system shall provide an interface enabling entry of the following player statistics” or “The system shall provide a report ranking players by the following criteria:…”).

It isn’t necessarily wrong to start with process analysis, especially when backed up with formal techniques like use cases, data flow diagramming, or others, but addition of data analysis early provides ability to be far more perceptive into the real business needs.  Without interviewing anyone a data analyst can know that there are many goals in a game of soccer (OK, to some not nearly enough, but that’s another story), that the attributes of a game include location, weather conditions, date and time, whether it’s regular season or playoff, and more.  Attributes of a goal: time during the game; left foot, right foot, or head; did it come from a set play or in the run of play; from the left or right side of the field, and much more.

The analyst who knows the data and understands its structure can probe with questions like whether a player tends to score at the end of games, or would it be useful to find one striker who tends to score from the left side of the field and another who scores from the right?  By understanding the data an analyst can understand the business problem more deeply, build better rapport with business people  by asking more informed questions, and cross the business/IT communications gap to define the right requirements so that the right system gets built.

It may be just the organizations I’ve been exposed to, but in my experience data analysis isn’t typically part of the requirements effort.  Supporting this point, the author of the wikipedia page on business analysis entirely omits data analysis, apparently favoring a process-only approach.  On the other hand, object-based techniques offer a balanced approach, studying both data and process by representing things like goals, games, and players as objects with their own attributes and behaviors.  In addition, the International Institute of Business Analysts (IIBA) includes data-oriented along with process-oriented techniques in its Business Analysis Body of Knowledge (BABOK).

As process/data balance early on in the application lifecycle becomes more widespread analysts should generate more insightful requirements and, other things being equal, the success rate of IT application projects should improve.

DQ, he isn’t so dumb he just needs glasses

In a recent very thoughtful post on data quality, Paul Erb plays out an analogy comparing data users with Don Quixote and data quality professionals with Sancho Panza, then reverses the analogy to cleverly coin the “Sancho Panza” test of data quality professionals.  He encourages data quality professionals promoting the critical role of data quality to apply a what would Sancho say test to ensure that they are aligned with the needs and interests of data consumers.

Here’s Paul’s description of the Sancho Panza test:

Think of Don Quixote [DQ] as the data-quality specialist or even the data management specialist or software vendor, bringing to the world his specialist’s perspective and vocabulary and enthusiasm, influenced by the books he’s read, visioning everyday business practices, with his value added, as goldmines for the organization.  Meanwhile Sancho Panza represents the person who does a practical job every day, who knows what works around here and what doesn’t.

I advocate to Data Quality (let’s call it DQ) consultants that they listen to this Sancho Panza, and consider themselves as Don Quixote.  Sancho doesn’t know much about data, but he knows what he likes… He’s open to listening, but slow to change, and he’ll tell you what he thinks.

Paul’s article reminded me that as a child I thought the problem with Don Quixote was that he tilted at windmills and attempted to ambush acting troupes because of his bad eyesight.  Of course this is not the case, but to me it provides a relevant perspective on data quality in many organizations.

Here’s the problem I’ve seen play out on a number of IT application projects:

  1. A high level business study recommends replacement or improvement of a current application.
  2. The organization approves the project described in a business case citing benefits named in the business study and costs detailed for infrastructure, package software, and application development, but data-related costs are glossed over or left out entirely.
  3. The project begins with a requirements phase that collects hundreds of imperative statements (“The system shall…”)  from business people who will use the system.
  4. Late in the requirements phase, the team finds that data integration work in system interfaces will be more complex than expected.  A common example: the project requires changes to a feeder application with no documentation and no in-house support expertise.
  5. Project leadership goes back to the sponsor seeking more money.

In these situations the business case was incorrect because it did not account for all of the costs of data integration.  I’ve seen projects weather steps four and five well, but often discovery of previously unseen data complexity starts a disruptive chain of events.  (Sadly for the project manager, such situations are often seen as a failure of project management and corrected accordingly, but that’s a topic for another post.)

In my view the root cause of unforeseen data complexity on projects is the lack of a data constituency in current IT. It is only recently that success of companies like Google and Amazon have motivated emergence of data as a key business resource in the collective consciousness. Famous success stories notwithstanding (see this link), there are relatively few senior IT managers with data quality backgrounds.  Conversely, many rose through the ranks of the infrastructure, application development, or business (process) analysis groups.

It will be a while before, for example, a Mobil CIO’s predecessor jobs include definition of a metadata repository or elimination of multipurpose data, but in the meantime here’s what we can do:  add a business case to the application lifecycle as the last step in requirements.  Stop the project when the real costs are known, recalculate the cost/benefit, and ask the sponsors if the project should continue.  Give Sancho (in this case the project team) a chance to speak to the reality of the situation, and hand to Don Quixote (project sponsors) the eyeglasses of in-depth visibility into real costs. If the decision is to move ahead with the project, then all share the same vision and the sponsors have endorsed the actual project, not the fuzzy image from earlier on that might have been a windmill.