<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bob Lambert &#187; Data Modeling</title>
	<atom:link href="http://robertlambert.net/tag/data-modeling/feed/" rel="self" type="application/rss+xml" />
	<link>http://robertlambert.net</link>
	<description>on business-aligned information technology</description>
	<lastBuildDate>Sat, 24 Jul 2010 20:26:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Use conceptual data modeling in requirements definition</title>
		<link>http://robertlambert.net/2010/07/use-conceptual-data-modeling-in-requirements-definition/</link>
		<comments>http://robertlambert.net/2010/07/use-conceptual-data-modeling-in-requirements-definition/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 16:24:50 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[App Dev]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[Project Management]]></category>
		<category><![CDATA[Business Analysis]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Database Design]]></category>
		<category><![CDATA[Requirements]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=979</guid>
		<description><![CDATA[I’ve often thought that conceptual data modeling was an underused tool in the arsenal available to requirements analysts, and in a recent conversation I found that many were surprised that it would be used in the requirements phase at all.  Checking the Business Analysis Body of Knowledge (BABOK) I found data modeling listed among the [...]]]></description>
			<content:encoded><![CDATA[<p>I’ve often thought that conceptual data modeling was an underused tool in the arsenal available to requirements analysts, and in a recent conversation I found that many were surprised that it would be used in the requirements phase at all.  Checking the <a title="The DMBOK" href="http://www.theiiba.org/AM/Template.cfm?Section=Body_of_Knowledge" target="_blank">Business Analysis Body of Knowledge</a> (BABOK) I found data modeling listed among the tools available to requirements analysts to “to describe the concepts relevant to a domain, the relationships between those concepts, and information associated with them.”  There’s also Steve Hoberman’s excellent book on the topic, <em><a title="Data Modeling for the Business" href="http://www.amazon.com/Data-Modeling-Business-Handbook-High-Level/dp/0977140075" target="_blank">Data Modeling for the Business</a></em>, an introduction to data modeling aimed at a business audience<em>.</em></p>
<p>Data modeling has long been one of my requirements analysis tools of choice for custom operational applications.  To me, using data modeling techniques in requirements analysis reduces errors by improving requirements completeness, consistency, and communication, and provides unique continuity between analysis and design.   As David Elliott told me in conversation, “development of a data model uncovers many opportunities for clarification of existing requirements, or uncovering of additional detail.  At the very least, it confirms to one’s business customer that the BSA understands and can graphically demonstrate many business rules and relationships.”</p>
<p>I’ll hasten to add a these caveats.  (1) Perhaps strangely, conceptual data modeling is not useful in the same way in requirements for informational systems like data warehouses and marts (I’ll save that discussion for another post).  (2) Requirements definition for commercial off-the-shelf (COTS) applications follows a different methodology in which data modeling might be less applicable.  (3) This post is <em>not</em> about database design, but rather about use of conceptual data modeling as a tool for organizing and validating requirements.</p>
<p><strong> </strong></p>
<p><strong>Conceptual vs. logical data modeling</strong></p>
<p>It is easy to see why in practice there are varying definitions of the different types of data models.  The Wikipedia entry on <a title="Data Modeling at Wikipedia" href="http://en.wikipedia.org/wiki/Data_modeling" target="_blank">data modeling</a> reflects the standard terminology based on the ANSI four-level database architecture, but features a confusing diagram that to me blurs the distinction between conceptual and logical models. The entries on <a title="Logical data modeling at Wikipedia" href="http://en.wikipedia.org/wiki/Logical_data_model" target="_blank">Logical Data Model</a> and <a title="Conceptual data modeling at Wikipedia" href="http://en.wikipedia.org/wiki/Conceptual_schema" target="_blank">Conceptual Data Model</a> make them sound like the same thing: implementation-independent representations of business data.  Then, the entry on <a title="Database design at Wikipedia" href="http://en.wikipedia.org/wiki/Database_design" target="_blank">Database Design</a> contradicts them by stating that the logical model “contains all the needed logical and physical design choices and physical storage parameters needed to … create a database.”</p>
<p>For this post I’ll follow the definitions offered in Simison and Witt’s <a title="Data Modeling Essentials" href="http://www.google.com/search?q=data+modeling+essentials&amp;rls=com.microsoft:en-us:IE-Address&amp;ie=UTF-8&amp;oe=UTF-8&amp;sourceid=ie7&amp;rlz=1I7HPNN_en" target="_blank"><em>Data Modeling Essentials</em></a>:</p>
<ul>
<li>Conceptual      data modeling identifies a set of data structures that will meet      requirements, focusing on business and not on technical or DBMS-specific      concerns</li>
<li>Subsequent      logical data modeling maps the conceptual model to structures supported by      the particular DBMS, finalizing the design in DBMS-appropriate constructs      but not yet optimizing for performance, which comes next in physical      modeling</li>
</ul>
<p><strong>Use of data modeling in requirements definition</strong></p>
<p>Conceptual data modeling is hardly an outlier technique in requirements definition:</p>
<ul>
<li>Perhaps in reaction to problems experienced by adopters of <a title="Structured techniques at Wikipedia" href="http://en.wikipedia.org/wiki/Structured_analysis" target="_blank">Structured techniques</a> in the 80s, data modeling was the cornerstone analytical technique in Clive Finkelstein’s and James Martin’s widely-adopted <a title="Information engineering at Wikipedia" href="http://en.wikipedia.org/wiki/Information_engineering" target="_blank">Information Engineering</a> methodology.</li>
<li> The BABOK includes class modeling, data modeling’s object-oriented cousin, in its chapter on data modeling.  Class modeling is a core technique of object-oriented analysis.</li>
<li>Scott Ambler’s <a title="Agile Modeling" href="http://www.agilemodeling.com/essays/initialRequirementsModeling.htm" target="_blank">Agile Modeling site</a> offers conceptual (or “slim”) data modeling as an option in the initial envisioning stage.</li>
<li>Informally searching requirements definition templates available on the web, I found that about a third recommend including conceptual data models.</li>
</ul>
<p><strong>Benefits of data modeling in requirements analysis</strong></p>
<p>The BABOK separates requirements gathering from requirements analysis, defining requirements analysis as an essential step to organize, prioritize, and validate elicited requirements. Elicited requirements are the business objectives of the system. The analysis step organizes those objectives in a way that both makes sense to the business and guides subsequent application design.  Conceptual data modeling in this stage helps ensure requirements completeness, consistency, and communications:</p>
<p><strong>Completeness:</strong> In my experience most requirements analysis is process-based, and the most common tool the “swim lane” activity diagram.  While such techniques are essential for understanding complex processes, they can miss requirements that aren’t directly involved in the process itself.  For example, a complex process might reference federal, state, and local tax rates by zip code.  Analysts who are heads-down in defining the process might neglect the need for at least annual refresh of the tax rate tables.  Data modelers thinking in terms of business objects and events and their life cycles would be less likely to miss that one.  This kind of review is formalized in the “CRUD Matrix” a table identifying which business activities create, read, update, or delete which business entities.</p>
<p><strong>Consistency:</strong> Another challenge with process-oriented techniques is, for large systems, the risk of inconsistency in definition of business objects and events.  For example, I worked on requirements definition of a specialized order processing system.  Separate sub-teams defined field and headquarters processes, and as a result there were incompatible definitions of critical concepts like “customer” and “order”.  Time pressures made it difficult for the two sub-teams to work together to make their work consistent.  A separate data modeling sub-team can provide a reference point for object and event definitions and promote consistency between separate process analysis teams (on COTS installations the product database itself serves as the reference data model for separate process definition teams).</p>
<p><strong>Communications: </strong>Data management professional Peter Carr recounted to me his experience as a consultant on a large project: “the Conceptual Diagram helped us think about all the current state situations, and broadly about the relationships between entities in the organization.  It helped us to ask questions of the business when they were looking to enhance or build new systems to solve business requirements.  Paraphrasing executive colleague Rich Hartt, “the enterprise data model is like a piece of art, it provides a picture into the business that offers new insight through its drawing and interpretation&#8217;.  He went on to tell the key business leaders that the enterprise data model will continually be changing, but that it helped them gain understanding of their business in a different way than written business rules”.</p>
<p>For relational database applications, data modeling applies the same conceptual tool throughout the development cycle.  A conceptual data model used to define the problem domain uses the same structure and symbols as a physical database design, although of course it uses fewer.  On the other hand, activity diagrams and data flow diagrams are fundamentally different in nature from the software that they describe.  In effect, process designers need to translate analysis artifacts into a different language.  Logical and physical data modelers use exactly the same language as the business analysts who complete the conceptual data model.</p>
<p>My colleague Grayson Gorman cites “<a title="Key reasons for project failure" href="http://blogs.captechconsulting.com/blog/grayson-gorman/key-reasons-project-failure" target="_blank">Poorly Defined / Missed Requirements</a>” as a key contributor to IT project failure.  In my view making data modeling a more prevalent part of requirements definition could help by improving requirements completeness, consistency, and communication with business participants, and promote a seamless transition from requirements to design.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2010/07/use-conceptual-data-modeling-in-requirements-definition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQL Saturday #30, Richmond Virginia, April 10, 2010</title>
		<link>http://robertlambert.net/2010/04/sql-saturday-30-richmond-virginia-april-10-2010/</link>
		<comments>http://robertlambert.net/2010/04/sql-saturday-30-richmond-virginia-april-10-2010/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 16:44:06 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[Business Analysis]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Database Design]]></category>
		<category><![CDATA[Requirements]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=900</guid>
		<description><![CDATA[Thanks to all who attended my presentations at SQL Saturday on April 10.  Here are the materials from my two presentations: - The Business End of Data Modeling (2.5m powerpoint presentation) - Normalize Metadata For Data Integration Analysis (5.5m full version, zip including presentation and code samples) - Normalize Metadata For Data Integration Analysis (small) [...]]]></description>
			<content:encoded><![CDATA[<p>Thanks to all who attended my presentations at SQL Saturday on April 10.  Here are the materials from my two presentations:</p>
<p>- <a href="http://robertlambert.net/wp-content/uploads/2010/04/BusinessEndOfDataModeling20100410.pps">The Business End of Data Modeling</a> (2.5m powerpoint presentation)</p>
<p>- <a href="http://robertlambert.net/wp-content/uploads/2010/04/NormalizeMetadataForDataIntegrationAnalysis.zip">Normalize Metadata For Data Integration Analysis</a> (5.5m full version, zip including presentation and code samples)</p>
<p>- <a href="http://robertlambert.net/wp-content/uploads/2010/04/NormalizeMetadataForDataIntegrationAnalysissmall.zip">Normalize Metadata For Data Integration Analysis (small)</a> (2m reduced size version, graphics removed from ppt file)</p>
<p>Here are some quick notes for those looking to run the Metadata prototype:</p>
<p>The prototype metadata database includes SQL Server 2008 data definition language and data manipulation language (DDL and DML) needed to create the database and populate it with tables and columns from Microsoft’s AdventureWorksDW sample database. It also includes a sample requirements spreadsheet and source-to-target map, and SSIS jobs to load the spreadsheets to corresponding metadata tables. These define fictional requirements and mappings to populate the AdventureWorksDW FACTInternetSales table from tables in the AdventureWorks sample database.</p>
<p>AdventureWorks and AdventureWorksDW are available here: <a title="AdventureWorks DB and DW downloads" href="http://msftdbprodsamples.codeplex.com/Wikipage" target="_blank">http://msftdbprodsamples.codeplex.com/Wikipage</a> (accessed 4/14/2010)</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2010/04/sql-saturday-30-richmond-virginia-april-10-2010/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data and Wine?</title>
		<link>http://robertlambert.net/2009/11/data-and-wine/</link>
		<comments>http://robertlambert.net/2009/11/data-and-wine/#comments</comments>
		<pubDate>Sat, 14 Nov 2009 11:44:10 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[Other]]></category>
		<category><![CDATA[CapTech]]></category>
		<category><![CDATA[Data Modeling]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=677</guid>
		<description><![CDATA[Great together, check this out:]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;">Great together, check this out:</p>
<p style="text-align: center;"><img class="size-full wp-image-679 aligncenter" title="DataAndWine" src="http://robertlambert.net/wp-content/uploads/2009/11/DataAndWine3.jpg" alt="DataAndWine" width="380" height="458" /></p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/11/data-and-wine/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>DQ, he isn&#8217;t so dumb he just needs glasses</title>
		<link>http://robertlambert.net/2009/05/dq-he-isnt-so-dumb-he-just-needs-glasses/</link>
		<comments>http://robertlambert.net/2009/05/dq-he-isnt-so-dumb-he-just-needs-glasses/#comments</comments>
		<pubDate>Sun, 03 May 2009 20:58:48 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[Data Management]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[Project Management]]></category>
		<category><![CDATA[Alignment]]></category>
		<category><![CDATA[Business Case]]></category>
		<category><![CDATA[CapTech]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Requirements]]></category>
		<category><![CDATA[Strategy]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=418</guid>
		<description><![CDATA[In a recent very thoughtful post on data quality, Paul Erb plays out an analogy comparing data users with Don Quixote and data quality professionals with Sancho Panza, then reverses the analogy to cleverly coin the &#8220;Sancho Panza&#8221; test of data quality professionals.  He encourages data quality professionals promoting the critical role of data quality [...]]]></description>
			<content:encoded><![CDATA[<p>In a recent very thoughtful <a title="I Don't Know Much About Data, but I Know What I Like" href="http://www.typepad.com/services/trackback/6a00d83454da7a69e201156e43e2a4970c" target="_blank">post on data quality</a>, Paul Erb plays out an analogy comparing data users with Don Quixote and data quality professionals with Sancho Panza, then reverses the analogy to cleverly coin the &#8220;Sancho Panza&#8221; test of data quality professionals.  He encourages data quality professionals promoting the critical role of data quality to apply a <em>what would Sancho say </em>test to ensure that they are aligned with the needs and interests of data consumers.</p>
<p>Here&#8217;s Paul&#8217;s description of the Sancho Panza test:</p>
<p style="padding-left: 30px;"><em>Think of Don Quixote [DQ] as the data-quality specialist or even the data management specialist or software vendor, bringing to the world his specialist&#8217;s perspective and vocabulary and enthusiasm, influenced by the books he&#8217;s read, visioning everyday business practices, with his value added, as goldmines for the organization.  Meanwhile Sancho Panza represents the person who does a practical job every day, who knows what works around here and what doesn&#8217;t.</em></p>
<p style="padding-left: 30px;"><em>I advocate to Data Quality (let&#8217;s call it DQ) consultants that they listen to this Sancho Panza, and consider themselves as Don Quixote.  Sancho doesn&#8217;t know much about data, but he knows what he likes&#8230; He&#8217;s open to listening, but slow to change, and he&#8217;ll tell you what he thinks.</em></p>
<p>Paul&#8217;s article reminded me that as a child I thought the problem with Don Quixote was that he tilted at windmills and attempted to ambush acting troupes because of his bad eyesight.  Of course this is not the case, but to me it provides a relevant perspective on data quality in many organizations.</p>
<p>Here&#8217;s the problem I&#8217;ve seen play out on a number of IT application projects:</p>
<ol>
<li>A high level business study recommends replacement or improvement of a current application.</li>
<li>The organization approves the project described in a business case citing benefits named in the business study and costs detailed for infrastructure, package software, and application development, but data-related costs are glossed over or left out entirely.</li>
<li>The project begins with a requirements phase that collects hundreds of imperative statements (&#8220;The system shall&#8230;&#8221;)  from business people who will use the system.</li>
<li>Late in the requirements phase, the team finds that data integration work in system interfaces will be more complex than expected.  A common example: the project requires changes to a feeder application with no documentation and no in-house support expertise.</li>
<li>Project leadership goes back to the sponsor seeking more money.</li>
</ol>
<p>In these situations the business case was incorrect because it did not account for all of the costs of data integration.  I&#8217;ve seen projects weather steps four and five well, but often discovery of previously unseen data complexity starts a disruptive chain of events.  (Sadly for the project manager, such situations are often seen as a failure of project management and corrected accordingly, but that&#8217;s a topic for another post.)</p>
<p>In my view the root cause of unforeseen data complexity on projects is the lack of a data constituency in current IT. It is only recently that success of companies like Google and Amazon have motivated emergence of data as a key business resource in the collective consciousness. Famous success stories notwithstanding (<a title="Show Me the Money: A DM/BI Business Value Primer" href="http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=4&amp;url=http%3A%2F%2Fwww.information-management.com%2Fspecialreports%2F2009_133%2Fbi_data_management_business_value-10015103-1.html&amp;ei=d_j9SaV_kfgwpJTlxwQ&amp;usg=AFQjCNE695M1rfsa2Ex7jvl4eA-_W9S75A" target="_blank">see this link</a>), there are relatively few senior IT managers with data quality backgrounds.  Conversely, many rose through the ranks of the infrastructure, application development, or business (process) analysis groups.</p>
<p>It will be a while before, for example, a Mobil CIO&#8217;s predecessor jobs include definition of a metadata repository or elimination of multipurpose data, but in the meantime here&#8217;s what we can do:  <a title="Big project coming up? Learn to two-step." href="http://robertlambert.net/2009/03/big-project-two-step/" target="_blank">add a business case to the application lifecycle as the last step in requirements</a>.  Stop the project when the real costs are known, recalculate the cost/benefit, and ask the sponsors if the project should continue.  Give Sancho (in this case the project team) a chance to speak to the reality of the situation, and hand to Don Quixote (project sponsors) the eyeglasses of in-depth visibility into real costs. If the decision is to move ahead with the project, then all share the same vision and the sponsors have endorsed the actual project, not the fuzzy image from earlier on that might have been a windmill.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/05/dq-he-isnt-so-dumb-he-just-needs-glasses/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>No business value in nulls</title>
		<link>http://robertlambert.net/2009/04/no-business-value-in-nulls/</link>
		<comments>http://robertlambert.net/2009/04/no-business-value-in-nulls/#comments</comments>
		<pubDate>Sun, 05 Apr 2009 22:10:43 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[Business Analysis]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Database Design]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=345</guid>
		<description><![CDATA[It seems I&#8217;m frequently in conversations about using null to represent a business value.  To paraphrase, say there are credit and cash customers, and there&#8217;s a suggestion to set &#8220;Customer_Type&#8221; to &#8220;C&#8221; for credit and null for cash.  To data and database professionals this is obviously a bad idea, but it&#8217;s not obvious from a [...]]]></description>
			<content:encoded><![CDATA[<p>It seems I&#8217;m frequently in conversations about using null to represent a business value.  To paraphrase, say there are credit and cash customers, and there&#8217;s a suggestion to set &#8220;Customer_Type&#8221; to &#8220;C&#8221; for credit and null for cash.  To data and database professionals this is obviously a bad idea, but it&#8217;s not obvious from a business point of view.</p>
<p>In a database null means that there is literally no value, or the value is indeterminate.  Null is not the same as zero or blank.  When a database operation involves nulls the result can be difficult to predict for someone not practiced in SQL.  In many cases the answer is null.  For example, 1+0=0 but 1+null=null.  In plain English, what you&#8217;re asking the DBMS to do in the latter case is to add 1 to [I don't know what], and of course 1+[I don't know what] equals [I don't know what].</p>
<p>So, if you use null to represent a business value then you might not get the results you&#8217;re looking for when you try to get business answers out of your database.  For example, say &#8220;C&#8221; represents credit customers and null represents cash customers, and you have 2 cash and 1 credit customers.   In SQL Server if you use a Count function to tally all of your cash customers the answer isn&#8217;t 2, it is null.</p>
<p>That&#8217;s one example of why it&#8217;s not a good idea to try to represent a business fact with a null value.  It doesn&#8217;t make business sense and in this case the DBMS, correctly, won&#8217;t make sense of it for you.</p>
<p>To be clear, whether or not a given database column permits null values is an entirely different question, best left to database designers.  For example, a database table might record which patient occupies which hospital bed.  It may be reasonable and correct to assign a null patient ID if the bed is currently available.  However, there are alternative methods of representing this situation, and the database designer should be free to choose the right alternative taking into account the specifics of the application under development.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/04/no-business-value-in-nulls/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A proposal for Enterprise Information Architecture</title>
		<link>http://robertlambert.net/2009/04/proposal-for-eia/</link>
		<comments>http://robertlambert.net/2009/04/proposal-for-eia/#comments</comments>
		<pubDate>Wed, 01 Apr 2009 21:08:20 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[Alignment]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Strategy]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=309</guid>
		<description><![CDATA[While many organizations understand the value of managing the information resource, for many others information management remains abstract and difficult to define.  In an effort to make it concrete here’s a hypothetical proposal to provide an Enterprise Information Architect for a hypothetical organization that really needs one. Today: inconsistent data of uncertain quality blurs enterprise [...]]]></description>
			<content:encoded><![CDATA[<p>While many organizations understand the value of managing the information resource, for many others information management remains abstract and difficult to define.  In an effort to make it concrete here’s a hypothetical proposal to provide an Enterprise Information Architect for a hypothetical organization that really needs one.</p>
<p><strong>Today: inconsistent data of uncertain quality blurs enterprise view and restricts planning<br />
</strong></p>
<p>Today managers, planners, and analysts lack the information required to run the organization as a single enterprise rather than a collection of diverse units.</p>
<ul>
<li>Data quality in IT applications varies to the point that, outside financials, it is impossible to gather consistent data supporting an enterprise view of operations.
<ul style="padding-left: 30px;">
<li>Application development efforts have focused narrowly on departmental interests without accounting for enterprise concerns, making application data incomplete in describing business processes and inconsistent with data in other applications.</li>
<li>Focus on departmental concerns and tight development timelines has resulted in incomplete validation of data critical to the enterprise but not critical to the application’s focus.  For example, customer demographics are not critical to the sales process and therefore zip codes and telephone numbers are not consistently collected at point of sale, substantially reducing value of market analysis based on sales data.</li>
</ul>
</li>
<li>Enterprise planners work with only the highest level summaries of operational data, those summaries suffer large margins of error, and planners cannot definitively answer questions required to make critical business decisions.</li>
<li>Regulators have questioned the validity and repeatability of reporting because of the organization&#8217;s heavy reliance on spreadsheets and manual processes in gathering and compiling data for reports.</li>
</ul>
<p><strong>Solution: enable sound planning and management by identifying data assets and setting processes to manage them<br />
</strong></p>
<p>Empower an Enterprise Information Architect to lead an effort that (1) identifies data that describes the organization, (2) defines how to integrate and improve quality of that data, and (3) improves the ability of information technology to maintain data quality.</p>
<p>(1) Lead definition of an Enterprise Information Architecture identifying information required to manage the organization as a single integrated enterprise, and data quality standards that ensure that data supports enterprise goals.</p>
<ul>
<li>Identify and define events and objects critical to the enterprise</li>
</ul>
<ul>
<li>Identify and define relationships among those events and objects and attributes that describe them</li>
</ul>
<ul>
<li>Classify data managed by the organization by type (operational, statistical, financial, decision support, etc.) and define standards for managing and integrating each type.</li>
<li>Compile the above into a plan that explicitly supports the enterprise strategic plan</li>
</ul>
<p>(2) Working with senior business managers, put in place a program of data quality improvement that plans and executes specific measures and sustained commitment to improving data quality in business processes and IT applications</p>
<ul>
<li>Identify the business group responsible for maintaining quality and integrity of each business object, event, relationship, and attribute</li>
</ul>
<ul>
<li>Identify for each data item of interest to the enterprise its “system of origination” and “system of record”.</li>
</ul>
<blockquote>
<ul>
<li>System of origination is the application that provides the entry point of a given data object to the organization.</li>
<li>System of record is the application that is the source of record for the data object.</li>
</ul>
</blockquote>
<ul>
<li>Define and deploy standards and practices for for business process and IT application definition that support data quality and integrity standards</li>
</ul>
<p>(3) Working with senior IT managers define and put in place standards for application requirements definition, data management, and metadata management to</p>
<ul>
<li>Define and deploy application development and interface standards that support data quality objectives.</li>
<li>Ensure that application development efforts support enterprise data quality</li>
<li>Continually monitor new developments in data management best practices and make that information available to the enterprise.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/04/proposal-for-eia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Beware the devils in the details of data integration</title>
		<link>http://robertlambert.net/2009/03/data-integration-devil-in-details/</link>
		<comments>http://robertlambert.net/2009/03/data-integration-devil-in-details/#comments</comments>
		<pubDate>Sun, 01 Mar 2009 14:26:50 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[Data Management]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[Project Management]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Database Design]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=189</guid>
		<description><![CDATA[Much of today’s IT application development – custom or off-the-shelf – involves integrating data from legacy systems, third- party software products and external data sources such as demographics or mail lists.  More often than not, data integration is unexpectedly complex, either due to data quality issues or the nature of the data integration itself. Here [...]]]></description>
			<content:encoded><![CDATA[<div class="wp-caption alignright" style="width: 259px"><a href="http://www.information-management.com/infodirect/20021004/5854-1.html"><img title="Information Management" src="http://www.information-management.com/media/ui/informationmgmt_logo.gif" alt="Excerpt from Illusions, Allusions – Let’s Get Real about Database Design, InfoManagement Direct, October 4, 2002" width="249" height="73" /></a><p class="wp-caption-text">Excerpt from &quot;Illusions, Allusions – Let’s Get Real about Database Design&quot;, October 4, 2002</p></div>
<p>Much of today’s IT application development – custom or off-the-shelf – involves integrating data from legacy systems, third- party software products and external data sources such as demographics or mail lists.  More often than not, data integration is unexpectedly complex, either due to data quality issues or the nature of the data integration itself.</p>
<p>Here are some typical examples:</p>
<ul>
<li>One ERP package uses the same table for both Sales Quotes and Sales Orders. Columns that mean one thing for Quotes mean quite something else Orders. One team extracting data from this ERP package continually mixed up, for example, Date Received on the Quote with Date Prepared for the Order. The designer who blindly copies data from input systems can propagate these issues. In this case, the correct solution is to extract the two documents into separate tables in the destination system, making each column describe either a quote or an order, not both.</li>
<li>Marketing databases often store data purchased from several third parties on the same set of customers. These sources usually include overlapping columns with different values. For the same customer, different sources might store different values for the person’s address, credit scores or even name. It is sometimes important to preserve all of the columns from all of the sources and to maintain the information on where the data came from as well as what its value was. This can result in a messy database design, where columns again carry dual meaning: their value and their source.</li>
<li>Codes from legacy databases tend to evolve into complex forms, embedding more and more information into a single field. This is perhaps a natural reaction to the slow evolution of the system relative to changes in business, as users shoehorn information into the system that it was not designed to store. For instance, in a legacy system a one- character code might classify customers by &#8220;customer category,&#8221; with values 1 for small business, 2 for mid-size, and 3 for Fortune 5000. Users might add codes 4, 5 and 6 for corresponding values for aerospace customers, then 7 for federal government, and so on. The database designer must know the data well to extract each embedded concept into a different destination column.</li>
</ul>
<p>When data integration is part of a project, expect complexity and leave room in interface development estimates for devils in the details of source system analysis and integration design.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/03/data-integration-devil-in-details/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data modeling: essential business skill</title>
		<link>http://robertlambert.net/2009/01/data-modeling-essential-business-skill/</link>
		<comments>http://robertlambert.net/2009/01/data-modeling-essential-business-skill/#comments</comments>
		<pubDate>Fri, 30 Jan 2009 15:42:47 +0000</pubDate>
		<dc:creator>Bob</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[Business Analysis]]></category>
		<category><![CDATA[Data Modeling]]></category>

		<guid isPermaLink="false">http://robertlambert.net/?p=8</guid>
		<description><![CDATA[Everyone involved in managing or improving a business process should understand data modeling.  For real. And almost everyone is in a position to improve a business process by understanding the current one and making suggestions to improve it.  Understanding a business process means understanding business objects, events, the relations among them, and the business rules [...]]]></description>
			<content:encoded><![CDATA[<p>Everyone involved in managing or improving a business process should understand data modeling.  For real. And almost everyone is in a position to improve a business process by understanding the current one and making suggestions to improve it.  Understanding a business process means understanding business objects, events, the relations among them, and the business rules that govern the process, and that&#8217;s what data modeling is all about.</p>
<p>Sure, data modeling <em>is</em> the first step to designing a database, but that&#8217;s just a coincidence.  A well designed database is well designed both because it&#8217;s efficient <em>and </em>because it matches business needs.</p>
<p>The first step in data modeling is understanding <em>entities</em>.  An entity is like a business object: examples may include customer, order, product, patient, blogger, post, or whatever.  The next step is to understand <em>relationships</em> among the entities, like  <em>a customer may place many orders </em>or  <em>a post is written by a blogger</em>.   Then the data modeler thinks about the <em>attributes</em> of the entities, like the <em>name </em>of the blogger, the <em>price</em> of the product, and so on.  The attributes and relationships are where the business rules come from.  Examples of rules may be &#8220;<em>a post is written by exactly one blogger&#8221; </em>or <em>&#8220;every order must have a shipping address.&#8221; </em>It&#8217;s not really a sequential step by step thing, more like a series of really interesting brainstorm sessions, but you get the idea.</p>
<p>Data modeling can get really complex, especially when it includes enough detail to generate an actual database, but that&#8217;s beside the point.  We&#8217;re talking about clear thinking about business things, events, relationships, and rules.  The point is that this kind of thinking can enable a business person to understand better the things, events, and rules of a business, and then to define more rational processes based on that understanding.</p>
<p>Today a lot of the problems that data modeling helps reveal relate to overcomplicated org charts and artificial complexities of legacy information systems.  Often the business evolves around the system, and it takes clear thinking to untangle the process spaghetti that results.</p>
<p>I worked with one company that organized itself by four different product line channels, served by matrixed support functions like accounts receivable and claims.  Worked pretty well, except when you wanted to know all of your contacts with a given customer, for example.  The same customer could have had a different record for each channel, then more information was socked away in the matrixed support functions.   Furthermore, the customer records were all laid out differently, and had different addresses, contact information, billing instructions, and so on.</p>
<p>Maybe I&#8217;m oversimplifying, but shouldn&#8217;t one customer have one file with the same information that everyone uses?  It seems to me that if applied in the real world this data-oriented perspective could really make things simpler and more cost effective.</p>
]]></content:encoded>
			<wfw:commentRss>http://robertlambert.net/2009/01/data-modeling-essential-business-skill/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
