I’ve often thought that conceptual data modeling was an underused tool in the arsenal available to requirements analysts, and in a recent conversation I found that many were surprised that it would be used in the requirements phase at all. Checking the Business Analysis Body of Knowledge (BABOK) I found data modeling listed among the tools available to requirements analysts to “to describe the concepts relevant to a domain, the relationships between those concepts, and information associated with them.” There’s also Steve Hoberman’s excellent book on the topic, Data Modeling for the Business, an introduction to data modeling aimed at a business audience.
Data modeling has long been one of my requirements analysis tools of choice for custom operational applications. To me, using data modeling techniques in requirements analysis reduces errors by improving requirements completeness, consistency, and communication, and provides unique continuity between analysis and design. As David Elliott told me in conversation, “development of a data model uncovers many opportunities for clarification of existing requirements, or uncovering of additional detail. At the very least, it confirms to one’s business customer that the BSA understands and can graphically demonstrate many business rules and relationships.”
I’ll hasten to add a these caveats. (1) Perhaps strangely, conceptual data modeling is not useful in the same way in requirements for informational systems like data warehouses and marts (I’ll save that discussion for another post). (2) Requirements definition for commercial off-the-shelf (COTS) applications follows a different methodology in which data modeling might be less applicable. (3) This post is not about database design, but rather about use of conceptual data modeling as a tool for organizing and validating requirements.
Conceptual vs. logical data modeling
It is easy to see why in practice there are varying definitions of the different types of data models. The Wikipedia entry on data modeling reflects the standard terminology based on the ANSI four-level database architecture, but features a confusing diagram that to me blurs the distinction between conceptual and logical models. The entries on Logical Data Model and Conceptual Data Model make them sound like the same thing: implementation-independent representations of business data. Then, the entry on Database Design contradicts them by stating that the logical model “contains all the needed logical and physical design choices and physical storage parameters needed to … create a database.”
For this post I’ll follow the definitions offered in Simison and Witt’s Data Modeling Essentials:
- Conceptual data modeling identifies a set of data structures that will meet requirements, focusing on business and not on technical or DBMS-specific concerns
- Subsequent logical data modeling maps the conceptual model to structures supported by the particular DBMS, finalizing the design in DBMS-appropriate constructs but not yet optimizing for performance, which comes next in physical modeling
Use of data modeling in requirements definition
Conceptual data modeling is hardly an outlier technique in requirements definition:
- Perhaps in reaction to problems experienced by adopters of Structured techniques in the 80s, data modeling was the cornerstone analytical technique in Clive Finkelstein’s and James Martin’s widely-adopted Information Engineering methodology.
- The BABOK includes class modeling, data modeling’s object-oriented cousin, in its chapter on data modeling. Class modeling is a core technique of object-oriented analysis.
- Scott Ambler’s Agile Modeling site offers conceptual (or “slim”) data modeling as an option in the initial envisioning stage.
- Informally searching requirements definition templates available on the web, I found that about a third recommend including conceptual data models.
Benefits of data modeling in requirements analysis
The BABOK separates requirements gathering from requirements analysis, defining requirements analysis as an essential step to organize, prioritize, and validate elicited requirements. Elicited requirements are the business objectives of the system. The analysis step organizes those objectives in a way that both makes sense to the business and guides subsequent application design. Conceptual data modeling in this stage helps ensure requirements completeness, consistency, and communications:
Completeness: In my experience most requirements analysis is process-based, and the most common tool the “swim lane” activity diagram. While such techniques are essential for understanding complex processes, they can miss requirements that aren’t directly involved in the process itself. For example, a complex process might reference federal, state, and local tax rates by zip code. Analysts who are heads-down in defining the process might neglect the need for at least annual refresh of the tax rate tables. Data modelers thinking in terms of business objects and events and their life cycles would be less likely to miss that one. This kind of review is formalized in the “CRUD Matrix” a table identifying which business activities create, read, update, or delete which business entities.
Consistency: Another challenge with process-oriented techniques is, for large systems, the risk of inconsistency in definition of business objects and events. For example, I worked on requirements definition of a specialized order processing system. Separate sub-teams defined field and headquarters processes, and as a result there were incompatible definitions of critical concepts like “customer” and “order”. Time pressures made it difficult for the two sub-teams to work together to make their work consistent. A separate data modeling sub-team can provide a reference point for object and event definitions and promote consistency between separate process analysis teams (on COTS installations the product database itself serves as the reference data model for separate process definition teams).
Communications: Data management professional Peter Carr recounted to me his experience as a consultant on a large project: “the Conceptual Diagram helped us think about all the current state situations, and broadly about the relationships between entities in the organization. It helped us to ask questions of the business when they were looking to enhance or build new systems to solve business requirements. Paraphrasing executive colleague Rich Hartt, “the enterprise data model is like a piece of art, it provides a picture into the business that offers new insight through its drawing and interpretation’. He went on to tell the key business leaders that the enterprise data model will continually be changing, but that it helped them gain understanding of their business in a different way than written business rules”.
For relational database applications, data modeling applies the same conceptual tool throughout the development cycle. A conceptual data model used to define the problem domain uses the same structure and symbols as a physical database design, although of course it uses fewer. On the other hand, activity diagrams and data flow diagrams are fundamentally different in nature from the software that they describe. In effect, process designers need to translate analysis artifacts into a different language. Logical and physical data modelers use exactly the same language as the business analysts who complete the conceptual data model.
My colleague Grayson Gorman cites “Poorly Defined / Missed Requirements” as a key contributor to IT project failure. In my view making data modeling a more prevalent part of requirements definition could help by improving requirements completeness, consistency, and communication with business participants, and promote a seamless transition from requirements to design.