It seems I’m frequently in conversations about using null to represent a business value. To paraphrase, say there are credit and cash customers, and there’s a suggestion to set “Customer_Type” to “C” for credit and null for cash. To data and database professionals this is obviously a bad idea, but it’s not obvious from a business point of view.
In a database null means that there is literally no value, or the value is indeterminate. Null is not the same as zero or blank. When a database operation involves nulls the result can be difficult to predict for someone not practiced in SQL. In many cases the answer is null. For example, 1+0=0 but 1+null=null. In plain English, what you’re asking the DBMS to do in the latter case is to add 1 to [I don’t know what], and of course 1+[I don’t know what] equals [I don’t know what].
So, if you use null to represent a business value then you might not get the results you’re looking for when you try to get business answers out of your database. For example, say “C” represents credit customers and null represents cash customers, and you have 2 cash and 1 credit customers. In SQL Server if you use a Count function to tally all of your cash customers the answer isn’t 2, it is null.
That’s one example of why it’s not a good idea to try to represent a business fact with a null value. It doesn’t make business sense and in this case the DBMS, correctly, won’t make sense of it for you.
To be clear, whether or not a given database column permits null values is an entirely different question, best left to database designers. For example, a database table might record which patient occupies which hospital bed. It may be reasonable and correct to assign a null patient ID if the bed is currently available. However, there are alternative methods of representing this situation, and the database designer should be free to choose the right alternative taking into account the specifics of the application under development.
It seems to me that a null might imply a business value, depending on the scenario. In the examples you provided, I’d agree nulls don’t make much sense.
What if we have a database table tracking a baseball team’s players’ batting averages?
– You might have a value 0.500 for a player who gets on base every other time.
– Or 0.000 if a player has never reached first base (no pun intended)
– Or null for a player who hasn’t been to bat yet?
Now, to your point you could probably achieve the same result with an alternate design: perhaps a separate player table from player_average or whatever, such that only players who have been at bat are in the player_average table. But this might not always be the best design, or perhaps not possible due to other design constraints.
Similar examples exist in the Java world – do I return a null Collection or a Collection of size 0? I think, like many other problems in the database/software architecture world, what’s important is to approach the problem consistently and be sure there’s a common understanding across the team.
Andy, great examples, and I definitely agree with your approach. Still, it seems to me they are about how to represent the business requirement in a design. Nulls are a valid tool for the DB/code-literate in design, but in terms of business requirements I still think probably never. I think of all business requirements working on normalized data, thereby giving the designer/developer full control of design decisions and not cutting off any data design options. Given that, to me the player_average table is a (probably appropriate) denormalization of the player_at_bat_event table, of which there would be no row for the player who has not yet faced a pitcher. So I think we’re coming from the same place, just in different stages of the development life cycle. I suppose a clarifying comment is in order: “no business value in nulls until design.”