There’s something clearly wrong about data modelling.

First of all, there are exceptional books like Silverston’s « The Data Model Ressource book » and then there are « models » in the real world. Just look at your CRM or even (may be) your ERP – why they work with the very same data and have such different models when there are « reference models » already?
Think about it: why do we need all those « dictionnaries » when we can have a unified model?
A thought experiment
First of all, let’s say we work in a B2B environment (to make it easy!) and all we want is a reference model for a « Party ». Basically, the main « object » of this model is a « company » or more generally an « organization ». It is a clear and identifiable object (using some kind of VAT/TVA/SIREN/INN or other local code). We will not go any deeper with this model (addresses, etc).
Now, CRM. We’ve decided to use the very same « reference » model. We’ve developed our own CRM to do that!
Ok, now I have Janette Doe « from IBM » who contacts me to buy something (or just get a quote). This is a new contact, I have to create it in my CRM, but I want to attach it to the company (right?).
Which one ? Should I ask for VAT ? What if it is not « that same IBM » (yes, there are many of companies with that same name). What should I do with my model?
I’m in this strange state where I have to give up and say « I cannot really identify her company, because the only thing I know is the name of the company, which is not unique« .
It’s actually nearly always the case for this process.
So, I give up:
- either I create my model around a « contact » (the name of the company is just a field and there is no identifiable entity like « company »)
- …or I say « I have no way to model the companies, so I have CRM accounts » (whatever it is, because IRL there are no « accounts » – it’s a pure abstraction)
What can we get from this experiment ?
First of all, our rigid ER-modelling has no simple way to cope with the uncertainty and either « we have everything » (but I have no clients ’cause everyone has to give me lots of technical details just so that I can use my mentally rigid data model) or « we have nothing » (i.e. we forget about companies in our B2B environment).
Yes, there are lots of different solutions, like:
- create a company with a flag « incomplete » so that we can identify it… and complete or remove it later
- create a column with a name and a nullable FK to the company object (to complete when we have more information)
- etc
These just look like a solution. In fact, if you analyze them from the point of view of the user, it’s all the same. If the user has no need to put a VAT/DUNS/whatever, he/she will not.
That’s the second observation – our model works well for a given population who has more or less homogeneous expectations and constraints. Otherwise « it works not ».
Too many words
Hey, I’m nearly there.
My overall conclusions:
- if you acquire the uncertain data, and you want a « certain » one (CRM -> ERP -> FI) you will have more than one model, just add some BPM and it’ll work;
- you will (probably) never ever have only one Master-Data-whatever;
- the models generally reflect not the reality, but what we know about the reality at a given step of the process;
- don’t model an object if you cannot identify this object, just live with it.
Hopefully that helps to understand why you will (probably) not have a full common data dictionnary, full common MDM, common anything but in very specific cases.
Good health to you and to your models.