An evolution of One MDM solution

The story is my pure imagination. All facts, characters, and incidents portrayed are fictitious. No identification with actual persons, places, buildings, and products is intended or should be inferred.
…and moreover I have my client’s permission to tell it here… 🙂

Older days

Imagine a group of companies running different ERPs and accounting systems, but having some kind of central governance… That’s what I call NO MDM. 🙂

The centralized vision is somewhat and somehow managed by Sales, Analytics and in Risks departments, but no one has a real vision of what’s going on in details.

The most important object to manage is the “Third Party” (or just « Party »). We are not talking about « Customer Data Management » because « Party » has multiple roles withing every contract (it may be a “client”, “address”, “warehouse”, “partner”, etc).

During this period the definition of a « Party » was “a company at a given address with a given role created within a given ERP system”. I mean if you’re looking for any referenced “company” it will be represented in systems by 8 records on average because records will have something different – sometimes it’s just another role or a different contact.

In order to give some visibility to the Sales, there is a “hack” – every “client” record contains an ID of a linked CRM Account in order to trace the results of the Account Managers… because there is no other way to identify the performance of the sales department, I’m not even talking about credit limits and so on…

This is where our journey begins.

The users have been 100% responsible for the state of the data.

Consolidation and Data Quality Management

The first step was to understand what really happens in the group. Given its specifics there are 3 levels of « Party » which may be useful at different levels:

  1. CRM is working with people (Contacts) and this is creating quite a mess as there can be multiple “Accounts” for the very same « Party » of the next level.
  2. ERPs are mostly interested in the level of Facilities / Establishments (which are by definition “organisations’ addresses”, i.e. “company at a given geographical point”).
  3. Financial analytics/Risks/Accounting is using a higher level of a Party – “organisation” which is mainly identified by its VAT code (or another identifier of the same kind depending on the country).

Top priority – have a vision on ERP level.

What can be done?

  1. Consolidation. Let’s identify the duplicate data in ERP systems using our brand-new definition (Facilities / Establishments). All we need is an MDM Register type, actually. Duplicate identification starts here.
  2. Collaborative editing. For the biggest clients and partners, we can inverse the data flow – they are no more created and managed in each and every ERP of each and every country – they are managed in the central repository (the model is still based on Facilities / Establishments).
  3. Reactive DQ. In order to accomplish the task, multiple Reference Data repositories are unified – this is where the Data Quality journey begins because proposing a unified repository does not mean enforcing one…

Time passes by and more and more Parties are integrated, so when ERP systems are replaced by a newer version, by default they are fed from the central repository. The duplicate detection mechanism reuses the MDM Register, but now it’s running on the data which is still under creation (leading to a Proactive DQM in the future).

Next step

We see that the situation is changing – what started as an MDM Register, data consolidation and Data Quality platform becomes a Single Source of Truth, this is where we can verify the data before it becomes available (i.e. we are going from multisource consolidation effort to single source Proactive validation).

Note. Why is it possible for this group and not for everyone else ? It’s because it is a B2B service and in this world a workflow of a few hours or even days is quite acceptable.

The people in charge of the project managed to establish standardized workflows which are used by thousands of users/data stewards all over the planet.

This is the moment to understand that:

  1. ERPs have seen a shift of the definition – “Party” is mostly a “company”, it may have multiple addresses, so the definition is quite aligned with accounting systems. This is thanks to the shift from the specific model to a more generalized one.
  2. The existing workflows are becoming quite complicated to manage in the UI made for the data entry (quite showing the real difference between CRUD and BPM).
  3. MDM Register is not really suited for Proactive DQ.

Hopefully the first two statements are quite evident, let’s see what the third one means.

When we consolidate the data, there are no perfect algorithms, so we always balance between a number of true duplicates found and a number of false positives (errors of the worst kind – we believe that something is a duplicate when it’s not – it produces quite a lot of work for data stewards). The best demonstration is a precision-recall curve showing you have to balance between the perfection (recall) and an amount of work to identify and manage the false positives (precision):

When we use automatic data flows, we always look for a balance because if we try to find all duplicates, our algorithm will also pour out on us (on our data stewards) lots of false positives, but… if we have a Proactive DQ (check errors on creation), we don’t need to be afraid of showing too much data to the user creating a new record – she will waste just a few seconds more to verify a few additional lines.

Here is the beginning of the workflow of editing a new Party:

In this workflow we try to prevent the creation of the duplicates because the user has no reason to go any further if she finds the necessary object. It means that we can be much more permissive with our algorithm and we don’t need to “match” the data, we need to “search” the data.

It is more of a technical shift, but it helps to get rid of MDM Register and replace it with a Search Microservice.

The journey

Just to show you the overall shift, here goes the schema:

Hopefully you can see that:

  1. It was quite a journey,
  2. Certain changes were predictable (the shift from the reactive DQ to the proactive DQ),
  3. Certain changes have been driven by the changes in the underlying systems (better model in ERP driving the shift of definitions and adjustment of the business processes).
  4. It may be a little misleading, but CRM is not directly creating any Parties in MDM – it just starts a workflow as an external user.

What do you think?

PS What style of MDM do you see in the final solution? Is it still a Collaborative MDM? Is it a new one?

I wish a good health to you and your MDM system.