Everybody has bad data: ensuring good data quality when moving to Microsoft Dynamics CRM

I am often asked as we are beginning a project, “What are some of the most common pitfalls in implementing CRM?”  Or, “What do people typically underestimate in a large CRM deployment?”  A number of possible answers may come to mind, but in my experience, the effort to cleanse and connect to enterprise data wins every time.  Everyone has bad data.  It’s the dirty little secret of enterprise systems.  The one that IT guys are embarrassed to explain to upper management. Systems store data in different ways, laid out in deferent architectures.  Some are legacy systems that have been around for years.  Some for 20 or more years.  Others are more current architectures, maybe with web services interfaces.  But the underlying problem is the same in each.  People key data with the constraints placed on them by the user interfaces and systems they interact with.  And when people interact with systems, data is not clean.


Surfacing Data from the Silos

The natural evolution of data in an enterprise lends itself to forming silos.  It doesn’t happen on purpose.  But as a business grows new systems are implemented, with newer, more diverse architectures.  New business units make application decisions that support their processes.  Acquisitions occur with their own systems deployed.  And before we know it, we have silos of data with little to no integration between them.  We all recognize the need for those systems to talk in order to be more efficient, and to provide better metrics for the organization.  But the task of how to do it is too daunting.  In comes CRM, promising a 360 degree view of the customer.  Live metrics and dashboards to visualize customer data.  Executive management finally sees the solution to the enterprise data silo problem.  If only it were so simple.

The Problem

We have two systems that house basically the same data.  We assume that it’s just a matter of mapping the fields together, pushing it into CRM, and then we’re done, right?  If only it were so simple.  Yes, there is the aspect of making sure that customer attribute “First Name” from the “Contact” record in one data source matches to “FName” from the “CUSTOMER” record in the second data source.  That’s a critical first step, but we are a long way from being done.  Some fields are based off of lookups from other tables. Option sets may store a value in the data source is some obscure number or a unique identifier.  And in each of these scenarios, we must identify how to map the contents of one data source to the other.  Add to that the complexity of layout and formatting.  The first data source may have a field called “Name”, which should contain names in the format of “Quattlebaum, Michael”.  So when we connect these pieces of data, we much transform the data into two fields:  First Name = Michael and Last Name = Quattlebaum.  In that process, we inevitably come across a record that has “Michael Quattlebaum” instead.  So data scrubbing and exception handling must be part of our consideration.  Once we have data that we feel is mapped properly, and cleansed, we must also consider how to de-duplicate records.  The hope, of course, is that among the systems we have in place, the same customer exists in multiple systems.  And by combining elements of data from across those systems, we receive a more complete view of our interaction with that customer.  So defining the rules for how we match records is an important process.  Here are some examples for matching contact data, and the possible issues with each:

Potential Rule Problems
Match against all available email addresses
  • Generic Email Addresses used
  • People enter invalid email addresses
Match against name and address
  • People at the same address have the same name
  • There’s a “Michael Quattlebaum”

This is just a simple example to show the complexities of defining de-duplication rules.  In the end, there are data integration and data migration considerations that must be addressed, and are inherently difficult to resolve.

The Solution

I know. Some of you skipped straight to this section to find out what the magic pill is.  I wish there was a simple answer that says, “just do this and all your data will be clean and lined up.”  There is no such solution, but there are a number of components, tools, and approaches to help with the problem:

  • Plan on plenty of grunt work.  Whether you are creating cross reference tables in SQL, manually cleaning up data in Access or Excel, or using a sophisticated tool to help with the process, there will need to be people to review and correct data problems or create the rules around the automation of the process.
  • Use a service provider to do some of the grunt work.  Inevitably, there will need to be a set of eyes on some of your data.  There are service providers that help with that, and often at a lower cost than you can provide internal resources.
  • Use a tool that is rules-based, and can be used in an on-going effort to incorporate data.  Trillium Software is a tool that I’ve used, and incorporates strategies for all of the complexities you may face with your data.
  • Develop a Master Data Management strategy and platform.  There are a number of MDM platforms available.  And whether you use a tool for your strategy, or you just define a common schema and services communication approach across the enterprise, taking this approach can simplify all of your integration projects going forward.

What Now?

Once you’ve gone through the effort and cost of migrating and integrating data, you want it to stay that way.  The problem is that people are still involved in the process.  So data cleanliness will continue to be problematic.  Customer Effective provides tools to review and approve data changes inside of Microsoft Dynamics CRM.  The Customer Effective Data Quality Management solution provides data stewards a user interface to review and accept or revert changes.


If your organization is interested in finding more out about the data cleansing process, or about our tools specifically, let us know by contacting us here.