How a golden data set can help overcome common data challenges

Thursday, October 31, 2019 | By Lyndsay Noble, Director, Data Science

How a golden data set can help overcome common data challenges

How do you infuse data into your day-to-day business? In the The Data Age—How data can drive decisions at all levels of an organization webinar, I sat down with Alastair Hewitt, Director, CORE-Sightline to explore what goes into creating a “golden data set,” and how organizations can use that information in a strategic way.

As analysts attempt to organize company data from multiple departments into a meaningful data set in order to analyze or answer some questions, they can encounter some universal challenges:

  • Lack of resources to scrub and validate data – 80% of an analyst’s analyzing time is spent addressing this challenge. When analysts are overburdened in this area, it might mean that projects are not taken on due to a lack of resources.
  • Systems aren’t integrated to provide access to all required data – The data needed for driving decisions is often located in multiple unconnected systems. Data can stretch across a company’s CRM, marketing automation, operational systems, weblogs, etc. Every single part of the company may store data in a different way in a different location.
  • Staff lacks the correct skills – Lots of people may be able to make dashboards and reports, but cleaning, validating and integrating data requires a different set of skills.
  • Inconsistent data format – This challenge often comes down to the level of precision or granularity. From a precision perspective, one system might use names while another uses IDs, and trying to match between names and IDs or emails and IDs can be very difficult. Granularity problems come into play when trying to match data such as US Census data at a regional level to individual human records.
  • Insights are not generated in a timely manner – The first 4 challenges drive this one. Business leaders want answers immediately—they don’t want to wait for analysts to overcome all these challenges.

A “golden data set” is a clean, validated, integrated data set. The process for establishing it takes planning. The first step is to identify all the data sources that have important information. Next, identify which fields are common across the different sources and find a way to join or reconcile those fields. It is important to define the rules for joining. Finally, put the plan into action. A plan might be for a giant enterprise data management project, or it might be on a smaller scale within a single department, but the planning process stays the same.

Alternative Investments, Asset Management, Regulation

Theme picker