fusionpoint: marketing analytics - data integration tips from fusionpoint

13
Copyright © 2013 FusionPoint, LLC Page 1 www.thefusionpoint.com Marketing Analytics: Top 7 Data Cleansing, QC & Harmonization Tips NOTICE Proprietary and Confidential This material is proprietary to FusionPoint, LLC. It contains confidential information, which is solely the property of FusionPoint, LLC. This material shall not be used, reproduced, copied, disclosed, or transmitted, in whole or in part, without the express consent of FusionPoint. Copyright © 2012 to FusionPoint, LLC All rights reserved.

Upload: wymanb2

Post on 24-May-2015

124 views

Category:

Marketing


0 download

DESCRIPTION

This presentation shares some of the data integration and harmonization tips that FusionPoint has learned while building custom analytic and reporting tools for our Sales and Marketing clients. Whether your project requires big data, predictive analytics or custom front-end development....it all starts with getting the data right!

TRANSCRIPT

Page 1: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 1 www.thefusionpoint.com

Marketing Analytics: Top 7 Data Cleansing, QC & Harmonization Tips

NOTICE Proprietary and Confidential

This material is proprietary to FusionPoint, LLC. It contains confidential information,

which is solely the property of FusionPoint, LLC. This material shall not be used,

reproduced, copied, disclosed, or transmitted, in whole or in part, without the

express consent of FusionPoint. Copyright © 2012 to FusionPoint, LLC All rights

reserved.

Page 2: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 2 www.thefusionpoint.com

• Background & Context

• Typical Project Flow

• Data Integration and Harmonization Tips

• Conclusions

Agenda

Page 3: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 3 www.thefusionpoint.com

• FusionPoint is a company that helps clients build customized marketing analytic solutions.

• We frequently see clients that build really cool analytic models, reports or what-if tools only to realize that continuous data “feeding” can be a nightmare in terms of cost, time and effort. Worse, we see clients that produce inaccurate results and business recommendations driven by data anomalies and errors .

• This presentation shares tips from some of our data integration and harmonization experience collected while building hundreds of custom reporting and analytic tools.

• We hope that this information helps you design, build and maintain analytic capabilities that drive a sustainable competitive advantage for your organization.

Background and Context

For more information about FusionPoint or this topic,

please visit us at www.thefusionpoint.com

Or call us at (203) 702-2100

Page 4: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 4 www.thefusionpoint.com

High-level Project Overview

• Most sales and marketing solutions start with disparate data feeds (Box 1 on the left). While this example shows three sample data sources your project could include ERP, CRM, Agency, 3rd Party, Social Media, Financial, Economic, Demographic, Competitive, environmental (i.e. weather) and other sources.

• These feeds typically reside in different locations/systems and in different formats

• An efficient process to QC, clean and harmonize those disparate data sources is critical in developing a sustainable solution that can feed quality data to downstream analytic and reporting tools

Page 5: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 5 www.thefusionpoint.com

• Every project has it’s own unique roadmap. The vast majority of projects will evolve in terms of expected outputs and business requirements.

• For this reason, it is critical that you start with a complete understanding of how your marketing data mart will be used in the short term. With that base, a general understanding of how that use could evolve is important to selecting the most flexible solution architecture to “future-proof” your efforts.

• You should start every engagement by understanding the business questions that need to be answered today and those that will potentially need to be addressed down the road. From this foundation, you can back into the analytic and data/data transformation needs.

• If these business questions have not been gathered then we recommend conducting a series of end-user interviews with multiple people from each unique user role/community.

Tip #1: Start at the End

Page 6: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 6 www.thefusionpoint.com

• Depending on the complexity of your project, you may have anywhere from one to hundreds of data feeds.

• It is critical that you develop an thorough inventory of the data assets that will be utilized.

• At a minimum, that data inventory should include…

– Owner/System of Record

– Size Estimates

– Format(s)

– Measures/Metrics

• Business Rules for QC purposes

• Aggregation & Deaggregation Rules

– Dimensionality

– Update Calendars

Tip #2: Complete a Data Audit & Inventory

Page 7: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 7 www.thefusionpoint.com

• Attempting to get your hands on data samples can alert you to issues that you, and frequently the data owners, may not have been aware of.

– Do you need to obtain permissions or sign agreements before you can access, store or process the data?

– Do you need to buy data?

– Does the data feed only come via certain systems or interfaces?

– Does the structure of the data respect the formats in the specification?

– If it takes weeks to obtain the samples, do you need to manage delivery timelines carefully?

• Obtaining data samples early in the process can help you identify and rectify issues earlier in the project timeline.

Tip #3: Obtain Data Samples Early

Page 8: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 8 www.thefusionpoint.com

• As you develop your process to on-board the data, it is critical that you log information on every step of the process. This will allow you to monitor performance and identify bottlenecks and optimization opportunities.

• These logs are also critical to help develop standard operating procedures around data and process anomalies.

• Start with a process flow (i.e. Visio diagram) that captures each data hand-off and transformation step.

• For each step ensure that you know the parties involved and their responsibilities (i.e. RACI diagram), the expected and actual delivery dates/times, the size and content of the data delivered and the results of all QC scans conducted at that step.

• Timing each step in the process can also help identify bottlenecks as you experience data growth.

Tip #4: Log and Track Everything

Page 9: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 9 www.thefusionpoint.com

The Data QC steps below give you a general stepwise approach we build into each data integration project.

• Match file headers and content with the expected formats

• Check files sizes and record counts

• Check descriptive statistics for all measures in the file.

– Record Counts, Count of Null Records

– Min, Average and Max

– Min, Average and Max of Non-Zero Values

– Variance and Std Deviations

– Exception reports highlighting cells that exceed acceptable thresholds

• Check for Outliers

– On methodology would look for cells that exceed a certain number of standard deviations from the mean for that vector

• Add Domain Specific Business Checks

– UPCs summed < Total Brand

– State Zip Codes summed = Sate Aggregate

Tip #5: Check Data Integrity Before Mapping

Page 10: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 10 www.thefusionpoint.com

Unfortunately, the checks described on the previous pages do not capture all data issues. One additional check we try to include in our process is including overlapping time periods with each data feed.

• This allows us to check the values from the prior feed with the values from the current feed for a subset of the periods.

• Within certain industries, these values are designed to be identical. If the overlap differs in any meaningful way it is indicative of a data issue.

• In other industries data restatements can occur and differences in values are expected. In these cases, percentage thresholds can be used to identify unusually large restatements. For example, if the sales data for the prior period is more than 10% different, flag those cells. While these differences are not always incorrect data, the process of highlighting large period-to-period changes in the data is frequently important to business users.

• This check has helped us proactively identify issues where columns have been labeled improperly or step changes exist with the data sets.

Tip #6: Request Overlapping Periods

Page 11: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 11 www.thefusionpoint.com

We are frequently asked, “How do you automate the dimensional mapping?” The short answer is that while many data sets support automation, you can not assume that all data sets and feeds can have automated mapping rules.

• When it comes to dimensional mapping, do not assume that this can be 100% automated.

• For certain dimensions automation can be built into the process (i.e. lookups on NY to New York)

• For other dimensions, mapping may require domain expertise on the data. For example does advertising for Coke’s Copy #1 apply to All Coke, Diet Coke or Diet Cherry Coke? At times questions like these are best answered by the agency or Brand Managers.

• Early identification of what can be automated and what requires human intervention is critical to managing project timelines and expectations.

Tip #7: Do not assume dimensional mapping can be 100% automated

Page 12: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 12 www.thefusionpoint.com

• Sales and Marketing analytic tools have the ability to drive huge gains in terms of revenue and profits. They can only do this if the data being fed into them has integrity and can be efficiently harmonized.

• The ability to build and maintain a continuous process starts with a solid data foundation. The importance of an efficient data QC, Cleansing and harmonization can not be overstated.

– The quality of everything down stream is dependent on this process (Avoid GIGO)

– The efficiency and turn-around of everything down stream is dependent on this process

– The ability for you to course correct is linked to the flexibility and design of your data management process

Conclusions

Page 13: FusionPoint: Marketing Analytics - Data Integration Tips from FusionPoint

Copyright © 2013 FusionPoint, LLC – Page 13 www.thefusionpoint.com

Thank You for Your Time!

For more information about FusionPoint or this topic,

please visit us at www.thefusionpoint.com

Or call us at (203) 702-2100