a-one consultants
TRANSCRIPT
A-ONEConsultants
Jessica Morris & Partap Singh
Current Assignment
DATA ANALYSIS
Data Mining Goals
Analyze QVC airtime and sales history to determine the best times to sell certain products on air
Determine which states make the most purchases in order to better geographically target QVCs sales
Determine which brands and products sell the best
Data Provided
Clean the Data
In order to get the data in a format readable by HDFS file types, the data needed to be cleaned
We used a combination of Excel and Powershell to do this
Quotes needed to be removed and dates needed to be formatted as YYYY-MM-DD not MM/DD/YYYY.
Process the Data
A mixture of the Hadoop tools Hive and Impala were used
We ran a combination of queries on the tables including joins and distinct queries to get an idea of the data we were working with
These queries generated the Excel files that we further analyzed in Tableau
In a real world situation, one would not limit themselves to one tool
Example of Hive/Impala Queries
Example of Generated Data
Airtime for each product generated by Impala
Example of Generated Data (cont.)
Hive was used to generate the excel file here. The chart was created in excel and shows the top 25 sales dates...
Example of Generated Data (Cont)
...As well as how many orders contained what products on these dates
Visualization of Data
in Tableau
Visualization - Tableau
Visualization - Tableau
Visualization - Tableau
Visualization - Tableau
Visualization - Tableau
Visualization - Tableau (map)
Visualization - Tableau (map)
Visualization - Tableau (map)
Visualization - Tableau (map)
Thank You!