analyzing mixpanel data into amazon redshift
TRANSCRIPT
![Page 1: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/1.jpg)
Analyzing Mixpanel Data into Amazon
Redshift
![Page 2: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/2.jpg)
How?Define the Data Pipeline
Access & Extract data from Mixpanel
Prepare data
Load data to Amazon Redshift
![Page 3: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/3.jpg)
Who am I?
Kostas PardalisCo-founder & CEO
Blendo.co@KostasPardalis
![Page 4: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/4.jpg)
Why we built Blendo?The Simplest Platform to get and remix your data from any source.
Make your data available anywhere.
![Page 5: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/5.jpg)
Mixpanel?
Mixpanel helps you to easily measure what people are doing in your app on iOS, Android, and web.
![Page 6: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/6.jpg)
Amazon Redshift
![Page 7: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/7.jpg)
How to Analyze Mixpanel Data?
![Page 8: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/8.jpg)
Use the Mixpanel Internal Reports
Write JQL
Load data to a data warehouse for SQL Access
![Page 9: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/9.jpg)
How to Extract data from Mixpanel?
![Page 10: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/10.jpg)
Use Mixpanel’s Export APIhttps://mixpanel.com/docs/api-documentation/data-export-api
Access it with:CURLPostman Apache HttpClient for JavaSpray-client for ScalaHyper for RustRuby rest-clientPython http-client
![Page 11: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/11.jpg)
Use Mixpanel’s Export APIhttps://mixpanel.com/docs/api-documentation/data-export-api
Or Use Mixpanel’s Libraries /SDKsPythonPHPRubyJavascript
![Page 12: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/12.jpg)
Mixpanel API Resources
Annotationsannotations - list the annotations for a specified date range.create - create an annotationupdate - update an annotationdelete - delete an annotation
Exportexport - get a "raw dump" of tracked events over a time period
![Page 13: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/13.jpg)
Mixpanel API ResourcesEventsevents - get total, unique, or average data for a set of events over a time periodtop - get the top events from the last daynames - get the top event names for a time period
Event Propertiesproperties - get total, unique, or average data from a single event propertytop - get the top properties for an eventvalues - get the top values for a single event property
![Page 14: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/14.jpg)
Mixpanel API ResourcesFunnelsfunnels - get data for a set of funnels over a time periodlist - get a list of the names of all the funnels
Segmentationsegmentation - get data for an event, segmented and filtered by properties over a time periodnumeric - get numeric data, divided up into buckets for an event segmented and filtered by properties over a time periodsum - get the sum of a segment's values per time unitaverage - get the average of a segment's values per time unitSegmentation Expressions - a detailed overview of what a segmentation expression consists of
![Page 15: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/15.jpg)
Mixpanel API ResourcesRetentionretention - get data about how often people are coming back (cohort analysis)addiction - get data about how frequently people are performing events
People Analyticsengage - get data from People Analytics Let’s assume that we want to export our raw data from Mixpanel. To do so we’ll need to execute requests to the export endpoint.
![Page 16: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/16.jpg)
Mixpanel API ResourcesLet’s assume that we want to export our raw data from Mixpanel. We’ll need to execute requests to the export endpoint. Eg “a request that would get us back raw events from Mixapanel”
![Page 17: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/17.jpg)
Prepare Mixpanel Data for Amazon Redshift
![Page 18: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/18.jpg)
Prepare Mixpanel Data for Amazon Redshift
• Follow Amazon Redshift Data Model• Map into tables and columns• Adhere to the
datatypes that are supported by Redshift*• Have in mind the best practices that Amazon has
published regarding the design of a Redshift database.
Amazon Redshift is built around industry-standard SQL with added functionality to manage very large datasets and high performance analysis.
* As your data are probably coming in a representation like JSON that supports a much smaller range of data types you have to be really careful about what data you feed into Redshift and make sure that you have mapped your types
![Page 19: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/19.jpg)
Load data from Mixpanel to Amazon Redshift
![Page 20: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/20.jpg)
Put data in a source that Redshift can pull it from
Amazon S3
Amazon DynamoDB
Amazon Kinesis Firehose
![Page 22: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/22.jpg)
Amazon S3
2. Create a bucketExecute an HTTP PUT on the Amazon AWS REST API endpoints for S3. (Use: CURL or Postman or use the libraries provided by Amazon)*
* You can find more information by reading the API reference for the Bucket operations on Amazon AWS documentation.
![Page 23: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/23.jpg)
Amazon S3
3. Start sending your data to Amazon
S3Use the same AWS REST APIUse the endpoints for Object operations
![Page 24: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/24.jpg)
Amazon DynamoDB
• DynamoDB imports data from S3• Adds another step between S3 and
Amazon Redshift
![Page 25: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/25.jpg)
Amazon Kinesis Firehose
1. Create a delivery stream
2. Add data to the stream
* Whenever you add new data to the stream, Kinesis takes care of adding these data to S3 or Redshift. Going through S3 in this case is redundant if your goal is to move your data to Redshift.
Amazon Kinesis Firehose offers a real time streaming approach into data importing
Use the same AWS REST APIPush by using a Kinesis Agent.
![Page 26: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/26.jpg)
Load data into Redshift #1
INSERT
1. Connect to Amazon Redshift instance with your client, (JDBC or ODBC)
2. Perform an INSERT command for your data.
for more information you can check the INSERT examples page on the Amazon Redshift documentation.
![Page 27: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/27.jpg)
Load data into Redshift #2
COPY
For more examples on how to invoke a COPY command you can check the COPY examples page on Amazon Redshift documentation.
1. Connect to Amazon Redshift instance with your client, (JDBC or ODBC)
2. Perform an COPY command for your data.
![Page 28: Analyzing Mixpanel Data into Amazon Redshift](https://reader036.vdocuments.mx/reader036/viewer/2022062223/586e73131a28ab99598b533b/html5/thumbnails/28.jpg)
An even easier way?