hannah aizenman - get to know your data

25

Get To Know Your Data Hannah Aizenman @story645

Upload: pydata

Post on 08-Jul-2015

262 views

Category:

Data & Analytics

1 download

Report

Download

Embed Size (px):

DESCRIPTION

A recent article in the New York Times estimates that data scientists spend somewhere between %50 and %80 of their time "collecting and preparing unruly digital data" before they ever get to the analysis. Data is often badly labeled, inconsistently sampled, incorrect in strange places, missing, and otherwise contains a whole host of errors, leading to the "garbage in, garbage out" problem. While detecting the myriad ways in which the data is broken can sometimes be difficult, traditional visualization techniques, exploratory data analytics, and cluster analysis can help. This talk will discuss some of the typical methods for sanity checking small data sets: visualization, simple statistics, and some basic combinations of the two. This talk will then veer into some machine learning techniques for exploring the underlying structure of larger data sets to verify the occurrence of known patterns and to detect outliers that could be due to errors rather than the occurance of something interesting.

TRANSCRIPT

Page 1: Hannah Aizenman - Get To Know Your Data

Get To Know Your Data

Hannah Aizenman@story645

Page 2: Hannah Aizenman - Get To Know Your Data

image via @Ted Underwood

https://twitter.com/albertocairo/status/498847521561927680

Page 3: Hannah Aizenman - Get To Know Your Data

Unprocessed Data

Page 4: Hannah Aizenman - Get To Know Your Data

Missing Observations

Page 5: Hannah Aizenman - Get To Know Your Data

Misused Technique

Page 6: Hannah Aizenman - Get To Know Your Data

Start?

Page 7: Hannah Aizenman - Get To Know Your Data

Research

Page 8: Hannah Aizenman - Get To Know Your Data

Explore Attributes

Page 9: Hannah Aizenman - Get To Know Your Data

Take Snapshots

Page 10: Hannah Aizenman - Get To Know Your Data

Plot

Page 11: Hannah Aizenman - Get To Know Your Data

Label

Page 12: Hannah Aizenman - Get To Know Your Data

Rearrange

Page 13: Hannah Aizenman - Get To Know Your Data

Higher D Data: Plot 1 Dim

Page 14: Hannah Aizenman - Get To Know Your Data

Plot Another Dim (or 2)

Page 15: Hannah Aizenman - Get To Know Your Data

Fix that Plot

Page 16: Hannah Aizenman - Get To Know Your Data

Histogram

Page 17: Hannah Aizenman - Get To Know Your Data

Min, Max, Mean, Median

Page 18: Hannah Aizenman - Get To Know Your Data

Too Much Data

Page 19: Hannah Aizenman - Get To Know Your Data

Multivariate Relationships

Page 20: Hannah Aizenman - Get To Know Your Data

Multivariate Relationships With Classes

Page 21: Hannah Aizenman - Get To Know Your Data

Known Patterns

Page 22: Hannah Aizenman - Get To Know Your Data

Expected Values

Page 23: Hannah Aizenman - Get To Know Your Data

Look For Structure

Page 24: Hannah Aizenman - Get To Know Your Data

Incorporate Outside Knowledge

Page 25: Hannah Aizenman - Get To Know Your Data

Weave it All Together

Toward Probabilistic Seasonal Prediction Nir Krakauer, Hannah Aizenman, Michael Grossberg, Irina Gladkova Department of Civil Engineering and CUNY Remote

EDUCATION ACADEMIC APPOINTMENTSdornsife.usc.edu/tools/mytools/PersonnelInfoSystem/DOC/Faculty/SIR/... · Joshua Aizenman - 2 - CURRICULUM Vita

Hannah and Samuel · 2020-07-20 · Hannah was very sad because she did not have any children. Hannah wanted to have a family. Elkanah and Hannah went to the tabernacle. Hannah prayed

Michael Aizenman and Hugo Duminil-Copin June 18, 2020

Must Know Drugs: Psych, Complex, Peds/OB, Community Compiled by Hannah Giboney

Kayla, Hannah Field, Becca James, Hannah Frizzell

Hannah abellera.assignment1

HAY, DO YOU KNOW ABOUT GOATS? Hannah Albertson Next

Brain Tumours – what should I know? Dr Hannah Lord Consultant Clinical Oncologist

Joshua Aizenman [UCSC and the NBER] and Yothin Jinjarak ... Jin ISoM July 11… · Joshua Aizenman [UCSC and the NBER] and Yothin Jinjarak [SOAS, University of London] Abstract This

Avigael M. Aizenman J0501 - California Science ...cssf.usc.edu/History/2006/Projects/J05.pdfAvigael M. Aizenman Investigating Mass Reduction in Pennies: Pennies and Acids J0501 Objectives/Goals

Aizenman Credit Ratings and the Pricing of Sovereign Debt During the Euro Crisis

NOBODY LOVES YOU WHEN YOU’RE CRAZY - SimplyScripts · Everything seems so big. It feels good to know I can control something. Hannah nods, writing on her clipboard. HANNAH Why do

Sicklecell Anemia - A Disease of Diverse Populations Jennie Aizenman Scott Bronson Uwe Hilgert

Hannah Rudman221008

Hannah Arendt Vita activa · Title: Hannah Arendt Vita activa

Joshua Aizenman and Rajeswari Sengupta August 2012 - … · · 2018-03-29Joshua Aizenman and Rajeswari Sengupta UCSC and the NBER; IFMR, ... the three dimensions of the Trilemma

HANNAH BROTHERS, ANDREW AND WILLIAM ANDREW HANNAH€¦ · HANNAH BROTHERS, ANDREW AND WILLIAM ANDREW HANNAH Andrew Hannah Born August 13, 1759 o Died April 1, 1843 Salem Cemetery,

Hannah collinskkkdfee

Hannah©2018 Schleich® Hannah

Cereb. Cortex 1996 Aizenman 751 8

Hannah Matthews

Las pinturas de la señora Jaika Aizenman La señora Jaika Aizenman nació en Belalcázar, en el departamento de Caldas, Colombia. Su infancia estuvo rodeada

[PVG] Hannah Montana - Hannah Montana 3

Joshua Aizenman and Yi Sun - NBER

Christianity By: Isabel Covert, Hannah Potter, and Hannah Johnston