the coming data deluge challenges and opportunities of ...lekheng/courses/191f09/donoho.pdf · 6...
TRANSCRIPT
![Page 1: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/1.jpg)
Data, Data, Data! 1�
�
�
�
USNA Michelson Lecture 2001
Data, Data, Data!
Challenges and Opportunities of
the coming Data Deluge
David Donoho, Statistics Department, Stanford University
Data, Data, Data! 2�
�
�
�
Thanks to ...
• Office of Naval Research:
– Reza Malek Madani
– Wen Masters
– Carey Schwartz
• Will Glaser, Savage Beast
• Jeffrey Klein, Xamplify
• John Lavery, Army Research Office
• Tracy McSheery, PhaseSpace
• Peter Weinberger, Renaissance Technologies
• Shankar Sastry, UC Berkeley
Data, Data, Data! 3�
�
�
�
Data, Data, Data! 4�
�
�
�
Agenda
I. The Era of Data
II. Its Characteristics
III. Its Critics
IV. Mathematics: The Long View
V. Mathematical Challenge: Navigating High Dimensional Space
![Page 2: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/2.jpg)
Data, Data, Data! 5�
�
�
�
I. The other World Hunger Problem
(a) Sherlock
"Data, Data, Data!I can't make bricks without clay!"-- The Adventure of the Copper Beeches
(b) Quotation
Data, Data, Data! 6�
�
�
�
The coming Data Era
• Since beginning of history: unquenched desire for data
• Coming years, problem will be solved
• Prodigious efforts of best and brightest
• Massive constructions: ambitious and visionary
• Will define an Era
• Will permeate everything
Data, Data, Data! 7�
�
�
�
The Era of Faith
(a) Sistine Chapel (b) Notre Dame
Data, Data, Data! 8�
�
�
�
The coming Data Era
In place of cathedrals: massive investment for
• Ubiquitous sensors
• Pervasive networking
• Gargantuan databases
• Data-rich discourse and decision
![Page 3: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/3.jpg)
Data, Data, Data! 9�
�
�
�
The Data Era Touches Everything...
Touches even the most spiritual places
• Our Fate
• Our Creation
Data, Data, Data! 10�
�
�
�
I.1 Data and Notions of Fate
(a) AB 3700 (b) Celera Genomics
Data, Data, Data! 11�
�
�
�
Decoding the Genome
(a) Nature (b) Science
Data, Data, Data! 12�
�
�
�
Genomic Data Availability
• 109 Base pairs
• 30,000 Genes
• On-line 24*7
• Ubiquitous Software
• Internet Accessibility
![Page 4: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/4.jpg)
Data, Data, Data! 13�
�
�
�
Genomic Data will be touching your fate
• Will you (your favorite phobia here ...)
– Will you die young?
– Will your offspring share your traits?
– Will you react poorly to stress?
• Will we ... conquer disease, defects?
Data, Data, Data! 14�
�
�
�
I.2 Data and Notions of Creation
Data, Data, Data! 15�
�
�
�
Sloan Digital Sky Survey
(a) Sloan Instrument (b) Las Campanas Survey
Data, Data, Data! 16�
�
�
�
Sloan Facts (from www.sdss.org)
• Catalog larger by orders of magnitude tha perviou
• Data available on-line 24*7
• SDSS logo – ”Cosmic Genome Project”
![Page 5: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/5.jpg)
Data, Data, Data! 17�
�
�
�
Data Touch our Idea of Creation
• Sensors seeing back into the earliest moments
• Processing detects the first wrinkles in space/time
Data, Data, Data! 18�
�
�
�
Data Have Revolutionary Impact
• Today: one-time transition in human affairs
• You will be participants and shapers
Comparable Transition: Printing Press
• 50 million books in century 1450-1550
• Mass Communication, Art, Expression
• Exploration, Conquest, Empire
• Religious conflict, Total War
Data, Data, Data! 19�
�
�
�
II. Symbols of the Modern Era?
(a) DataCenter (b) Network
Data, Data, Data! 20�
�
�
�
True characteristics of the Modern Era
Ubiquitous, Pervasive Combination of
• Gathering Data
• Transporting
• Archiving
• Publishing
• Exploiting
![Page 6: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/6.jpg)
Data, Data, Data! 21�
�
�
�
II.1 Gathering Data
Everything Can and Will Become Data
• Smells
• Chemical Vision
• Motions, Gestures
• Preferences
• Essences
• Human Identity
• Behavior
Data, Data, Data! 22�
�
�
�
Electronic Nose (N. Lewis, Caltech)
Data, Data, Data! 23�
�
�
�
Electronic Nose Results
(a) (b)
Data, Data, Data! 24�
�
�
�
Hyperspectral Photography
(a) Eye (b) Spectrum
![Page 7: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/7.jpg)
Data, Data, Data! 25�
�
�
�
Hyperspectral Photography – Derived Image
Figure 11: Derived Image
Data, Data, Data! 26�
�
�
�
Human Motion Tracking – HiBall Tracker
(a) (b)
Data, Data, Data! 27�
�
�
�
SAR/Lidar Sensor Fusion
Data, Data, Data! 28�
�
�
�
New types of data: endless
• Cortical response to stimuli (fMRI)
• Internet Browsing Behavior
• Gait, Posture, Gesture
• NBA: mining videos for player moves
![Page 8: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/8.jpg)
Data, Data, Data! 29�
�
�
�
II.2 Ubiquitous and Pervasive Data Transport
Netcentric Communications
• Internet
• Wireless
• Bluetooth
Data Swamps Everything – Voice, Media etc.
Data, Data, Data! 30�
�
�
�
II.3 Ubiquitous and Pervasive Archiving Data
• Earthquakes
• Galaxies
• Genome
• ...
• Massive
• Comprehensive
• Complete
Data, Data, Data! 31�
�
�
�
Earthquake Catalogs
Data, Data, Data! 32�
�
�
�
II.4 Ubiquitous and Pervasive Publication of Data
(a) Forbidden City (b) Detained EP3 on CNN
![Page 9: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/9.jpg)
Data, Data, Data! 33�
�
�
�
Publication of High-Resolution Data
(a) 1.0 Meter Res (b) 0.5 Meter Res
Data, Data, Data! 34�
�
�
�
Publication of Financial Data
Data, Data, Data! 35�
�
�
�
II.5 Pervasive and Ubiquitous Exploitation
• Scientific
• Military
• Industry
• Investment Community
Data, Data, Data! 36�
�
�
�
Military need for Data
(a) (b)
![Page 10: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/10.jpg)
Data, Data, Data! 37�
�
�
�
Military Contributions to Data Era
• Computers
• Communications
• Networking
• Storage
• Display
Data, Data, Data! 38�
�
�
�
Modernizing War
(a) Publ Docs (b) Army Docs
Data, Data, Data! 39�
�
�
�
Navy Vision
Data, Data, Data! 40�
�
�
�
Unmanned Air Vehicles
![Page 11: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/11.jpg)
Data, Data, Data! 41�
�
�
�
Unmanned Ground Sensors (Concept)
Data, Data, Data! 42�
�
�
�
MicroMotes (Kris Pister, David Culler et al.)
(a) Mote (b) Network
Data, Data, Data! 43�
�
�
�
US Corporations and Data Era
• Size of IT marketplace $750B/year
• Size of IT budgets nearing 50% of total.
Data, Data, Data! 44�
�
�
�
Corporate Organization = Information Organization
• Merchandising
• Airlines
• Communications
• Shipping
• Manufacturing
![Page 12: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/12.jpg)
Data, Data, Data! 45�
�
�
�
WalMart: 7 Terabyte Transactions Database
(a) WalMart (b) Database
Data, Data, Data! 46�
�
�
�
Investment Community and Data Era
Massive Investment in startups for ...
• Network Infrastructure
• Genomic Databases
• Consumer Behavior Tracking
• Mass Personalization
Data, Data, Data! 47�
�
�
�
Example: Savage Beast (Oakland, CA)
(a) Query (b) Approach
Data, Data, Data! 48�
�
�
�
Compiling Musical Genome
![Page 13: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/13.jpg)
Data, Data, Data! 49�
�
�
�
Data, Data, Data! 50�
�
�
�
III Critics of the Data Era, 1
(a) Ralph Nader (b) Pope John
Data, Data, Data! 51�
�
�
�
III Critics of the Data Era, 2
Data, Data, Data! 52�
�
�
�
First Law of Data Smog
Information, once rare and cherished like caviar, is nowplentiful and take for granted like potatoes.
![Page 14: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/14.jpg)
Data, Data, Data! 53�
�
�
�
Typical Quotation from Data Smog
Tens of thousands of words daily pulse through ourbeleaguered brains. Accompanied by a massive amount ofother auditory and visual stimuli. In every moment of theaudio-visual orgy of our highly-informed days, the brainhandles a massive amount of electrical traffic. No wonderwe feel burnt
– Philosopher Philip Novak
Data, Data, Data! 54�
�
�
�
My Reaction
• Beginning of telephone era: people reported shock lasting daysafter a phone call
• Much opposition is simply romanticism
On the Other hand ...
• Serious looming problems in Data era;
• Typical critics ill-equipped to see, understand, or articulatethem.
Data, Data, Data! 55�
�
�
�
IV. The Long View – Mathematics
• Society is making massive bets ...
• Are these wise?
• Will the investment pay off?
• Where do you come in?
Data, Data, Data! 56�
�
�
�
A Common View of Technology
(a) 2001
"Any sufficiently advanced technology is indistinguishable from Magic"
-- Arthur C. Clarke
(b) Quote
![Page 15: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/15.jpg)
Data, Data, Data! 57�
�
�
�
USNA Responsibility
• You are Leaders of Tomorrow
• You must be Tough Customers
• Demand Good Investments
• Make Society’s Investments Pay Off
Mathematics Tells You
• Information technology is not magic
• Extracting Information from data is not a sure thing
• Specific hard work on a case-by-case basis
• You can learn what must be done
Data, Data, Data! 58�
�
�
�
Undergraduate/Graduate Courses Relevant
• Linear Algebra/Vector Analysis
• Statistics/Data Analysis
• Machine Learning/Data Mining
• Information Theory
• Signal Processing
Data, Data, Data! 59�
�
�
�
Mathematical Viewpoint about Database
• Datasets == arrays of numbers
• Array of numbers == points clouds in space
• Exploitation of Data == Recognizing patterns in point clouds
Data, Data, Data! 60�
�
�
�
Statistical Dataset
• N Individuals, p variables
• X(i, j) data 1 ≤ i ≤ N , 1 ≤ j ≤ p
Example Mathematics Student Exams
• N = 25 Students, p = 5 measured variables
• X(i, 1): Diff. Geometry score of student i
• X(i, 2): Complex Analysis score of student i
• X(i, 3): Algebra score of student i
• X(i, 4): Real Analysis score of student i
• X(i, 5): Statistics score of student i
![Page 16: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/16.jpg)
Data, Data, Data! 61�
�
�
�
ID # Diff. Complex Algebra Real Statistics
Geom. Analysis Analysis
1 36 58 43 36 37
2 62 54 50 46 52
3 31 42 41 40 29
4 76 78 69 66 81
5 46 56 52 56 40
6 12 42 38 38 28
7 39 46 51 54 41
. . . . . . . . . . . . . . . . . .
Data, Data, Data! 62�
�
�
�
Dataset as a point cloud in 2-d
0 10 20 30 40 50 60 70 8010
20
30
40
50
60
70
80
90
Differential Geometry Scores
Sta
tistic
s S
core
s
Graduate Student Exam Scores
Data, Data, Data! 63�
�
�
�
Dataset as a point cloud in 3-d
0
20
40
60
80 0
20
40
60
80
10030
35
40
45
50
55
60
65
70
Statistics
Graduate Student Exam Scores
Differential Geometry
Alg
ebra
Data, Data, Data! 64�
�
�
�
Many other Statistical Datasets
• N Individuals, p variables
• X(i, j) data 1 ≤ i ≤ N , 1 ≤ j ≤ p
Examples
• Medical: Individuals = Patients, Variables = Age, BloodPressure, ...
• Financial: Individuals = Stocks, Variables = P/E Ratio,Earnings Growth (APR), ...
• Sports: Individuals = Athletes, Variables = AB, H, W, BA,RBI, HR, ...
![Page 17: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/17.jpg)
Data, Data, Data! 65�
�
�
�
Some Common Patterns in Point Clouds
• Planes
• Clusters
• Outliers
• Filaments
Data Analysis: Finding and Interpreting such Patterns
Data, Data, Data! 66�
�
�
�
Interlude: MacSpin
Data, Data, Data! 67�
�
�
�
Society’s Bets
• Socially useful datasets == point clouds in space
• Exploitation of Data == Recognizing patterns in point clouds
• We will recognize patterns in point clouds which allowexploitation
Data, Data, Data! 68�
�
�
�
V. The True Challenge: High Dimensionality
Modern Data have high volumes
• A Sound: > 3KHz
• An Image: >1M Pixels
Each ’Data Point’ in a database is a highly multidimensional object
• An Utterance: >50K dimensions
• A Face: > 200K dimensions
![Page 18: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/18.jpg)
Data, Data, Data! 69�
�
�
�
Example Modern Databases
Human Identity
• For each person, a 256 ∗ 256 image
• N = 106 individuals (points)
• p = 256 ∗ 256 = 65536 variables (dimensions)
Hyperspectral Image
• For each chemical, a 1024-long spectrum
• N = 5000 compounds (points)
• p = 1024 variables (dimensions)
Data, Data, Data! 70�
�
�
�
Needs with High Dimensional data
• Visualizing high dimensional data
• Navigating high dimensional space
• Discovering high dimensional patterns
Data, Data, Data! 71�
�
�
�
Can We Visualize High Dimensions?
Data, Data, Data! 72�
�
�
�
![Page 19: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/19.jpg)
Data, Data, Data! 73�
�
�
�
Data, Data, Data! 74�
�
�
�Data, Data, Data! 75�
�
�
�
Data, Data, Data! 76�
�
�
�
![Page 20: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/20.jpg)
Data, Data, Data! 77�
�
�
�
Data, Data, Data! 78�
�
�
�Data, Data, Data! 79�
�
�
�
Data, Data, Data! 80�
�
�
�
![Page 21: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/21.jpg)
Data, Data, Data! 81�
�
�
�
Data, Data, Data! 82�
�
�
�
We must Reduce Dimensionality!!
Idea: Project onto Subspace
Data, Data, Data! 83�
�
�
�
Approaches to Reduce Dimensionality
• Ideas from Linear Algebra and Statistics
• Choose new coordinate system (basis)
• Rotate into that basis
• Use only a few of the new coordinates
Data, Data, Data! 84�
�
�
�
Illustration of Rotate/Drop Coordinates
-20
-10
0
10
20
-20
-10
0
10
20-20
-10
0
10
20
![Page 22: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/22.jpg)
Data, Data, Data! 85�
�
�
�
Approaches to Reduce Dimensionality
• V.I Principal Component Analysis
• V.II Multiresolution Analysis
Data, Data, Data! 86�
�
�
�
Dataset as a point cloud in 2-d
0 10 20 30 40 50 60 70 8010
20
30
40
50
60
70
80
90
Differential Geometry Scores
Sta
tistic
s S
core
s
Graduate Student Exam Scores
Data, Data, Data! 87�
�
�
�
Concentration Ellipsoid and Principal Axes
-20 0 20 40 60 80 1000
10
20
30
40
50
60
70
80
90
Differential Geometry Scores
Sta
tistic
s S
core
s
Graduate Student Exam Scores
Data, Data, Data! 88�
�
�
�
Principal Components
• Concentration Ellipsoid
• Principal Axes
• Longer Axes – more variation
Sometimes
• Few Axes – most variation
• Summarize many variables by a few.
![Page 23: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/23.jpg)
Data, Data, Data! 89�
�
�
�
PCA Relies on ...
• Linear Algebra (eigenvalues of positive definite matrices)
• Multivariate Statistics (covariance matrices)
• Numerical Computing (eigenvalue solvers)
Used throughout Science and Technology
• Chemometrics
• Hyperspectral Imagery
• Human Identification
Data, Data, Data! 90�
�
�
�
PCA Output on eNose
Data, Data, Data! 91�
�
�
�
PCA Output on Hyperspectral Imagery
Data, Data, Data! 92�
�
�
�
Eigenfaces
(a) UBC Students (b) Principal faces
![Page 24: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/24.jpg)
Data, Data, Data! 93�
�
�
�
Can we discover ...
• ... from data
• ... the true cooordinate system
• Recent approach: ICA
• Uses different statistical principles
Data, Data, Data! 94�
�
�
�
Example: Facial Gestures
Data, Data, Data! 95�
�
�
�
(a) PCA (b) ICA
Sejnowski Lab, Salk Inst.
Data, Data, Data! 96�
�
�
�
V.II Dimensionality Reduction by Multiresolution Anal
• New coordinate system: not pixels but (scale,location)
• Break Data into coarse/medium/fine scales
• Select information by scale/location
• Alter egos: wavelets, multiscale methods
![Page 25: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/25.jpg)
Data, Data, Data! 97�
�
�
�
BigMac
Bandpass BigMac, s=1
Bandpass, s=2
Bandpass, s=3
Partitioned, s=2
Partitioned, s=3 Ridgelet Coeff, s=2
Ridgelet Coeff, s=2
Figure 25: BigMac Image, and stages of Curvelet Analysis.
Data, Data, Data! 98�
�
�
�Figure 26: Fingerprint Movie. 64W + 0/256/768/3072 Curvelets
Data, Data, Data! 99�
�
�
�
Figure 27: Lichtenstein, ‘In The Car’; 64 Wavelets+ 256 Curvelets
Data, Data, Data! 100�
�
�
�
V.III Nonlinear Structures in High-D Space
(a) Scatter (b) Manifold
![Page 26: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/26.jpg)
Data, Data, Data! 101�
�
�
�
Example of Nonlinear Structure: Images
Data, Data, Data! 102�
�
�
�
Can We Discover ...
• ... from data
• ... an underlying nonlinear coordinate system
Relevant Courses
• Linear Algebra
• Multivariate Statistics
• Differential Geometry of Curves and Surfaces
Data, Data, Data! 103�
�
�
�
ISOMAP (Josh Tenenbaum, Stanford)
Data, Data, Data! 104�
�
�
�
Many Reasons to Navigate High Dimensions
• Locate Best Matches in Multimedia Databases
• Steganography
• Understand tactical tendencies of opponent
• Maneuver planning
Many Methods to Navigate High Dimensions
• Sonification
• Diverse Mathematical Ideas
![Page 27: the coming Data Deluge Challenges and Opportunities of ...lekheng/courses/191f09/donoho.pdf · 6 individuals (points) • p = 256 ∗ 256 = 65536 variables (dimensions) Hyperspectral](https://reader034.vdocuments.mx/reader034/viewer/2022051920/600cd8afbc42a655ef02a10e/html5/thumbnails/27.jpg)
Data, Data, Data! 105�
�
�
�
Summary
• Entering Era of Data
• Serious Challenges: Data Smog, Data Glut
• Mathematical Challenges:navigate, explore high dimensional space
• Only scratched surface in this lecture
• Progress important for Navy Goals and Society in General
• You can contribute