cs228: deep learning & unsupervised feature learning

9
Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng

Upload: metea

Post on 12-Jan-2016

68 views

Category:

Documents


0 download

DESCRIPTION

CS228: Deep Learning & Unsupervised Feature Learning. Andrew Ng. Pieter Abbeel Adam Coates Zico Kolter Ian Goodfellow Quoc Le Honglak Lee Rajat Raina Andrew Saxe. TexPoint fonts used in EMF. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

CS228: Deep Learning &

Unsupervised Feature Learning

Andrew Ng

Page 2: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

How is computer perception done?

Image Low-levelvision features

Recognition

Object detection

Computer vision is hard!

Page 3: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

How is computer perception done?

Image Vision features Recognition

Object detection

Audio Audio features Speaker ID

Audio classification

NLP

Text Text features

Text classification, MT, IR, etc.

Page 4: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

Sensor representations

Input Learning/AIalgorithm

Low-level features

Page 5: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

A plethora of sensors

Camera array

3d range scan (laser scanner)

3d range scans (flash lidar)

Audio

A general-purpose algorithm for good sensor representations?

Visible light image

Thermal Infrared

Page 6: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

Sensor representation in the brain

[BrainPort; Martinez et al; Roe et al.]

Seeing with your tongueHuman echolocation (sonar)

Auditory cortex learns to see.

Auditory Cortex

Page 7: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

Learning abstract representations

pixels

edges

object parts(combination of edges)

object models

[Related work: Deep learning, Hinton, Bengio, LeCun, and others.]

Page 8: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

Feature learning for audio

Learned features correspond tophonemes and other “basic units”of sound.

Learned features

Algorithm:

Page 9: CS228: Deep Learning & Unsupervised Feature Learning

Andrew Ng

TIMIT Phone classification AccuracyPrior art (Clarkson et al.,1999) 79.6%

Stanford Feature learning 80.3%

TIMIT Speaker identification AccuracyPrior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%

Audio

Images

Multimodal (audio/video)

CIFAR Object classification Accuracy

Prior art (Yu and Zhang, 2010) 74.5%

Stanford Feature learning 79.6%

NORB Object classification Accuracy

Prior art (Ranzato et al., 2009) 94.4%

Stanford Feature learning 97.0%

AVLetters Lip reading Accuracy

Prior art (Zhao et al., 2009) 58.9%

Stanford Feature learning 65.8%

Galaxy

Other feature learning records: Different phone recognition task (Hinton), PASCAL VOC object classification (Yu)

Hollywood2 Classification Accuracy

Prior art (Laptev et al., 2004) 48%

Stanford Feature learning 53%

KTH Accuracy

Prior art (Wang et al., 2010) 92.1%

Stanford Feature learning 93.9%

UCF Accuracy

Prior art (Wang et al., 2010) 85.6%

Stanford Feature learning 86.5%

YouTube Accuracy

Prior art (Liu et al., 2009) 71.2%

Stanford Feature learning 75.8%

Video