from pixel to visual intelligencevalser.org/2017/ppt/vooc/valse_2017_lcw.pdf · yao xiao, cewu lu,...

82
From pixel to Visual Intelligence Speaker: Cewu Lu (卢策吾) Shanghai Jiaotong University 上海交通大学

Upload: others

Post on 12-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

From pixel to Visual Intelligence

Speaker: Cewu Lu (卢策吾)

Shanghai Jiaotong University

上海交通大学

Page 2: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

•About me.

•My understanding of Computer Vision Big Picture .

•My research at that Big Picture.

Outline

Page 3: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

About Me

• Professor • Ph.D supervisor• 1000 talents oversee (国家青年千人计划)

Machine Vision and Intelligence Group

Page 4: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Before I joined SJTU

Postdoc and research Fellow at

Prof. Fei-fei LiDirector of Stanford AI lab

Prof. Leonidas J. GuibasNAE(美国工程院院士)

Page 5: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

• Stanford-Toyota Self-Driving Cars(斯坦福-丰田无人车) core member

• Publish (accepted) 21 CVPR/ICCV/PAMI/IJCV (77% first author), CCF Apaper 31

• Most cited paper SIGGRAPH in recent 5 years among 1000+ papers.

• Two papers are included in OpenCV

About Me

Page 6: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Computer Vision

Machine can See

NSF while paper: Let machine see like human

Page 7: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Computer Vision

Machine can See

Pixel level Patch level Human Understanding

Object level Super object

[SIFT Feature, 2004]

[Deep Learning, 2012]

Page 8: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Image

Video

RGBD

Scene UnderstandingObject Detection

Fine-gained

Event Understanding

Action Recognition

Gradient Processing

Image Abstraction

Stereo Deblur

DenoisePatch Representation

Tracking

Face

3D reconstruction

Visual QA

Image2catpion

Video storying

Video storying

Pixel level Patch level Human Understanding

Object level(recognition)

SaliencyScene Parsing

Point cloud segmentation

Page 9: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Image

Video

RGBD

Scene UnderstandingObject Detection

Fine-gained

Event Understanding

Action Recognition

Gradient Processing

Image Abstraction

Stereo Deblur

DenoisePatch Representation

Tracking

Face

3D reconstruction

Visual QA

Image2catpion

Video storying

Video storying

Pixel level Patch level Human Understanding

Object level(recognition)

SIGRAPHA ASIA

SIGRAPHA ASIA

IJCV CVPR

ICCV

CVPR

CVPR

TIP

CVPR

CVPRTIP

CVPR

CVPR

ECCV

CVPR

ICCV

ICCV

ICCV

ICCVICCV

ICCV

ICCV

TVCG

PAMI

PAMI

PAMI

IJCV

ICCP

My Research Work

Page 10: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Representative Work on Patch Level

Page 11: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

L0-norm smoothing

Cewu Lu*, Li Xu*, Yi Xu, Jiaya Jia , "Image Smoothing via L0 Gradient Minimization“,ACM Transactions on Graphics, Vol. 30, No. 5 (SIGGRAPH Asia 2011) * Indicates co-first author

Page 12: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Main Structure Extraction

Smoothing result

Page 13: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Ours

Extracted Edge

Page 14: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Canny

Extracted Edge

Page 15: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 16: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Stationary Estimation

L0 Regularized Stationary Time Estimation for Crowd Group Analysis, [CVPR 2014] [PAMI 2016]

Page 17: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Abnormal Event Detection at 1000 FPS

[Cewu Lu et al. ICCV]

Cewu Lu, Jianping Shi, Jiaya Jia. Abnormal Event Detection at 150 FPS in MATLAB,IEEE International Conference on Computer Vision [ICCV 2013] [IJCV 2017]

Page 18: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Results (UCSD Ped1 Dataset)

MPPCA: [Mahadevan et al. 2009] MPPCA+SF: [Mahadevan et al. 2009] SF: [Mahadevan et al. 2009] MDT: [Mahadevan et al. 2009] Sparse: [Cong et al. 2011] Adam: [Adam et al. 2008]

Pixel-level comparison. FPR: False Positive Rat. TPR: True Positive Rate. Subspace: replacing our combination learning by [Ehsan et al. 2009].

Page 19: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Results

Sec per Frame Platform CPU Memory

[Mahadevan et al. 2009] 25 - 3.0GHz 2.0GB

[Cong et al. 2011] 3.8 - 2.6GHz 2.0GB

[Antic et al. 2011] 10 MATLAB - -

Our 0.00098 MATLAB 2012 3.4GHz 8.0GB

Testing time comparison on the UCSD Ped1 dataset.

Page 20: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Results

Sec per Frame Platform CPU Memory

[Mahadevan et al. 2009] 25 - 3.0GHz 2.0GB

[Cong et al. 2011] 3.8 - 2.6GHz 2.0GB

[Antic et al. 2011] 10 MATLAB - -

Our 0.00098 MATLAB 2012 3.4GHz 8.0GB

Testing time comparison on the UCSD Ped1 dataset.

Others

Ours

Page 21: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Results

Page 22: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Representative Work on Object Level

Page 23: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Personal Object Discovery[Cewu Lu et al. TIP]

Page 24: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Object Scene Distribution

Page 25: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Highlight Projects (Personal Object Discovery)

Cewu Lu, Renjie Liao, Jiaya Jia , “Personal Object Discovery“, IEEE Transactions on Image Processing.

Page 26: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Weather Understanding[Cewu Lu et al. CVPR 2014][Cewu Lu et al. TPAMI2014]

Page 27: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Highlight Projects (Weather classification)

Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang, “Two-class Weather Classification“, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014, (TPAMI) 2017.

Page 28: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 29: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Real-Time Video Stylization Using Object FlowsCewu Lu Yao Xiao and Chi-Keung TangIEEE Transactions on Visualization and Computer Graphics (TVCG), 2017

Combining Sketch and Tone for Pencil Drawing Production.Cewu Lu, Li Xu, Jiaya Jia.Non-Photorealistic Animation and Rendering (NPAR), 2012(Best Paper Award).

Page 30: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Cewu Lu et al. Real-Time Video Stylization Using Object Flows

Page 31: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Papers (Object Detection)

Cewu Lu, Hao Chen, Qifeng Chen, Hei Law, Yao Xiao, Chi-Keung Tang ECCV 2014 workshop - ImageNet Large Scale Visual Recognition Challenge

Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia, Deep LAC: Deep Localization, Alignment and Classification for Fine-grained Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015.

Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015.

Cewu Lu, Yongyi Lu, CK Tang, Efficient Square Localization for Efficient and Accurate Object Detection, submitted to IEEE International Conference on Computer Vision (ICCV), 2015.

Cewu Lu, Yongyi Lu, CK Tang, Explicit Closed-Curve Optimization for Objectness Estimation , submitted to IEEE International Conference on Computer (ICCV), 2015.

Cewu Lu, Yongyi Lu, CK Tang, Unobjectness for Object Proposals Generation, submitted to IEEE International Conference on Computer Vision (ICCV), 2015.

Page 32: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Pixel level Patch level Human Understanding

Object level

[Deep Learning 2012]

StoryNoun (名词)

Sentence (句子)

Phrase(短语)

verb(动词)Natural Language Understanding

Computer Vision

Comparison to NLP

Page 33: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Pixel level Patch level Human Understanding

Object level

[Deep Learning 2012]

StoryNoun (名词)

Sentence (句子)

Phrase(短语)

verb(动词)Natural Language Understanding

Computer Vision

Comparison to NLP

What can I do here?

Page 34: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Representative Work on Beyond Object Level

Page 35: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Visual Relationship Detection with Language PriorsCewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-FeiECCV 2016 (oral) (reported by ECCV daily)

Page 36: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Detecting <Subject, Predicate, Object> (<主,谓,宾>)

Page 37: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Difficulties

(1)detection errors by individual is huge (under 5%)

(2)Training data is sparse

主 谓 宾

Page 38: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

主 谓 宾 主谓宾

100类 70类 100类 70万类

Difficulties

(1)detection errors by individual is huge (under 5%)

(2)Training data is sparse

Page 40: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 41: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

• Discover and predict relationships in image.

Mining relationship tuples:<man, wear, glass>

<man, carry, bag>

<Car, on, ground>

<trash bin, next to, streetlight>

…………

Some Results

Page 42: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Using relationship: Human-ride-horse

Page 44: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

人 人

A problem: miss sub-object level information!

Page 45: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Leg stamps on somethingScale pan is stamped by something

Beneath Holistic Object Recognition

Richer semantics on parts helps to infer the story.

Page 46: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

sth sits on saddlewheel in the airwheel on sthsth holds handlebar

sth touches headleg in the airleg on sthtorso wears sth

head with bridle reinsth rides torsotorso wears sthleg in the airleg on sth

hand embraces sthtorso sits on sthleg is bent

sth sits on saddle sth hold handlebar.wheel on sthwheel on sth

(a) (b) (c)

(d) (e)

Beyond Holistic Object Recognition: Enriching Image Understanding with Part StatesCewu Lu et. al (with Stanford University) arXiv:1612.07310

Page 47: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Beyond Holistic Object Recognition

Page 48: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Regional Multi-person Pose EstimationHaoshu Fang,Shuqin Xie,Cewu Lu (通信作者)

arXiv:1612.00137v2

Page 49: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

SST network

Page 50: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

STN: spatial Transform NetworkSDTN: spatial de-transform networkSPPE: single person pose estimation

Page 51: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

STN: spatial Transform NetworkSDTN: spatial de-transform networkSPPE: single person pose estimation

Page 52: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Comparison

“CMU” indicates Real-time Multi-Person 2D Pose Estimation using Part Affinity Fields,Cao et al. CVPR 2017

MPII COCO

Ours 77.4 64.7

CMU 75.6 61.8

Page 53: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 54: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Computer Vision

Machine can See

Pixel level Patch level Human Understanding

Object level Super object

Part level

Page 55: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Computer Vision Big Picture

Machine can See

Machine can Act

Page 56: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 57: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Without Action…

Page 58: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Without Action…

To acquire perception, we need daily action indeed!

Page 59: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.

Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., et al. (2016, February 5). Asynchronous Methods for Deep Reinforcement Learning. arXiv.org.Zhu, Y., Mottaghi, R., Kolve, E., Lim, J. J., Gupta, A., Fei-Fei, L., & Farhadi, A. (2016, September 17). Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement

Learning. arXiv.org.

Reinforcement Learning

Page 60: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Suiqin Xie, Cewu Lu(通信作者) Reinforcement learning for pose estimation

Page 61: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Yourong You, Cewu Lu (通信作者),Reinforcement Learning Car for self-driving

Page 62: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Learning Step1. Low speed straight2. Low speed curve3. Stuck4. High Speed straight5. Low speed curve6. Collision

Page 63: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Yourong You, Cewu Lu (通信作者),Reinforcement Learning for Car self-driving

Page 64: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Virtual to Real Reinforcement Learning for Autonomous Driving

Virtual to Real Reinforcement Learning for Autonomous Driving (with Berkeley )

Yurong You, Xinlei Pan,Ziyan Wang and Cewu Lu, arXiv:1704.03952v1

B-RL: training the vehicle in the virtual car racing simulator TORCS [31] with virtual image as input

method Ours B-RL

result 43.40% 28.33%

Page 66: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Visual Intelligence Big Picture

Machine can See

Machine can Act Machine has Knowledge

Page 67: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

ShapeNet (Stanford, Princeton, Adobe )

A Scalable Active Framework for Region Annotation in 3D Shape CollectionsACM Transactions on Graphics (ACM SIGGRAPH ASIA 2016)(With Stanford, Adobe, UCB)

Page 68: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

editable Real-world

Promising to one-shot learning

Page 69: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,
Page 70: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Unsupervised Image Group Distance InferenceZhengTian Xu, Cewu Lu(通信作者) will submit to arXiv soon

Page 71: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Unsupervised Image Group Distance InferenceZhengTian Xu, Cewu Lu(通信作者)

Page 72: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

From pixel to Visual Intelligence

See

Act Knowledge

See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)

My goal: (1) information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly solve long-tail 长尾效应).

Page 73: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

From pixel to Visual Intelligence

See

Act Knowledge

See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)

My direction: information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly reduce long-tail effect 长尾效应).

只不是增加数据的数量,而是数据的深度(信息量)!

Page 74: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

From pixel to Visual Intelligence

See

Act Knowledge

See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)

My direction: information exploration beyond object level to mine high-level semantics andbetter object level recognition (partly solve long-tail 长尾效应).

Page 75: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

From pixel to Visual Intelligence

See

Act Knowledge

See: finer and finerObject recognition (2013)Detection (2014)Segmentation (2015)Part level such as pose (2016)

Challenging:

(1) how to benchmark we visually understand the work?

subject part 主观(task driving) + objective part 客观 (doing that)

My thinking: leave to Act

(2) How to obtain low-shot (even one-shot) learning?

My thinking: leave to Knowledge

Page 76: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

我们实验室在招生。。。求扩散。。。

Page 77: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

My Research Directions

Machine can See

Machine can Act Machine has Knowledge

Better performance (deep learning)!Sub- and super object levelIn Video and Image

• Real-world interaction • Learning speed • Reward function (inverse RL)• Huge action space

one-shot learning by O(1) effortVisual Knowledge base (self-learning and scale-up)

Page 78: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Applications

Machine can See

Machine can Act Machine has Knowledge

11 students: Pose estimationVideo action understandingVisual relationshipObject detectionDeep Learning on mobile phone

9 students:Auto-carRobot armAuto-navigation

5 Students

Page 79: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

发邮件到这里[email protected]

2018年入学,硕士,博士博士后(工资好说,不差钱)

福利:推荐北美名校暑假实习今年成功推荐:斯坦福(vision group),麻省理工, CMU

目前组里成员有来自:上海交大ACM班,复旦ACM队中科大少年班,浙大竺可桢学院。人均1.6次国家奖学金。

目前2018年入学,发了两个offer,上交计算机系前三名(一作ICCV 2017),上交电子系前三名, 目前还有名额。。。

Page 80: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

欢迎实习!• 目前实习过的学生包括加州伯克利,香港科技大学,浙江大学。我们提供住宿

Page 81: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

Thanks!

Page 82: From pixel to Visual Intelligencevalser.org/2017/ppt/VOOC/valse_2017_lcw.pdf · Yao Xiao, Cewu Lu, Chi-Keung Tang, Complexity-Adaptive Distance Metric for Object Proposals Generation,

欢迎关注我们实验室公众号MVIG@SJTU