object detection training: an online learning pipeline for ......design and implementation of an...

Object Detection Training: An Online Learning Pipeline for Humanoid Robots

Elisa Maiettini and Giulia Pasquale

MUNICH 9-11 OCT 2018

Joint work with:

Lorenzo Natale, Lorenzo Rosasco

Talk Overview

o Computer Vision & Robotic Scenario

o Object Detection’s Challenges

o Our Online Detection Approach

o Experimental Results

o Implementation Details

Stunning performance

Huge amount of data

Long train time

Computationally demanding

Computer Vision at The Edge

…VIDEO HERE…

Full video https://www.youtube.com/watch?v=s8Ui_kV9dhw DeepLab: https://arxiv.org/abs/1802.02611 , Mask E-CNN: https://arxiv.org/abs/1703.06870 ,

YOLOv2 and YOLO v3: https://pjreddie.com/darknet/yolo/

Multiple sources of information (sensors, context…)

Specific data required

Need to adapt quickly

Limited computation

Real World Robotic Scenario

Our Previous Steps

Are we done with Object Recognition? The R1 perspective.

Giulia Pasquale, GTC 2017, San Jose, CA

[http://on-demand.gputechconf.com/gtc/2017/video/s7295-giulia-pasquale-are-

we-done-with-object-recognition-the-r1-robot-perspective.PNG.mp4 ]

Natural, Interactive training of service robots to detect

novel objects, Elisa Maiettini and Giulia Pasquale, GTC 2018,

Munich

[https://drive.google.com/file/d/0B596cb8D9K9kcGE4czA0NnNUTmM/preview]

Talk Overview






Object Detection: a Computational Challenge

1) Localization 2) Classification

Sliding window / gridSelective Search, Uijlings, IJCV, 2013

Region Proposal Network, S. Ren et al., NIPS, 2015 …

SVMs,RLS,

FC Layers,…

o Binary classification problemo Enormous dataseto Strongly unbalancedo Samples from the Majority class are:

• Redundant• Tons of easy samples

Object Detection: a Computational Challenge

Detector

Where?

What?

Region Based Approaches 1) R-CNN, Girshick R et al., 2014

2) Faster R-CNN, Shaoqing R et al., 2015

Standard Offline Pipeline

Apple AppleApple

FeatureExtractor

RegionProposal

Data acquisition and annotation

Train detection model

Hours/days of train…

Detection Model


Annotation

RGB Image

Data acquisition and annotation


Hours/days of train…

Detection Model


Annotation

RGB Image

• Human annotation

• Long train time

• Slow adaptation

Talk Overview






Our Online Detection Approach at a Glance

Data acquisition and

automatic annotation[3]


Few secondsof train[4]…

Detection Model

[3] E. Maiettini et al., Humanoids, 2017 [4] E. Maiettini et al., IROS, 2018

Annotation

RGB Image

RGB Image

Segmentation

Output

[5] Pasquale et al. IROS 2016

iCubWorld[5]

https://robotology.github.io/iCubWorld

Automatic Data Acquisition

Bounding boxes

Labels

https://robotology.github.io/iCubWorld

Faster R-CNN architecture

Featureextractor

Feature extraction module Detection module

Proposed Learning Pipeline

ConvolutionalLayers

Region ProposalNetwork

Fastclassifier

Bboxrefinement

FALKON+

Minibootstrap

RegularizedLeast

Squares

R-CNN approach

o Subsampling + splitting negatives

o First model train

FOR EACH batch i

• Select hard negatives

• Train ith model

• Prune easy negatives

Minibootstrap


o First model train

FOR EACH batch i


• Train ith model


Minibootstrap

Nchosen_1 M1Train(P )


o First model train

FOR EACH batch i


• Train ith model


Minibootstrap

Mi-1TEST ON Bi Bi_hard

M1Train(P )Nchosen_1


o First model train

FOR EACH batch i


• Train ith model


Minibootstrap


P)Nchosen_i-1Train(Bi_hard Mi



o First model train

FOR EACH batch i


• Train ith model


Minibootstrap


Nchosen_i

TEST ON (Bi Nchosen_i-1)

Mi




o First model train

FOR EACH batch i


• Train ith model


Minibootstrap



And now repeat!

Nchosen_i

TEST ON (Bi Nchosen_i-1)

Mi


FALKON: An Optimal Large Scale Kernel Method

• Kernel method efficient for Large Scale datasets;

• Accurate classifier (statistical bounds mathematically proved in [6]);

• Stochastic data subsampling obtained applying iterative solvers, preconditioning and Nÿstrom method.

[6] Rudi A. et al, NIPS, 2017

Talk Overview






Experimental setup

PRE-TRAIN TASK TARGET TASK


Fastclassifier

Bboxrefinement

FALKON+

Minibootstrap

RegularizedLeast

Squares

BACKBONE NETWORK:ZF[8]

Resnet50[9]

Resnet101[9]

DATASETPascal VOC dataset[6]

iCubWorld-Transformations[7]

[6] http://host.robots.ox.ac.uk/pascal/VOC/[7] https://robotology.github.io/iCubWorld/

[8] Visualizing and understandingconvolutional networks. CoRR, D. Zeiler et al.[9] Deep Residual Learning for Image Recognition, K. He et al.

http://host.robots.ox.ac.uk/pascal/VOC/

https://robotology.github.io/iCubWorld/

Experiments on Pascal Voc Dataset

mAP Train Time

Faster R-CNN 74,3 3h 15m

FALKON + Fullbootstrap 75,1 55m

FALKON + Minibootstrap (10x2000) 70,4 1m 40s

Pre-train task: Voc 2007 + Voc 2012 Target task: Voc 2007 + Voc 2012


Fastclassifier

Bboxrefinement

FALKON+

Minibootstrap

RegularizedLeast

Squares

…

Resnet101

Detection module

Fastclassifier

Bboxrefinement

FALKON+

Minibootstrap

RegularizedLeast

Squares

Pre-train task: 100 objects from iCubWorld Target task: 30 objects from iCubWorld

Feature extraction module

Experiments on iCubWorld: Setup

… …

Resnet50

mAP Train Time

Faster R-CNN last layers 51,7 4h

FALKON + Minibootstrap (10x2000) 51,7 33s

Experiments on iCubWorld: Some Results

Experiments on iCubWorld: More Results

Talk Overview






R1, your Personal Humanoid RobotSensors

IMUIntel RealSense

2x RGB camerasSensorized skin

2x LIDAR

Motion

2 wheelsTorso elongation: 20 cmArms elongation: 13 cm

Li-ion battery: 3 hours

Software

CAFFEC++

PythonMATLAB

+

Computation

2x NVIDIA Jetson TX2

Intel i7

Fully Autonomous

Platform!

For Extra Power

Computation!

NVIDIA GeForce 1080 Ti

Application pipeline

Feature/Region Extractor

Automatic GT Extractor

DetectorVisualizerState

Machine

Gaze Ctrl

Speech Recognizer

Mug

GTboxes

Feature for each region

Predictions

Commands

Predictions

Verbal commands

Actions

RGBImage

DepthImage

GTboxes

Gazecommands

Speechcommands

…VIDEO HERE…

Deploying Our Pipeline on R1…

Conclusions

Design and implementation of an online Object Detection learning pipeline, that can be trained in few seconds

Deployment on R1 humanoid robot, thanks to NVIDIA acceleration

Future works Further exploit context information

Design fully autonomous learning pipeline

Thank you!

Elisa [email protected]

Giulia [email protected]

MUNICH 9-11 OCT 2018

object detection training: an online learning pipeline for ......design and implementation of an...

Documents