real-time attention system gpu-vocus for exploration · 2008-04-01 · macs_y3_visual_attention.ppt...

18
Real-time attention system GPU-VOCUS for exploration Stefan May, Adaptive Reflective Teams Department Sankt Augustin, February 15th, 2008

Upload: others

Post on 07-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

Real-time attention system GPU-VOCUS forexploration

Stefan May,Adaptive Reflective Teams Department

Sankt Augustin, February 15th, 2008

Page 2: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 2FP6-004381-MACS

I. Outline

The Role of Visual Attention

Visual Attention for Robot Control

Visual Attention Simulator VOCUS

Visual Attention GPU-accelerated VOCUS

Implementation

The task of Exploration

Conclusion

Page 3: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 3FP6-004381-MACS

1. The Role of Visual Attention

Selection of relevant stimuli and processes in interaction

with the environment by means of simple features

No previous knowledge about scene or objects!

Focusing on parts of the optical array reduces the

computational effort

, but

Visual attention is still a time-consuming task,

if parallelism is not used

Pop-Out effect: Attraction or warning

Page 4: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 4FP6-004381-MACS

2. Visual Attention for Robot Control

Tasks of an Attention System in MACS

Extraction of “interesting” regions, especially in theexploration phase

Monitoring cues during interaction needs a high framerate (Processing time < 33 ms!)

Further: Distance estimation

Page 5: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 5FP6-004381-MACS

3. Visual Attention Simulator VOCUS

6 image pyramids

48 scale maps (12 Intensity, 12 Orientation, 24 Color)

10 Feature maps (2 Intensity, 4 Orientation, 4 Color)

Center-surround/Gaborfilter

Rescaling/Summing up

Weighting

3 Conspicuity maps (1 Intensity, 1 Orientation, 1 Color)

Fusion

1 Saliency map

VOCUS processes lots of independent maps

Overview

VOCUS uses lots of local operators

Application of parallel processing?

Ref.: Frintrop [8] (Btw parallelism is biological plausible!)

Page 6: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 6FP6-004381-MACS

4. Visual Attention: GPU-accelerated VOCUS

Ref.: Shih-hsuan Hsu, Graphics group, CMLab, CSIE, NTU

278.6

NV GF 7800 GT

5189.1GFlops (max.)

NV GF 8800 GTXIntel P4 630, 3GHz

CPU: Intel Pentium 4

GPU: NV GF 8800 GTX

Arithmetic performance of the GPU is the result of ahighly specialized architecture (SIMD)

Page 7: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 7FP6-004381-MACS

5. Requirements for attenting the real world

Tasks of an Attention System in MACS

Extraction of „interesting“ regions without anyknowledge about the environment, especially in theexploration phase

Monitoring cues during interaction needs a highframerate (Processing time < 33 ms!)

Further: Distance estimation via triangulation

Page 8: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 8FP6-004381-MACS

5. Implementations

Runtime Comparison (VGA-Resolution)

Speedup

~6 / ~9

noyesFeature orientation

Mean runtime / ms

9,621,8GPU-VOCUS (NV GF 8800 GTX / 32-bit)

25,057,7GPU-VOCUS (NV GF 7800 GT / 16-bit)

34,377,5GPU-VOCUS (NV GF 7800 GT / 32-bit)

89,1129,2VOCUS (integral)

969,71407,6VOCUS Speedup~65 / ~101

Page 9: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 9FP6-004381-MACS

5. Implementations

VGA-Resolution

Online extraction withfull camera frame rate (15 Hz)

CPU resources are free forfurther processing tasks

Feature Extraction with GPU-VOCUS (no IOR)

Page 10: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 10FP6-004381-MACS

5. Implementations

Find similar features

Combine them to regions

Works like a “low-leveltracking” if cue isunambiguous

Using Top-Down Mode

Page 11: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 11FP6-004381-MACS

5. Visual Attention for Exploration

Exploration using visual attention (“curiosity”)

Basic skill

Attention system VOCUS

Bottom-up in left and Top-down in right camera images

S. May et al.: IROS 2007

CPU version shown atthe Y2 review (~3 Hz)

Page 12: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 12FP6-004381-MACS

6. The task of Exploration

VOCUS vs. GPU-VOCUS (2 x VGA-Resolution)

VOCUS(integral)

GPU-VOCUS(Monitoring a physical process)

~3 Hz

~15 Hz(30 Hz possible)

Page 13: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 13FP6-004381-MACS

6. The task of Exploration – Triangulation

Exploration using visual attention (“curiosity”)

Compute approximate position using triangulation (+ Mean shift)

Approach blue entity

Approach yellow entity

Page 14: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 14FP6-004381-MACS

6. Results

GPU-VOCUS provides cues of“interesting regions”

Classification of simplefeatures

Learning of relation between“Cue – Behavior – Outcome”

Saliency-based Exploration

Page 15: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 15FP6-004381-MACS

7. Conclusion

Online calculation of relevant stimuli with visualattention

No previous knowledge, no model database

Usage of standard hardware on a mobile robot

Speedup improvements enable monitoring of physicalprocesses

Future work “Curiosity drive”: Saliency is not sufficient for

“curiosity”!

Curiosity involves novelty detection and experienceand learning

Page 16: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 16FP6-004381-MACS

References

[1] J. J. Gibson, The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, 1979.

[2] G. Fritz, L. Paletta, R. Breithaupt, E. Rome, and G. Dorffner, Learning Predictive Features inAffordance-based Robotic Perception Systems, in Proeedings of the IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), October 2006.

[3] A. P. Duchon, W. H. Warren, and L. P. Kaelbling, Ecological robotics, Adaptive Behavior,Special Issue on Biologically Inspired Models of Spatial Navigation, vol. 6, no. 3-4, pp. 473–507,1998.

[4] W. Warren and S. Whang, Visual guidance of walking through apertures: Body scaledinformation for affordances, 1987, vol. 13, pp. 371–383.

[5] L. Mark, Eyeheight-scaled information about affordances: Lerning and projecting a sersori-motor mapping, 1987, vol. 13, pp. 361–370.

[6] K. MacDorman, Grounding symbols through sensorimotor integration, 1999.

[7] Fraunhofer IAIS. (2007) EU Project MACS. [Online]. Available: http://www.macs-eu.org

[8] S. Frintrop, VOCUS: A Visual Attention System for Object Detection and Goal-directed Search,ser. Lecture Notes in Artificial Intelligence (LNAI). Springer Berlin/Heidelberg, 2006, vol. 3899 /2006.

Page 17: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 17FP6-004381-MACS

Real-time attention system GPU-VOCUS for exploration

Thank you for your attention

Page 18: Real-time attention system GPU-VOCUS for exploration · 2008-04-01 · MACS_Y3_Visual_Attention.ppt 3 FP6-004381-MACS 1. The Role of Visual Attention Selection of relevant stimuli

MACS_Y3_Visual_Attention.ppt 18FP6-004381-MACS

4. Visual Attention: GPU-accelerated VOCUS

GPGPU – some Properties

Overhead through data transmission (CPU <-> GPU)

Per-pixel execution model (parallel execution!)

Programs running on the GPU are called shader

Multipass-Rendering / Micropass-Rendering (many smallshaders vs. large shaders; bottleneck, Re-useability?)

Shaders are applied, if „something“ is drawn (Render-to-texture – multiple copy operations)

More difficult to debug (than processing on CPU; Debugger aswell as I/O is not comfortable)

Different high-level shading languages available (GLSL, HLSL,Cg, CUDA)

Global operations need multiple passes (e.g. calculation ofmaxima)