real-time attention system gpu-vocus for exploration · 2008-04-01 · macs_y3_visual_attention.ppt...

Real-time attention system GPU-VOCUS forexploration

Stefan May,Adaptive Reflective Teams Department

Sankt Augustin, February 15th, 2008

MACS_Y3_Visual_Attention.ppt 2FP6-004381-MACS

I. Outline

The Role of Visual Attention

Visual Attention for Robot Control

Visual Attention Simulator VOCUS

Visual Attention GPU-accelerated VOCUS

Implementation

The task of Exploration

Conclusion


1. The Role of Visual Attention

Selection of relevant stimuli and processes in interaction

with the environment by means of simple features

No previous knowledge about scene or objects!

Focusing on parts of the optical array reduces the

computational effort

, but

Visual attention is still a time-consuming task,

if parallelism is not used

Pop-Out effect: Attraction or warning


2. Visual Attention for Robot Control

Tasks of an Attention System in MACS

Extraction of “interesting” regions, especially in theexploration phase

Monitoring cues during interaction needs a high framerate (Processing time < 33 ms!)

Further: Distance estimation


3. Visual Attention Simulator VOCUS

6 image pyramids

48 scale maps (12 Intensity, 12 Orientation, 24 Color)

10 Feature maps (2 Intensity, 4 Orientation, 4 Color)

Center-surround/Gaborfilter

Rescaling/Summing up

Weighting

3 Conspicuity maps (1 Intensity, 1 Orientation, 1 Color)

Fusion

1 Saliency map

VOCUS processes lots of independent maps

Overview

VOCUS uses lots of local operators

Application of parallel processing?

Ref.: Frintrop [8] (Btw parallelism is biological plausible!)


4. Visual Attention: GPU-accelerated VOCUS

Ref.: Shih-hsuan Hsu, Graphics group, CMLab, CSIE, NTU

278.6

NV GF 7800 GT

5189.1GFlops (max.)

NV GF 8800 GTXIntel P4 630, 3GHz

CPU: Intel Pentium 4

GPU: NV GF 8800 GTX

Arithmetic performance of the GPU is the result of ahighly specialized architecture (SIMD)


5. Requirements for attenting the real world

Tasks of an Attention System in MACS

Extraction of „interesting“ regions without anyknowledge about the environment, especially in theexploration phase

Monitoring cues during interaction needs a highframerate (Processing time < 33 ms!)

Further: Distance estimation via triangulation


5. Implementations

Runtime Comparison (VGA-Resolution)

Speedup

~6 / ~9

noyesFeature orientation

Mean runtime / ms

9,621,8GPU-VOCUS (NV GF 8800 GTX / 32-bit)

25,057,7GPU-VOCUS (NV GF 7800 GT / 16-bit)

34,377,5GPU-VOCUS (NV GF 7800 GT / 32-bit)

89,1129,2VOCUS (integral)

969,71407,6VOCUS Speedup~65 / ~101


5. Implementations

VGA-Resolution

Online extraction withfull camera frame rate (15 Hz)

CPU resources are free forfurther processing tasks

Feature Extraction with GPU-VOCUS (no IOR)


5. Implementations

Find similar features

Combine them to regions

Works like a “low-leveltracking” if cue isunambiguous

Using Top-Down Mode


5. Visual Attention for Exploration

Exploration using visual attention (“curiosity”)

Basic skill

Attention system VOCUS

Bottom-up in left and Top-down in right camera images

S. May et al.: IROS 2007

CPU version shown atthe Y2 review (~3 Hz)


6. The task of Exploration

VOCUS vs. GPU-VOCUS (2 x VGA-Resolution)

VOCUS(integral)

GPU-VOCUS(Monitoring a physical process)

~3 Hz

~15 Hz(30 Hz possible)


6. The task of Exploration – Triangulation

Exploration using visual attention (“curiosity”)

Compute approximate position using triangulation (+ Mean shift)

Approach blue entity

Approach yellow entity


6. Results

GPU-VOCUS provides cues of“interesting regions”

Classification of simplefeatures

Learning of relation between“Cue – Behavior – Outcome”

Saliency-based Exploration


7. Conclusion

Online calculation of relevant stimuli with visualattention

No previous knowledge, no model database

Usage of standard hardware on a mobile robot

Speedup improvements enable monitoring of physicalprocesses

Future work “Curiosity drive”: Saliency is not sufficient for

“curiosity”!

Curiosity involves novelty detection and experienceand learning


References

[1] J. J. Gibson, The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, 1979.

[2] G. Fritz, L. Paletta, R. Breithaupt, E. Rome, and G. Dorffner, Learning Predictive Features inAffordance-based Robotic Perception Systems, in Proeedings of the IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), October 2006.

[3] A. P. Duchon, W. H. Warren, and L. P. Kaelbling, Ecological robotics, Adaptive Behavior,Special Issue on Biologically Inspired Models of Spatial Navigation, vol. 6, no. 3-4, pp. 473–507,1998.

[4] W. Warren and S. Whang, Visual guidance of walking through apertures: Body scaledinformation for affordances, 1987, vol. 13, pp. 371–383.

[5] L. Mark, Eyeheight-scaled information about affordances: Lerning and projecting a sersori-motor mapping, 1987, vol. 13, pp. 361–370.

[6] K. MacDorman, Grounding symbols through sensorimotor integration, 1999.

[7] Fraunhofer IAIS. (2007) EU Project MACS. [Online]. Available: http://www.macs-eu.org

[8] S. Frintrop, VOCUS: A Visual Attention System for Object Detection and Goal-directed Search,ser. Lecture Notes in Artificial Intelligence (LNAI). Springer Berlin/Heidelberg, 2006, vol. 3899 /2006.


Real-time attention system GPU-VOCUS for exploration

Thank you for your attention


4. Visual Attention: GPU-accelerated VOCUS

GPGPU – some Properties

Overhead through data transmission (CPU <-> GPU)

Per-pixel execution model (parallel execution!)

Programs running on the GPU are called shader

Multipass-Rendering / Micropass-Rendering (many smallshaders vs. large shaders; bottleneck, Re-useability?)

Shaders are applied, if „something“ is drawn (Render-to-texture – multiple copy operations)

More difficult to debug (than processing on CPU; Debugger aswell as I/O is not comfortable)

Different high-level shading languages available (GLSL, HLSL,Cg, CUDA)

Global operations need multiple passes (e.g. calculation ofmaxima)

real-time attention system gpu-vocus for exploration · 2008-04-01 · macs_y3_visual_attention.ppt...

Documents