real-time computer vision with ruby and libjit · [wad07] j. wedekind, b. amavasai, and k. dutton....

1
Real-Time Computer Vision with Ruby and LibJIT J. Wedekind, M. Howarth, J. Travis, K. Dutton, B. Amavasai Microsystems & Machine Vision Laboratory, Sheffield Hallam University, Sheffield, United Kingdom EPSRC GR/S85696/01, Basic Technology: Nanorobotics Technologies for Simultaneous Multidimensional Imaging & Manipulation of Nanoobjects Abstract Many image processing algorithms can be broken down into atomic operations on arrays. There are unary operations such as additive inverse, absolute value, or square-root and binary operations such as addition or multiplication. While the arrays typically have elements of a single type, this element-type may be a signed or unsigned integer with 8, 16, 32, or more bits, a floating point number, or a complex number. A computer vision library thus needs to implement binary operations for various combinations of element- types. Furthermore binary operations can occur as array-array-, array-scalar-, or as scalar-array-operation. This kind of requirements ake it very hard to write a computer vision library in a statically typed language such as C++. On the other hand a naive implementation in a dynamically typed programming language such as Ruby will not satisfy real-time constraints. However recently the DotGNU project has released libJIT which is a just-in-time compiler library for i386, x86-64, and other processors. The combination of Ruby, libJIT, and other free software allows interactive development of real-time algorithms in an unprecedented way. Previous work: [WAD07], [WADB08], [HOR] References [HOR] Hornetseye. Web page. http://hornetseye.rubyforge.org/. [WAD07] J. Wedekind, B. Amavasai, and K. Dutton. Steerable filters generated with the hypercomplex dual-tree wavelet transform. In IEEE International Conference on Signal Processing and Commu- nications, pages 1291–4, 2007. [WADB08]J. Wedekind, B. Amavasai, K. Dutton, and M. Boissenin. A machine vision extension for the Ruby programming language. pages 991–6, 2008. HornetsEye, libJIT, Ruby type=DFLOAT othertype=UBYTE target=DFLOAT MultiArray.dfloat Fixnum no type.jit support? and othertype.jit support? and target.jit support? yes p = JITTerm.param( 0 ) q = target.jit load( JITTerm.param( 2 ) ) r = JITTerm.param( 1 ) n = JITTerm.const( f, JITType::LINT, size * target.size ) rend = r + n f.until( proc { r == rend } ) x = typecode.jit load( f, q ) target.jit store( action.call( x, q ) ) p.inc!( typecode.size ) r.inc!( target.size ) end g, h ∈{0, 1,..., w 1}×{0, 1,..., h 1} h( x 1 x 2 )= g( x 1 x 2 )/2 h = g.collect {|x| x/2 } <Hornetseye::JITFunction:0x7fd97797e2c8> coercion h=g/2 match Shi-Tomasi Corner Detector C W δg( x) δx 1 2 δg( x) δx 1 · δg( x) δx 2 δg( x) δx 1 · δg( x) δx 2 δg( x) δx 2 2 d x C = A λ 1 0 0 λ 2 A , Shi-Tomasi: min(λ 1 , λ 2 ) > λ require ’hornetseye’ include Hornetseye grad sigma, cov sigma = 1.0, 1.0 img = MultiArray.load grey8 ’test.png’ x = img.gauss gradient x grad sigma y = img.gauss gradient y grad sigma a=(x ** 2 ).gauss blur cov sigma b=(y ** 2 ).gauss blur cov sigma c=(x * y ).gauss blur cov sigma tr = a + b det = a * b-c * c dissqrt = Math.sqrt( ( tr * tr - det * 4 ). major( 0.0 ) ) result = 0.5 * ( tr - dissqrt ) result.normalise( 255..0 ).show Colourspaces & I/O colourspaces RGB24 RGBF UYVY YUY2 I420 YV12 Grey8 Grey16 GreyF V4L2 libdc1394 X−Video video files image files machine vision Hornetseye::Image Hornetseye::MultiArray Y C b C r = 0.299 0.587 0.114 0.168736 0.331264 0.500 0.500 0.418688 0.081312 R G B + 0 128 128 also see: http://fourcc.org/ GUI integration Computing Time n.to_scomplex + 1 n + Complex( 2, 3 ) n * 2 n + 1 n * n n.fill!( 1.0 ) n = NArray.sfloat( 4, 500_000 ).fill!( 0.0 ) 41 ms 66 ms 69 ms 34 ms 38 ms 42 ms 65 ms 82 ms 85 ms 65 ms 80 ms 126 ms 65 ms 78 ms 121 ms 161 ms 264 ms 119 ms 162 ms 264 ms 375 ms Comparison Blue bars are indicating the computing time used by a naive C++ implementa- tion. Green bars are indicating the comput- ing time used by M. Tanaka’s Ruby- extension NArray. Red bars are indicating the computing time used by the JIT-based implementa- tion of HornetsEye. Environment 1.2 GHz AMD Duron TM processor g++ 4.1.3 compiler ruby 1.8.6 interpreter libjit-0.1.1 just-in-time compiler Optimisation Potential cache for reusing compiled methods loop unrolling

Upload: others

Post on 13-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Real-Time Computer Vision with Ruby and LibJIT

    J. Wedekind, M. Howarth, J. Travis, K. Dutton, B. Amavasai

    Microsystems & Machine Vision Laboratory, Sheffield Hallam University, Sheffield, United Kingdom

    EPSRC GR/S85696/01, Basic Technology: NanoroboticsTechnologies for Simultaneous Multidimensional Imaging & Manipulation of Nanoobjects

    Abstract

    Many image processing algorithms can be broken down into atomic operations on arrays. There are unaryoperations such as additive inverse, absolute value, or square-root and binary operations such as addition ormultiplication. While the arrays typically have elements of a single type, this element-type may be a signedor unsigned integer with 8, 16, 32, or more bits, a floating point number, or a complex number.

    A computer vision library thus needs to implement binary operations for various combinations of element-types. Furthermore binary operations can occur as array-array-, array-scalar-, or as scalar-array-operation.This kind of requirements ake it very hard to write a computer vision library in a statically typed languagesuch as C++. On the other hand a naive implementation in a dynamically typed programming language suchas Ruby will not satisfy real-time constraints.

    However recently the DotGNU project has released libJIT which is a just-in-time compiler library for i386,x86-64, and other processors. The combination of Ruby, libJIT, and other free software allows interactivedevelopment of real-time algorithms in an unprecedented way.

    Previous work: [WAD07], [WADB08], [HOR]

    References

    [HOR] Hornetseye. Web page. http://hornetseye.rubyforge.org/.

    [WAD07] J. Wedekind, B. Amavasai, and K. Dutton. Steerable filters generated with the hypercomplexdual-tree wavelet transform. In IEEE International Conference on Signal Processing and Commu-nications, pages 1291–4, 2007.

    [WADB08] J. Wedekind, B. Amavasai, K. Dutton, and M. Boissenin. A machine vision extension for the Rubyprogramming language. pages 991–6, 2008.

    HornetsEye, libJIT, Ruby

    type=DFLOAT othertype=UBYTE

    target=DFLOAT

    MultiArray.dfloat Fixnum

    no

    type.jit support? andothertype.jit support? andtarget.jit support?

    yes

    p = JITTerm.param( 0 )q = target.jit load( JITTerm.param( 2 ) )r = JITTerm.param( 1 )n = JITTerm.const( f, JITType::LINT, size * target.size )rend = r + nf.until( proc { r == rend } )

    x = typecode.jit load( f, q )target.jit store( action.call( x, q ) )p.inc!( typecode.size )r.inc!( target.size )

    end

    g,h ∈ {0,1, . . . ,w−1}×{0,1, . . . ,h−1}

    h(

    (

    x1x2

    )

    ) = g(

    (

    x1x2

    )

    )/2

    h = g.collect { |x| x / 2 }

    coercion

    h=g/2

    match

    Shi-Tomasi Corner Detector

    C ≔

    "W

    (

    δg(~x)

    δx1

    )2δg(~x)

    δx1·δg(~x)

    δx2δg(~x)

    δx1·δg(~x)

    δx2

    (

    δg(~x)

    δx2

    )2

    d~x

    C = A⊤(

    λ1 0

    0 λ2

    )

    A , Shi-Tomasi: min(λ1,λ2) > λ

    require ’hornetseye’

    include Hornetseye

    grad sigma, cov sigma = 1.0, 1.0

    img = MultiArray.load grey8 ’test.png’

    x = img.gauss gradient x grad sigma

    y = img.gauss gradient y grad sigma

    a = ( x ** 2 ).gauss blur cov sigma

    b = ( y ** 2 ).gauss blur cov sigma

    c = ( x * y ).gauss blur cov sigma

    tr = a + b

    det = a * b - c * c

    dissqrt = Math.sqrt( ( tr * tr - det * 4 ).

    major( 0.0 ) )

    result = 0.5 * ( tr - dissqrt )

    result.normalise( 255..0 ).show

    Colourspaces & I/O

    colourspaces

    RGB24 RGBF

    UYVY YUY2

    I420YV12Grey8

    Grey16GreyF

    V4L2

    libdc1394

    X−Video

    video filesimage files

    machine vision

    Hornetseye::Image

    Hornetseye::MultiArray

    Y

    CbCr

    =

    0.299 0.587 0.114−0.168736 −0.331264 0.5000.500 −0.418688 −0.081312

    R

    G

    B

    +

    0

    128

    128

    also see: http://fourcc.org/

    GUI integration Computing Time

    n.to_scomplex + 1

    n + Complex( 2, 3 )

    n * 2

    n + 1

    n * n

    n.fill!( 1.0 )

    n = NArray.sfloat( 4, 500_000 ).fill!( 0.0 )41 ms

    66 ms

    69 ms

    34 ms

    38 ms

    42 ms

    65 ms

    82 ms

    85 ms

    65 ms

    80 ms

    126 ms

    65 ms

    78 ms

    121 ms

    161 ms

    264 ms

    119 ms

    162 ms

    264 ms

    375 ms

    Comparison

    Blue bars are indicating the computingtime used by a naive C++ implementa-tion.Green bars are indicating the comput-ing time used by M. Tanaka’s Ruby-extension NArray.Red bars are indicating the computingtime used by the JIT-based implementa-tion of HornetsEye.

    Environment

    • 1.2 GHz AMD DuronTM processor

    • g++ 4.1.3 compiler

    • ruby 1.8.6 interpreter

    • libjit-0.1.1 just-in-time compiler

    Optimisation Potential

    • cache for reusing compiled methods

    • loop unrolling