Download - Keynote from ISUVR'10

Image-based modelling for augmented reality

Anton van den Hengel

Director, Australian Centre for Visual technologies

Professor, Adelaide University, South Australia

Director, PunchCard Visual Technologies

3D Modelling for AR

AR needs modelsAR is about the interaction between the real

and the synthetic 3D modelling isn’t much fun

Even with the best interfaces invented 3D Studio Max? Blender?

User-created content

2D UCC has changed the face of the web Blogs, Wikis, Social networking sites, Advertising, Fanfiction, News Sites, Trip

planners, Mobile Photos & Videos, Customer review sites, Forums, Experience and photo sharing sites, Audio, Video games, Maps and location systems and such, but more

Associated Content, Atom.com, BatchBuzz.com, Brickfish, CreateDebate, Dailymotion, Deviant Art, Demotix, Digg, eBay, Eventful, Fark, Epinions, Facebook, Filemobile, Flickr, Forelinksters, Friends Reunited, GiantBomb, Helium.com, HubPages, InfoBarrel, iStockphoto, Justin.tv, JayCut, Mahalo, Metacafe, Mouthshut.com, MySpace, Newgrounds, Orkut, OpenStreetMap, Picasa, Photobucket, PhoneZoo, Revver, Scribd, Second Life, Shutterstock, Shvoong, Skyrock, Squidoo, TripAdvisor, The Politicus, TypePad, Twitter, Urban Dictionary, Veoh, Vimeo, Widgetbox, Wigix, Wikia, WikiMapia, Wikinvest, Wikipedia, Wix.com, WordPress, Yelp, YouTube, YoYoGames, Zooppa

User-created content for AR

Google-created content for AR

UCC for AR

Just using images is a good startBut limits interactions to 2D

Flexible AR requires 3D models Ubiquitous AR requires UCC Flexible ubiquitous AR requires 3D UCC

3D UCC

3D has been limited by the lack of good UCC tools This is true for AR But also VR, 3D TV,

Second Life, Google Earth, Little Big Planet, 3D PDF, Adobe Premier, Unreal Tournament, Playstation, SGML, ...

3D UCC

AR particularly needs to model the real world Images are a good source of 3D information

Easily accessible They’re typically captured anyway Almost everything has a camera attached

Humans are very good at interpreting them

Can AR be ubiquitous without UCC?

Image-based 3D UCC

The image is the interfacePeople can’t help but see images in 3DMost image sets embody 3D

Powerful way to model real objectsVarying levels of interactionVarying types of models

Helps even in modelling imaginary objects

Image-based modelling for AR

AR is largely about interactive imagesAny other mode of interaction adds

complexity The majority of the content is real 3D modelling from images seems a

natural fit with AR

Image-based 3D modelling

AutomaticVery detailed models of everythingBut it’s getting better

InteractiveMeans you can specify

What you want to model What kind of model you want

Videotrace

Interactive image-based modelling A familiar interface Image-based interactions

The image is the interface Generates low polygon count models with

textures

Modelling

Results

Another example

Interactive 3D modelling

3D modelling is critical to all sorts of application Special effects, but also mining, architecture, defence,

urban planning, … People are getting more visually sophisticated More 3D data is being generated

More cameras, but also scanners etc The interfaces of modelling programs are usually

very hard to fathom

Low polygon-count models

Insert your own objects into a game Model an environment for AR Put your house into Google Earth Video editing

Cut and paste between sequencesRemove someone from your home videos

Put your truck into a game

Modelling for special effects

Video editing requires models

Modelling architecture

Modeling for virtual environments

The process

Capture and import the video Run video through the camera tracker

Performs structure and motion analysis

Interact with the system to generate and edit the modelExport to your application

The approach

Pre-compute where possibleStructure from motion (camera tracking)Superpixels

Then interact Interactions allow user to exploit precomputed

results

Structure from motion

Camera tracking Calculates

Reconstructed point cloudCamera parameters

Location Orientation Intrinsics (eg. Focal length)

Informs interaction interpretation process

Structure from motion

Interactions Straight lines

Closed sets of lines define planar polygons Curves

For planar shapes with curved edges For NURBS surfaces

Mirroring Duplicates existing geometry

Extrusion Dense meshing

Fitting planar faces

User specifies boundary Boundary specifies infinitely many planes Fitting similar to pre-emptive RANSAC

Generate bounded plane hypotheses from point cloud

Eliminate hypotheses that fail a series of tests Run simplest / most robust tests first

Generally 3d tests before 2d tests

Image plane

Line of sight

Fitting planar facesFitting planar faces

Object points

Hierarchical RANSAC Generate bounded plane hypotheses Tests

Support from point cloudReprojects within new image boundariesConstraints on relative edge length and face

sizeColour histogram matching on facesColour matching on edge projectionsReprojection is not self-occluding

2D Curves

3D Curves

Mirroring

Extrusion

Dense surface reconstruction

Live modelling

Live modelling

Most geometry cannot be modelled beforehandYou can’t tell where it will beModelling the whole world won’t work

Need to generate models in-situWhile you’re there

Live modelling in AR

Using VideoTrace to model geometry from live video To insert elsewhere in

the world So real objects can

occlude synthetic geometry

Live modelling for AR

The camera tracking is performed live using SLAMSimultaneous Localisation and mapping

Markerless video tracking No prior model of the space

Using PTAM Parallel Tracking and Mapping Klein and Murray

Videotrace - Live

Occlusion

Low polygon count models?Needed for efficiencyNot accurate enough for occlusion

calculations SLAM errors also prevent direct occlusion

modelling

Occlusion boundary refinement

The model of the foreground object is projected into the imageUsing the PTAM-estimated camera

parameters But there is always some misalignment Solve using a live segmentation of the real

object from the video


Lay out nodes of a graph around the projected boundarySet foreground and background probabilities

per node from colour modelSet link weights from edge strengthSegment using max-flow algorithm

At frame rate


Graph cut means that model doesn’t need to be accurate Very low polygon

counts Very simple modelling

process More complex objects

possible


Graph cut gives a hard segmentation

Fix with an alpha matte

Blends between foreground and synthetic object

Fixes some holes in the cut

Live modelling for AR

AR modelling for other purposes

Minimal interaction AR modelling Use the camera as the modelling tool

The user only specifies the object, the rest is done with the camera

Projective texturingSome compensation for Visual Hull

Silhouette modelling

Minimal interaction modelling

How to get Videotrace

It’s available on free beta testJust register at www.punchcard.com.auThey will email you a link It’s a real beta

Hopefully the final version will be free too

What’s next?

New interactions, applications and data sources Interactive SFM, Better SLAM Videoshop

Download - Keynote from ISUVR'10

Top Related