fast texture synthesis using tree-structuredvector...

Fast Texture Synthesis using Tree-structured Vector QuantizationCategory: ResearchPapernumber:0015

Figure1: Ourtexturegenerationprocesstakesanexampletexturepatch(left) andarandomnoise(middle)asinput,andmodifiesthisrandomnoiseto make it look like thegivenexampletexture.Thesynthesizedtexture(right) canbeof arbitrarysize,andis perceivedasvery similarto thegivenexample.Usingouralgorithm,texturescanbegeneratedwithin seconds,andthesynthesizedresultsarealwaystileable.

Abstract

Texture synthesisis importantfor many applicationsin computergraphics,vision, andimageprocessing.However, it remainsdiffi-cult to designanalgorithmthatis bothefficientandcapableof gen-eratinghigh quality results. In this paper, we presentan efficientalgorithmfor realistic texture synthesis.The algorithmis easytouseandrequiresonly a sampletexture as input. It generatestex-tureswith perceivedquality equalto or betterthanthoseproducedby previous techniques,but runs two ordersof magnitudefaster.Thispermitsusto applytexturesynthesisto problemswhereit hastraditionally beenconsideredimpractical. In particular, we haveappliedit to constrainedsynthesisfor imageeditingandtemporaltexturegeneration.Ouralgorithmis derivedfrom Markov RandomField texturemodels,andgeneratestexturesthrougha determinis-tic searchingprocess.We acceleratethis synthesisprocessusingtree-structuredvectorquantization.

Keywords: Texture Synthesis,CompressionAlgorithms, ImageProcessing

1 Introduction

Textureis aubiquitousvisualexperience.It candescribeawideva-riety of surfacecharacteristicssuchasterrain,plants,minerals,furandskin. Sincereproducingthevisualrealismof therealworld is amajorgoalfor computergraphics,texturesarecommonlyemployedwhenrenderingsyntheticimages.Thesetexturescanbe obtainedfrom a variety of sourcessuchashand-drawn picturesor scannedphotographs.Hand-drawn picturescanbe aestheticallypleasing,but it is hardto make themphoto-realistic.Most scannedimages,however, are of inadequatesizeandcan lead to visible seamsorrepetitionif they aredirectlyusedfor texturemapping.

Texture synthesisis an alternative way to createtextures. Be-causesynthetictexturescanbemadeof any size,visual repetitionis avoided. Texturesynthesiscanalsoproducetileableimagesbyproperlyhandlingthe boundaryconditions.Potentialapplicationsof texture synthesisarealsobroad;someexamplesareimagede-noising,occlusionfill-in, andcompression.

Thegoalof texturesynthesiscanbestatedasfollows: Given atexture sample,synthesizea new texture that, whenperceived bya humanobserver, appearsto begeneratedby thesameunderlyingstochasticprocess. The major challengesare 1) Modeling- howto estimatethestochasticprocessfrom agivenfinite texturesampleand2) Sampling-how to developanefficientsamplingproceduretoproducenew texturesfrom a givenmodel. Both themodelingandsamplingpartsare essentialfor the successof texture synthesis;the visual fidelity of generatedtextureswill dependprimarily ontheaccuracy of themodeling,while theefficiency of thesamplingprocedurewill directlydeterminethecomputationalcostof texturegeneration.

In this paper, we presenta very simplealgorithmthat can ef-ficiently synthesizea wide variety of textures. The inputsconsistof an exampletexture patchanda randomnoiseimagewith sizespecifiedby the user(Figure1). The algorithmmodifiesthis ran-domnoiseto makeit look likethegivenexample.This techniqueisflexible andeasyto use,sinceonly anexampletexturepatch(usu-ally a photograph)is required.New texturescanbegeneratedwithlittle computationtime, andtheir tileability is guaranteed.Theal-gorithmis alsoeasyto implement;thetwo majorcomponentsareamultiresolutionpyramidandasimplesearchingalgorithm.

Thekey advantagesof this algorithmarequality andspeed;thequality of thesynthesizedtexturesareequalto or betterthanthosegeneratedby previous techniques,while the computationspeedis2 ordersof magnitudefasterthan thoseapproachesthat generatecomparableresultsto our algorithm. This permitsus to applyour

paper# 0015 2

algorithm in areaswheretexture synthesishastraditionally beenconsideredtooexpensive. In particular, wehaveextendedthealgo-rithm to constrainedsynthesisfor imageeditingandmotiontexturesynthesis.

1.1 Previous Work

Numerousapproacheshave beenproposedfor textureanalysisandsynthesis,andanexhaustive survey is beyondthescopeof this pa-per. We briefly review somerecentandrepresentative works andreferthereaderto [9] and[13] for morecompletesurveys.

Physical Simulation: It is possibleto synthesizecertainsur-facetexturesby directly simulatingtheir physicalgenerationpro-cess.Biological patternssuchasfur, scales,andskin canbemod-eled using reactiondiffusion ([28]) and cellular texturing ([29]).Someweatheringandmineralphenomenoncanbefaithfully repro-ducedby detailedsimulations([6]). Thesetechniquescanproducetexturesdirectly on 3D meshesso the texture mappingdistortionproblemis avoided. However, different texturesareusuallygen-eratedby very differentphysicalprocessso theseapproachesareapplicableto only limited classesof textures.

Markov Random Field and Gibbs Sampling: Many algo-rithms model texturesby Markov RandomFields (or in a differ-entmathematicalform, GibbsSampling),andgeneratetexturesbyprobabilitysampling([7], [30], [21], [19]). SinceMarkov RandomFieldshave beenproven to be a goodapproximationfor a broadrangeof textures,thesealgorithmsaregeneralandsomeof themproducegoodresults.A drawbackof Markov RandomField sam-pling, though,is that it is computationallyexpensive; even smalltexturepatchescantake hoursor daysto generate.

Feature Matching: Somealgorithmsmodel texturesasa setof features,andgeneratenew imagesby matchingthe featuresinan exampletexture ([10], [5], [23]). Thesealgorithmsareusuallymoreefficient thanMarkov RandomField algorithms.HeegerandBergen([10]) modeltexturesby matchingmarginal histogramsofimagepyramids.Theirtechniquesucceedsonhighlystochastictex-turesbut fails on morestructuredones.De Bonet([5]) synthesizesnew imagesby randomizinganinput texturesamplewhile preserv-ing the cross-scaledependencies.This methodworks betterthan[10] on structuredtextures,but it canproduceboundaryartifactsiftheinput textureis not tileable.SimoncelliandPortilla ([23]) gen-eratetexturesby matchingthejoint statisticsof theimagepyramids.Theirmethodcansuccessfullycaptureglobaltexturalstructuresbutfails to preserve localpatterns.

1.2 Overview

Ourgoalwasto developanalgorithmthatcombinestheadvantagesof previous approaches.We want it to be efficient, general,andable to producehigh quality, tileable textures. It shouldalso beuserfriendly; i.e. the numberof tunableinput parametersshouldbeminimal. This canbeachievedby a carefulselectionof thetex-turemodelingandsynthesisprocedure.For the texturemodel,weuseMarkov RandomFields (MRF) sincethey have beenprovento cover the widest variety of useful texture types. To avoid theusualcomputationalexpenseof MRFs,we have developeda syn-thesisprocedurewhichavoidsexplicit probabilityconstructionandsampling.

Markov RandomField methodsmodela textureasa realizationof a local andstationary randomprocess.That is, eachpixel of atextureimageis characterizedby a smallsetof spatiallyneighbor-ing pixels,andthis characterizationis thesamefor all pixels. Theintuition behindthis modelcanbe demonstratedby the followingexperiment(Figure2). Imaginethataviewer is givenanimage,butonly allowed to observe it througha small movablewindow. Asthewindow is movedtheviewer canobserve differentpartsof the

(a)

(a1) (a2)

(b)

(b1) (b2)

Figure2: How texturesdiffer from images. (a) is a generalim-agewhile (b) is a texture. A movablewindow with two differentpositionsaredrawn asblacksquaresin (a) and(b), with thecorre-spondingcontentsshown below. Differentregionsof a texturearealwaysperceived to besimilar (b1,b2),which is not thecasefor ageneralimage(a1,a2).In addition,eachpixel in (b) is only relatedto a smallsetof neighboringpixels. Thesetwo characteristicsarecalledstationarityandlocality, respectively.

image.Theimageis stationaryif, undera properwindow size,theobservable portion alwaysappearssimilar. The imageis local ifeachpixel is predictablefrom asmallsetof neighboringpixelsandis independentof therestof theimage.

Basedon theselocality andstationarityassumptions,our algo-rithm synthesizesa new texture so that it is locally similar to anexampletexturepatch.Thenew textureis generatedpixel by pixel,andeachpixel is determinedso that local similarity is preservedbetweenthe exampletextureandthe result image. This synthesisprocedure,unlikemostMRF basedalgorithms,is completelydeter-ministic andno explicit probabilitydistribution is constructed.Asaresult,it is efficientandamenableto furtheracceleration.

Theremainderof thepaperis organizedasfollows. In section2,we presentthe algorithm. In section3, we demonstratesynthe-sisresultsandcomparethemwith thosegeneratedby previousap-proaches.In section4, weproposeaccelerationtechniques.In sec-tions 5 and 6, wediscussapplications,limitations,andextensions.

2 Algorithm

UsingMarkov RandomFieldsasthetexturemodel,thegoalof thesynthesisalgorithm is to generatea new texture so that eachlo-cal region of it is similar to anotherregion from the input texture.We first describehow the algorithmworks in a singleresolution,andthenweextendit usingamultiresolutionpyramidto obtainim-provementsin efficiency. For easyreference,we list the symbolsusedin Table1 andsummarizethealgorithmin Table2.

2.1 Single Resolution Synthesis

The algorithmstartswith an input texture sample��

anda whiterandomnoise

��. We forcetherandomnoise

��to look like

� �by

transforming� �

pixel by pixel in arasterscanordering,i.e. fromtopto bottomandleft to right. Figure3 shows a graphicalillustrationof thesynthesisprocess.

To determinethe pixel value � at� �

, its spatialneighborhood�� (the L-shapedregions in Figure3) is comparedagainstallpossibleneighborhoods

�� from��

. The input pixel � � withthemostsimilar

�� is assignedto � . We usea simple �� norm

paper# 0015 3

(b) (c) (d) (e)

: Neighborhood N

p

p

(a)

p p

Figure3: Singleresolutiontexturesynthesis.(a) is theinput textureand(b)-(d)show differentsynthesisstagesof theoutputimage.Pixelsintheoutputimageareassignedin arasterscanordering.Thevalueof eachoutputpixel � is determinedby comparingits spatialneighborhood�� with all neighborhoodsin the input texture. The input pixel with mostsimilar neighborhoodwill be assignedto the correspondingoutputpixel. Neighborhoodscrossingtheoutputimageboundaries(shown in (b) and(c)) arehandledtoroidally, asdiscussedin Section2.4.

Symbol Meaning��Input texturesample��Outputtextureimage� �

Gaussianpyramidbuilt from��

Gaussianpyramidbuilt from��

� � An inputpixel in��

or� �

� An outputpixel in� �

or� �� Neighborhoodaroundthepixel �� th level of pyramid�� pixel at level � andposition� �� of

�

Table1: Tableof symbols

(a) (b) (c)

Figure4: Synthesisresultswith differentneighborhoodsizes.Theneighborhoodsizesare(a) 5x5, (b) 7x7, (c) 9x9, respectively. Allimagesshown areof size128x128.Notethatastheneighborhoodsizeincreasestheresultingtexturequalitygetsbetter. However, thecomputationcostalsoincreases.

(sumof squareddifference)to measurethesimilarity betweentheneighborhoods.Thegoalof this synthesisprocessis to ensurethatthe newly assignedpixel � will maintainasmuchlocal similaritybetween

� �and��

aspossible.The sameprocessis repeatedforeachoutputpixel until all thepixelsaredetermined.This is akin toputtingtogetherajigsaw puzzle:thepiecesaretheindividualpixelsandthefitnessbetweenthesepiecesis determinedby thecolorsofthesurroundingneighborhoodpixels.

2.2 Neighborhood

Becausethe setof local neighborhoods�� is usedasthe pri-

marymodelfor textures,thequality of thesynthesizedresultswill

(a)

(b) (c)

Figure5: Causalityof the neighborhood.(a) sampletexture (b)synthesisresultusingacausalneighborhood(c) synthesisresultus-ing anoncausalneighborhood.Both (b) and(c) aregeneratedfromthe samerandomnoiseusinga 9x9 neighborhood.As shown, anoncausalneighborhoodis unableto generatevalid results.

dependon its sizeandshape.Intuitively, thesizeof theneighbor-hoodsshouldbeonthescaleof thelargestregulartexturestructure;otherwisethis structuremaybelost andtheresultimagewill looktoo random.Figure4 demonstratestheeffect of theneighborhoodsizeon thesynthesisresults.

Theshapeof theneighborhoodwill directly determinethequal-ity of

��. It mustbecausal,i.e. theneighborhoodcanonly contain

thosepixels precedingthe currentoutput pixel in the rasterscanordering. The reasonis to ensurethat eachoutputneighborhood�� will containonly alreadyassignedpixels. For the first fewrowsandcolumnsof

� �,�� maycontainunassigned(noise)pix-

elsbut asthealgorithmprogressesall theother�� will becom-

pletely“valid” (containsonlyalreadyassignedpixels).A noncausal�� , whichalwayscontainsunassignedpixels,is unableto trans-form

� �to look like

��(Figure5). Thus,the noiseimageis only

usedwhengeneratingthefirst few rows andcolumnsof theoutputimage.After this, it is ignored.

2.3 Multiresolution Synthesis

The singleresolutionalgorithmcapturesthe texture structuresbyusingadequatelysizedneighborhoods.However, for texturescon-taining largescalestructureswe have to uselargeneighborhoods,and large neighborhoodsdemandmore computation. This prob-lem canbesolvedby usinga mutlresolutionimagepyramid ([4]);computationis saved becausewe canrepresentlarge scalestruc-

paper# 0015 4

O O

O

O

O

O OO

O O

O

Q

Q

Q Q Q

L+1

Y Q

QQ

L

O

X

Figure6: A causalneighborhoodcontainingtwo levelsof pyramidpixels.Thecurrentlevel of thepyramidis shown atleft andthenextlower resolutionlevel is shown at right. Thecurrentoutputpixel � ,marked asX, is locatedat

� �� , where � is the currentlevelnumberand

� �� is its coordinate.At this level of thepyramid �theimageis only partiallycomplete.Thus,wemustusethepreced-ing pixelsin therasterscanordering(markedasO).Thepositionoftheparentof thecurrentpixel, locatedat

� �� !��" � �$# � , is markedasY.

turesmorecompactlyby a few pixels in a certainlower resolutionpyramidlevel.

The multiresolutionsynthesisalgorithm proceedsas follows.Two Gaussianpyramids([4]),

� �and� �

, arefirst built from��

and��

, respectively. Thealgorithmthentransforms� �

from lowerto higherresolutions,suchthateachhigherresolutionlevel is con-structedfrom thealreadysynthesizedlower resolutionlevels. Thisis similar to the sequencein which a pictureis painted: long andthick strokes areplacedfirst, anddetailsare thenadded. Withineachoutputpyramid level

� �%� �& , the pixels aresynthesizedin away similar to the singleresolutioncasewherethe pixels areas-signedin a rasterscanordering. Theonly modificationis that forthemultiresoltioncase,eachneighborhood

�� containspixelsinthecurrentresolutionaswell asthosein thelower resolutions.Anexampleof multiresolutionneighborhoodwith two levelsis shownin Figure6. Thesimilarity betweentwo multiresolutionneighbor-hoodsis measuredby computingthe sumof squareddistanceofall pixelswithin them. Theselower resolutionpixelsconstrainthesynthesisprocessso that the addedhigh frequency detailswill beconsistentwith thealreadysynthesizedlow frequency structures.

2.4 Edge Handling

Proper edge handling for�� near the image boundariesis

very important. For the synthesispyramid the edge is treatedtoroidally. In other words, if

� � � �� denotesthe pixel atlevel � and position

� �� of pyramid� �

, then� �%� ��('� � � ��)�+*-,/.�01�)��*-,/. � , where 0 and�

are the num-berof rows andcolumns,respectively, of

� � � �� . Handlingedgestoroidally is essentialto guaranteethat the resultingsynthetictex-turewill tile seamlessly.

For theinputpyramid� �

, toroidalneighborhoodswill typicallycontaindiscontinuitiesunless

� �is tileable.A reasonableedgehan-

dler for� �

is to padit with areflectedcopy of itself. Anothersolu-tion is to useonly those

�� completelyinside� �

, anddiscardthosecrossingtheboundaries.Weusea reflective edgehandlerforall examplesshown in thispaper.

2.5 Initialization

Naturaltexturesoftencontainrecognizablestructuresaswell asacertainamountof randomness.Sinceour goalis to reproducereal-istic textures,it is essentialthat thealgorithmcapturestherandomaspectof the textures. This notion of randomnesscansometimesbeachievedby entropy maximization([30]), but thecomputationalcostis prohibitive. Instead,we initialize the outputimage

� �asa

white randomnoise,andgraduallymodify this noiseto look likethe input texture

� �. This initialization stepseedsthe algorithm

with sufficiententropy, andletstherestof thesynthesisprocessfo-cuson the transformationof

��towards

� �. To make this random

noiseabetterinitial guess,wealsoequalizethepyramidhistogramof� �

with respectto� �

([10]).The initial noiseeffects the synthesisprocessin the following

way. For thesingleresolutioncase,neighborhoodsin thefirst fewrows andcolumnsof

��containnoisepixels. Thesenoisepixels

introduceuncertaintyin theneighborhoodmatchingprocess,caus-ing the boundarypixels to be assignedsemi-stochastically(How-ever, thesearchingprocessis still deterministic.Therandomnessiscausedby the initial noise). The restof the noisepixels areover-writtendirectlyduringsynthesis.For themultiresolutioncase,how-ever, moreof thenoisepixelscontributeto thesynthesisprocess,atleastindirectly, sincethey determinethe initial valueof the lowestresolutionlevel of

� �.

2.6 Summary of Algorithm

Wesummarizethealgorithmin thefollowing pseudocode.

function�3254

TextureSynthesis(��

,� �

)1 Initialize(

��);

2� �64

BuildPyramid(��

);3

� �74BuildPyramid(

��);

4 foreach level � from lower to higherresolutionsof� �

5 loop throughall pixels� � � �� of

� � � �&6 8 4 FindBestMatch(

� �,� �

, �� );7

� � � �� 4 8 ;8

� 2 4ReconPyramid(

� �);

9 return�32

;

function 8 4 FindBestMatch(� �

,� �

, �� )1

� � 4BuildNeighborhood(

� � �� );2

��9�: ��;� 4null; 8 4 null;

3 loop throughall pixels� � � �� of

� �<� �&4

�-�64BuildNeighborhood(

� �, �� );

5 if Match(�=�

,� �

) > Match(� 9�: ��;�

,� �

)6

� 9�: ��;� 4?�=�; 8 4 � �/� �� ;

7 return 8 ;

Table2: Pseudocodeof theAlgorithm

The architectureof this algorithm is flexible; it is composedfrom severalorthogonalcomponents.We list thesecomponentsasfollowsanddiscussthecorrespondingdesignchoices.

Pyramid: The pyramids are built from and reconstructedtoimagesusing the standardroutines BuildPyramid and Recon-Pyramid. Various pyramids can be usedfor texture synthesis;examplesareGaussianpyramid ([21]), Laplacianpyramid ([10]),steerablepyramid ([10], [23]), and feature-basedpyramids([5]).Different pyramids will give different trade-offs betweenspatialand frequency resolutions. In this paper, we chooseto use theGaussianpyramidfor its simplicity andgreaterspatiallocalization(adetaileddiscussionof this issuecanbefoundin [20]). However,otherkindsof pyramidscanbeusedinstead.

paper# 0015 5

Neighborhood: The neighborhoodcan have arbitrary size andshape;theonly requirementis that it containsonly valid pixels. Anoncausal/symmetricneighborhood,for example,canbe usedbyextendingtheoriginalalgorithmwith two passes(Section5.1).

Synthesis Ordering: A rasterscanorderingis usedin line 5 offunction TextureSynthesis. This, however, canalsobe extended.For example,a spiralorderingcanbeusedfor constrainedtexturesynthesis(Section5.1). The synthesisorderingshouldcooperatewith the BuildNeighborhood so that the neighborhoodcontainsonly valid pixels.

Searching: An exhaustive searchingprocedureFindBestMatchis employed to determinethe outputpixel values. Becausethis isa standardprocess,variouspoint searchingalgorithmscanbeusedfor acceleration.Thiswill bediscussedin detailin Section4.

3 Synthesis Results

To test the effectivenessof our approach,we have run the algo-rithm on many different imagesfrom standardtexture sets. Fig-ure7 showsexamplesusingtheBrodatztexturealbum([3]) andtheMIT VisTex set([17]). TheBrodatzalbum is themostcommonlyusedtexture testingsuiteandcontainsa broadrangeof grayscaleimages. Sincemost graphicsapplicationsrequirecolor textures,wealsousetheMIT VisTex set,whichcontainsrealworld texturesphotographedundernaturallighting conditions.

A visual comparisonof our approachwith several other algo-rithmsis shown in Figure8. Result(a) is generatedby HeegerandBergen’salgorithm([10]) usingasteerablepyramidwith 6 orienta-tions.Thealgorithmcapturescertainrandomaspectsof thetexturebut fails on thedominatinggrid-likestructures.Result(b) is gener-atedby DeBonet’sapproach([5]) wherewechoosehisrandomnessparameterto make theresultlook best.Thoughcapableof captur-ing morestructuralpatternsthan(a), certainboundaryartifactsareveryvisible. This is becausehisapproachcharacterizestexturesbylower frequency pyramidlevelsonly; thereforethelateralrelation-shipbetweenpixelsat thesamelevel is lost. Result(c) is generatedby Efros andLeung’s algorithm([7]). This techniqueis basedontheMarkov RandomFieldmodelandis capableof generatinghighquality textures.However, a directapplicationof his approachcanproducenon-tileableresults.1

Result(d) is synthesizedusingour approach.It is tileableandthe imagequality is comparablewith thosesynthesizeddirectlyfrom MRFs. It took about8 minutesto generateusinga 195MHzR10000processor. However, this is not the maximumpossiblespeedachievablewith this algorithm. In the next section,we de-scribemodificationsthatacceleratethealgorithmgreatly.

4 Acceleration

Our deterministicsynthesisprocedureavoids the usualcomputa-tional requirementfor samplingfrom a MRF. However, the algo-rithm asdescribedemploys exhaustive searching,which makes itslow. Fortunately, accelerationis possible.Thisis achievedby con-sideringneighborhoods

�� aspoints in a multiple dimensionalspace,andcastingtheneighborhoodmatchingprocessasa nearestpointsearchingproblem([18]).

The nearestpoint searchingproblemin multiple dimensionsisstatedasfollows: givena set @ of A pointsandanovel querypoint

1We have found that it is possibleto extendtheir approachusingmul-tiresolutionpyramidsandatoroidalneighborhoodto maketileabletextures.However this is not statedin theoriginalpaper([7]).

Bin a . -dimensionalspace,find apoint in thesetsuchthatits dis-

tancefromB

is lesserthan,or equalto, thedistanceofB

from anyotherpoint in theset.Becausea largenumberof suchqueriesmayneedto be conductedover thesamedataset @ , thecomputationalcostcanbe reducedif we preprocess@ to createa datastructurethat allows fastnearestpoint queries. Many suchdatastructureshavebeenproposedandwereferthereaderto [18] for amorecom-pletereference.However, mostof thesealgorithmsassumegenericinputsanddo not attemptto take advantageof any specialstruc-turesthey mayhave. Popat([21]) observedthattheset @ of spatialneighborhoodsfrom a texture canoften be characterizedwell bya clusteringprobability model. Taking advantageof this cluster-ing property, we proposeto usetree-structuredvectorquantization(TSVQ,[8]) asthesearchingalgorithm([27]).

4.1 TSVQ Acceleration

Tree-structuredvectorquantization(TSVQ)is acommontechniquefor datacompression.It takes a set of training vectorsas input,andgeneratesa binary-tree-structuredcodebook.The first stepisto computethecentroidof thesetof trainingvectorsanduseit astheroot level codeword. To find thechildrenof this root, thecen-troid andaperturbedcentroidarechosenasinitial child codewords.A generalizedLloyd algorithm([8]), consistingof alternationsbe-tweencentroidcomputationandnearestcentroidpartition, is thenusedto find the locally optimal codewords for the two children.The training vectorsare divided into two groupsbasedon thesecodewordsandthealgorithmrecursesoneachof thesubtrees.Thisprocessterminateswhenthe numberof codewordsexceedsa pre-selectedsizeor theaveragecodingerror is below a certainthresh-old. The final codebookis the collectionof the leaf level code-words.

Thetreegeneratedby TSVQ canbeusedasa datastructureforefficient nearestpoint query. To find the nearestpoint of a givenqueryvector, thetreeis traversedfrom theroot in abestfirst order-ingby comparingthequeryvectorwith thetwochildrencodewords,andthenfollowstheonethathasaclosercodeword. Thisprocessisrepeatedfor eachvisitednodeuntil a leafnodeis reached.Thebestcodeword is thenreturnedasthecodewordof thatleafnode.Unlikefull searching,theresultcodewordmaynotbetheoptimalonesinceonly partof the treeis traversed.However, the resultcodeword isusuallycloseto theoptimalsolution,andthecomputationis moreefficient thanfull searching.If thetreeis reasonablybalanced(thiscanbe enforcedin the algorithm),a singlesearchwith codebooksize CD@�C canbe achieved in time E �GF ,/H�CI@�CJ , which is muchfasterthanexhaustive searchingwith lineartimecomplexity E � CI@�CJ .

To useTSVQ in our synthesisalgorithm,we simply collect theset of neighborhoodpixels

�� for eachinput pixel and treatthemasavectorof sizeequalto thenumberof pixelsin

�� . Weusethesevectors K �� ML from each

� �<� �& asthe training data,and generatethe correspondingtree structurecodebooksN � �& .During the synthesisprocess,the (approximate)closestpoint foreach�� at

� �O� �� is foundbydoingabestfirst traversalof N � �& .Becausethis treetraversalhastime complexity E �GF ,<H �=P (where�=P

is thenumberof pixelsof� �<� �& ), thesynthesisprocedurecan

beexecutedveryefficiently. Typical texturestakesecondsto gener-ate;theexacttiming dependson theinputandoutputimagesizes.

4.2 Acceleration Results

An example comparingthe resultsof exhaustive searchingandTSVQis shown in Figure9. Theoriginal imagesizesare128x128andthe resultingimagesizesare200x200. The averagerunningtime for exhaustive searchingis 360 seconds.The averagetrain-ing time for TSVQ is 22 secondsandtheaveragesynthesistime is7.5 seconds.Thecodeis implementedin C++ andthetimingsare

paper# 0015 6

(a) (f)

(b) (g)

(c) (h)

(d) (i)

(e) (j)

Figure7: Texturesynthesisresults.Thesmallerpatchesaretheinput textures,andto their right aresynthesizedresults.A 9x9neighborhoodis usedfor all cases.Brodatztextures: (a) D52 (b) D103 (c) D84 (d) D11 (e) D20. VisTex textures: (f) Flowers0000(g) Misc 0000(h)Clouds0000(i) Fabric0015(j) Leaves0009.

paper# 0015 7

Sample

(a) (b) (c) (d)

Figure8: A comparisonof texturesynthesisresultsusingdifferentalgorithms:(a)HeegerandBergen’smethod([10]) (b) DeBonet’smethod([5]) (c) EfrosandLeung’smethod([7]) (d) Ourmethod.Only EfrosandLeung’salgorithmproducesresultscomparablewith ours.However,ouralgorithmis 2 ordersof magnitudefasterthantheirs(Section4). Thesampletexturepatchhassize64x64,andall theresultimagesareofsize192x192.A 9x9neighborhoodis usedfor (c) and(d).

(a)Exhaustive (a)TSVQ (b) Exhaustive (b) TSVQ

Figure9: AcceleratedsynthesisusingTSVQ.In eachpairof figures,theresultgeneratedby exhaustivesearchingis ontheleft andtheTSVQacceleratedresultis on theright. Theoriginal imagesareshown in Figure7. All generatedimagesareof size200x200.Theaveragerunningtimefor exhaustivesearchingis 360seconds.Theaveragetrainingtimefor TSVQis 22secondsandtheaveragesynthesistimeis 7.5seconds.

(a) (b) (c)

Figure10: TSVQ accelerationwith differentcodebooksizes.Theoriginal imagesizeis 64x64andall thesesynthesizedresultsareofsize128x128.Thenumberof codewordsin eachcaseare(a)64(b)512(c) 4096(all).

measuredon a 195MHzR10000processor. As shown in Figure9,resultsgeneratedwith TSVQ accelerationareroughlycomparablein quality to thosegeneratedfrom theunacceleratedapproach.Ina few cases,TSVQ will generatemoreblurry images(suchasFig-ure9 (b)). We fix this by allowing limited backtrackingin thetreetraversalso thatmorethanoneleaf nodecanbevisited. Whenthenumberof visitedleaf nodesis thesameasthecodebooksize,theresultwill beexactly thesameastheexhaustive searchingcase.

Onedisadvantageof TSVQ accelerationis thememoryrequire-ment.Becauseaninputpixelcanappearin multipleneighborhoods,a full-sizedTSVQtreecanconsumeE � .RQ � memorywhere. istheneighborhoodsizeand

�is thenumberof input imagepixels.

Fortunately, texturesusuallycontainrepeatingstructures;therefore

Algorithm TrainingTime SynthesisTimeEfrosandLeung none 1941seconds

ExhaustiveSearching none 503secondsTSVQacceleration 12seconds 12seconds

Table3: A breakdown of runningtime for the texturesshown inFigure8. Thefirst row shows the timing of EfrosandLeung’s al-gorithm. The secondandthird rows show the timing of our algo-rithm, usingexhaustive searchingandTSVQ acceleration,respec-tively. All the timings weremeasuredusinga 195 MHz R10000processor.

wecanusecodebookswith fewernumberof codewordsthanthein-put trainingset.Figure10shows texturesgeneratedby TSVQwithdifferentcodebooksizes.As expectedthe imagequality improveswhenthecodebooksizeincreases.However, resultsgeneratedwithfewer numberof codewords suchas (b) look plausiblecomparedwith the full codebookresult (c). In our experiencewe can usecodebookswith lessthan10 percentsize of the original trainingdatawithout noticeabledegradationof quality of the synthesisre-sults. To further reducethe expenseof training,we canalsotrainon a subsetratherthantheentirecollectionof input neighborhoodvectors.

Table3 shows a timing breakdown for generatingthe texturesshown in Figure8. Our unacceleratedalgorithmtook503seconds.TheTSVQacceleratedalgorithmtook12 secondsfor training,andanother12secondsfor synthesis.In comparison,EfrosandLeung’s

paper# 0015 8

algorithm([7]) took half an hour to generatethe sametexture 2.Becausetheir algorithm usesa variablesizedneighborhoodit isdifficult to accelerate.Our algorithm, on the other hand,usesafixed neighborhoodand can be directly acceleratedby any pointsearchingalgorithm.

5 Applications

Oneof the chief advantagesof our texturesynthesismethodis itslow computationalcost. This permitsus to explore a variety ofapplicationsin additionto the usualtexture mappingfor graphicsthat were previously impractical. Presentedhereare constrainedsynthesisfor imageeditingandtemporaltexturegeneration.

5.1 Constrained Texture Synthesis

Photographs,films and imagesoften contain regions that are insomesenseflawed.A flaw canbeascrambledregionon ascannedphotograph,scratchesonanold film, wiresor propsin amovie filmframe,or simply anundesirableobjectin animage.Sincethepro-cessescausingtheseflaws areoften irreversible,analgorithmthatcanfix theseflaws is desirable.For example,Hirani andTotsuka([11]) developedan interactive algorithmthatfinds translationallysimilar regionsfor noiseremoval. Often,theflawedportionis con-tainedwithin aregionof texture,andcanbereplacedbyconstrainedtexturesynthesis([7],[12]).

Texture replacementby constrainedsynthesismustsatisfy tworequirements:thesynthesizedregion mustlook like thesurround-ing texture,andtheboundarybetweenthenew andold regionsmustbe invisible. Multiresolution blending([4]) with anothersimilartexture,shown in Figure11(b), will producevisibleboundariesforstructuredtextures.Betterresultscanbeobtainedby applyingouralgorithmin Section2 over theflawedregions,but discontinuitiesstill appearat the right and bottom boundariesas shown in Fig-ure 11 (c). Theseartifactsarecausedby the causalneighborhoodaswell astherasterscansynthesisordering.

To remove theseboundaryartifacts a noncausal(symmetric)neighborhoodmustbeused.However, wehave to modify theorig-inal algorithm so that only valid (alreadysynthesized)pixels arecontainedwithin the symmetricneighborhoods;otherwisethe al-gorithmwill notgeneratevalid results(Figure5). Thiscanbedonewith atwo-passextensionof theoriginalalgorithm.Eachpassis thesameastheoriginalmultiresolutionprocess,exceptthatadifferentneighborhoodis used.During thefirst pass,theneighborhoodcon-tainsonly pixelsfrom thelowerresolutionpyramidlevels.Becausethe synthesisprogressesin a lower to higherresolutionfashion,asymmetricneighborhoodcanbe usedwithout introducinginvalidpixels. This passusesthe lower resolutioninformationto “extrap-olate” thehigherresolutionregionsthatneedto bereplaced.In thesecondpass,a symmetricneighborhoodthat containspixels fromboth the currentandlower resolutionsis used. Thesetwo passesalternatefor eachlevel of the outputpyramid. In the acceleratedalgorithm, the analysisphaseis alsomodifiedso that two TSVQtreescorrespondingto thesetwo kinds of neighborhoodsarebuiltfor eachlevel of the input pyramid. Finally, we alsomodify thesynthesisorderingin thefollowing way: insteadof theusualraster-scanordering,pixels in the filled regionsareassignedin a spiralfashion. For example,the hole in Figure11 (a) is replacedfromoutsideto inside from the surroundingregion until every pixel isassigned(Figure11 (d)). This spiral synthesisorderingremoves

2In this timing comparisonwe choosea very small input patch(withsize64x64). Becausethe time complexity of our approachover EfrosandLeung’s is SUTDVIWYX[Z]\D^_SUTDZ]\ whereZ is thenumberof input imagepixels,ourapproachperformsevenbetterfor largerinput textures.

the directionalbiaswhich causesthe boundarydiscontinuities(asin Figure11 (c)).

Examplesof constrainedsynthesisfor holefilling andimageex-trapolationareshown in Figure12. Within eachpairof images,theblack region is filled in usinginformationsavailablein the restofthe image. The synthesizedregionscanblendsmoothlywith theoriginalpartsevenfor structuredtextures.Becauseof its efficiency,thisapproachmaybeusefulasaninteractive tool for imageeditingor denoising([16]).

5.2 Temporal Texture Synthesis

The low costof our acceleratedalgorithmenablesus to considersynthesizingtexturesof dimensiongreaterthantwo. An exampleof3D textureis temporaltexture.Temporaltexturesaremotionswithindeterminateextent both in spaceandtime. They candescribeawide variety of naturalphenomenasuchasfire, smoke, andfluidmotions.Sincerealisticmotionsynthesisis oneof themajorgoalsof computergraphics,atechniquethatcansynthesizetemporaltex-tureswould be useful. Most existing algorithmsmodel temporaltexturesby directsimulation;examplesincludefluid, gas,andfire([24]). Direct simulations,however, areoften expensive andonlysuitablefor specifickinds of textures;thereforean algorithmthatcanmodelgeneralmotiontextureswouldbeadvantageous([26]).

Temporaltexturesconsistof 3D spatial-temporalvolumeof mo-tion data. If the motion datais local andstationaryboth in spaceandtime, the texturecanbe synthesizedby a 3D extensionof ouroriginal algorithm. This extensioncanbe simply doneby replac-ing various2D entitiesin the original algorithm,suchas images,pyramids,andneighborhoodswith their 3D counterparts.For ex-ample,thetwo Gaussianpyramidsareconstructedby filtering anddownsamplingfrom 3D volumetricdata; the neighborhoodscon-tain local pixels in both the spatialandtemporaldimension.Thesynthesisprogressesfrom lower to higherresolutions,andwithineachresolutionthe output is synthesizedslice by slice along thetimedomain.

Figure 13 shows synthesisresultsof several typical temporaltextures: fire, smoke, and oceanwaves (shown in the accompa-nying video tape). The resultingsequencescapturethe flavor ofthe original motions,andtile both spatiallyandtemporally. Thistechniqueis alsoefficient. Acceleratedby TSVQ,eachresultframetook about20 secondsto synthesize.Currentlyall the texturesaregeneratedautomatically;we plan to extendthealgorithmto allowmoreexplicit usercontrols(suchasthedistributionandintensityofthefire andsmoke).

6 Conclusions and Future Work

Texturesare importantfor a wide variety of applicationsin com-putergraphicsandimageprocessing.On theotherhand,they arehard to synthesize.The goal of this paperis to provide a practi-cal tool for efficiently synthesizinga broadrangeof textures. In-spiredby Markov RandomFieldmethods,ouralgorithmis general;a wide variety of texturescanbe synthesizedwithout any knowl-edgeof their physicalformationprocesses.The algorithmis alsoefficient;by aproperaccelerationusingTSVQ,typical texturescanbegeneratedwithin secondson currentPCsandworkstations.Thealgorithmis alsoeasyto use:only anexampletexturepatchis re-quired.

One drawback of the Markov RandomField approachis thatonly local and stationaryphenomenacan be modeled. Other vi-sualcuessuchas3D shape,depth,lighting, or reflectioncannotbe capturedby this approach.Onepossiblesolutionwould be toincorporatethis informationin a preprocessingstep,or to imposecertainconstraintsduring the synthesisprocess.For example,tosynthesizeaperspectively viewedbrick wall, wecoulduseashape

paper# 0015 9

(a) (b) (c) (d)

Figure11: Constrainedtexture synthesis.(a) a texture containinga black region that needsto be filled in. (b) multiresolutionblending([4]) with anothertextureregion will produceboundaryartifacts.(c) A directapplicationof thealgorithmin Section2 will producevisiblediscontinuitiesat theright andbottomboundaries.(d) A muchbetterresultcanbegeneratedby usingamodificationof thealgorithmwith 2passes.

(a) (a)Result (b) (b) Result

Figure12: Constrainedsynthesisexamples.In eachpairof figures,theoriginal imageis on theleft andthesynthesizedresultis on theright.Thegoalis to fill in theblackregionswithoutchangingtherestof theimage.Examplesshown areBrodatztextureswith imageextrapolation(a)D36andholefilling (b) D40.

from texturetechnique([15]) to determinethesizesandorientationof individualbricksin theoriginal image.Then,duringthesynthe-sisprocess,aglobalconstraintis enforcedover theoutputimagesothat thepatternis generatedaccordingto the relative positionandorientationbetweenthewall andtheeyepoint.

Aside from constrainedsynthesisand temporal textures, nu-merousapplicationsof our approacharepossible.Otherpotentialapplications/extensionsare:

Multidimensional texture: The notion of texture extendsnatu-rally to multi-dimensionaldata.Oneexamplewaspresentedin thispaper- motionsequences.Thesametechniquecanalsobedirectlyappliedto solid texturesor animatedsolid texture synthesis.Weare also trying to extend our algorithm for generatingstructuredsolid texturesfrom 2D views ([10]).

Texture compression/decompression: Texturesusuallycontainrepeatingpatternsandhigh frequency information; thereforetheyare not well compressedby transform-basedtechniquessuchas JPEG. However, codebook-basedcompressiontechniqueswork well on textures([2]). This suggeststhat texturesmight becompressableby our synthesistechnique. Compressionwouldconsistof building a codebook,but unlike [2], no code indiceswould be generated;only the codebookwould be transmittedandthe compressionratio is controlledby the numberof codewords.Decompressionwould consistof texture synthesis. This decom-pressionstep,if acceleratedonemoreorderof magnitudeover ourcurrent software implementation,could be usablefor real timetexturemapping.Theadvantageof this approachover [2] is much

greatercompression,sinceonly thecodebookis transmitted.

Motion synthesis/editing: Some motions can be efficientlymodeledas spatial-temporaltextures. Others,suchas animal orhumanmotion,aretoohighly structuredfor suchadirectapproach.However, it might be possibleto encodetheir motion as jointangles,then apply texture analysis-synthesisto the resulting1Dtemporalmotionsignals.

Modeling geometric details: Models scannedfrom real worldobjectsoften contain texture-like geometricdetails, making themodelsexpensive to store,transmitor manipulate.Thesegeometricdetailscanbe representedasdisplacementmapsover a smoothersurface representation([14]). The resulting displacementmapsshouldbe compressable/decompressableas2D texturesusingourtechnique. Taking this idea further, missinggeometricdetails,acommonproblemin many scanningsituations([1]), couldbefilledin usingourconstrainedtexturesynthesistechnique.

Direct synthesis over meshes: Mappingtexturesonto irregular3D meshesby projectionoftencausedistortions([22]). Thesedis-tortionscansometimesbefixedby establishingsuitableparameter-izationof themesh,but amoredirectapproachwouldbeto synthe-sizetexturedirectly over the mesh. In principle, this canbe doneusingour technique.However, this will requireextendingordinarysignalprocessingoperationssuchasfiltering anddownsamplingtoirregular3D meshes.

paper# 0015 10

(a) (b) (c)

Figure13: Temporaltexturesynthesisresults.(a) fire (b) smoke (c) oceanwaves.In eachpairof images,thespatial-temporalvolumeof theoriginalmotionsequenceis shown on theleft, andthecorrespondingsynthesisresultis shown on theright. A 5x5x5causalneighborhoodisusedfor synthesis.Theoriginalmotionsequencescontain32 frames,andthesynthesisresultscontain64 frames.Theindividual framesizesare(a) 128x128(b) 150x112(c) 150x112.Acceleratedby TSVQ,thetrainingtimeare(a)1875(b) 2155(c) 2131secondsandthesynthesistimeperframeare(a)19.78(b) 18.78(c) 20.08seconds.To savememory, weuseonly arandom10percentof theinputneighborhoodvectorsto build the(full) codebooks.Theoriginalfire sequenceis acquiredfrom [25].

Acknowledgement

[Removedfor papersubmission]

References[1] Anonymous. The Digital MichelangeloProject: 3D scanningof large statues.

submittedfor publication.

[2] A. C. Beers,M. Agrawala,andN. Chaddha.Renderingfrom compressedtex-tures.Proceedings of SIGGRAPH 96, pages373–378,August1996.

[3] P. Brodatz. Textures: A Photographic Album for Artists and Designers. Dover,New York, 1966.

[4] P. J.Burt andE. H. Adelson.A multiresolutionsplinewith applicationto imagemosaics.ACM Transactions on Graphics, 2(4):217–236,Oct.1983.

[5] J.S.DeBonet.Multiresolutionsamplingprocedurefor analysisandsynthesisoftexture images.In T. Whitted,editor, SIGGRAPH 97 Conference Proceedings,AnnualConferenceSeries,pages361–368.ACM SIGGRAPH,AddisonWesley,Aug. 1997.

[6] J.Dorsey, A. Edelman,J.Legakis,H. W. Jensen,andH. K. Pedersen.Modelingandrenderingof weatheredstone. Proceedings of SIGGRAPH 99, pages225–234,August1999.

[7] A. EfrosandT. Leung.Texturesynthesisby non-parametricsampling.In Inter-national Conference on Computer Vision, volume2, pages1033–8,Sep1999.

[8] A. GershoandR.M. Gray. Vector Quantization and Signal Compression. KluwerAcademicPublishers,1992.

[9] R. Haralick.Statisticalimagetextureanalysis.In Handbook of Pattern Recogni-tion and Image Processing, volume86,pages247–279.AcademicPress,1986.

[10] D. J. Heeger andJ. R. Bergen. Pyramid-Basedtexture analysis/synthesis.InR. Cook, editor, SIGGRAPH 95 Conference Proceedings, Annual ConferenceSeries,pages229–238.ACM SIGGRAPH,AddisonWesley, Aug. 1995.

[11] A. N. Hirani andT. Totsuka.Combiningfrequency andspatialdomaininforma-tion for fast interactive imagenoiseremoval. Computer Graphics, 30(AnnualConferenceSeries):269–276,1996.

[12] H. IgehyandL. Pereira.Imagereplacementthroughtexturesynthesis.In Inter-national Conference on Image Processing, volume3, pages186–189,Oct1997.

[13] H. IversenandT. Lonnestad.An evaluationof stochasticmodelsfor analysisandsynthesisof grayscaletexture. Pattern Recognition Letters, 15:575–585,1994.

[14] V. Krishnamurthyand M. Levoy. Fitting smoothsurfacesto densepolygonmeshes.Proceedings of SIGGRAPH 96, pages313–324,August1996. ISBN0-201-94800-1.Held in New Orleans,Louisiana.

[15] J. Malik and R. Rosenholtz. Computinglocal surfaceorientationand shapefrom texture for curved surfaces. International Journal of Computer Vision,23(2):149–168,1997.

[16] T. MalzbenderandS.Spach.A context sensitive texturenib. In Proceedings ofComputer Graphics International, pages151–163,June1993.

[17] MIT Media Lab. Vision texture. http://www-white.media.mit.edu/vismod/-imagery/VisionTexture/vistex.html.

[18] S. NeneandS. Nayar. A simplealgorithmfor nearestneighborsearchin highdimensions.IEEETPAMI: IEEE Transactions on Pattern Analysis and MachineIntelligence, 19:989–1003,1997.

[19] R. PagetandI. Longstaff. Texturesynthesisvia anoncausalnonparametricmul-tiscaleMarkov randomfield. IEEE Transactions on Image Processing, 7(6):925–931,June1998.

[20] A. C. Popat. Conjoint Probabilistic Subband Modeling. PhD thesis,Mas-sachusettsInstituteof Technology, 1997.

[21] K. PopatandR. Picard. Novel cluster-basedprobabilitymodelfor texturesyn-thesis,classification,andcompression.In Visual Communications and ImageProcessing, pages756–68,1993.

[22] M. Segal, C. Korobkin, R. van Widenfelt, J. Foran,and P. E. Haeberli. Fastshadows andlighting effectsusingtexturemapping. Computer Graphics (Pro-ceedings of SIGGRAPH 92), 26(2):249–252,July1992.

[23] E. Simoncelli and J. Portilla. Texture characterizationvia joint statisticsofwaveletcoefficientmagnitudes.In Fifth International Conference on Image Pro-cessing, volume1, pages62–66,Oct.1998.

[24] J.StamandE. Fiume.Depictingfire andothergaseousphenomenausingdiffu-sionprocesses.Proceedings of SIGGRAPH 95, pages129–136,August1995.

[25] M. Szummer. Temporaltexturedatabase.ftp://whitechapel.media.mit.edu/pub/-szummer/temporal-texture/.

[26] M. SzummerandR. W. Picard. Temporaltexture modeling. In InternationalConference on Image Processing, volume3, pages823–6,Sep1996.

[27] L. Wei. Deterministictextureanalysisandsynthesisusingtreestructurevectorquantization. In XII Brazilian Symposium on Computer Graphics and ImageProcessing, pages207–213,October1999.

[28] A. Witkin andM. Kass.Reaction-diffusiontextures.In T. W. Sederberg, editor,Computer Graphics (SIGGRAPH ’91 Proceedings), volume25,pages299–308,July1991.

[29] S. P. Worley. A cellular texture basisfunction. In H. Rushmeier, editor, SIG-GRAPH 96 Conference Proceedings, AnnualConferenceSeries,pages291–294.ACM SIGGRAPH,AddisonWesley, Aug. 1996.

[30] S. Zhu, Y. Wu, andD. Mumford. Filters, randomfieldsandmaximunentropy(FRAME) - towardsaunifiedtheoryfor texturemodeling.International Journalof Computer Vision, 27(2):107–126,1998.

Fast Texture Synthesis using Tree-structured Vector Quantization

Li-Yi Wei MarcLevoy

StanfordUniversity

Category: Research

Format:Print

Contact: Li-Y i WeiRoom386,GatesComputerScienceBuildingStanford,CA 94309,U.S.A.

phone: (650)725-3733fax: (650)723-0011email: [email protected]

Papernumber:0015

Estimated# of pages:10

Keywords:TextureSynthesis,CompressionAlgorithms,ImageProcessing

Texturesynthesisis importantfor many applicationsin computergraphics,vision, andimagepro-cessing.However, it remainsdifficult to designan algorithmthat is both efficient andcapableofgeneratinghigh quality results. In this paper, we presentan efficient algorithmfor realistictexturesynthesis.Thealgorithmis easyto useandrequiresonly a sampletextureasinput. It generatestex-tureswith perceivedquality equalto or betterthanthoseproducedby previous techniques,but runstwo ordersof magnitudefaster. This permitsus to apply texturesynthesisto problemswhereit hastraditionallybeenconsideredimpractical. In particular, we have appliedit to constrainedsynthesisfor imageeditingandtemporaltexturegeneration.Our algorithmis derived from Markov RandomFieldtexturemodels,andgeneratestexturesthroughadeterministicsearchingprocess.Weacceleratethissynthesisprocessusingtree-structuredvectorquantization.

fast texture synthesis using tree-structuredvector...

Documents