depth from focus 2010

22
Depth From Focus Anne Berres, Bernd Lietzow, Peter Salz, Yannik Schelske February 20, 2010 1 Introduction Many areas in science and engineering highly depend on 3D representations of the ap- plication’s objects. Often, these objects are scanned with laser scanners or modelled manually. In practice, laser scanners often cannot be used due to the reflective prop- erties of the objects or because of the object’s fragility. Manual modelling is often too imprecise. Therefore, touchless reconstruction of objects is a very useful addition to the current reconstruction methods. In addition, character recognition can also be improved through image fusion using the focus information gained through our method. The aim of our project is to create a depth map from a series of images. First we take a series of images with a calibrated camera. Afterwards, the images are cropped to the same image section. Then, we compute a focus measure to determine sharp areas of each image of a series and get a coarse depth map as a byproduct. This depth map is then refined using an approximation with a Gaussian function. Finally, our depth map can be visualised in 3D. Our algorithm is written entirely in Python. It is included in the Appendix. For the calibration, we used the MatLab Toolbox for Camera Calibration [1]. The visualisation is done in Python and MeshLab [2]. Finally, we also tried out an alternative approach which uses defocus information rather than focus information to create a depth map. In this term paper, we chronologically give an overview over the different steps in our method and evaluate our results. 2 Image Creation For the Depth From Focus approach, we need a series of images which have identical camera settings (thus, the focus plane has the same distance from the camera) and which are taken from different distances to the scene. The camera that is used should be an SLR as small consumer cameras tend to have a very large depth of field and make it harder to determine which areas are actually in focus. To achieve the best results, the depth of field has to be very narrow. 1

Upload: blazej-czuprynski

Post on 28-Nov-2014

214 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Depth From Focus 2010

Depth From Focus

Anne Berres, Bernd Lietzow, Peter Salz, Yannik Schelske

February 20, 2010

1 Introduction

Many areas in science and engineering highly depend on 3D representations of the ap-plication’s objects. Often, these objects are scanned with laser scanners or modelledmanually. In practice, laser scanners often cannot be used due to the reflective prop-erties of the objects or because of the object’s fragility. Manual modelling is often tooimprecise. Therefore, touchless reconstruction of objects is a very useful addition to thecurrent reconstruction methods.

In addition, character recognition can also be improved through image fusion usingthe focus information gained through our method.

The aim of our project is to create a depth map from a series of images.First we take a series of images with a calibrated camera. Afterwards, the images are

cropped to the same image section. Then, we compute a focus measure to determinesharp areas of each image of a series and get a coarse depth map as a byproduct. Thisdepth map is then refined using an approximation with a Gaussian function. Finally,our depth map can be visualised in 3D.

Our algorithm is written entirely in Python. It is included in the Appendix. For thecalibration, we used the MatLab Toolbox for Camera Calibration [1]. The visualisationis done in Python and MeshLab [2].

Finally, we also tried out an alternative approach which uses defocus informationrather than focus information to create a depth map.

In this term paper, we chronologically give an overview over the different steps in ourmethod and evaluate our results.

2 Image Creation

For the Depth From Focus approach, we need a series of images which have identicalcamera settings (thus, the focus plane has the same distance from the camera) and whichare taken from different distances to the scene.

The camera that is used should be an SLR as small consumer cameras tend to havea very large depth of field and make it harder to determine which areas are actually infocus. To achieve the best results, the depth of field has to be very narrow.

1

Page 2: Depth From Focus 2010

Scene Setup

We used an OLYMPUS E-500 with a telephoto lens as photos taken with a wide anglelens have a larger depth of field. The aperture was opened as widely as possible toachieve a narrower depth of field.

In order to move the camera along the optical axis with as little change to the otherparameters as possible, it was mounted onto a rail, as displayed in Figure 1. This rail isequipped with a linear-motion ball bearing to ensure smooth and exact translation forevery picture of the series.

Since the calibration is dependent on the focus setting of the camera and the focalplane has to stay in the same distance to the lens through the course of an image series,the focus setting of the lens must not change. To achieve this, the lens was set tomanual focus mode and before calibrating, the focus ring was fixed with tape in orderto prevent accidental readjustment. This tape was never removed during experimentsto avoid having to recalibrate the camera.

The scene was about 2 m in front of the camera and about 20 cm deep. For each object,10 to 20 images were taken. The camera was moved in equidistant steps between theimages. Depending on the size and shape of each object, the step width varied between1.0 cm and 2.0 cm.

(a) Rail with camera (b) Taking a series of pictures

Figure 1: Experiment setup

Calibration

No real camera is absolutely perfect. Even less so, if it is affordable for amateur pho-tographers. The lens can have both a radial and a tangential distortion. Both resultin images which are not always exact enough for scientific applications. Therefore it isoften necessary to calibrate the camera beforehand for each setting to compute the error.

The [1, Camera Calibration Toolbox] for MatLab is very useful for this task. It loadsa couple of images of calibration patterns, like the one in Figure 2a from diverse angles,as seen in Figure 2c. Four corners of the pattern are manually selected, and the toolbox

2

Page 3: Depth From Focus 2010

computes the extrinsic parameters such as the focal length, the principal point and thedistortion for you.

Figure 2b displays the overall distortion (that is, both radial and tangential distortion)of the camera we used. The principal point is not exactly in the centre of the image butshifted a bit to the right.

(a) Calibration pattern (b) Distortion model

(c) Placement of patterns in space

Figure 2: The MatLab calibration toolbox provides a lot of useful functions.

Figure 3 visualises the effect of calibration. This small image section is taken from theborder region of an image of an object on a cloth bag, as seen in Figure 3a. The outlineof the bag is given in red before calibration and in green after calibration. Figure 3bshows that the bag was shifted towards the bottom right. This makes sense since thesection was taken from the bottom right quarter of the image.

The calibration toolbox discards any colour information in the images. Color is notnecessary for our approach, so if we use uncalibrated images, we first convert them togreyscale.

Error Correction

Since the camera could not be mouted to the rail with the movement axis of the rail inperfect alignment to its optical axis, the scene would appear slightly shifted in one direc-tion between the individual pictures. To correct this vertical movement, we introduced

3

Page 4: Depth From Focus 2010

(a) Section from the bot-tom right region.

(b) Before (red) and after (green) calibration.

Figure 3: The distortion is most visible near the border regions of an image.

a shift parameter. This factor is the number of pixels each image has to be translatedin order to align it with the previous one. Since within one image series, the camerawas always moved in equidistant steps, the displacement of the scene is also equidistantand the shift parameter remains constant within one image series. After calibratingthe images it is determined manually and the images are translated prior to any othercalculations.

Another error introduced while taking the pictures arises from mechanical deficienciesof the rail. Due to the heavy weight of the camera it bends down slightly when it isextended. The fact that we used a telephoto lens made the errors quite visible in theimages. Since this error is non-linear, correction was done manually by aligning theimages with an image editing software only when very accurate results were desired.

However, we are convinced that in a professional environment, it is possible to producea rail that is stable enough for this task. In the applications where a microscope is usedthis problem does not exist.

Suitable Objects

[3, Nayar] strongly limits the applicability of the approach in his paper. In order to getnice results, the objects to be used need an extremely good texture because the approachis strongly based on edge detection. When observing the comparison in Figure 4, onecan see that the surface of the printer looks the same, no matter whether it is in focusor not.

For his experiments, Nayar artificially created a very small, sand-papered steel ball.The roughness achieved gave the ball a very strong texture to be observed with a micro-scope. On objects, such a good texture is difficult to achieve. In our examples, we triedto find some objects with an adequate texture.

In further experiments, a lack of texture could be compensated by projecting struc-tured light onto the object.

4

Page 5: Depth From Focus 2010

(a) Object in focus (b) Object out of focus

Figure 4: Untextured surfaces look the same in focus and out of focus.

3 Focus Measure

To determine which areas of an image are in focus, we need a focus measure. Wemeasure the change of the image value between the different pixels. If an area with astrong texture is in focus, the second derivative of the image value is very high.

In Figure 5, we have an object P in 3D space. The light rays it emits are refracted bythe lens and converge at the point Q on the image plane1.

δ

o

f

i

object plane

lens sensor plane

image plane

P

Q

Figure 5: Image formation geometry. [3]

The Gaussian lens law can now give us focal distance f from the object distance oand the image distance i:

1

o+

1

i=

1

f

One way of determining the area in focus is to use high-pass filter, but this needs arather expensive computation. A Laplacian filter is a cheaper alternative to a high-pass

1Usually, the sensor plane and the image plane should coincide, so we ignore δ in our approach.

5

Page 6: Depth From Focus 2010

filter:

∆2LI =

∂2I

∂x2+∂2I

∂y2

This approximates the second derivative of the image intensity I in both x and y direc-tion. However, it has a defect: The derivatives can have opposing signs, so if they areequal in magnitude, they cancel each other out. To overcome this problem, you can usea modified version of the Laplacian. Here, the absolute value of the derivatives is used.

∆2MLI =

∣∣∣∣∣∂2I

∂x2

∣∣∣∣∣+∣∣∣∣∣∂2I

∂y2

∣∣∣∣∣Naturally, the Modified Laplacian is always greater or equal to the absolute value of theordinary Laplacian.

We implemented the Modified Laplacian as the sum of two convolutions with a filterkernel. A step parameter is used to modify the size of our kernel.

[3, Nayar] introduces a third measure, the Sum-Modified-Laplacian. It takes the sumof all Modified Laplacian measures in an environment of a point that are greater than athreshold value.

SML(i, j) =i+N∑

x=i−N

j+N∑y=j−n

ML(x, y) · t

where t = 1 if ML(x, y) ≥ Ti and t = 0 otherwise.

As we already determined the crop of each image as described in the previous section,it suffices to compute the focus measure for the cropped parts. After computing thefocus measure, the focus values are interpolated bilinearly to the scaled-up image size.

4 Image Processing

Naturally, when you take a series of pictures with different distances from the scene, theimage crop will vary, so we cannot simply enter the images into our program, becauseotherwise we would get a result like Figure 6:

In order to find out how to crop our images, we make use of the Intercept Theorem.As mentioned before, we measured the distance from the camera to the scene. Thisdistance can now be used as the following:

In Figure 7, we have the distance d from the lens to the scene, the distance s fromthe lens to the sensor, the height H of our object and the height h1 of the object on thesensor. This is the setting for the image that is taken from the closest point to the scene.The version on the right is used for each point that is further away from the scene. Forthe i’th point, we get ∆d = d+ i · t, where t is the step width between the points.

To get the cropping factor, we use the Intercept Theorem: Hh1

= ds ,

Hhi

= d+∆ds . By

dividing the left equation’s sides by the corresponding right one’s, we get the croppingfactor

hih1

=d

d+ ∆d

6

Page 7: Depth From Focus 2010

(a) Fuse map of an uncropped image series. (b) Corresponding fused image.

Figure 6: The fusion results of uncropped images are not very helpful.

Figure 7: Left: Closest image of the scene; right: other images

with which the initial image size of image i is multiplied to get the size of the croppedimage. If the shift parameter mentioned in the previous section is zero, the centre of thecropped image is the same as that of the uncropped one. Otherwise, it is shifted downby the shift parameter.

The computations are done at this point, however, scaling destroys details in ourimages. This causes problems for the focus measure as it relies on the presence ofdetails. Thus, we do not scale the images yet.

Figure 8 contains the changes for three pictures taken from different distances. Thetop row is the shot closest to the scene and it is perfectly focused. It will not be croppedor scaled at all, since it already contains the smallest image section. The other middleand bottom row’s shots are taken from further away and the red square is blurred.

Both images are cropped to the size determined with the cropping factor. They nowcontain the same image section as the top row. Since we need our images to be the samesize for the algorithm, we scale them up to the original image size.

5 Image Fusion

After calculating the focus measure for every image in the sequence, we can fuse theimages in order to get an image in which the entire scene appears to be in focus. This isnot necessary for the creation of a depth map, but it is a reasonable intermediate step.

7

Page 8: Depth From Focus 2010

Figure 8: Original images (left), cropped images (middle) and scaled images (right).

It is good for quick evaluation of a focus measure when parameters are tuned and theresults of the image could be used in further experiments to texture the depth map.

For every pixel of the fused image, we look for its focus measure in the input images.We take the colour value from the image which has the highest focus measure.

As a result, we obtain a picture which has only maximally focused areas. Similarly,we can create a very coarse index depth map with the indices of each picture as a depthvalue. As the step width remained constant for our experiments, only the depth scalingmay be slightly different.

As before, we first do all computations on the uncropped and unscaled images andapply the deformations to the result since this preserves the details we need for ourcomputations.

Figure 9 displays the results we can produce solely with the focus measure.

The index map in Figure 9a contains a lot of noise, but it already gives the viewera general idea of what the object looks like. If a 3D representation of this depth mapis created, it has so many spikes with irrelevant data that it is almost impossible torecognise the teddy bear.

To get rid of the noise, we applied a windowed median. Figures 9b and 9c display themap after application of the median for window sizes 9 px and 21 px.

Obviously, the 3D map created with the evened-out data in Figure 9d still does notlook like a teddy bear. From the right perspective it is possible to find its feet, but itshead remains concealed.

Finally, Figure 9e shows a fused version of the teddy bear image series. Opposed tothe individual images of the series, the furry texture is sharp on the whole2 body of theteddy bear.

2The very shaded area at the bottom of the image did not deliver enough texture to be recognised assharp. This is reflected in the depth maps by the red areas. As a result, this region was not takenfrom the optimal image but from a different one.

8

Page 9: Depth From Focus 2010

(a) Index depth map (b) Windowed median with win-dow size 9 px

(c) Windowed median with win-dow size 21 px

(d) 3D depth map of median (e) Fused image from the indexdepth map

Figure 9: Visualisation of the coarse results.

6 Depth Map

The index depth map created for image fusion was rather coarse. Since we cannot doinfinitely many tiny steps, we cannot guarantee that we will find optimal depth valuesfor every pixel. For a smoother result, we try to estimate the true depth value for eachpixel by taking its focus measure in multiple photos into account.

In order to create a more refined depth map, we assume that the focus measure foreach pixel has a Gaussian distribution over the individual images as seen in Figure 10.From the fuse map, we already know which image has the maximal focus value for apixel. We assume that the focus values for this image and the two neighbouring imagesform the tip of the Gaussian function. We now interpolate the focus values of these threeimages and find the peak. The depth value at which the peak occurs is the exact depthvalue d̄ for the pixel.

F = Fpeak exp

−0.5

(d− d̄σF

)2⇔ lnF = lnFpeak − 0.5

(d− d̄σF

)2

9

Page 10: Depth From Focus 2010

Figure 10: Gaussian function fitted to focus measures [3].

Let Fm = Fmax, where Fmax is the maximal coarse depth for a pixel, and let Fm−1, Fm+1

be the neighbour values. Thus we get the depth value d̄, the standard deviation σF andthe focus measure Fpeak from

d̄ =(lnFm − lnFm+1)(d2

m − d2m−1)− (lnFm − lnFm−1)(d2

m − d2m+1)

2∆d((lnFm − lnFm−1) + (lnFm − lnFm+1))

σ2F = −

(d2m − d2

m−1) + (d2m − d2

m+1)

2((lnFm − lnFm−1) + (lnFm − lnFm+1))

Fpeak = Fm exp

0.5

(d− d̄σF

)2

This is calculated for every pixel, therefore we can produce a depth map with theestimated depths for each pixel.

Unfortunately, this formula is unstable for realistic scenes.

First of all, real data does not usually fit into a Gaussian function. This is illustrated byFigure 11: besides the image with the maximal focus measure, there is a local maximumwhich is larger than both of its neighbours. In addition, all but two measures are greaterthan the right neighbour. If we tried fitting a Gaussian function through all the points,the maximum would probably be between the global maximum and the local one on theleft. With these three points, the bell is extremely narrow so our standard deviationbecomes very small.

Moreover, the focus measure is computed by determining the change of image intensity.But if the image intensity remains the same, the focus measure evaluates to zero. Thiscauses problems with the calculations (division by zero, logarithm of zero). To avoidthose problems, we defined special cases for each of the problems. Furthermore, werestrict calculations to reasonable standard deviations. If the focus measure fails toprovide good results, it is indicated by very large or very small standard deviations, andcan be treated.

10

Page 11: Depth From Focus 2010

Figure 11: Real focus measure distribution with global maximum (green) and neighbours(red).

(a) Refined depthmap

(b) Map detail (righteye and nose)

(c) Windowed me-dian with windowsize 9px

(d) Windowed me-dian with windowsize 21px

Figure 12: 2D depth maps of the refined results.

Figure 12b shows a depth map calculated by this method. In 12c and 12d the noise isreduced by a median filter as described above. Figure 13 shows 3D plots for the medianfiltered depth map. These plots show that the ears of the teddy bear are situated behindits forehead and its snout protrudes from its face. The rear legs and feet are the frontmostobjects in the image.

Figure 13: 3D depth map from different points of view.

11

Page 12: Depth From Focus 2010

7 Alternative Approach: Shape From Defocus

A different approach to the idea of shape from focus is described by [4, Surya]. Inthis approach, multiple images are taken from the same camera position with identicalfocus settings, but different aperture settings. Areas which are in focus appear sharpindependent of the aperture setting. But an area which is out of focus gets increasinglyblurry when the aperture is widened.

To determine the focus value of a region, the correlation between pixels which are closeto each other is calculated. If the correlation is small in both images, an area is texturedand sharp. If the correlation is small in the image taken with the narrow aperture andlarger in the picture with the wider aperture, an area is textured, but unsharp. If thecorrelation is high in both pictures, the area is untextured. This approach does notconfuse untextured areas with out of focus ones, but we still cannot determine whetheran untextured area is in focus or not.

From the focus value of each pixel, we can directly derive the distance of the objectvisible in this pixel from the focal plane. However it is impossible to determine whetherthe object is in front of the focal plane or behind it. Therefore the camera has to befocused on either the frontmost object in the scene or on the object which is furthestaway from the lens.

We set up one small experiment using two images from the same position of the samescene. Figure 14 displays the complete scene as well as a section to compare wide andnarrow camera aperture.

(a) Complete image (b) Wide aperture (c) Narrow aperture

Figure 14: Comparison of different apertures.

A wide aperture blurs the cocoa box and the stapler in the background a lot more thanthan narrow aperture does. This means, that the “sehr gut” text on the box is rathersharp with narrow aperture (i.e. black text has little correlation with white background),but it is blurred with wide aperture (i.e. black and white both have higher correlationswith different shades of grey than with each other. The effect for the stapler is similarbut not as prominent as it is closer to the camera.

However, the tin in the foreground appears sharp in both images. Thus, the correlationbetween pastel green and gold or red is small in both cases.

Therefore, we know that the tin in the foreground is in focus whereas the stapler isfurther away and the cocoa box is at the back of the scene. With this knowledge, wecan create a depth map of the scene. Objects in front have the shortest distance to

12

Page 13: Depth From Focus 2010

the camera and are coloured blue. Those at the back have the longest distance and arecoloured red.

Figure 15: Depth map for this example.

Although this simple test of the shape from defocus approach gave quite promisingresults, we did not investigate it any further because shape from defocus was not theactual goal of our project. Furthermore, most of the values in the depth map have tobe determined from a blurry picture, and we hoped to get more exact results from amethod which relies on multiple clearly focused images.

8 Conclusion

The most critical point in our procedure was the focus measure. Without an accurateway to determine whether an area lies in the focal plane of a given image we can not giveany information about its depth coordinate. The existing focus measures we evaluatedwere adequate for image fusion, but did not perform well enough for calculating a depthmap. The main problem is still finding a distinction between areas which are untexturedand areas which are out of focus.

We assume that the focus measures for one pixel over all the images of a series aredistributed along a Gaussian curve. But this only holds for ideal data sets so the result-ing depth maps were very noisy. The misalignment that arises from imprecise cameramovement and scaling only by full pixels also introduces errors. The noise in the depthmaps was mitigated by applying a median filter, and the results looked quite promising.However, they were not as good as depth maps created by multi-view techniques.

In future work, a combination of the shape from defocus and the depth from focusapproaches could be evaluated in order to compensate for the weakness of the focusmeasure.

13

Page 14: Depth From Focus 2010

References

[1] Jean-Yves Bouguet. Camera calibration toolbox for matlab.http://www.vision.caltech.edu/bouguetj/calib doc/, June 2008.

[2] Paolo Cignoni and Guido Ranzuglia. Meshlab. http://meshlab.sourceforge.net/, Sep2009.

[3] Shree K. Nayar. Shape from focus. Technical report, Carnegie Mellon University,1989.

[4] Gopal Surya and Murali Subbarao. Depth from defocus by changing camera aperture:A spatial domain approach, 1993.

14

Page 15: Depth From Focus 2010

Appendix: Source Code

import numpy as npimport Imageimport pylab as p lfrom s c ipy . s i g n a l import convolvefrom m p l t o o l k i t s . mplot3d import Axes3Dfrom matp lo t l i b import cmimport matp lo t l i b . pyplot as p l tfrom s c ipy . ndimage import g a u s s i a n f i l t e r

def loadRawImages ( f i l enames ) :rawImages = [ Image . open ( f ) for f in f i l enames ]return rawImages

def transformToArray ( rawImages ) :return np . array ( [ np . array ( i , dtype=’ f l o a t ’ ) for i in

rawImages ] )

def transformToGrayArray ( rawImages ) :return np . array ( [ np . array ( i . convert ( ’L ’ ) , dtype=’ f l o a t ’

) for i in rawImages ] )

def modi f i edLaplace ( image , s tep ) :”””Computes the sum of the a b s o l u t e va l u e s o f the second

d e r i v a t i v e s .The s t ep parameter determines the s i z e o f the k e rn e l s

used f o r convo lu t i on .Due to s p e c i a l cases at the borders , the r e s u l t i n g array

needs to be cropped . ”””L x = np . z e ro s ( ( step , s tep ) )L x [ 0 , s tep /2 ] = 1 .L x [−1 , s tep /2 ] = 1 .L x [ s tep /2 , s tep /2 ] = −2.L y = np . z e ro s ( ( step , s tep ) )L y [ s tep /2 , 0 ] = 1 .L y [ s tep /2 ,−1] = 1 .L y [ s tep /2 , s tep /2 ] = −2.convolved = np . abs ( convolve ( L x , image ) )c2 = np . abs ( convolve ( L y , image ) )convolved = convolved + c2shape = convolved . shape

15

Page 16: Depth From Focus 2010

return convolved [ s tep /2 : shape [0]− s tep /2 , s tep /2 : shape [1]−s tep /2 ]

def SML( measure , n , T) :”””From the modif ied−l a p l a c e− f i l t e r e d image , compute the sum−modif ied−l a p l a c e .A normal ized box k e rne l i s used to sum up the va l u e s o f

the window . I f a va lueo f the ML−opera t ion i s sma l l e r than a t h r e s h o l d T, i t i s

omit ted .n : window s i z eT: t h r e s h o l d ”””

shape = measure . shapewindow = np . ones ( ( n , n) ) / f l o a t (n)measure [ measure < T] = 0sml = convolve ( window , measure )shape = sml . shapereturn sml [ n /2 : shape [0]−n/2 , n /2 : shape [1]−n /2 ]

def gaussianFuseMap ( measures ) :”””Given a l i s t o f f ocus measures , compute a fu s e map .

For every p i x e l the image wi th the l a r g e s t f ocus measurei s determined and i t s index s t o r ed in the fu s e map .Due to memory problems , the focus measures are comparedpa i rw i s e . ”””

shape = ( measures [ 0 ] . shape [ 0 ] , measures [ 0 ] . shape [ 1 ] )maxIndices = np . z e ro s ( shape )for i in range (1 , l en ( measures ) , 1 ) :

i n t e rmed ia t e = np . argmax (np . array ( [ measures [ i −1] , measures[ i ] ] ) , a x i s =0)

maxIndices = maxIndices + inte rmed ia t ereturn maxIndices

def gaussianDepth ( measures , depths ) :”””From given focus measures and an accord ing l i s t o f depth

va lues , t h i scomputes the depth va lue d bar f o r every p i x e l .For each p i x e l , the index wi th the l a r g e s t f ocus measure

i s determined .Together wi th i t s ne ighbors , i t forms a gauss ian peak , so

the model assumes .Therefore , the sigma and d bar parameter o f t h i s gauss ian

i s computed .

16

Page 17: Depth From Focus 2010

There are some s p e c i a l cases , l i k e a l l f ocus measures areequal , or one focus measure

i s zero . Then the p i x e l i s assumed as background .I f the sigma parameter i s ou t s i d e o f a c e r t a i n scope , the

p i x e l can a l s o be assumedas background . And f i n a l l y , i f the F peak , t h a t i s the

i n t e r p o l a t e d focus measureat d bar , i s too smal l , we have again a background p i x e l .

”””d e l t a = np . abs ( depths [1]− depths [ 0 ] )shape = ( measures [ 0 ] . shape [ 0 ] , measures [ 0 ] . shape [ 1 ] )maxIndices = gaussianFuseMap ( measures )

l = l en ( measures )−1r e s u l t = np . z e r o s ( shape )for x in range ( shape [ 0 ] ) :

for y in range ( shape [ 1 ] ) :m = i n t ( maxIndices [ x , y ] )i f m==0 or m==l :

r e s u l t [ x , y ] = depths [m]e l i f measures [m] [ x , y ] < 0 .1 or measures [m−1] [ x , y]==0

or measures [m+1] [ x , y]==0:r e s u l t [ x , y ] = 0

else :d i f f 1 = np . l og ( measures [m] [ x , y ] ) − np . l og ( measures

[m+1] [ x , y ] )d i f f 2 = np . l og ( measures [m] [ x , y ] ) − np . l og ( measures

[m−1] [ x , y ] )nenner = 2∗ d e l t a ∗( d i f f 2 + d i f f 1 )i f nenner==0:

r e s u l t [ x , y ] = 0else :

d1 = depths [m]∗∗2 − depths [m−1]∗∗2d2 = depths [m]∗∗2 − depths [m+1]∗∗2d bar = np . abs ( ( d i f f 1 ∗d1 − d i f f 2 ∗d2 ) / nenner )sigma = np . s q r t (np . abs (−(d1 + d2 ) /(2∗ ( d i f f 2 +

d i f f 1 ) ) ) )i f sigma > 3 . or sigma <0.1 :

r e s u l t [ x , y ] = 0else :

F peak = measures [m] [ x , y ]∗ np . exp ( 0 . 5∗ ( (depths [m] − d bar ) / sigma ) ∗∗2)

i f F peak < 0 . 1 :r e s u l t [ x , y ] = 0

17

Page 18: Depth From Focus 2010

else :r e s u l t [ x , y ] = d bar

return r e s u l t

def matchImages ( raw , ds , step , w, T, cu to f f , s h i f t , f o cu s=True ):”””Computes the image reg ion o f i n t e r e s t , computes the focus

measuref o r t h i s ROI and then s c a l e s the r e s u l t i n g image to the

same s i z eas the f i r s t image . ”””

n = len ( raw )print ”Image 0 . . . ”# Convert to g r ay s ca l eg = np . array ( raw [ 0 ] . convert ( ’L ’ ) , dtype=’ f l o a t ’ )# Crop the images = g [ c u t o f f [ 0 ] : c u t o f f [ 1 ] , c u t o f f [ 2 ] : c u t o f f [ 3 ] ]# Compute Focus measurei f f o cu s : s = SML( modi f i edLaplace ( s /255 . , s t ep ) ,w,T/255 . )shape = np . array ( s . shape )focusMeasure = [ ]# Add the f i r s t f ocus measure to the l i s tfocusMeasure . append ( s )for i in range (1 , n , 1 ) :

print ”Image ”+s t r ( i )+” . . . ”g = np . array ( raw [ i ] . convert ( ’L ’ ) , dtype=’ f l o a t ’ )g = g [ c u t o f f [ 0 ] : c u t o f f [ 1 ] , c u t o f f [2]− i ∗ s h i f t : c u t o f f [3]− i∗ s h i f t ]

i f f o cu s : g = SML( modi f i edLaplace ( g /255 . , s tep ) ,w,T/255 . )

# Determines the change o f s i z e between imagesf a c t o r = 1 . / ( ds [ 0 ] / f l o a t ( ds [ i ] ) )# smal l i s the reg ion o f the l a r g e image t ha t needs to

be cropped and then r e s i z e dsmal l = (1 − f a c t o r ) ∗ shapesmal l = np . round ( smal l / 2 . )g = g [ smal l [ 0 ] : shape [0]− smal l [ 0 ] , smal l [ 1 ] : shape [1]−

smal l [ 0 ] ]# To conver t an array to an image , the va l u e s need to be

between 0 and 255. So the o ld# i n t e r v a l i s s tored , then conver ted to [ 0 , 255 ] , then

conver ted to an Image ,# then r e s i z e d wi th b i l i n e a r i n t e r p o l a t i on , then

18

Page 19: Depth From Focus 2010

conver ted back to an array and f i n a l l y ,# transformed back to the o ld i n t e r v a l .mi = g . min ( )ma = g . max( )im = Image . fromarray (np . u int8 (255∗ ( g−g . min ( ) ) /( g . max( )−g

. min ( ) ) ) )g = np . array ( im . r e s i z e ( ( shape [ 1 ] , shape [ 0 ] ) , Image .

BILINEAR) , dtype=’ f l o a t ’ )g = g /255 .g = (ma−mi) ∗g + mifocusMeasure . append ( g )

return focusMeasure

def depthFromFocus ( f i l enames , step , w, T, d0 , de l ta , cu to f f ,s h i f t , gauss ian=True ) :””” f i l enames conta ins the image f i l enames , s t a r t i n g wi th the

one t ha t has the most remotefocus p lane .s t ep i s the spac ing parameter f o r the ML Operation .w: the spac ing parameter f o r the SML opera t ion .T: t h r e s h o l d parameter f o r SML opera t ion .d0 : Distance from camera to focus p lane . Needs to be

determined expe r imen ta l l y .d e l t a : Distance between d i f f e r e n t focus p lanes .c u t o f f : The reg ion o f i n t e r e s t as a l i s t o f 4 va l u e ss h i f t : The v e r t i c a l s h i f t to cope wi th camera movement .gauss ian : I f False , the coarse depth map i s computed

in s t ead . ”””n = len ( f i l enames )raw = loadRawImages ( f i l enames )ds = [ d0 + i ∗ d e l t a for i in range (n−1,−1,−1) ]focusMeasure = matchImages ( raw , ds , step , w, T, cu to f f ,

s h i f t )

del raw

print ”Computing depth map . . . ”i f gauss ian :

dm = gaussianDepth ( focusMeasure , ds )dm[dm==0] = 1.05∗dm. max( )return dm

else :indexMap = gaussianFuseMap ( focusMeasure )

19

Page 20: Depth From Focus 2010

shape = indexMap . shapedepthMap = np . z e r o s ( ( shape [ 0 ] , shape [ 1 ] ) )for x in range ( shape [ 0 ] ) :

for y in range ( shape [ 1 ] ) :depthMap [ x , y ] = ds [ i n t ( indexMap [ x , y ] ) ]

return depthMap

def plot3D ( data , width , height , s tep ) :””” Plo t a 3D sur f ace f o r ” data ” , g i ven the ”width ” and ”

he i g h t ” o f the array .Note : ”width ” means image width and ” he i g h t ” image he i gh t

, not array dimensions . ”””

f i g = p l t . f i g u r e ( )ax = Axes3D( f i g )X,Y = np . meshgrid ( range (0 , width , s tep ) , range (0 , he ight , s tep ) )ax . p l o t w i r e f r ame (X,Y, data [ : : step , : : s t ep ] , cmap=cm. j e t )p l t . show ( )del axdel f i g

def sigma2 ( g1 , g2 , D1 , D2 ,w) :d i f f = ( g1−g2 ) ∗∗2l a p l a c e g 1 = l a p l a c e ( g1 ) ∗∗2ke rne l = np . ones ( (w,w) )f a c t o r = 4∗D2∗D2/(D1∗D1−D2∗D2)r = convolve ( kerne l , l a p l a c e g 1 )r [ r==0] = 1 .r e s u l t = f a c t o r ∗np . s q r t ( convolve ( kerne l , d i f f ) / r )return np . s q r t (np . abs ( r e s u l t ) )

def u( g1 , g2 , D1 , D2 , f , s ,w) :f a c t o r = sigma2 ( g1 , g2 , D1 , D2 ,w)f a c t o r [ f a c t o r == 0 ] = 1 .return D2∗ s /(2∗np . s q r t (2 ) ∗ f a c t o r ) ∗(1 + 1 ./ f − 1 ./ s )

def depthFromDefocus ( fn1 , fn2 , K1, K2, f , s ,w=5) :# Load images

20

Page 21: Depth From Focus 2010

raw = loadRawImages ( [ fn1 , fn2 ] )gray = transformToGrayArray ( [ r for r in raw ] )height , width = gray . shape [ 1 ] , gray . shape [ 2 ]del raw# Normalize imagesg1 = gray [ 0 ] / gray [ 0 ] . max( )g2 = gray [ 1 ] / gray [ 1 ] . max( )# Compute aper ture s i z e from f o c a l l e n g t h and ”Blendenzah l ”D1 = f / f l o a t (K1)D2 = f / f l o a t (K2)# Compute d i s t anc e sus = u( g1 , g2 , D1 , D2 , f , s ,w)# Crop arrays to co r r e c t s i z e s ( due to convo lu t i on opera t ion

)# At the moment t h i s works on ly f o r w=5us = us [ 2 : he ight −2, 2 : width−2]# Inve r t the d i s t anc e sus = us . max( ) − us

return us

i f name == ” main ” :raw = loadRawImages ( [ ”PC185183 .JPG” , ”PC185184 .JPG” ] )#raw = loadRawImages ( [ ” PC105138 .JPG” , ”PC105140 .JPG” ] )width , he ight = 800 ,600gray = transformToGrayArray ( [ r . r e s i z e ( ( width , he ight ) ) for r

in raw ] )g1 = gray [ 0 ] / gray [ 0 ] . max( )g2 = gray [ 1 ] / gray [ 1 ] . max( )f = 55 .D1 = f /8 .D2 = f /16 .#D1 = f /4.#D2 = f /16.s = 60 .w = 5us = u( g1 , g2 , D1 , D2 , f , s ,w)us = us [ 2 : he ight −2, 2 : width−2]us = us . max( ) − us

f i g = p l t . f i g u r e ( )ax = Axes3D( f i g )X,Y = np . meshgrid ( range ( width ) , range ( he ight ) )

21

Page 22: Depth From Focus 2010

ax . p l o t s u r f a c e (X,Y, us , cmap=cm. j e t )p l t . show ( )

22