eyw4 user tutorial 1

8/8/2019 EYW4 User Tutorial 1

1/30

EyesWeb Users Tutorial 1

The patches analysed in this tutorial regard the elaboration of image sequences with basic

background subtraction techniques. These video elaboration methods can be applied to a great

plethora of fields, for example video-surveillance, interactive multimedia systems, musem and

entertainment applications: using a video-camera system we can immediately discern details in

our image, pointing out moving objects and following their trajectory to eventually interpret theiractivities.

In this tutorial we explain the following EyesWeb 4.5 patches:

Background Subtraction (simple algorithm thatsubtracts a background frame from ourinput feed)

Simple Threshold (algorithm that works well in high-contrast enviroments whichbinarises out feed)

Background Subtraction with Multiple Thresholds (advanced version of the previouspatch, which allows for multiple thresholds to be applied to the feed and doesnt requirethe users intervention)

Simple Frame Differencing (patch that creates a silhouette by subtracting each fram fromits following one)

Adaptive Background Subtraction (a highgly user-interactive patch that combinesBackground Subtraction, and Simple Frame Differencing)

Persistent Frame Differencing (algorithm that allows us not only to extrapolate anymoving object from the feed, but also gives us some information on how much and whereits moving)

Tests:

To test these patches the following sample videos were used: TestBS.avi, TestBS1.avi and

TestBS2.avi.

DIST - University of Genova

Laboratorio di Informatica Musicale (InfoMus)http://www.infomus.dist.unige.it


2/30

The Video File Reader block

As we are working with video feeds, all our patches will require a block that opens a video file(be it AVI or MPEG or what else) and inputs it into our patch. This is done by the Video File

Reader block (ImageInputVideo File Reader).

In EyesWeb each block has a set of parameters to determine for it to work in the desiderd way,so the first thing we must do is modify said parameters upon adding the block to our work-space.

In our case we are mainly interested in setting three parameters out of its default value: the color model, the

player status and the play input.

Color Model must be changed from RGB (default setting)to BW (black and white), so that our output image is

already converted into grey-scale. Why? Simply because itis easier for the computer to work with monochrome images

rather than full colour.

Player Status indicates what status the block starts at: this is

to be set at Stopped and the check-ox near to the Play

input must be marked so that it may receive an external

input on when it should start broadcasting the video. In the

more basic patches this might seem superfluous, as the

external input comes from a Bang Generator(BangInput Bang Generator) that starts off at thebeginning of the patch, re-creating the same conditions that

we have just modified. But as we proced to more advanced

pathces, we will need for different blocks to start at the

same time and they will all be coordinated by the same

Bang Generator, so must begin their execution in their

off-mode.


3/30

Background Subtraction

It enables the user to extract moving pixels from a video source, eliminating any static elements

(e.g. background), which are extrapolated as a fixed image at the beginning of the feed (thus the

video must start with a capture of the empty background for this patch to work)

The final video is the result of the following algorithm:

M(t)= Abs[ I(t) Background Image]

What we need to do is extract a frame containing an empty background (i.e. without anyintrusions) and subtract it from the grey-scale image we just inputed: this is done by using the

Snapshot block (TopologySnapshot) which will memorise and store the first frame (in thiscase) in the video (remember when we said that the feed should start with the empty background

for this patch to work?).

Next we can subtract the background from the grey-scale feed with the use of an Arithmetics

block (Operations

Arithmetic), setting the operation type parameter to absolute difference.

All that is left to do now is visualise the result through a Display block

(ImageOutputDisplay).

The main advantage of this patch is the lack of manual intervention on behalf of the user: no

parameters are to be set or changed during runtime.

On the other hand, the patch is terribly hindered by possible lack of contrast between the subject

and the background, confusing ones pixels with the others and creating artefacts.


4/30

Tests:

To make the tests consistent in pointing out this patchs merits and flaws, some slightmodifications were done to the original model:

The changes were necessary to ensure that the background frame we are using for the process is

free from any kind of disturbance, such as moving objects or light changes: what happens is that

the added Video Reader block receives a completely different video-input, specifically a staticrecording of the background, thus (hopefully) avoiding any kind of artefact.

The main advantage that this patch presents the user with is detail: of all the ones presented in

this brief tutorial, only Background Subtraction presents a final output that is rich in

particulars. This means that not only the user will be able to examine things such as clothing,

physiognomy, skin tonality, etc.. but he will also perceive the final feed as three-dimensional,

clearly understanding where the extrapolated subject is.

On the other-hand, as was already mentioned, Background Subtraction suffers terribly from

changes in lighting and scarce contrast between background and subject.

Any sort of light change comes out as a bright spot on the background, while shadows created bymoving objects remain in the output and ruin the final result.

Furthermore, if trying to extract a dark subject from a dark background, the output will be

transparent and with an indefinite contour.


5/30

In figure 1 we can see the effects of changing lights and shadows on the output, while comparing

the image with the one presented in figure 2 the difference in definition between low-contrastfeeds and high-contrast ones is quite clear.

Figure 1

Figure 2


6/30

In both images we can notice a white area on the left-hand side: this is due to an error in

choosing the background feed (figure 3), in which the floor had been covered with sheets ofwhite paper (which wasnt done for the other feeds).

Figure 3


7/30

Simple Threshold

This patch uses a variable threshold to extract a silhouette of the moving subject from the input

video. The algorithm is very simple to understand, though it proves to be faulty when the

threshold is either too high or too low and when subject and background have similar tonalities.

The first three components of the patch are the ones already described in the previous paragraph,

so we will not repeat ourselves. This time, instead of extracting a background image and using

the Arithmetics block, the patch inputs the grey-scaled image in a Threshold block

(OperationsThreshold Operation (int)) with a variable threshold parameter (which can be setthrough an Int Generator item (NumericInput Int Generator) and is visualised through aDisplay item (MathNumericScalarGenericOutputDisplay) (actually, its twothreshold parameters, one for the lower bondary and one for the upper one, though both are set to

the same value to create a binarised output)

What happens is that all pixels that have a grey-scale value higher than the set threshold will be

converted to white (i.e. their value will become 255), while all those that are lower will beconverted to black.

Tests:

The main disadvantage encountered while using this patch is the absolute absence of depth: the

output feed results completely flat!

Furthermore there is a great loss in details (e.g. crossed arms will not be distinguishable and will

result uniformly black), particularly if examining a dark subject.


8/30

Lighting changes also create artefacts and require the user to re-define his threshold settings.

The patch also requires the user to be an active part in the process, setting the threshold to the

right level to extract a correct silhouette: too low a threshold and not enough pixels will beconsidered, too high and the subject will tend to be confused with the background.

As we can see from the following images, according to different threshold values we have

different percentages of spurious pixels (those bothersome black dots that shouldnt be there).

Figure 4 (threshold 190) denotes a high quantity of spurious pixels and white areas near light

sources or where the background is illuminated by the frontal lights.

Figure 4


9/30

In figure 5 (threshold 190) we notice the subject disappearing due to the threshold level being

too high: both the subject and the background appear as a semi-uniform black area.

Figure 5

Figure 6 (threshold 130), shows what happens when reducing the threshold value: thebackground slowly disappears, the silhouette is more precise and changes in light become less

bothersome.

Figure 6


10/30

Background Subtraction with Multiple Thresholds

The method has an edge on both its predecessors as the use of multiple thresholds eliminates the

need of user intervention during the process to obtain background removal (that was one of theflaws in the Simple Threshold) and retains a good amount of detail while enhancing the contrast

of the image (in the Background Subtraction the image was grey-scale and thus less tidy)

All of this is obtained through the use of the Background Subtraction with Multiple Thresholds

block (Imaging->Operations->BgndSubMultThresh ), which allows the process to apply different

levels of thresholds to different areas of the feed according to their lighting. It is defined by three parameters: the number of Threshold Levels (set to 35), the minimum level (87) and the

maximum (194).

The result was then processed through a median filter (Imaging->Filters->NonlinearFilter) toeliminate any artefacts or spurious pixels


11/30

Tests:

As with the Background Subtraction patch, we slightly modified the process to allow us to use a

separate feed for background extraction.

A few flaws have turned up following these test.

First off, the persistent white area problem that appeared in the Background Subtraction patch

also appears here, but as was pointed out before its jus a matter of incongruity between the feed

we want to elaborate and the background feed. (figure 7)

Secondly, the patch suffers from light changes, particularly if the brightness grows in time,resulting in white areas on the screen, that might hide the subject (similarly to what happened in

the Simple Threshold patch, when the threshold level was too high. (figure 7 and 8)

Figure 7


12/30

Figure 8

All in all the patch proved to be a failure: not being able to cope with brightness changes, when

the subject is illuminated directly with bright light his silhouette disappears, thus rendering the

method useless in areas like video-surveillance.

(NOTE: the Multiple Threshold block used is the legacy version belongng to the EyesWeb 3.2library and while this tutorial is being written (January 2007) it is not as yet present in the 4.5

distribution)


13/30


14/30

Where M(t) is the resulting image, I(t) is the current image, I(t-1) is the previous image and is

our threshold value.

Frame differencing results advantageous as it responds very quickly to changes in lighting and

camera motion, extracting only objects that are actually moving (objects that stop disappear from

the screen until they start moving again).

However, only the edge of the silhouette is extracted from the feed and there arent enough

references to perceive if the object is moving towards or away from the camera. Furthermore,

rapid changes in light will cause major variations in the overall lighting, creating shadows and

reflections and the patch cannot discern them from real moving objects, resulting in a veryconfusing image.

Tests:

In Figure 9 we can see how the application of a low threshold (5) can produce great quantities of

spurious pixels and how the subjects shadow is visible in the background, resulting in aninappropriate output.

Figure 9

In Figure 10 a higher threshold value (20) results in the disappearance of the spurious pixels,although the silhouette of the shadow remains in the background.


15/30

Figure 10

In Figure 11 the threshold value is raised even more (50) and as we can see the shadows

silhouette completely disappears (although this is not always true).

Figure 11


16/30

Adaptive Background Subtraction

The Adaptive Background Subtraction patch responds to changes in lighting better than theothers. Being a hybrid of the Threshold, Simple Frame Differencing and Background Subtractionwe have a distinction between moving objects (well segmented, but with a slight pixel trail) and

fixed objects (that slowly fade in the background).

The algorithm used is:

M(t) = Threshold {abs[I(t)-B(t-1)], }

B(t) = *I(t) + (1-)*B(t-1)

Where I(t) is the current image, B(t) is the result of the above expression and is the threshold

value. is a parameter that can be changed at runtime and can only have two values: 0 and 1.

If we examine the patch we can clearly see that with =1 the patch represents a Simple Frame

Differencing: B(t) is equal to I(t), thus our output M(t) becomes Threshold {abs[I(t)-I(t-1)], },

exactly as in the Simple Frame Differencing patch. On the other hand, if =0 it becomes a

simple Background Subtraction, with the only difference that the background is the last frame


17/30

processed while =1 and that the feed is binarised and not in grey-scale, thus losing most of the

methods advantages, while if the patch starts with =0 the patch behaves like a SimpleThreshold (the background subtracted is a blank screen).

Values for>1 werent considered in the elaboration, as this would only result in a brighteningof the image.

All this might seem difficult to accomplish, though after a careful study of the patch you will

realise that its not that hard to understand.

First off we need to generate B(t)= *I(t) + (1-)*B(t-1) and its components.

(1-) is generated by combining a phoney Random Generator block

(Math

Numeric

Scalar

Generic

Input

Random Generator) to generate the 1 (that iswhy we have called it phoney: to have a constant output we set the maximum and minimum

range of the random output to 1) and our usual Int Generator for the . These are then combinedtogether with a Scalar Arithmetic Operation (double) block (OperationsScalar ArithmeticOperation (double)).

They then are multiplied to B(t-1) using another similar block.

B(t-1) is withheld from the previous cycle through a Queue block (TopologyQueue)and sentto another Scalar Arithmetic Operator to finish the algorithm (which shouldnt need any

explaining, as it is exactly like all the other we have seen this far).

This is valid until =1. As soon as it changes to 0 the whole configuration must change to that of

a Background Subtraction.

The background is stored through a Snapshot block, that receiving as a Load parameter (i.e. ifthe Load parameter is =! 0 the block will do its job, stopping if it ever becomes =0) constantly

memorises B(t-1). As soon as =0 the Snapshot stops memorising and stores the last frame to be

used in lieu of the background input feed that we had in the Background Subtraction patch.

To finish off we put an Input Selector block (Topology Input Selector) piloted by , that willskip between the Snapshot and the Queue blocks.

Tests:

We have carried out tests and have obtained the following results.

In Figure 12 (=1 and threshold=5) we have a high amount of spurious pixels (10,6%), too high

to allow us to precisely discern our subjects silhouette. We also have the residual image of thesubjects shadow.


18/30

Figure 12

In Figure 13 (=1 threshold=20) the amount of spurious pixels is drastically reduced (0.59%),

although the shadows contour remains.

Figure 13

Figure 14 (=1 threshold=60) we can see an optimised elaboration: there is no sign of theshadow and the amount of spurious pixels is so small it can be completely ignored. Sadly, we

still do not have a perfect reconstruction of our subjects silhouette.


19/30

Figure 14

As we can see in Figure 15 some parts of the contour are missing. This is due

to the Frame Differencing technique that will delete pixels that arent moving

(in this case the foot thats standing on the floor)

Figure 15

In Figure 16 (=1 threshold=35) there are too many spurious pixels (0.79%) and too many ofthe subjects details are lost in the background.


20/30

Figure 16

In Figure 17 (=0 threshold=160) we try to amend to the previous images problems by raisingthe threshold level obtaining a high-contrast silhouette, while unfortunately also creating many

dark areas in the background which might cause our subject to disappear.

Figure 17

The quantity of spurious pixels is terribly high (51.4%), although if we consideronly the area surrounding the subject the percentage is lowered to 12.32% (stilltoo many).

Figure 18


21/30

In Figure 19 (=0 threshold=110) we find that for this particular threshold level the spurious

pixels are only 8.83% and that most of them are far away from our subjects silhouette,permitting us to easily distinguish it.

Figure 19

In Figure 20 we start the patch at =0 and subsequently changed it to 1, obtaining a BackgroundSubtraction. The last silhouette that was extrapolated from the image is stored in the backgroundand persists in the output.

Figure 20

In Figure 21 we try to eliminate this extra silhouette by raising the threshold level, but notice

that by doing so we also lose the subjects.


22/30

Figure 21

Considering the results that were obtained during this test we can come to the conclusion thatAdaptive Background Subtraction can be used in video-surveillance with a sufficient degree of

success as long as

a) we set an appropriate threshold level according to the chosen value of. (60 for=1 and110 for=0)

b) we avoid changing from 1 to 0 during run-time.

Persistent Frame Differencing


23/30

Similarly to the previous Adaptive Background Subtraction, this patch responds well to any

change in lighting and objects that arent preceived in motion fade away with time. Movingimages leave behind them a somewhat persistent trail of pixels, whichs gradient enables us to

perceive threedimensionality.

To make a long story short: this patch visualises the outputs motion history.

The algorithm used is:

M(t) is the same result obtained with Simple Frame Differencing.

To this we have to add B(t), which is very simply a residue of all previous frames, their

brightness diminished by a factor.


24/30

B(t) is obtained by processing the output through a Time Delay block (thus transforming our H(t)

in H(t-1) ) and then uniformly subtracting from each of its pixels to diminish the imagesbrightness, obtaining the aforementioned fading trail of pixels.

is the rate at which this trail fades: for values higher than 190 we do not have any sort of trail

(transforming the path into a Simple Frame Differencing), while for=0 the trail simply does not

fade and persists for the whole duration of the process (creating quite an amount of confusion inthe output!)

What we need to do now is combine M(t) and B(t) together, which can be easily done through a

Logical block (OperationsLogical) with its operation type parameter set to OR.

Tests:

The first thing to consider while using this patch is the presence of two runtime customisable

parameters: the threshold value and.

In Figure 22 (=5 =10) we can see that when setting both parameters to such low values theresult is quite incomprehensible: the silhouette can barely be distinguished and the whole image

comes out as confusing.

Figure 22

In Figures 23-24 (=5=60220) we can see that with a diminishing of the pixel trail the image

becomes more and more understandable, although the amount of spurious pixels (due most of all

to the low threshold value) is still very high.


25/30

Figure 23

Figure 24

In Figure 25 (=20 =3) we can see that raising the threshold value the amount of spurious

pixels is greatly reduced, while the low value of results in a persisting trail that can help theuser to identify the direction of the subjects motion (although in this case it will most probibally


26/30

confuse the user as the fading rate is too low). Lighting changes create bothersome white patches

in the background that might hide the subjects silhouette if he happens to cross those areas.

Figure 25

In Figure 26 (=20 =10) the trail fades a little bit faster, presenting the user with a moreunderstandable image, although lighting changes still produce unwanted white areas.

Figure 26

In Figure 27 (=30 =3) we raised the threshold level: the white areas due to the lightingchanges are definitively reduced.


27/30

Figure 27

Figure 28 (=40 =3) shows what happens when raising the threshold value even more: the

white areas created by the changes in lighting are reduced to a bare minimum, while the detail in

the subjects silhouette is maintained.

Figure 28

Trying to raise the threshold level even more, like in Figure 29 (=100=3),just results in a lossof detail in the silhouette, as some particulars fall under the threshold level and are thus

considered as parte of the background and deleted. On the positive side, all white areas deriving

from light changes completely disappear.


28/30

Figure 29

We continue refining our test in Figure 30 (=100 =5)by raising the value: the pixel trail isnow less persistent and allows us to distinguish quite clearly which path was taken by our

subject.

Figure 30

In Figure 31-32-33 (=100 =1060200) we see the results of further raising of the

parameter: the main point of the Persistent Frame Differencing patch slowly disappears,becoming nothing more than a Simple Frame Differencing.


29/30

Figure 31

Figure 32

Figure 33

In conclusion, we can assert that the Persistent Frame Differencing patch gives the best results

for threshold level =100 (no artefacts due to lighting problems) and for=510 (a long enoughtrail to distinguish the direction and quantity of movement)


30/30

References

For Simple Frame differencing, Adaptive Background Subtraction and Persistent Frame

Differencing the algorithm schemes are based on Robert Collins short course to the Universityof Genoa regarding Image sequences elaboration for video-surveillance.

eyw4 user tutorial 1

Documents