mmsp irek defée multimedia signal processing basic problems in processing media information

72
MMSP Irek Defée MULTIMEDIA SIGNAL PROCESSING BASIC PROBLEMS IN PROCESSING MEDIA INFORMATION

Post on 21-Dec-2015

239 views

Category:

Documents


0 download

TRANSCRIPT

MMSP Irek Defée

MULTIMEDIA SIGNAL PROCESSING

BASIC PROBLEMS IN PROCESSING

MEDIA INFORMATION

Kinect – new media interface

• Before we proceed we mention important development in the progress of media interfaces

• This is a device and system called Kinect made

by Microsoft. Kinect is available as product

from the beginning of November 2010

Kinect is a part of Microsoft Xbox game platform

but it can be bought separately!

MMSP Irek Defée

What is Kinect?

• Kinect is a new type of hardware for interacting with people - with proper software support of course

• Kinect looks like this

MMSP Irek Defée

What is inside Kinect?

MMSP Irek Defée

There is a hardware worth about 40 euro, working in the following schematicsplus software which extracts signals and sends them for processing to Xbox.Processing takes about 5% of Xbox power (Xeon processor)

How the Kinect works?

MMSP Irek Defée

Kinect has FOUR microphones to retrieve spatial sound and attenuate noise,interferences and compensate for room acoustics

Kinect has small color camera with 640x480 resolution

Most advanced aspect

MMSP Irek Defée

Kinect ”eyes”

Eyes of Kinect are made by ab INFRARED MEASUREMENT SYSTEM-Laser beam is send from the objective and received by sensor as can be seen above.These sensors can move to adjust for the distance and height. This deviceproduces MAP OF DEPTH to objects in a room.

The device can thus ’see’ in bad light or in darkness. Before the use it is TRAINEDwith movements of persons in the room. You can see on the right that in infraredthe beam makes lots of measurement dots

What Kinect does?

MMSP Irek Defée

Kinect recognizes voice IN ROOMS and can be used for voice control of applicationsKinect recognizes persons and body movements which is used in applicationsBut before this Kinect is TRAINED interactively like shown in pictures

After the training person and body movements will be recognized. More than one person can be identified in ascene

Why Kinect is revolutionary?

• It is the first practical natural interface for machines communicating with people

• It works in normal rooms• It is combining acoustical and visual sense• It is recognizing full body movements, even

complicated ones• It is recognizing persons• It works well, it is not perfect but one can predict

there will be much more in the future

MMSP Irek Defée

Kinect applications

MMSP Irek Defée

Games and interactive playing (sports, dancing)

More applications: exercising, rehbilitation, child development

Control of devices by voice, gestures

Automation, robotics

More…. we do not know yet… but the public drivers are partiallyavailable

Back to the lectures

• We continue with the overview of the

biological systems and priniciples of

sensory information processing to finish

it with some conclusions

MMSP Irek Defée

MMSP Irek Defée

FROM PREVIOUS LECTURES WE KNOW

THAT MULTIMEDIA INFORMATION

PROCESSING IS EXCELLENTLY DONE BY

THE HUMAN INFORMATION PROCESSING

SYSTEM

MMSP Irek Defée

• OUR PROBLEM IS:

Biological systems perform processing of

audiovisual information using special

”hardware” (which could be called ’wetware’)

and ’software’ that is algorithms.

The question is: Can we make processing of

audiovisual information using different hardware

and software? Maybe algorithms could be similar?

MMSP Irek Defée

IN HUMAN VISUALSYSTEM PROCESSINGSTARTS IMMEDIATELY IN THE RETINA AND THERE ARE COLOR PROCESSING AND BLACK AND WHITE LIGHT ACQUISITION AND PROCESSING SYSTEMS

Let us take visual processing as example

MMSP Irek Defée

FROM COLOR AND BLACK &WHITE RECEPTORS SIGNALSGO TO INITIAL PROCESSING ELEMENTS

IT IS IMPORTANT TO NOTICE THATTHE NUMBER OF COLOR PROCESSINGELEMENTS IS MUCH LOWER THANBLACK AND WHITE

OUTPUT LINKS

MMSP Irek Defée

• WHAT THESE PROCESSING ELEMENTS DO?

I

MOST RECENT

MEASUREMENTS OF RETINAL NEURAL

CELLS SHOW THAT THEIR RECEPTIVE

FIELDS ARE QUITE IRREGULAR

IN THE FOLLOWING PAGES SOME

INFORMATION ABOUT WHAT THESE

CELLS ARE DOING IS GIVEN

MMSP Irek Defée

• BAR OF LIGHT IS MOVED OVER PHOTORCEPTORS

IN DIFFERENT DIRECTIONS

OUTPUT OF THE PHOTORECPTORS IS SUMMED

WITH POSITIVE SIGN

(EXICITATION) OR

NEGATIVE SIGN

(INHIBITION)

MMSP Irek Defée

DEPENDING ON THE DIRECTION OF MOTION SIGNALS SUM UPSTRONGLY OR NOT

MMSP Irek Defée

• HERE THE MEASURED SIGNALS ARE SHOWN

FOR CELLS WHICH

REACT STRONGLY TO

WHITE BAR ON BLACK

BACKGROUND AND

OPPOSITE (off)

MMSP Irek Defée

• HERE WE SEE THE

RESPONSE MEASURED IN

TIME

MMSP Irek Defée

• WE CAN SEE THAT INITIAL PROCESSING IN THE EYE INCLUDES

DETECTION OF DIRECTIONAL CHANGES IN LIGHT INTENSITY

THIS MIGHT BE DONE FOR DIFFERENT

COLORS TOO

MMSP Irek Defée

WE CAN NOW ASK FOLLOWING QUESTIONS:

WHY THE PROCESSING IS ORGANISED IN

THIS WAY? FOR THE ANSWER WE CAN THINK

THAT THE PROCESSING IS OPTIMISED IN SOME

WAY.

WHAT MIGHT BE OPTIMISATION CRITERIA?

WHAT ARE THE GENERAL PRINCIPLES OF

HUMAN/BIOLOGICAL INFORMATION

PROCESSING?

MMSP Irek Defée

OVERLAPPING SQUARES OR NOT???

MMSP Irek Defée

• WHY WE SEE HERE THREE SQUARES

AND NOT CUT OUT SQUARES?

NOTE THAT ONLY ONE

SQUARE IS FULLY

VISIBLE, OTHERS ARE

COVERED, IN FACT THEY MAY NOT BE SQUARES

THIS IS BECAUSE THE VISUAL

SYSTEM PRODUCES INTERPRETATION WHICH IS MOST PLAUSIBLE (GENERIC)

BUT IT MAY BE WRONG TOO,

ALTHOUGH WE WOULD BE

SURPRISED IT WOULD REALLY

BE!!!

MMSP Irek Defée

• THE INTERPRETATION PRODUCED IS FOR DETECTING MOST PROBABLE

OBJECTS

THE UPPER FIGURE IS DETECTED AS

ARCH OVERLAID ON THE

SAWTOOH

THIS IS THE MOST PROBABLE INTERPRETATION

THE BOTTOM FIGURE INTERPRETATION IS SURPRISING,

BUT IT COULD ALSO BE PRODUCED

IF THERE WILL BE MORE EVIDENCE

MMSP Irek Defée

• VISUAL SYSTEM ASSUMES THAT LIGHT

IS COMING FROM TOP

LIGHT DIRECTION

SAME PICTURE UPSIDE DOWN

MMSP Irek Defée

• The statistics-based system works normally in almost perfect way. As we could see it fails

sometimes when input signals are highly improbable and/or if most probable interpretation is not correct.

This can be seen in visual illusions.

We will look at them closer since recent statistical approach is explaining them. This

provides for us a hint what kind of processing is done.

MMSP Irek Defée

WE CAN NOW ASK FOLLOWING QUESTIONS:

WHY THE PROCESSING IS ORGANISED IN

THIS WAY? FOR THE ANSWER WE CAN THINK

THAT THE PROCESSING IS OPTIMISED IN SOME

WAY.

WHAT MIGHT BE OPTIMISATION CRITERIA?

WHAT ARE THE GENERAL PRINCIPLES OF

HUMAN/BIOLOGICAL INFORMATION

PROCESSING?

MMSP Irek Defée

Principles we can identify now:

• Statistical processing matched to the real world signal statistics – provides responses to most probable signals. This is very natural principle

• Minimization of information processed, as much information as possible is eliminated, minimum information needed to provide response is used. This principle allows to minimize energy and processing effort.

MMSP Irek Defée

• A book which appeared in 2005 based on

earlier research:

MMSP Irek Defée

• The authors are visual psychologists, they consider vision as a system interpreting world

from images projected onto the eye:

Light from external sourcebounces of objects and is projected. This projection is notunique (e.g. objects of differentsize will have the same projectiondepending on their distance

MMSP Irek Defée

• In visual illusions projection gives rise to improper interpretation

Natural scene,illusion persists

Stimuli changes, illusion persists,

MMSP Irek Defée

This picture gives strong of depthbecause of combination of manymutually consistent cues:-perspective-texture gradient-Shading and shadow

MMSP Irek Defée

• Geometry of natural scenes

Geometrical illusions represent wrong

interpretation od real world. To find out why

researchers took pictures with depth map

Laser range scanner forMeasuring distance

Real pictures with corresponding distancesmarked by colors

MMSP Irek Defée

• If large number of such pictures is taken

a database can be created in which real world

objects are matched with distances and statistics is calculated.

Example: subjective metrics

Let’s think about lines of different lengths whichare seen in real world. If all length would have the same probability there would be linear relationbetween the stimulation for every length. But if this is not the case, some length will be stimulated more often. This can lead to distortions in perception.

MMSP Irek Defée

• Example: Line length illusionVariation of apparent length as function of orientation

In experiments people report changinglength depending on angle

MMSP Irek Defée

• Why it is so? Let’s sample lines in pictures from database

Grid of templatesto overlay on picturewith straight lines

White – accepted lines,Black – rejected lines

The points in the picture were compared with measured by laser range to see if they correspond to lines in real world. Total of 1.2x10^7 line segments were collected

Probability distribution ofof lines vs. length for differentorientations

Cumulative distribution (lines shorter than x)This shows how many linesat certain orientationcorresponded to real linesof length shorter or equal tox

MMSP Irek Defée

• Prediction of apparent length based on probability

Take e.g lines of length 7 at orientation 20 deg,

their cumulative probability is 0.15 which means

that 15% lines is shorter than 7 pixels and 85% is

longer. For all orientations we get this plot

This is very smilar to the one measured in experiments withpeople!!!

MMSP Irek Defée

• Why such biases exist?

In nature lines do not appear often, horizontal linesare typically generated fromhorizontal flat surfaces

Vertical lines are limitedby gravity and by this rareand lines at 20-30deg even moreRare, and they are mostlyprojected from perspective

MMSP Irek Defée

• Visual illusions: Angles

All angles in this picture have 90 degbut when they are projected on the eye,projections may differ up to 60 deg

A) Bias in angle estimation between two linesB,C,D) Angle illusions

MMSP Irek Defée

• To explain this a database of angles is made, as before

Extraction of angles Probability distributions for differentTypes of angles (bottom line) in naturalscenes and scenes with human created objectsWe can see bias: angles close to 90 degare less likely to occur

MMSP Irek Defée

• Bias and illusions

Angles close to 90 deg are more likelyTo come from planar surface, which is typically larger than surface from linesinteresecting at smaller angles. Thus90 deg angles are less likely

Probability distributionof angles is not linear,cumulative probabilityis biased

Thus predicted perceivedangle is different fromactual one, for 90 degit is the same

The magnitude of anglemisperception (lines)vs. experimentally measuredvalues

MMSP Irek Defée

• Explanation of angle illusions

Why vertical line is tilted? We take reference line at 60 deg (black) and check probability of occurence of physical sources of a second line oriented at different angles. Since the angle between the lines is 30 deg we look at the probability for 30 deg and then into cumulative probability (previous page) which gives value 0.184 which multiplied by 180 gives angle 33,2 deg in agreement with measurements

MMSP Irek Defée

• Size illusionAccording to the previous explanationsthe reason for this illusion is:

Probability distributions of the possible sourcesof the targets, given their different contexts,are differentTo check this hypothesis database wassearched for circular objects and probabilitiesof the sources of targets in the context were calculated:

Various size illusions ofcenter and surrounding

MMSP Irek Defée

Experimental conditionsa) The inner circle is surrounded by the4 circles with changing diametersb) Probability of occurence of centercircle with specific size for outer circleswith different diameters. Dashed lineshows probability for circle with 14 pixelsdiameter. (Bigger surrounding circles are much less likely to appear)c) Cumulative probability for 14 pixel circle d) Examples of scenes with large circles and small circles

Why there are statistical differences? Circles originate from planar projections,larger circles are less likely.Why the presence of surrounding circles changes the occurence of target centralcircles differently? Larger circles arise from larger planes in the world, theyare flat areas – then it is more probable that the central circle will be larger.In other words, the presence of larger surrounding circles increases the probability of of occurence of physical sources of larger central circles. In result probabilityDistribution of central circles is changing according to the size of surrounding circles.

MMSP Irek Defée

• Changing the interval between center and

surrounding circles

Probabilities when the distance is changing Dashed line is for circle of size 14

Cumulative Probability for the 14 pixel circle

MMSP Irek Defée

• Comparison of inner circle with single circlea) Probability distribution of singel circle vs. diameterb) Probability for single circle superimposed with probability of central circle surrounded by outer circle, dashed line is for 24 pixel circle, probability curve is for outer circle 32 pixel diameter, cumulative probability is much higher – there is biasc) When the outer circle is much bigger the cumulative probability is smaller

The changing cumulative probability ratios and dependence on thecentral and outer circle sizes is well seen – and illusion depends on these parameters in exactly the same way

MMSP Irek Defée

• Distance illusions

a) When objects are close perceived distance is overestimated to physical one

b) Objects which are close to each other are perceived as being at the same distance

c) The distance to close objects is overestimated, the distance to far objects is underestimated

d) Objects on the ground when they are about 7m distance appear closer and with increasing distance they appear more elevated

MMSP Irek Defée

According to the methodology probability

distribution of distances is measured but there

are several variables here:Probability of alldistances from scanner

Probability of the differences in distances between objects for threedifferent horizontalangles

Probability of horizontal distancesdifferent heights with respect to eye level

MMSP Irek Defée

• Interpretation of these probabilities

a) This curve for all distances has strong peak for distance of 3m . This is in agreement with experiments in which people seeing single objects hanging in completely dark scene report them as being in the distance of 2-4 m

b) When the angular separation between the objects is small they tend to be seen at equal distance but this tendency decreases when the angle is increasing

c) The dependency of probability of distance vs. eye level has peak at distance of 4 m. Thus for objects at distance less than 4 m will be overestimated and those at distance more than 4 m will be underestimated. This agrees with experiments

MMSP Irek Defée

• The size illusion

The size illusiondoes not depend nn particular type of endings

It can be inducedeven without line

and even (but lessstrongly) withdots

Why this happens?

Again, for explanation databaseis searched for such patterns and probabilities are calculated.Here we consider case when bothgigures ar inline, on the left/right

Templatesused

Templatesoverlaid onpictures

MMSP Irek Defée

• Results of probability calculationsa) Probability of lines with specific length and arrows pointing inwards and outwardsb) Cumulative probabilitiesc) Superimposed cumulative probabilities showing differencesd) Example of two lines of length 50 pixels. One can see that cumulative probability for outward arrows is higher which corresponds to the bigger length.

Figures are in-line extending to the left or to the right

MMSP Irek Defée

• Angle illusion

The line is interrupted by vertical occluder

It is then perceived as two segments shifted

Why this happens?

Again statistics of such patterns is calculated

from the database od pictures

MMSP Irek Defée

• Templates for calculation

a) Shows the templates, for each red line there is one template corresponding to the shiftb) The templates are matched in the pictures and statistics can be calculatedc) Other templates can be used for different configurations of this illusiond) Definition of the difference in location of the line segments

MMSP Irek Defée

• Probability distributions measured

We can see peaks which are at nonzero shift

So the most probable interpretation from this statistics is that that there is nonzero shift

MMSP Irek Defée

• One can also study what is the effect of angle

of the line and the width of the distractor

Change of line orientation

Change of widthof the distractor

As can be seen whent the are larger,The peak moves towards greatershifts which implies that the illusionwill be stronger – and it is really so

MMSP Irek Defée

• The processing of information in biological systems is statistical – it aims for producing MOST PROBABLE response to the signals coming from real world. This type of processing must be based on statistics of signals and models from real world. Result of processing is most plausible answer for ”normal conditions” and assumptions. This we have seen in the examples before and they are repeated next.

MMSP Irek Defée

CONCLUSION

• Statistics based processing seems to be very

strong in explaining visual illusions (many of

them in the same way)

The principle of statistical processing is powerful:

The system collects information about most likely

distribution of signals and provides most

probable interpretations for them. This will work

in most cases. Only when signals are very nontypical

it will fail but this is rare.

MMSP Irek Defée

BUT….

• We have to remember that biological systems

are able to deal with extreme variations of signals and still extract right information from them. This will be illustrated now by the example of face recognition

Faces can be distorted in many ways and still

recognized. We can guess something about PRINCIPLES OF FACE PROCESSING

MMSP Irek Defée

We can recognize FAMILIAR faces from extremely low resolutionpictures.

How this is done? – We do not have clear idea – but it pointsto the minimization of processed information

MMSP Irek Defée

Contour information is not enough

MMSP Irek Defée

Face is processed somehow as a ”whole” and not as composed by parts. From the combined picture on the left we see newface, when we split it we recognize other faces

MMSP Irek Defée

Eyebrows are very important for the identification of faces

MMSP Irek Defée

Faces can be recognized despite extreme distortions

MMSP Irek Defée

Faces seem to be encoded in memory in exaggerated.caricature way:

A) Average face (averaged from a number of personsB) Some typical faceC) Face created by taking bid deviation from average Such faces are recognized even better than typical ones

MMSP Irek Defée

Newborn babies turn more attention to more face-like objects(upper row) than not face-like

MMSP Irek Defée

Faces and antifaces: If face within green circle is observed for some timethe center one will not be correctly recognized but as one in the red circle(more distance from the center means more differences)This means that there is some kind of prototype encoding and tuning toit

MMSP Irek Defée

Impact of skin pigmentationRow 1: Faces differ only in shapeRow 2: Faces differ only in skin pigmentation but not shapeRow 3: Faces differ in shape and pigmentationWe see that pigmentation has significant impact (row 2)

MMSP Irek Defée

Color helps: Left original Middle black and white Right color only, eyes can be located more precisely

MMSP Irek Defée

From negative picture it is impossible to identifyfaces

MMSP Irek Defée

Face recognition is strongly compensated for the direction of ilumination, pictures above are easily recognized as same person

MMSP Irek Defée

Resonse of neural cell of monkey in the face processingarea of the brain. Response to something like face is muchmore stronger than for hand. (But remember that milionsand milions of cells are processing at the same time)

Measurement from human brain: signal from face-like picturesis much stronger than from other objects

MMSP Irek Defée

The examples shown for faces indicate how sophisticated is information processing in biological systems.

What is very amazing is getting correct results despiteextreme distortions. For the most part, we do not know how this is done and we have difficulty in thinking howTo develop algorithms which would have similarcapabilities. This is the topic for studies in the future