mmsp irek defée multimedia signal processing basic problems in processing media information
Post on 21-Dec-2015
239 views
TRANSCRIPT
Kinect – new media interface
• Before we proceed we mention important development in the progress of media interfaces
• This is a device and system called Kinect made
by Microsoft. Kinect is available as product
from the beginning of November 2010
Kinect is a part of Microsoft Xbox game platform
but it can be bought separately!
MMSP Irek Defée
What is Kinect?
• Kinect is a new type of hardware for interacting with people - with proper software support of course
• Kinect looks like this
MMSP Irek Defée
What is inside Kinect?
MMSP Irek Defée
There is a hardware worth about 40 euro, working in the following schematicsplus software which extracts signals and sends them for processing to Xbox.Processing takes about 5% of Xbox power (Xeon processor)
How the Kinect works?
MMSP Irek Defée
Kinect has FOUR microphones to retrieve spatial sound and attenuate noise,interferences and compensate for room acoustics
Kinect has small color camera with 640x480 resolution
Most advanced aspect
MMSP Irek Defée
Kinect ”eyes”
Eyes of Kinect are made by ab INFRARED MEASUREMENT SYSTEM-Laser beam is send from the objective and received by sensor as can be seen above.These sensors can move to adjust for the distance and height. This deviceproduces MAP OF DEPTH to objects in a room.
The device can thus ’see’ in bad light or in darkness. Before the use it is TRAINEDwith movements of persons in the room. You can see on the right that in infraredthe beam makes lots of measurement dots
What Kinect does?
MMSP Irek Defée
Kinect recognizes voice IN ROOMS and can be used for voice control of applicationsKinect recognizes persons and body movements which is used in applicationsBut before this Kinect is TRAINED interactively like shown in pictures
After the training person and body movements will be recognized. More than one person can be identified in ascene
Why Kinect is revolutionary?
• It is the first practical natural interface for machines communicating with people
• It works in normal rooms• It is combining acoustical and visual sense• It is recognizing full body movements, even
complicated ones• It is recognizing persons• It works well, it is not perfect but one can predict
there will be much more in the future
MMSP Irek Defée
Kinect applications
MMSP Irek Defée
Games and interactive playing (sports, dancing)
More applications: exercising, rehbilitation, child development
Control of devices by voice, gestures
Automation, robotics
More…. we do not know yet… but the public drivers are partiallyavailable
Back to the lectures
• We continue with the overview of the
biological systems and priniciples of
sensory information processing to finish
it with some conclusions
MMSP Irek Defée
MMSP Irek Defée
FROM PREVIOUS LECTURES WE KNOW
THAT MULTIMEDIA INFORMATION
PROCESSING IS EXCELLENTLY DONE BY
THE HUMAN INFORMATION PROCESSING
SYSTEM
MMSP Irek Defée
• OUR PROBLEM IS:
Biological systems perform processing of
audiovisual information using special
”hardware” (which could be called ’wetware’)
and ’software’ that is algorithms.
The question is: Can we make processing of
audiovisual information using different hardware
and software? Maybe algorithms could be similar?
MMSP Irek Defée
IN HUMAN VISUALSYSTEM PROCESSINGSTARTS IMMEDIATELY IN THE RETINA AND THERE ARE COLOR PROCESSING AND BLACK AND WHITE LIGHT ACQUISITION AND PROCESSING SYSTEMS
Let us take visual processing as example
MMSP Irek Defée
FROM COLOR AND BLACK &WHITE RECEPTORS SIGNALSGO TO INITIAL PROCESSING ELEMENTS
IT IS IMPORTANT TO NOTICE THATTHE NUMBER OF COLOR PROCESSINGELEMENTS IS MUCH LOWER THANBLACK AND WHITE
OUTPUT LINKS
MMSP Irek Defée
• WHAT THESE PROCESSING ELEMENTS DO?
I
MOST RECENT
MEASUREMENTS OF RETINAL NEURAL
CELLS SHOW THAT THEIR RECEPTIVE
FIELDS ARE QUITE IRREGULAR
IN THE FOLLOWING PAGES SOME
INFORMATION ABOUT WHAT THESE
CELLS ARE DOING IS GIVEN
MMSP Irek Defée
• BAR OF LIGHT IS MOVED OVER PHOTORCEPTORS
IN DIFFERENT DIRECTIONS
OUTPUT OF THE PHOTORECPTORS IS SUMMED
WITH POSITIVE SIGN
(EXICITATION) OR
NEGATIVE SIGN
(INHIBITION)
MMSP Irek Defée
• HERE THE MEASURED SIGNALS ARE SHOWN
FOR CELLS WHICH
REACT STRONGLY TO
WHITE BAR ON BLACK
BACKGROUND AND
OPPOSITE (off)
MMSP Irek Defée
• WE CAN SEE THAT INITIAL PROCESSING IN THE EYE INCLUDES
DETECTION OF DIRECTIONAL CHANGES IN LIGHT INTENSITY
THIS MIGHT BE DONE FOR DIFFERENT
COLORS TOO
MMSP Irek Defée
WE CAN NOW ASK FOLLOWING QUESTIONS:
WHY THE PROCESSING IS ORGANISED IN
THIS WAY? FOR THE ANSWER WE CAN THINK
THAT THE PROCESSING IS OPTIMISED IN SOME
WAY.
WHAT MIGHT BE OPTIMISATION CRITERIA?
WHAT ARE THE GENERAL PRINCIPLES OF
HUMAN/BIOLOGICAL INFORMATION
PROCESSING?
MMSP Irek Defée
• WHY WE SEE HERE THREE SQUARES
AND NOT CUT OUT SQUARES?
NOTE THAT ONLY ONE
SQUARE IS FULLY
VISIBLE, OTHERS ARE
COVERED, IN FACT THEY MAY NOT BE SQUARES
THIS IS BECAUSE THE VISUAL
SYSTEM PRODUCES INTERPRETATION WHICH IS MOST PLAUSIBLE (GENERIC)
BUT IT MAY BE WRONG TOO,
ALTHOUGH WE WOULD BE
SURPRISED IT WOULD REALLY
BE!!!
MMSP Irek Defée
• THE INTERPRETATION PRODUCED IS FOR DETECTING MOST PROBABLE
OBJECTS
THE UPPER FIGURE IS DETECTED AS
ARCH OVERLAID ON THE
SAWTOOH
THIS IS THE MOST PROBABLE INTERPRETATION
THE BOTTOM FIGURE INTERPRETATION IS SURPRISING,
BUT IT COULD ALSO BE PRODUCED
IF THERE WILL BE MORE EVIDENCE
MMSP Irek Defée
• VISUAL SYSTEM ASSUMES THAT LIGHT
IS COMING FROM TOP
LIGHT DIRECTION
SAME PICTURE UPSIDE DOWN
MMSP Irek Defée
• The statistics-based system works normally in almost perfect way. As we could see it fails
sometimes when input signals are highly improbable and/or if most probable interpretation is not correct.
This can be seen in visual illusions.
We will look at them closer since recent statistical approach is explaining them. This
provides for us a hint what kind of processing is done.
MMSP Irek Defée
WE CAN NOW ASK FOLLOWING QUESTIONS:
WHY THE PROCESSING IS ORGANISED IN
THIS WAY? FOR THE ANSWER WE CAN THINK
THAT THE PROCESSING IS OPTIMISED IN SOME
WAY.
WHAT MIGHT BE OPTIMISATION CRITERIA?
WHAT ARE THE GENERAL PRINCIPLES OF
HUMAN/BIOLOGICAL INFORMATION
PROCESSING?
MMSP Irek Defée
Principles we can identify now:
• Statistical processing matched to the real world signal statistics – provides responses to most probable signals. This is very natural principle
• Minimization of information processed, as much information as possible is eliminated, minimum information needed to provide response is used. This principle allows to minimize energy and processing effort.
MMSP Irek Defée
• The authors are visual psychologists, they consider vision as a system interpreting world
from images projected onto the eye:
Light from external sourcebounces of objects and is projected. This projection is notunique (e.g. objects of differentsize will have the same projectiondepending on their distance
MMSP Irek Defée
• In visual illusions projection gives rise to improper interpretation
Natural scene,illusion persists
Stimuli changes, illusion persists,
MMSP Irek Defée
This picture gives strong of depthbecause of combination of manymutually consistent cues:-perspective-texture gradient-Shading and shadow
MMSP Irek Defée
• Geometry of natural scenes
Geometrical illusions represent wrong
interpretation od real world. To find out why
researchers took pictures with depth map
Laser range scanner forMeasuring distance
Real pictures with corresponding distancesmarked by colors
MMSP Irek Defée
• If large number of such pictures is taken
a database can be created in which real world
objects are matched with distances and statistics is calculated.
Example: subjective metrics
Let’s think about lines of different lengths whichare seen in real world. If all length would have the same probability there would be linear relationbetween the stimulation for every length. But if this is not the case, some length will be stimulated more often. This can lead to distortions in perception.
MMSP Irek Defée
• Example: Line length illusionVariation of apparent length as function of orientation
In experiments people report changinglength depending on angle
MMSP Irek Defée
• Why it is so? Let’s sample lines in pictures from database
Grid of templatesto overlay on picturewith straight lines
White – accepted lines,Black – rejected lines
The points in the picture were compared with measured by laser range to see if they correspond to lines in real world. Total of 1.2x10^7 line segments were collected
Probability distribution ofof lines vs. length for differentorientations
Cumulative distribution (lines shorter than x)This shows how many linesat certain orientationcorresponded to real linesof length shorter or equal tox
MMSP Irek Defée
• Prediction of apparent length based on probability
Take e.g lines of length 7 at orientation 20 deg,
their cumulative probability is 0.15 which means
that 15% lines is shorter than 7 pixels and 85% is
longer. For all orientations we get this plot
This is very smilar to the one measured in experiments withpeople!!!
MMSP Irek Defée
• Why such biases exist?
In nature lines do not appear often, horizontal linesare typically generated fromhorizontal flat surfaces
Vertical lines are limitedby gravity and by this rareand lines at 20-30deg even moreRare, and they are mostlyprojected from perspective
MMSP Irek Defée
• Visual illusions: Angles
All angles in this picture have 90 degbut when they are projected on the eye,projections may differ up to 60 deg
A) Bias in angle estimation between two linesB,C,D) Angle illusions
MMSP Irek Defée
• To explain this a database of angles is made, as before
Extraction of angles Probability distributions for differentTypes of angles (bottom line) in naturalscenes and scenes with human created objectsWe can see bias: angles close to 90 degare less likely to occur
MMSP Irek Defée
• Bias and illusions
Angles close to 90 deg are more likelyTo come from planar surface, which is typically larger than surface from linesinteresecting at smaller angles. Thus90 deg angles are less likely
Probability distributionof angles is not linear,cumulative probabilityis biased
Thus predicted perceivedangle is different fromactual one, for 90 degit is the same
The magnitude of anglemisperception (lines)vs. experimentally measuredvalues
MMSP Irek Defée
• Explanation of angle illusions
Why vertical line is tilted? We take reference line at 60 deg (black) and check probability of occurence of physical sources of a second line oriented at different angles. Since the angle between the lines is 30 deg we look at the probability for 30 deg and then into cumulative probability (previous page) which gives value 0.184 which multiplied by 180 gives angle 33,2 deg in agreement with measurements
MMSP Irek Defée
• Size illusionAccording to the previous explanationsthe reason for this illusion is:
Probability distributions of the possible sourcesof the targets, given their different contexts,are differentTo check this hypothesis database wassearched for circular objects and probabilitiesof the sources of targets in the context were calculated:
Various size illusions ofcenter and surrounding
MMSP Irek Defée
Experimental conditionsa) The inner circle is surrounded by the4 circles with changing diametersb) Probability of occurence of centercircle with specific size for outer circleswith different diameters. Dashed lineshows probability for circle with 14 pixelsdiameter. (Bigger surrounding circles are much less likely to appear)c) Cumulative probability for 14 pixel circle d) Examples of scenes with large circles and small circles
Why there are statistical differences? Circles originate from planar projections,larger circles are less likely.Why the presence of surrounding circles changes the occurence of target centralcircles differently? Larger circles arise from larger planes in the world, theyare flat areas – then it is more probable that the central circle will be larger.In other words, the presence of larger surrounding circles increases the probability of of occurence of physical sources of larger central circles. In result probabilityDistribution of central circles is changing according to the size of surrounding circles.
MMSP Irek Defée
• Changing the interval between center and
surrounding circles
Probabilities when the distance is changing Dashed line is for circle of size 14
Cumulative Probability for the 14 pixel circle
MMSP Irek Defée
• Comparison of inner circle with single circlea) Probability distribution of singel circle vs. diameterb) Probability for single circle superimposed with probability of central circle surrounded by outer circle, dashed line is for 24 pixel circle, probability curve is for outer circle 32 pixel diameter, cumulative probability is much higher – there is biasc) When the outer circle is much bigger the cumulative probability is smaller
The changing cumulative probability ratios and dependence on thecentral and outer circle sizes is well seen – and illusion depends on these parameters in exactly the same way
MMSP Irek Defée
• Distance illusions
a) When objects are close perceived distance is overestimated to physical one
b) Objects which are close to each other are perceived as being at the same distance
c) The distance to close objects is overestimated, the distance to far objects is underestimated
d) Objects on the ground when they are about 7m distance appear closer and with increasing distance they appear more elevated
MMSP Irek Defée
According to the methodology probability
distribution of distances is measured but there
are several variables here:Probability of alldistances from scanner
Probability of the differences in distances between objects for threedifferent horizontalangles
Probability of horizontal distancesdifferent heights with respect to eye level
MMSP Irek Defée
• Interpretation of these probabilities
a) This curve for all distances has strong peak for distance of 3m . This is in agreement with experiments in which people seeing single objects hanging in completely dark scene report them as being in the distance of 2-4 m
b) When the angular separation between the objects is small they tend to be seen at equal distance but this tendency decreases when the angle is increasing
c) The dependency of probability of distance vs. eye level has peak at distance of 4 m. Thus for objects at distance less than 4 m will be overestimated and those at distance more than 4 m will be underestimated. This agrees with experiments
MMSP Irek Defée
• The size illusion
The size illusiondoes not depend nn particular type of endings
It can be inducedeven without line
and even (but lessstrongly) withdots
Why this happens?
Again, for explanation databaseis searched for such patterns and probabilities are calculated.Here we consider case when bothgigures ar inline, on the left/right
Templatesused
Templatesoverlaid onpictures
MMSP Irek Defée
• Results of probability calculationsa) Probability of lines with specific length and arrows pointing inwards and outwardsb) Cumulative probabilitiesc) Superimposed cumulative probabilities showing differencesd) Example of two lines of length 50 pixels. One can see that cumulative probability for outward arrows is higher which corresponds to the bigger length.
Figures are in-line extending to the left or to the right
MMSP Irek Defée
• Angle illusion
The line is interrupted by vertical occluder
It is then perceived as two segments shifted
Why this happens?
Again statistics of such patterns is calculated
from the database od pictures
MMSP Irek Defée
• Templates for calculation
a) Shows the templates, for each red line there is one template corresponding to the shiftb) The templates are matched in the pictures and statistics can be calculatedc) Other templates can be used for different configurations of this illusiond) Definition of the difference in location of the line segments
MMSP Irek Defée
• Probability distributions measured
We can see peaks which are at nonzero shift
So the most probable interpretation from this statistics is that that there is nonzero shift
MMSP Irek Defée
• One can also study what is the effect of angle
of the line and the width of the distractor
Change of line orientation
Change of widthof the distractor
As can be seen whent the are larger,The peak moves towards greatershifts which implies that the illusionwill be stronger – and it is really so
MMSP Irek Defée
• The processing of information in biological systems is statistical – it aims for producing MOST PROBABLE response to the signals coming from real world. This type of processing must be based on statistics of signals and models from real world. Result of processing is most plausible answer for ”normal conditions” and assumptions. This we have seen in the examples before and they are repeated next.
MMSP Irek Defée
CONCLUSION
• Statistics based processing seems to be very
strong in explaining visual illusions (many of
them in the same way)
The principle of statistical processing is powerful:
The system collects information about most likely
distribution of signals and provides most
probable interpretations for them. This will work
in most cases. Only when signals are very nontypical
it will fail but this is rare.
MMSP Irek Defée
BUT….
• We have to remember that biological systems
are able to deal with extreme variations of signals and still extract right information from them. This will be illustrated now by the example of face recognition
Faces can be distorted in many ways and still
recognized. We can guess something about PRINCIPLES OF FACE PROCESSING
MMSP Irek Defée
We can recognize FAMILIAR faces from extremely low resolutionpictures.
How this is done? – We do not have clear idea – but it pointsto the minimization of processed information
MMSP Irek Defée
Face is processed somehow as a ”whole” and not as composed by parts. From the combined picture on the left we see newface, when we split it we recognize other faces
MMSP Irek Defée
Faces seem to be encoded in memory in exaggerated.caricature way:
A) Average face (averaged from a number of personsB) Some typical faceC) Face created by taking bid deviation from average Such faces are recognized even better than typical ones
MMSP Irek Defée
Newborn babies turn more attention to more face-like objects(upper row) than not face-like
MMSP Irek Defée
Faces and antifaces: If face within green circle is observed for some timethe center one will not be correctly recognized but as one in the red circle(more distance from the center means more differences)This means that there is some kind of prototype encoding and tuning toit
MMSP Irek Defée
Impact of skin pigmentationRow 1: Faces differ only in shapeRow 2: Faces differ only in skin pigmentation but not shapeRow 3: Faces differ in shape and pigmentationWe see that pigmentation has significant impact (row 2)
MMSP Irek Defée
Color helps: Left original Middle black and white Right color only, eyes can be located more precisely
MMSP Irek Defée
Face recognition is strongly compensated for the direction of ilumination, pictures above are easily recognized as same person
MMSP Irek Defée
Resonse of neural cell of monkey in the face processingarea of the brain. Response to something like face is muchmore stronger than for hand. (But remember that milionsand milions of cells are processing at the same time)
Measurement from human brain: signal from face-like picturesis much stronger than from other objects
MMSP Irek Defée
The examples shown for faces indicate how sophisticated is information processing in biological systems.
What is very amazing is getting correct results despiteextreme distortions. For the most part, we do not know how this is done and we have difficulty in thinking howTo develop algorithms which would have similarcapabilities. This is the topic for studies in the future