cs256 intelligent systems -vision systems module overview
Post on 19-Dec-2015
224 views
TRANSCRIPT
CS256 Intelligent Systems-Vision Systems
Module Overview
Timetable
Week(mode)
1 (2L)
2(2L)
3(2L)
4(LP)
5(LP)
6 (LP)
7 (LP)
8 (LP)
9 (LP)
10(2L)
Topic
Introduction to the module and vision systems
Case studies and basic concepts
Java and image Fundamentals
Feature Extraction and Image Transforms
Edge Detection and Segmentation
Colour and Texture
Recover 3D information
System Architecture
Knowledge and Reasoning
Image Classification and Retrieval (including revision)
Coursework
• Develop a system that is able to identify key features in selected images.
• Write a report to describe the design, implementation and evaluation of the system. Please see details in separate document on coursework assignment.
• Questions will be asked during lab sessions• Deadline: Monday 18th April, 2005
Assessment
• Examination– 60%– three questions from four
• Coursework– 40%– Report based on experiments
Recommended Texts
• Nick Efford, Digital Image Processing, A Practical Introduction using Java, Addison Wesley, ISBN 0201596237, May 2000
• Tim Morris (2004), Computer Vision and Image Processing, Palgrave MacMillan, ISBN 0333994515
• Patrick H Winston, (1992), Artificial Intelligence (Third Edition), Addison Wesley Publishers Co. ISBN 0201533774
• Rob Callan (2003), Artificial Intelligence, Palgrave MacMillan, ISBN 0333801369
• Paul F Whelan and Dereck Molloy (2001), Machine Vision Algorithms in Java: Techniques and Implementation, Springer, ISBN 1852332182
Objectives of the module
• Understand the fundamentals in machine intelligence– Focus on vision systems, but will relate to other domains
• Understand components in vision systems– Be familiar with common operations for processing images– Be able to implement simple image processing operations
• Evaluate a vision system• additionally: encourage the students to practise more
basic and advanced Java programming
Intelligence and Perception
• First to understand how we perceive the world then to teach the machine to interpret the world based on primitive data it has received
• Human Perceptual Modalities– Tactile – touch– Gustatory – taste– Visual – sight– Auditory – hearing– Olfactory – smell
Intelligent Systems• intelligent robots and intelligent machines
– With artificial intelligence principles– reason about the world and take appropriate
actions by manipulating knowledge– sense the world directly
• Vision - computational perception– a diverse and interdisciplinary body of knowledge
and techniques– to understand the principles behind the processes
that interpret perceptual signals provided by various sensors.
Intelligent Systems• In vision, software’s job is to process the input
from the hardware or sensors• Humans have the natural abilities to speak, to
see, to think, to smell, to sense etc. Machines do not have such inborn abilities, but only have simple engines to follow logical algorithms.
• The procedure to have the computer obtain the similar natural abilities like speaking and vision, are closely related to building knowledge system, but it is also the combination of simulating the perception procedure and knowledge
Intelligent Systems
• Integrate different levels of processing for bridging different gaps – sensors, raw data, low level processing, high level processing and knowledge, for building a complete intelligent system
• Reflected in this module structure
Figure 5-10 image B95-00016-01.3.S1.X5.4.jpg (above) and the its annotation window generated in I-Browse system
Applications• Classical
– robot– medical imaging– remote sensing– astronomy
• Today– DTV– image interpretation– biometry– GIS, (Earth/Planetary Observation, monitoring, exploration)– human genome project– Creative media and art, entertainment
Sample applications - Biometry
• Using personal characteristics to identify a person– fingerprints– face– iris– DNA– gait– etc
Iris Scan
• Striations on iris are individually unique
• Obvious applications– security– PIN
} fixed number of samples
Locate the eye in the head image
Radial resampling of iris
Numerical descriptionAnalysis
Image Representation
x
n
11 m
y
f(x,y)
An array F:-A digital image consisting of an array of m x n pixels in the xth column and the yth
row has an intensity equal to f(x,y).
(r(x,y), g(x,y), b(x,y))
Colour image and video sequence
• colour can be conveyed by combining different colours of light, using three components (red, green and blue): R = r(x,y); G = g(x,y); B = b(x,y), where R, G, B are defined in a similar way to F.
• The vector (r(x,y), g(x,y), b(x,y)) defines the intensity and colour at the point (x,y) in the colour image.
• A video sequence is, in effect, a time-sampled representation of the original moving scene.
• Each frame in the sequence is a standard colour, or monochrome image and can be coded as such.
• a monochrome video sequence may be represented digitally as a sequence o 2-D arrays [F1, F2, F3..FN].
Java example for image representation;-
The Difficulty in Vision Computing – Taking the Human Visual System for
Granted
• The processing capability of human visual systems is often taken for granted
• The subtlety and difficulty of describing the exact operation of the subconscious functions presents significant difficulty in developing algorithms to emulate human visual behaviour
• If we are computer…
Difficulties in vision computing- the sensory gap
• The sensory gap is the gap between the object in the world and the information in a (computational) description derived from a recording of that scene.
• disambiguation processing
Difficulties in vision computing - The semantic gap
• The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. (Arnold, 2000)
• The higher level interpretation, the more more domain knowledge and its management are required.