cs256 intelligent systems -vision systems module overview

CS256 Intelligent Systems-Vision Systems

Module Overview

Timetable

Week(mode)

1 (2L)

2(2L)

3(2L)

4(LP)

5(LP)

6 (LP)

7 (LP)

8 (LP)

9 (LP)

10(2L)

Topic

Introduction to the module and vision systems

Case studies and basic concepts

Java and image Fundamentals

Feature Extraction and Image Transforms

Edge Detection and Segmentation

Colour and Texture

Recover 3D information

System Architecture

Knowledge and Reasoning

Image Classification and Retrieval (including revision)

Coursework

• Develop a system that is able to identify key features in selected images.

• Write a report to describe the design, implementation and evaluation of the system. Please see details in separate document on coursework assignment.

• Questions will be asked during lab sessions• Deadline: Monday 18th April, 2005

Assessment

• Examination– 60%– three questions from four

• Coursework– 40%– Report based on experiments

Recommended Texts

• Nick Efford, Digital Image Processing, A Practical Introduction using Java, Addison Wesley, ISBN 0201596237, May 2000

• Tim Morris (2004), Computer Vision and Image Processing, Palgrave MacMillan, ISBN 0333994515

• Patrick H Winston, (1992), Artificial Intelligence (Third Edition), Addison Wesley Publishers Co. ISBN 0201533774

• Rob Callan (2003), Artificial Intelligence, Palgrave MacMillan, ISBN 0333801369

• Paul F Whelan and Dereck Molloy (2001), Machine Vision Algorithms in Java: Techniques and Implementation, Springer, ISBN 1852332182

Objectives of the module

• Understand the fundamentals in machine intelligence– Focus on vision systems, but will relate to other domains

• Understand components in vision systems– Be familiar with common operations for processing images– Be able to implement simple image processing operations

• Evaluate a vision system• additionally: encourage the students to practise more

basic and advanced Java programming

Intelligence and Perception

• First to understand how we perceive the world then to teach the machine to interpret the world based on primitive data it has received

• Human Perceptual Modalities– Tactile – touch– Gustatory – taste– Visual – sight– Auditory – hearing– Olfactory – smell

Intelligent Systems• intelligent robots and intelligent machines

– With artificial intelligence principles– reason about the world and take appropriate

actions by manipulating knowledge– sense the world directly

• Vision - computational perception– a diverse and interdisciplinary body of knowledge

and techniques– to understand the principles behind the processes

that interpret perceptual signals provided by various sensors.

Intelligent Systems• In vision, software’s job is to process the input

from the hardware or sensors• Humans have the natural abilities to speak, to

see, to think, to smell, to sense etc. Machines do not have such inborn abilities, but only have simple engines to follow logical algorithms.

• The procedure to have the computer obtain the similar natural abilities like speaking and vision, are closely related to building knowledge system, but it is also the combination of simulating the perception procedure and knowledge

Intelligent Systems

• Integrate different levels of processing for bridging different gaps – sensors, raw data, low level processing, high level processing and knowledge, for building a complete intelligent system

• Reflected in this module structure

Figure 5-10 image B95-00016-01.3.S1.X5.4.jpg (above) and the its annotation window generated in I-Browse system

Applications• Classical

– robot– medical imaging– remote sensing– astronomy

• Today– DTV– image interpretation– biometry– GIS, (Earth/Planetary Observation, monitoring, exploration)– human genome project– Creative media and art, entertainment

Sample applications - Biometry

• Using personal characteristics to identify a person– fingerprints– face– iris– DNA– gait– etc

Iris Scan

• Striations on iris are individually unique

• Obvious applications– security– PIN

} fixed number of samples

Locate the eye in the head image

Radial resampling of iris

Numerical descriptionAnalysis

Image Representation

x

n

11 m

y

f(x,y)

An array F:-A digital image consisting of an array of m x n pixels in the xth column and the yth

row has an intensity equal to f(x,y).

(r(x,y), g(x,y), b(x,y))

Colour image and video sequence

• colour can be conveyed by combining different colours of light, using three components (red, green and blue): R = r(x,y); G = g(x,y); B = b(x,y), where R, G, B are defined in a similar way to F.

• The vector (r(x,y), g(x,y), b(x,y)) defines the intensity and colour at the point (x,y) in the colour image.

• A video sequence is, in effect, a time-sampled representation of the original moving scene.

• Each frame in the sequence is a standard colour, or monochrome image and can be coded as such.

• a monochrome video sequence may be represented digitally as a sequence o 2-D arrays [F1, F2, F3..FN].

Java example for image representation;-

The Difficulty in Vision Computing – Taking the Human Visual System for

Granted

• The processing capability of human visual systems is often taken for granted

• The subtlety and difficulty of describing the exact operation of the subconscious functions presents significant difficulty in developing algorithms to emulate human visual behaviour

• If we are computer…

Difficulties in vision computing- the sensory gap

• The sensory gap is the gap between the object in the world and the information in a (computational) description derived from a recording of that scene.

• disambiguation processing

Difficulties in vision computing - The semantic gap

• The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. (Arnold, 2000)

• The higher level interpretation, the more more domain knowledge and its management are required.

cs256 intelligent systems -vision systems module overview

Documents