Download - Hand Gesture Recognition
HAND GESTURE RECOGNITIONTECHNIQUES FOR ROBOT
CONTROL
A PROJECT REPORT
Submitted by
AMUTHAN.A (080107122003)
JIBIN GEORGE (080107122037)
MASANI DINESH.K (080107122053)
MATHAN PRASAD.S (080107122054)
In partial fulfillment for the award of the degree
Of
BACHELOR OF ENGINEERING
InELECTRONICS AND COMMUNICATION ENGINEERING
SNS COLLEGE OF TECHNOLOGY, COIMBATOREANNA UNIVERSITY OF TECHNOLOGY
COIMBATORE- 641047
SEPTEMBER 2011
1
ANNA UNIVERSITY OF TECHNOLOGY COIMBATORE-641047
BONAFIDE CERTIFICATE
Certified that this project report “HAND GESTURE RECOGNITION TECHNIQUES FOR ROBOT CONTROL” is the bonafide work of “AMUTHAN.A, JIBIN GEORGE , MASANI DINESH.K , MATHAN PRASAD.S” who carried out the project work under my supervision.
SIGNATURE SIGNATUREMr.V.J. ARUL KARTHICK Prof.S.ARUMUGAM SUPERVISOR HEAD OF THE DEPARTMENTAssistant Professor, Professor and Head,Department of Electronics Department of Electronics and Communication Engineering, and Communication Engineering,SNS College of Technology, SNS College of Technology, Coimbatore – 35. Coimbatore – 35.
Submitted for the viva-voce examination held on……………….
------------------------- ------------------------- Internal Examiner External Examiner
2
ACKNOWLEDGEMENT
First of all we extend our heart-felt gratitude to the management of SNS College of Technology, Dr.V.S.Velusamy, Founder Trustee, Dr.S.Rajalakshmi, Correspondent and Dr.S.N.Subbramanian, Director cum Secretary for providing us with all sorts of supports in completion of this project phase.
We record our indebtedness to our Principal Dr.V.P.Arunachalam, for his guidance and sustained encouragement for the successful completion of this project phase.
We are highly grateful to Prof.S.Arumugam, Professor and Head, Department of Electronics and Communication Engineering, for his valuable suggestions and guidance throughout the course of this project phase. His positive approach had offered incessant help in all possible ways from the beginning.
We also extend our sincere thanks to Mr.P.Rajeswaran, Professor, Department of Electronics and Communication Engineering, for the support and timely assistance rendered for this project phase to have come out in flying colours.
We take immense pleasure in expressing our humble note of gratitude to our project guide MR.V.J.ARUL KARTHICK, Associate Professor, Department of Electronics and Communication Engineering, for his remarkable guidance in the course of completion of this project phase. We also extend our thanks to other faculty members and our friends for their moral support to us in helping us to successfully complete this project phase.
3
Table of contents
1. ABSTRACT………………………………………..52. INTRODUCTION…………………………………6 A.Brief description………………………………7 B. Literature survey……………………………..9 C. Software Engineering Approach……………113.PROBLEM DEFINITION…………………………124. DESIGN…………………………………………….13 A.Hand Gesture Recognition……………….14 B. Image Database …………………….……15 C.Image Processing………………….............165.PROPOSED OUTPUT……………………………. 186. CONCLUSION……………………………………..237. FUTURE SCOPE…………………………………..248. BIBLIOGRAPHY……………………………….....25
4
ABSTRACT
Hand gesture recognition techniques have been studied for more than two decades. Several
solutions have been developed; however, little attention has been paid on the human factors, e.g.
the intuitiveness of the applied hand gestures. This study was inspired by the movie Minority
Report, in which a gesture-based interface was presented to a large audience. In the movie, a
video-browsing application was controlled by hand gestures. Nowadays the tracking of hand
movements and the computer recognition of gestures is realizable; however, for a usable system
it is essential to have an intuitive set of gestures. The system functions used in Minority Report
were reverse engineered and a user study was conducted, in which participants were asked to
express these functions by means of hand gestures. We were interested how people formulate
gestures and whether we could find any pattern in these gestures. In particular, we focused on the
types of gestures in order to study intuitiveness, and on the kinetic features to discover how they
influence computer recognition. We found that there are typical gestures for each function, and
these are not necessarily related to the technology people are used to. This result suggests that an
intuitive set of gestures can be designed, which is not only usable in this specific application, but
can be generalized for other purposes as well. Furthermore, directions are given for computer
recognition of gestures regarding the number of hands used and the dimensions of the space
where the gestures are formulated.
5
INTRODUCTION
6
BRIEF DESCRIPTION
Several successful approaches to spatio-temporal signal processing such as speech recognition
and hand gesture recognition have been proposed. Most of them involve time alignment which
requires substantial computation and considerable memory storage. In this paper, we present a
Hand Gesture-based approach to control the robot. This approach employs a powerful method
based on easy response of robot movement for the patient by selecting templates; therefore,
considerable memory is alleviated.
Due to congenital malfunctions, diseases, head injuries, or virus infections, deaf or
non- vocal individuals are unable to communicate with hearing persons through speech. They
use sign language or hand gestures to express themselves, however, most hearing persons do
not have the special sign language expertise. Hand gestures can be classified into two classes:
(1) static hand gestures which relies only the information about the angles of the lingers and (2)
dynamic hand gestures which relies not only the fingers' flex angles but also the hand
trajectories and orientations. The dynamic hand gestures can be further divided into two
subclasses. The first subclass consists of hand gestures involving hand movements and the
second subclass consists; of hand gestures involving fingers' movements but without changing
the position of the hands. That is, it requires at least two different hand shapes connected
sequentially to form a particular hand gesture. Therefore samples of these hand gestures are
spatio-temporal patterns. The accumulated similarity associated with all samples of the input is
computed for each hand gesture in the vocabulary, and the unknown gesture is classified as the
gesture yielding the highest accumulative similarity. Developing sign language applications for
deaf people can be very important, as many of them, being not able to speak a language, are
also not able to read or write a spoken language. Ideally, a translation system would make it
possible to communicate with deaf people. Compared to speech commands, hand gestures are
advantageous in noisy environments, in situations where speech commands would be
disturbing, as well as for communicating quantitative information and spatial relationships.
A gesture is a form of non-verbal communication made with a part of the body and used instead
of verbal communication (or in combination with it).
7
Most people use gestures and body Language in addition to words when they speak. A sign
language is a language which uses gestures instead of sound to convey meaning combining hand-
shapes, orientation and movement of the hands, arms or body, facial expressions and lip-patterns.
Similar to automatic speech recognition (ASR), we focus in gesture recognition which can be
later translated to a certain machine movement. The goal of this project is to develop a program
implementing real time gesture recognition. At any time, a user can exhibit his hand doing a
specific gesture in front of a video camera linked to a computer. However, the user is not
supposed to be exactly at the same place when showing his hand. The program has to collect
pictures of this gesture thanks to the video camera, to analyze it and to identify the sign. It has to
do it as fast as possible, given that real time processing is required. In order to lighten the project,
it has been decided that the identification would consist in counting the number of fingers that
are shown by the user in the input picture. We propose a fast algorithm for automatically
recognizing a limited set of gestures from hand images for a robot control application. Hand
gesture recognition is a challenging problem in its general form. We consider a fixed set of
manual commands and a reasonably structured environment, and develop a simple, yet effective,
procedure for gesture recognition. Our approach contains steps for segmenting the hand region,
locating the fingers and finally classifying the gesture. The algorithm is in variant to translation,
rotation, and scale of the hand .We can even demonstrate the effectiveness of the technique on
real imagery.
8
LITERATURE SURVEYOBJECTIVE:
Our objective is to identify requirements (i.e., quality attributes and functional
requirements) for Gesture Based Recognition. We especially focus on requirements. For
research tools that target the domains of visualization for software maintenance, reengineering,
and reverse engineering mainly in the field of medical engineering.
METHOD:
The requirements are identified with a comprehensive literature survey based on
relevant publications in journals, conference proceedings, and theses. We have referred
Documents and journals available on the net for the same. Most of the data has been referred
from the IEEE website. As our library has online subscription of the IEEE journals, it provided
immense help in locating the resources.
The various journals referred are:
1) Implementation of adaptive feed-forward algorithm by
JaroslawSzewinski_†, Wojciech Jalmuzna, University of Technology, Institute of
Electronic Systems, Warsaw, Poland.
This deals with the description of the various algorithms used in Neural Networks viz. •
feed-forward (FF) • feedback (FB) • adaptive feed-forward (AFF).
2) A Fast Algorithm For Vision-Based Hand Gesture Recognition
For Robot Control by Asanterabi Malima, ErolÖzgür, and MüjdatÇetin, Faculty
of Engineering and Natural Sciences, Sabancı University, Tuzla, İstanbul, Turkey.
The above approach contains steps for segmenting the hand region, locating the fingers ,and
finally classifying the gesture. The algorithm is invariant to translation, rotation, and scale of
the hand.
9
3) A Gesture controlled robot for object perception and
Manipulation by Mark Batcher, Institute of Neuron informatics, Germany.
Gripsee is the name of the Robot of whose design is discussed in the paper ,it is used
for identifying an object, grasp it, and moving it to a new position. It serves as a
multipurpose Robot which can perform a no. of tasks , it is used as a Service Robot.
4) Programming-By-Example Gesture Recognition by Kevin Gabayan,
Steven Lansel .
Machine learning and hardware improvements to a programming-by-example rapid
prototyping system are proposed. This paper deals with the dynamic time warping gesture
recognition approach involving single signal channels.
10
SOFTWARE ENGINEERING APPROACH
For developing the code, and the whole algorithm, it was preferable to use Matlab. Indeed, in this
environment, image displaying, graphical analysis and image processing turn into a simple
enough issue concerning the coding, because Matlab has a huge and the fact that Matlab is
optimized for matrix-based calculus make any image treatment more easier given that any image
can be considered as a matrix.
That’s why the whole Code has been developed first under Matlab environment. Only the
code of the Image Scanning Method and of the Weighted Averaging Analysis method is
provided. Indeed, given that the last one is a kind of combination of the Pixel Counting Method
and of the Edge Counting Method; their respective codes may be extracted from the code of the
Weighted Averaging Method.
For the movement of dc motor of the robot, the program has been written in assembly
language since it is most suitable and we are well aware of the subject. The IC used is 8051
microcontroller; hence the code was written and tested in Keil C software.
11
PROBLEM DEFINITION
The experimental setup consists of a digital camera used to take the images .The camera
is interfaced to computer. Computer is used to create the database & analysis of the
images. The computer consists of a program prepared in Matlab for the various
operations on the images. Using Vision and Motion tool box, analysis of the images is
done.
The initial step is to create the database of the images which are used for training &
testing. The image database can have different formats. Images can be either hand drawn,
digitized photographs or a 3D dimensional hand. Photographs were used, as they are the
most realistic approach. Two operations were carried out in all of the images. They were
converted to grayscale and the background was made uniform.
The images with internet databases already had uniform backgrounds but the ones taken
with the digital camera had to be processed in Photoshop .The pattern recognition system
that will be used consists of some transformation T, which converts an image into a
feature vector, which will be then compared with feature vectors of a training set of
gestures.
12
DESIGN
13
HAND GESTURE RECOGNITION
Consider a robot navigation problem, in which a wheel controlling DC motor responds to the
hand pose signs given by a human, visually observed by the camera. We are interested in an
algorithm that enables the image from camera to identify a hand pose sign in the input image, as
one of five possible commands (or counts). The identified command will then be used as a
control input for the robot to perform a certain action or execute a certain task. For examples of
the signs to be used in our algorithm, see Figure . The signs could be associated with various
meanings depending on the function of the robot. For example, a “one” count could mean “move
forward”, a “five” count could mean “stop”. Furthermore, “two”, “three”, and “four” counts
could be interpreted as “reverse”, “turn
right,” and “turn left.”
Set of hand gestures, or “counts” considered in our work.
14
IMAGE DATABASE
The starting point of the project was the creation of a database with all the images that
would be used for training and testing.
The image database can have different formats. Images can be either hand drawn,
digitized photographs or a 3D dimensional hand. Photographs were used, as they are the
most realistic approach.
Images came from two main sources. Various ASL databases on the Internet and
Photographs. I took with a digital camera. This meant that they have different sizes,
different resolutions and sometimes almost completely different angles of shooting.
Images belonging to the last case were very few but they were discarded, as there was no
chance of classifying them correctly. Two operations were carried out in all of the
images. They were converted to gray scale and the background was made uniform. The
internet databases already had uniform backgrounds but the ones I took with the digital
camera had to be processed in Adobe Photoshop.
Drawn images can still simulate translational variances with the help of an editing
Program (e.g. Adobe Photoshop).
The database itself was constantly changing throughout the completion of the project as it
was it that would decide the robustness of the algorithm. Therefore, it had to be done in
such way that different situations could be tested and thresholds above which the
algorithm didn’t classify correct would be decided.
The construction of such a database is clearly dependent on the application. If the
application is a crane controller for example operated by the same person for long periods
the algorithm doesn’t have to be robust on different person’s images. In this case noise
and motion blur should be tolerable.
15
IMAGE PROCESSING
Image processing is any form of signal processing for which the input is an image, such as
photographs or frames of video; the output of image processing can be either an image or a set of
characteristics or parameters related to the image. Most image-processing techniques involve
treating the image as a two-dimensional signal and applying standard signal-
processing techniques to it.
Typical operations
Among many other image processing operations are:
Geometric transformations such as enlargement, reduction, and rotation
Color corrections such as brightness and contrast adjustments, quantization, or
conversion to a different color space
Digital compositing or Optical compositing (combination of two or more images). Used
in film making to make a "matte"
Image editing (e.g., to increase the quality of a digital image)
Image registration (alignment of two or more images), differencing and morphing
Image segmentation
Extending dynamic range by combining differently exposed images
2-D object recognition with affine invariance
16
APPLICATIONS:
Face detection
Feature detection
Lane departure warning system
Non-photorealistic rendering
Medical image processing
Microscope image processing
Morphological image processing
Remote sensing
Robot control:
The robot is two wheel control with a castor wheel provided for the support. ULN2004A IC
is used for driving the motors. DC motors has been used. As the name suggests, this motors
spin freely under the command of a controller.
This makes them easier to control, as the controller knows exactly how far they have
rotated, without having to use a sensor. Therefore they are used on many rolling wheels.
Stepper motor used is a unipolar motor , hence having six wires coming out of it. Four
of them are used for receiving data from the microcontroller for its movement while two
are short circuited and connected to 12V DC supply.
17
PROPOSED OUTPUT
18
Matlab Program for count one:
clc;close all;clear all;x1=imread('D:\h.jpg');figure,imshow(x1);x2=imread('D:\b.jpg');figure,imshow(x2);x=x2-x1;figure,imshow(x);display(x);bw=im2bw(x,0.9);figure,imshow(bw);display(bw);z=size(bw);c1=0;c2=0;for i=1:z(1) for j=1:z(2) c1=c1+1; if bw(i,j)==1 c2=c2+1; end endenddisplay(c1);display(c2);
19
InputImage plus background Only background
After background subtraction Binary image
20
OUTPUT:Number of ones and zeros:No.of zeros:c1 = 1920000No.of onesc2 = 0
Similarly the input image for count 2,3,4,5 are
21
The respected output for the count 2,3,4,and 5 are
For count----------2 c1 = 85801c2 = 63387
For count ----------3c1 = 81920c2 = 13319For count ----------4c1 = 68480c2 = 48For count ----------5c1 = 32292c2 = 19815.
C1----denotes number of black color in the imageC2---- denotes number of white color in the image
It should be noted that for different count of input images ,the values for black and white color is varied. Hence we found that the action will be performed according the different values of output.
22
CONCLUSION
We proposed a fast and simple algorithm for a hand gesture recognition for controlling
robot problem. Given observed images of the hand, the algorithm segments the hand region, and
then makes an inference on the activity of the fingers involved in the gesture. We have
demonstrated the effectiveness of this computationally efficient algorithm on real images we
have acquired. Based on our motivating robot control application, we have only considered a
limited number of gestures.
Our algorithm can be extended in a number of ways to recognize a broader set of gestures.
The segmentation portion of our algorithm is too simple, and would need to be improved if this
technique would need to be used in challenging operating conditions.
However we should note that the segmentation problem in a general setting is an open
research problem itself. Reliable performance of hand gesture recognition techniques in a general
setting require dealing with occlusions, temporal tracking for recognizing dynamic gestures, as
well as 3D modeling of the hand, which are still mostly beyond the current state of the art.
23
FUTURE SCOPE
Even with limited processing power, it will be possible to design very efficient algorithms in order to
• Advanced DSP processor can reduce the size of module
• Understand their (static) gestures
• Control for other biometric uses
Our software has been designed to be reusable and many behaviors that are more complex
may be added to our work. Because we limited ourselves to low processing power, our work
could easily be made more performing by adding a state-of-the-art processor. The use real
embedded OS could improve our system in terms of speed and stability.
In addition, implementing more sensor modalities would improve robustness even in very
complex scenes. Our system has shown the possibility that interaction with machines through
gestures is a feasible task and the set of detected gestures could be enhanced to more commands
by implementing a more complex model of a advanced vehicle for not only in limited space
while also in broader area as in the roads too .
In the future, service robot executing many different tasks from private movement to a
fully-fledged advanced automotive that can make disabled to able in all sense.
24
BIBLIOGRAPHY
Books and references:
Image Processing book by Anil.K.Jain
Digital Image Processing by Jayaraman
www.wikipedia.com
www.google.com
ieeexplore.ieee.org
25