docmentation with rules

54
ASSITIVE TEXT AND PRODUCT LABEL READING FROM HAND - HELD OBJECTS FOR BLIND PERSONS USING MATLAB Submitted in partial fulfillment of the Requirement for the award of the degree of BACHELOR OF TECHNOLOGY IN ELECTRONICS AND COMMUNICATION ENGINEERING Submitted by Student Name(s) Regd. No. SAI PRATHAP REDDY.K 12095A0421 NEELIMA.G 11091A0485 RAMANJANEYULU.Y 11091A04A0 MALLESWARI.N 11091A0464 SAI PRASANTH.A 11091A04B5 Under the Esteemed Guidance of Mr.N.RAMANJANEYULU M.Tech,(Ph.D.) Associate Professor in ECE (ESTD-1995) SCHOOL OF ELECTRONICS AND COMMUNICATION ENGINEERING RAJEEV GANDHI MEMORIAL COLLEGE OF ENGINEERING AND TECHNOLOGY (AUTONOMOUS) Affiliated to J.N.T. University-Anantapur, Approved by A.I.C.T.E., New Delhi, Accredited by N.B.A-New Delhi, Accredited by NAAC with A- grade, Participated in World Bank TEQIP-1 NANDYAL 518501, Kurnool Dist. A.P. YEAR: 2011-2015

Upload: kathi-reddy

Post on 18-Aug-2015

60 views

Category:

Documents


2 download

TRANSCRIPT

ASSITIVE TEXT AND PRODUCT LABEL READING

FROM HAND - HELD OBJECTS FOR BLIND PERSONS

USING MATLAB

Submitted in partial fulfillment of the Requirement for the award of the degree of

BACHELOR OF TECHNOLOGY IN

ELECTRONICS AND COMMUNICATION ENGINEERING

Submitted by

Student Name(s) Regd. No.

SAI PRATHAP REDDY.K 12095A0421

NEELIMA.G 11091A0485

RAMANJANEYULU.Y 11091A04A0

MALLESWARI.N 11091A0464

SAI PRASANTH.A 11091A04B5

Under the Esteemed Guidance of

Mr.N.RAMANJANEYULU M.Tech,(Ph.D.)

Associate Professor in ECE

(ESTD-1995)

SCHOOL OF ELECTRONICS AND COMMUNICATION ENGINEERING

RAJEEV GANDHI MEMORIAL

COLLEGE OF ENGINEERING AND TECHNOLOGY (AUTONOMOUS)

Affiliated to J.N.T. University-Anantapur, Approved by A.I.C.T.E., New Delhi, Accredited by N.B.A-New Delhi,

Accredited by NAAC with A- grade, Participated in World Bank TEQIP-1

NANDYAL –518501, Kurnool Dist. A.P. YEAR: 2011-2015

RAJEEV GANDHI MEMORIAL COLLEGE OF

ENGINEERING & TECHNOLOGY AUTONOMOUS

(Approved by A.I.C.T.E-New Delhi, Affiliated to JNT University-Anantapur,

Accredited by NBA-New Delhi, Accredited by NAAC with A-Grade)

NANDYAL – 518 501, A.P, India

SCHOOL OF ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE

This is to certify that the dissertation entitled “ASSITIVE TEXT AND

PRODUCT LABEL READING FROM HAND - HELD OBJECTS FOR BLIND

PERSONS USING MATLAB” is being submitted by SAI PRATHAP REDDY.K

(12095A0421), NEELIMA.G (11091A0485), RAMANJANEYULU.Y (11091A04A0),

MALLESWARI.N (11091A0464), SAI PRASANTH.A (11091A04B5) under the

guidance of Mr.N.RAMANJANEYULU for Project of the award of B.Tech Degree in

ELECTRONICS AND COMMUNICATION ENGINEERING in the RAJEEV

GANDHI MEMORIAL COLLEGE OF ENGINEERING & TECHNOLOGY, Nandyal

(Affiliated to J.N.T. University Anantapur) is a record of bonafied work carried out by

them under our guidance and supervision.

Head of the Department: Project Guide: Dr. D.SATYANARAYANA, Mr.N.RAMANJANEYULU

M.Tech. Ph.D, MISTE, FIETE, MIEEE M.Tech.( Ph.D.)

Professor and H.O.D Associate Professor in ECE

Signature of the External Examiner

Date of Examination:

i

ACKNOWLEDGEMENT

The successful completion of this project report is made possible

with the help and guidance received from various quarters. We would like

to avail this opportunity to express our sincere thanks and gratitude to all

of them.

We are deeply indebted to our guide, Mr.N.RAMANJANEYULU,

M.Tech., (Ph.D.) Associate Professor, Department of Electronics and

Communication Engineering. We are really fortunate to associate

ourselves with such an advising and helping guide in every possible way at

all stages for successful completion of this project work.

We extend our deep sense of gratitude to Dr. D.SATYANARAYANA

B.E, M.Tech., Ph.D., MISTE, FIETE professor and HOD of ECE,

RGMCET, for his moral support and valuable advices during this project

work and the course.

We also express our deep gratitude to our principal, Dr.

T.JAYACHANDRA PRASAD GARU and to our chairman, Dr. M.SANTHI

RAMUDU GARU for providing the required facilities.

We express our thanks to all other teaching and non-teaching staff

for their cooperation in many aspects for successful completion of the mini

project.

We also like to thank all our family members and friends who gave

us constructive suggestions and encouragement throughout the project.

PROJECT MEMBERS

K. SAI PRATHAP REDDY 12125A0412 G. NEELIM 11091A0485 Y. RAMANJANEYULU 11091A04A0

N. MALLESWARI 11091A0464 A. SAI PRASANTH 11091A04B5

ii

ABSTRACT

We propose assistive text reading framework to help blind

people to read text labels and product labels from hand-held objects

in their daily lives. Printed text is everywhere in the form of reports,

receipts, bank statements, restaurant menus, classroom handouts,

product packages, instructions on medicine bottles, etc. we first

propose an efficient and effective motion-based method to define a

region of interest (ROI). To isolate the object from the surrounding

objects in the camera view, Camera acts as main vision in detecting

the label image of the product or board then image is processed

internally and separates label from image by using MATLAB and

finally identifies the product and identified product name is

pronounced through voice. When capture button is clicked system

captures the product image placed in front of the web camera. Now

captured image or the selected image from the system by using

graphical user interface is converted into text by using edge based text

region extraction. Finally extracted text is converted into speech by

using text to speech synthesizer. We explore user interface issues in

extracting and reading text from different objects with complex

backgrounds.

iii

TABLE OF CONTENTS

CHAPTER NO TITLE PAGE NO.

ACKNOWLEDGEMENT i

ABSTRACT ii

LIST OF FIGURES v

CHAPTER 1 INTRODUCTION 1

1.1 Importance of Printed Text 1

CHAPTER 2 FUNDAMENTALS OF IMAGE PROCESSING 5

2.1 Introduction 5

2.1.1 Image 5

2.1.2 Image File Sizes 6

2.1.3 Image File Formats 6

2.1.4 Raster Formats 6

2.1.5 Vector Formats 9

2.2 Digital Image Processing 10

2.3 Applications of Digital Image Processing 10

2.4 Fundamental Steps in Digital Image

Processing 11

2.5 Components of Image Processing System 16

CHAPTER 3 EXISTING METHODS 20

3.1 Portable Bar Code Readers 20

3.2 KReader Mobile 21

3.3 Pen Scanners 22

CHAPTER 4 PROPOSED METHOD 23

4.1 Algorithm for Edge Based Text Region

Extraction 25

4.1.1 Detection 26

4.1.2 Localization 29

4.1.3 Character Extraction 29

4.2 IMPLEMENTATION 30

iv

4.2.1 Software Requirement -MATLAB 30

4.2.2 Typical Uses of MATLAB 31

4.2.3 Features of MATLAB 31

4.2.4 Basic Building Blocks of MATLAB 32

4.3 MATLAB Window 32

4.3.1 Command Window 32

4.3.2 Workspace Window 33

4.3.3 Current Directory Window 33

4.3.4 Command History Window 33

4.3.5 Editor Window 33

4.3.6 Graphics or Figure Window 34

4.3.7 Online Help Window 34

4.4 MATLAB Files 34

4.4.1 M-Files 35

4.4.2 Script Files 35

4.4.3 Function Files 35

4.4.4 Mat-Files 35

4.5 MATLAB Function 35

4.5.1 Development Environment 35

4.5.2 MATLAB Mathematical Function 36

4.5.3 MATLAB Language 36

4.5.4 GUI Construction 36

4.5.5 MATLAB Application Interface 36

4.6 MATLAB Working Environment 36

4.6.1 MATLAB Desktop 36

4.6.2 Using MATLAB Editor To Create M-

Files 38

4.6.3 Getting Help 39

CHAPTER 5 RESULTS 40

5.1 ADVANTAGES 44

5.2 CONCLUSION AND FURUTE SCOPE 44

REFERENCES 46

v

LIST OF FIGURES

FIGURE NO. NAME OF THE FIGURE PAGENO.

Figure 1.1 Printed text with multiple colours Complex backgrounds

or non-flat surfaces 3

Figure 1.2 Examples of text localization and recognition from

camera captured images. (Top) milk box. (bottom) men

bathroom 3

Figure 2.1 Fundamental Steps in Image processing 12

Figure 2.2 Components of Image Processing System 16

Figure 3.1 Barcode on a product 20

Figure 3.2 Barcode machine scannig unique code on product 21

Figure 3.3 KReader mobile in mobiles 21

Figure 3.4 Pen scnanners scanning a document 22

Figure 4.1 Extration of text from product 24

Figure 4.1.1 Image with multiple background and multiple fonts 24

Figure 4.1.2 Basic Block diagram for edge based text extraction 25

Figure 4.1.3 Default filter returned by the fspecial Gaussian

function 26

Figure 4.1.4 Sample Gaussian pyramid with 4 levels 27

Figure 4.1.5 Each resolution image resized to original image size 27

Figure 4.1.6 The directional kernels 27

Figure 4.1.7 Sample image from Figure 3 after convolution with

each directional kernel Note how the edge

information in each direction is highlighted 28

Figure 4.1.8 Sample resized image of the pyramid after

convolution with 0º kernel 28

Figure 4.1.9 (a) Before dilation (b) After dilation 29

Figure 4.1.10 (a) Original image (b) Result 30

Figure 4.2 Representation of MATLAB Window 37

Figure 5.1 GUI Window 40

Figure 5.2 Popup window for selecting picture 40

Figure 5.3 Browsing for a picture 41

vi

Figure 5.4 Loading of a Picture 41

Figure 5.5 Directory of the picture 42

Figure 5.6 Finished finding of picture from directory 42

Figure 5.7 Converted .txt file and result in notepad 43

Figure 5.8 Appeared picture and output text in notepad 43

Figure 5.9 Extracted text to speech 44

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 1

CHAPTER 1

INTRODUCTION

Of the 314 million visually impaired people worldwide, 45million

are blind. Even in developed country like the U.S., U.S., the 2008

National Health Interview Survey reported that an estimated 25.2

million adult Americans (over 8%) are blind or visually impaired. This

number is increasing rapidly as the baby boomer generation ages.

Recent developments in computer vision, digital cameras, and

portable computers make it feasible to assist these individuals by

developing camera-based products that combine computer vision

technology with other existing commercial products such optical

character recognition (OCR) systems.

1.1 Importance Of Printed Text

Reading is obviously essential in today‟s society. Printed text is

everywhere in the form of reports, receipts, bank statements,

restaurant menus, classroom handouts, product packages,

instructions on medicine bottles, etc. And while optical aids, video

magnifiers, and screen readers can help blind users and those with

low vision to access documents, there are few devices they can provide

good access to common hand-held objects such as product packages,

and objects printed with text such as prescription medication bottles.

The ability of people who are blind or have significant visual

impairments to read printed labels and product packages will enhance

independent living and foster economic and social self-sufficiency.

Today, there are already a few systems that have some promise for

portable use, but they cannot handle product labelling. For example,

portable bar code readers designed to help blind people identify

different products in an extensive product database can enable users

who are blind to access information about these products through

speech and Braille. But a big limitations that it is very hard for blind

users to find the position of the bar code and to correctly point the bar

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 2

code reader at the bar code. Some reading-assistive systems such as

pen scanners might be employed in these and similar situations. Such

systems integrate OCR software to offer the function of scanning and

recognition of text and some have integrated voice output.

However, these systems are generally designed for and perform

best with document images with simple backgrounds, standard fonts,

a small range of font sizes, and well-organized characters rather than

commercial product boxes with multiple decorative patterns. Most

state of the art OCR software cannot directly handle scene images

with complex backgrounds. A number of portable reading assistants

have been designed specifically for the visually impaired. KReader

Mobile runs on a cell phone and allows the user to read mail, receipts,

fliers, and many other documents. However, the document to be read

must be nearly flat, placed on a clear, dark surface (i.e., a non-

cluttered background), and contain mostly text. Furthermore,

KReader Mobile accurately reads black print on a white background,

but has problems recognizing coloured text or text on a coloured

background. It cannot read text with complex backgrounds, text

printed on cylinders with warped or incomplete images (such as soup

cans or medicine bottles). Furthermore, these systems require a blind

user to manually localize areas of interest and text regions on the

objects in most cases. Although a number of reading assistants have

been designed specifically for the visually impaired, to our knowledge,

no existing reading assistant can read text from the kinds of

challenging patterns and backgrounds found on many everyday

commercial products.

As shown in Figure 1.1 such text information can appearing

multiple scales, fonts, colors, and orientations. To assist blind persons

to read text from these kinds of hand-held objects, we have conceived

of a camera-based assistive text reading framework to track the object

of interest within the camera view and extract print text information

from the object. Our proposed algorithm can effectively handle

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 3

complex background and multiple patterns, and extract text

information from both hand-held objects and nearby signage.

Figure 1.1 Printed text with multiple colours Complex backgrounds or

non-flat surfaces

As shown in Figure 1.2 in assistive reading systems for blind

persons, it is very challenging for users to position the object of

interest within the center of the camera‟s view. As of now, there are

still no acceptable solutions. We approach the problem in stages. To

make sure the hand-held object appears in the camera view, we use a

camera with sufficiently wide angle to accommodate users with only

approximate aim. This may often result in other text objects appearing

in the camera‟s view (for example, while shopping at a supermarket).

To extract the hand-held object from the camera image, we develop a

motion-based method to obtain a region of interest (ROI) of the object.

Then, we perform text recognition only in this ROI.

Figure 1.2 Examples of text localization and recognition from camera

captured images. (Top) milk box. (bottom) men bathroom signage.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 4

From the above figure (a) Camera captured images. (b) Localized

text regions (marked in blue). (c) Text regions cropped from image. (d)

Text codes recognized by OCR. Text at the top right corner of bottom

image is shown in a magnified callout.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 5

CHAPTER 2

FUNDAMENTALS OF IMAGE PROCESSING

2.1 Introduction

Image processing usually refers to digital image processing, but

optical and analog image processing also are possible. The acquisition

of images (producing the input image in the first place) is referred to

as imaging. Digital image processing is the use of computer

algorithms to perform image processing on digital images. As a

subcategory or field of digital signal processing, digital image

processing has many advantages over analog image processing. It

allows a much wider range of algorithms to be applied to the input

data and can avoid problems such as the build-up of noise and signal

distortion during processing. Since images are defined over two

dimensions (perhaps more) digital image processing may be modelled

in the form of multidimensional systems.

2.1.1 Image

An image is a two-dimensional picture, which has a similar

appearance to some subject usually a physical object or a person.

Image is a two-dimensional, such as a photograph, screen display,

and as well as a three-dimensional, such as a statue. They may be

captured by optical devices such as cameras, mirrors, lenses,

telescopes, microscopes, etc. and natural objects and phenomena,

such as the human eye or water surfaces.

The word image is also used in the broader sense of any two-

dimensional figure such as a map, a graph, a pie chart, or an abstract

painting. In this wider sense, images can also be rendered manually,

such as by drawing, painting, carving, rendered automatically by

printing or computer graphics technology, or developed by a

combination of methods, especially in a pseudo-photograph.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 6

2.1.2 Image File Sizes

Image file size is expressed as the number of bytes that

increases with the number of pixels composing an image, and the

colour depth of the pixels. The greater the number of rows and

columns, the greater will be the image resolution, and larger the file.

Also, each pixel of an image increases in size when its colour depth

increases, an 8-bit pixel (1 byte) stores 256 colours, a 24-bit pixel (3

bytes) stores 16 million colours, the latter known as true colour.

Image compression uses algorithms to decrease the size of a file.

High resolution cameras produce large image files, ranging from

hundreds of kilobytes to megabytes, per the camera's resolution and

the image-storage format capacity. High resolution digital cameras

record 12 megapixel (1MP = 1,000,000 pixels / 1 million) images, or

more, in true colour. For example, an image recorded by a 12 MP

camera; since each pixel uses 3 bytes to record true colour, the

uncompressed image would occupy 36,000,000 bytes of memory, a

great amount of digital storage for one image, given that cameras

must record and store many images to be practical. Faced with large

file sizes, both within the camera and a storage disc, image file

formats were developed to store such large images.

2.1.3 Image File Formats

Image file formats are standardized means of organizing and

storing images. This entry is about digital image formats used to store

photographic and other images. Image files are composed of either

pixel or vector (geometric) data that are characterized to pixels when

displayed (with few exceptions) in a vector graphic display. Including

proprietary types, there are hundreds of image file types. The PNG,

JPEG, and GIF formats are most often used to display images on the

Internet.

2.1.4 Raster Formats

These formats store images as bitmaps (also known as pixmaps).

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 7

JPEG/JFIF

JPEG (Joint Photographic Experts Group) is a compression method.

JPEG compressed images are usually stored in the JFIF (JPEG File

Interchange Format) file format. JPEG compression is lossy

compression. Nearly every digital camera can save images in the

JPEG/JFIF format, which supports 8 bits per colour (red, green, blue)

for a 24-bit total, producing relatively small files. Photographic images

may be better stored in a lossless non-JPEG format if they will be re-

edited, or if small "artefacts" are unacceptable. The JPEG/JFIF format

also is used as the image compression algorithm in many Adobe PDF

files.

EXIF

The EXIF (Exchangeable image file format) format is a file standard

similar to the JFIF format with TIFF extensions. It is incorporated in

the JPEG writing software used in most cameras. Its purpose is to

record and to standardize the exchange of images with image

metadata between digital cameras and editing and viewing software.

The metadata are recorded for individual images and include such

things as camera settings, time and date, shutter speed, exposure,

image size, compression, name of camera, colour information, etc.

When images are viewed or edited by image editing software, all of this

image information can be displayed.

TIFF

The TIFF (Tagged Image File Format) format is a flexible format that

normally saves 8 bits or 16 bits per color (red, green, blue) for 24-bit

and 48-bit totals, respectively, usually using either the TIFF or TIF

filename extension. TIFFs are lossy and lossless. Some offer relatively

good lossless compression for bi-level (black & white) images. Some

digital cameras can save in TIFF format, using the LZW compression

algorithm for lossless storage. TIFF image format is not widely

supported by web browsers. TIFF remains widely accepted as a

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 8

photograph file standard in the printing business. TIFF can handle

device-specific colour spaces, such as the CMYK defined by a

particular set of printing press inks.

PNG

The PNG (Portable Network Graphics) file format was created as the

free, open-source successor to the GIF. The PNG file format supports

true colour (16 million colours) while the GIF supports only 256

colours. The PNG file excels when the image has large, uniformly

coloured areas. The lossless PNG format is best suited for editing

pictures, and the lossy formats, like JPG, are best for the final

distribution of photographic images, because JPG files are smaller

than PNG files. PNG, an extensible file format for the lossless,

portable, well compressed storage of raster images. PNG provides a

patent-free replacement for GIF and can also replace many common

uses of TIFF. Indexed-colour, gray scale, and true colour images are

supported, plus an optional alpha channel. PNG is designed to work

well in online viewing applications, such as the World Wide Web. PNG

is robust, providing both full file integrity checking and simple

detection of common transmission errors

GIF

GIF (Graphics Interchange Format) is limited to an 8-bit palette, or

256 colors. This makes the GIF format suitable for storing graphics

with relatively few colors such as simple diagrams, shapes, logos and

cartoon style images. The GIF format supports animation and is still

widely used to provide image animation effects. It also uses a lossless

compression that is more effective when large areas have a single

color, and ineffective for detailed images or dithered images.

BMP

The BMP file format (Windows bitmap) handles graphics files within

the Microsoft Windows OS. Typically, BMP files are uncompressed,

hence they are large. The advantage is their simplicity and wide as

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 9

acceptance in Windows programs. The BMP file format (Windows

bitmap) handles graphics files within the Microsoft Windows OS.

Typically, BMP files are uncompressed, hence they are large. The

advantage is their simplicity and wide as acceptance in Windows

programs.

2.1.5 Vector Formats

As opposed to the raster image formats above (where the data

describes the characteristics of each individual pixel), vector image

formats contain a geometric description which can be rendered

smoothly at any desired display size.

At some point, all vector graphics must be rasterized in order to

be displayed on digital monitors. However, vector images can be

displayed with analog CRT technology such as that used in some

electronic test equipment, medical monitors, radar displays, laser

shows and early video games. Plotters are printers that use vector

data rather than pixel data to draw graphics.

CGM

CGM (Computer Graphics Metafile) is a file format for 2D vector

graphics, raster graphics, and text. All graphical elements can be

specified in a textual source file that can be compiled into a binary file

or one of two text representations. CGM provides a means of graphics

data interchange for computer representation of 2D graphical

information independent from any particular application, system,

platform, or device.

SVG

SVG (Scalable Vector Graphics) is an open standard created and

developed by the World Wide Web Consortium to address the need for

a versatile, scriptable and all purpose vector format for the web and

otherwise. The SVG format does not have a compression scheme of its

own, but due to the textual nature of XML, an SVG graphic can be

compressed using a program such as zip.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 10

2.2 Digital Image Processing

A digital image processing allows the use of much more complex

algorithms for image processing, and hence, can offer both more

sophisticated performance at simple tasks, and the implementation of

methods which would be impossible by analog means. Some

techniques which are used in digital image processing include:

Pixelisation

Linear filtering

Principal components analysis

Independent component analysis

Hidden Markov models

Anisotropic diffusion

Partial differential equations

Self-organizing maps

Neural networks

2.3 Applications of Digital Image Processing

Some of the applications of digital image processing include:

Intelligent transportation systems

Film

Digital camera images

Medical applications

Restorations and enhancements

Digital cinema

Image transmission and coding

Colour processing

Remote sensing

High-resolution display

High-quality Colour representation

Super-high-definition image processing

Impact of standardization on image processing

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 11

Digital image processing, the manipulation of images by

computer, is relatively recent development in terms of man‟s ancient

fascination with visual stimuli. In its short history, it has been applied

to practically every type of images with varying degree of success. The

inherent subjective appeal of pictorial displays attracts perhaps a

disproportionate amount of attention from the scientists and also from

the layman. Digital image processing like other glamour fields, suffers

from myths, mis-connections, mis-understandings and mis-

information. It is vast umbrella under which fall diverse aspect of

optics, electronics, mathematics, photography graphics and computer

technology.

Several factor combine to indicate a lively future for digital

image processing. A major factor is the declining cost of computer

equipment. Several new technological trends promise to further

promote digital image processing. These include parallel processing

mode practical by low cost microprocessors, and the use of charge

coupled devices (CCDs) for digitizing, storage during processing and

display and large low cost of image storage arrays.

2.4 Fundamental Steps in Digital Image Processing

Image Acquisition

Image Acquisition is to acquire a digital image. To do so requires

an image sensor and the capability to digitize the signal produced by

the sensor. The sensor could be monochrome or color TV camera that

produces an entire image of the problem domain every 1/30 sec. the

image sensor could also be line scan camera that produces a single

image line at a time. In this case, the objects motion past the line.

Scanner produces a two-dimensional image. If the output of the

camera or other imaging sensor is not in digital form an analog to

digital converter digitizes it. The nature of the sensor and the image it

produces are determined by the application.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 12

Figure 2.1 Fundamental Steps in Image processing

Image Enhancement

Image enhancement is among the simplest and most appealing

areas of digital image processing. Basically, the idea behind

enhancement techniques is to bring out detail that is obscured, or

simply to highlight certain features of interesting an image. A familiar

example of enhancement is when we increase the contrast of an image

because “it looks better.” It is important to keep in mind that

enhancement is a very subjective area of image processing.

Image Restoration

Image restoration is an area that also deals with improving the

appearance of an image. However, unlike enhancement, which is

subjective, image restoration is objective, in the sense that restoration

techniques tend to be based on mathematical or probabilistic models

of image degradation.

Enhancement on the other hand is based on human subjective

preferences regarding what constitutes a “good” enhancement result.

For example, contrast stretching is considered an enhancement

technique because it is based primarily on the pleasing aspects it

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 13

might present to the viewer, where as removal of image blur by

applying a de-blurring function is considered a restoration technique.

Colour Image Processing

The use of color in image processing is motivated by two principal

factors. First, color is a powerful descriptor that often simplifies object

identification and extraction from a scene. Second, humans can

discern thousands of color shades and intensities, compared to about

only two dozen shades of gray. This second factor is particularly

important in manual image analysis.

Wavelets and Multi – Resolution Processing

Wavelets are the formation for representing images in various

degrees of resolution. Although the Fourier transform has been the

mainstay of transform based image processing since the late 1950‟s, a

more recent transformation, called the wavelet transform, and is now

making it even easier to compress, transmit, and analyse many

images. Unlike the Fourier transform, whose basis functions are

sinusoids, wavelet transforms are based on small values, called

Wavelets, of varying frequency and limited duration.

Wavelets were first shown to be the foundation of a powerful new

approach to signal processing and analysis called Multi-resolution

theory. Multi-resolution theory incorporates and unifies techniques

from a variety of disciplines, including sub band coding from signal

processing, quadrature mirror filtering from digital speech recognition,

and pyramidal image processing.

Compression

Compression, as the name implies, deals with techniques for

reducing the storage required saving an image, or the bandwidth

required for transmitting it. Although storage technology has improved

significantly over the past decade, the same cannot be said for

transmission capacity. This is true particularly in uses of the Internet,

which are characterized by significant pictorial content. Image

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 14

compression is familiar to most users of computers in the form of

image file extensions, such as the jpg file extension used in the JPEG

(Joint Photographic Experts Group) image compression standard.

Morphological Processing

Morphological processing deals with tools for extracting image

components that are useful in the representation and description of

shape. The language of mathematical morphology is set theory. As

such, morphology offers a unified and powerful approach to numerous

image processing problems. Sets in mathematical morphology

represent objects in an image. For example, the set of all black pixels

in a binary image is a complete morphological description of the

image.

Segmentation

Segmentation procedures partition an image into its constituent

parts or objects. In general, autonomous segmentation is one of the

most difficult tasks in digital image processing. A rugged segmentation

procedure brings the process a long way toward successful solution of

imaging problems that require objects to be identified individually.

On the other hand, weak or erratic segmentation algorithms

almost always guarantee eventual failure. In general, the more

accurate the segmentation, the more likely is to succeed.

Representation and Description

Representation and description almost always follow the output of

a segmentation stage, which usually is raw pixel data, constituting

either the boundary of a region (i.e., the set of pixels separating one

image region from another) or all the points in the region itself. In

either case, converting the data to a form suitable for computer

processing is necessary. The first decision that must be made is

whether the data should be represented as a boundary or as a

complete region. Boundary representation is appropriate when the

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 15

focus is on external shape characteristics, such as corners and

inflections.

Regional representation is appropriate when the focus is on

internal properties, such as texture or skeletal shape. In some

applications, these representations complement each other. Choosing

a representation is only part of the solution for transforming raw data

into a form suitable for subsequent computer processing . A method

must also be specified for describing the data so that features of

interest are highlighted. Description, also called feature selection,

deals with extracting attributes that result in some quantitative

information of interest or are basic for differentiating one class of

objects from another.

Object Recognition

The last stage involves recognition and interpretation. Recognition

is the process that assigns a label to an object based on the

information provided by its descriptors. Interpretation involves

assigning meaning to an ensemble of recognized objects.

Knowledge Base

Knowledge about a problem domain is coded into image processing

system in the form of a knowledge database. This knowledge may be

as simple as detailing regions of an image when the information of

interests is known to be located, thus limiting the search that has to

be conducted in seeking that information. The knowledge base also

can be quite complex, such as an inter related to list of all major

possible defects in a materials inspection problem or an image data

base containing high resolution satellite images of a region in

connection with change deletion application. In addition to guiding the

operation of each processing module, the knowledge base also

controls the interaction between modules. The system must be

endowed with the knowledge to recognize the significance of the

location of the string with respect to other components of an address

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 16

field. This knowledge glides not only the operation of each module, but

it also aids in feedback operations between modules through the

knowledge base. We implemented pre-processing techniques using

MATLAB.

2.5 Components of Image Processing System

As recently as the mid-1980s, numerous models of image

processing systems being sold throughout the world were rather

substantial peripheral devices that attached to equally substantial

host computers. Late in the 1980s and early in the 1990s, the market

shifted to image processing hardware in the form of single boards

designed to be compatible with industry standard buses and to fit into

engineering workstation cabinets and personal computers. In addition

to lowering costs, this market shift also served as a catalyst for a

significant number of new companies whose specialty is the

development of software written specifically for image processing.

Although large-scale image processing systems still are being

sold for massive imaging applications, such as processing of satellite

images, the trend continues toward miniaturizing and blending of

general-purpose small computers with specialized image processing

hardware. Figure 2.2 shows the basic components comprising a

typical general-purpose system used for digital image processing. The

function of each component is discussed in the following paragraphs,

starting with image sensing.

Figure 2.2 Components of Image Processing System

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 17

Image Sensors

With reference to sensing, two elements are required to acquire

digital images. The first is a physical device that is sensitive to the

energy radiated by the object we wish to image. The second, called a

digitizer, is a device for converting the output of the physical sensing

device into digital form. For instance, in a digital video camera, the

sensors produce an electrical output proportional to light intensity.

The digitizer converts these outputs to digital data.

Specialised Image Processing Hardware

Specialized image processing hardware usually consists of the

digitizer just mentioned, plus hardware that performs other primitive

operations, such as an arithmetic logic unit (ALU), which performs

arithmetic and logical operations in parallel on entire images. One

example of how an ALU is used is in averaging images as quickly as

they are digitized, for the purpose of noise reduction. This type of

hardware sometimes is called a front-end subsystem, and its most

distinguishing characteristic is speed.

Computer

The computer in an image processing system is a general-purpose

computer and can range from a PC to a supercomputer. In dedicated

applications, sometimes specially designed computers are used to

achieve a required level of performance, but our interest here is on

general-purpose image processing systems. In these systems, almost

any well-equipped PC-type machine is suitable for offline image

processing tasks.

Image Processing Software

Software for image processing consists of specialized modules that

perform specific tasks. A well-designed package also includes the

capability for the user to write code that, as a minimum, utilizes the

specialized modules. More sophisticated software packages allow the

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 18

integration of those modules and general-purpose software commands

from at least one computer language.

Mass Storage

Mass storage capability is a must in image processing

applications. An image of size 1024*1024 pixels, in which the

intensity of each pixel is an 8-bit quantity, requires one megabyte of

storage space if the image is not compressed. When dealing with

thousands, or even millions, of images, providing adequate storage in

an image processing system can be a challenge. Digital storage for

image processing applications fall into three principal categories:

short-term storage for use during processing, (2) on-line storage

for relatively fast recall, and (3) archival storage, characterized by

infrequent access. Storage is measured in bytes (eight bits), Kbytes

(one thousand bytes), Mbytes (one million bytes), Gbytes (meaning

giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion,

bytes).

One method of providing short-term storage is computer

memory. Another is by specialized boards, called frame buffers that

store one or more images and can be accessed rapidly, usually at

video rates. The latter method allows virtually instantaneous image

zoom, as well as scroll (vertical shifts) and pan (horizontal shifts).

Online storage generally takes the form of magnetic disks or optical-

media storage. The key factor characterizing on-line storage is

frequent access to the stored data. Finally, archival storage is

characterized by massive storage requirements but infrequent need for

access. Magnetic tapes and optical disks housed in “jukeboxes” are

the usual media for archival applications.

Image Displays

Image displays in use today are mainly colour (preferably flat

screen) TV monitors. Monitors are driven by the outputs of image and

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 19

graphics display cards that are an integral part of the computer

system.

Hardcopy

Hardcopy devices for recording images include laser printers, film

cameras, heat-sensitive devices, inkjet units, and digital units, such

as optical and CD-ROM disks. Film provides the highest possible

resolution, but paper is the obvious medium of choice for written

material. For presentations, images are displayed on film

transparencies or in a digital medium if image projection equipment is

used. The latter approach is gaining acceptance as the standard for

image presentations.

Network

Networking is almost a default function in any computer system in

use today. Because of the large amount of data inherent in image

processing applications, the key consideration in image transmission

is bandwidth. In dedicated networks, this typically is not a problem,

but communications with remote sites via the Internet are not always

as efficient. Fortunately, this situation is improving quickly as a result

of optical fiber and other broadband technologies

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 20

CHAPTER 3

EXISTING METHODS

There are three existing methods

Portable Bar code readers

KReader Mobile

Pen scanners

3.1 Portable Bar Code Readers A barcode is an optical machine-readable representation of

data, which shows certain data on certain products. Originally,

barcodes represented data in the widths (lines) and the spacing of

parallel lines, and may be referred to as linear or 1D (1 dimensional)

barcodes or symbolises. They also come in patterns of squares, dots,

hexagons and other geometric patterns within images termed 2D (2

dimensional) matrix codes or symbolises. Although 2D systems use

symbols other than bars, they are generally referred to as barcodes as

well. Barcodes can be read by optical scanners called barcode readers,

or scanned from an image by special software.

These are designed to help blind people identify different

products in an extensive product database. But as shown in figure 3.1

big limitation is that it is very hard for blind users to find the position

of the bar code and to correctly point the bar code reader at the bar

code.

Figure 3.1 Barcode on a product

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 21

Figure 3.2 Barcode machine scannig unique code on product

3.2 KReader mobile

KReader Mobile runs on a cellphone and allows the user to read

mail, receipts, files and many other documents. The disadvantage of

KReader Mobile is that document must be nearly flat, placed on a

clear, dark surface and should mostly text. It accurately reads black

print on a black background but has problems recognizing colored

text or text on a colored background.

As shown in figure 3.3 it has a major advancement in portability

and functionality of print access for struggling readers and those

learning a second language. Developed under the direction of Assistive

Technology pioneer Ray Kurzweil the kReader Mobile software package

runs on a multifunction cell phone and allows users to snap a picture

of virtually any document, including mail, receipts, handouts, memos

and many other documents. Our proprietary document analysis

technology determines the words and reads them aloud to the user.

Reading in other languages is available, along with translation

between languages. This is a truly portable solution to reading on the

go, allowing users to read what they want wherever they happen to be.

Figure 3.3 KReader mobile in mobiles

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 22

3.3 Pen Scanners

A scanner is a device that captures an image from a physical

object or document to create a digital copy of it. They come in a wide

range of designs and styles, but overall their purpose is to create a

digital backup of a physical image or document. Some of the most

common models on the market are flatbed designs that have a glass

bed that you lay a document onto and then a light is used to scan the

item and create a digital copy of it.

These have become increasingly powerful, with high-resolution

models often used for photographs and illustrations. There are also

specialized scanners used to capture images from older photographic

film and slides, as well as smaller versions that can quickly scan a

business card. Pen scanners provide small options for scanning

individual lines of text, while portable models can be used to scan any

document at any location. Consider what types of items you have to

scan, and where you'll be doing most of your scanning to find the best

model for your needs.

As shown in figure 3.4 Pen scanners are reading-assistive

systems, these systems are generally designed for reading and perform

best with document images with simple backgrounds, standard fonts,

a small range of font sizes and well organized characters rather than

commercial product boxes with multiple decorative patterns.

Figure 3.4 Pen scnanners scanning a document

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 23

CHAPTER 4

PROPOSED METHOD

To overcome the problems defined in problem definitions and

also to assist blind persons to read text from those kinds of

challenging patterns and backgrounds found on many everyday

commercial products of Hand-held objects, then have to conceived of a

camera-based assistive text reading framework to track the object of

interest within the camera view and extract print text information

from the object. Proposed algorithm used in this system can effectively

handle complex background and multiple patterns, and extract text

information from both hand-held objects and nearby signage.

To overcome the problem in assistive reading systems for blind

persons, in existing system very challenging for users to position the

object of interest within the center of the camera‟s view. As of now,

there are still no acceptable solutions. This problem approached in

stages. As shown in figure 4.1 the hand-held object should be appears

in the camera view, this thesis use a camera with sufficiently wide

angle to accommodate users with only approximate aim. This may

often result in other text objects appearing in the camera‟s view (for

example, while shopping at a supermarket). To extract the hand-held

object from the camera image, this system going to develop a motion-

based method to obtain a region of interest (ROI) of the object. Then

perform text recognition only that ROI. As sshown in figure 4.2 It is a

challenging problem to automatically localize objects and text ROIs

from captured images with complex backgrounds, because text in

captured images is most likely surrounded by various background

outlier “noise,” and text characters usually appear in multiple scales,

fonts, and colors. For the text orientations, this thesis assumes that

text strings in scene images keep approximately horizontal alignment.

Many algorithms have been developed for localization of text

regions in scene images. Here we are using this method:

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 24

Edge Based Text Region Extraction

In solving the task at hand, to extract text information from complex

backgrounds with multiple and variable text patterns, here propose a

text localization algorithm that combines rule-based layout analysis

and learning-based text classifier training, which define novel feature

maps based on stroke orientations and edge distributions. These, in

turn, generate representative and discriminative text features to

distinguish text characters from background outliers.

Figure 4.1 Extraction of text from a product

By using this Edge Based Text Region Extraction algorithm we can

extract the text from the desired image. Now the extracted text can be

conerted into speech by using text to speech synthesizer.

Figure 4.1.1 Image with multiple background and multiple fonts

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 25

DESCRIPTION

4.1 Algorithm For Edge Based Text Region Extraction

The basic steps of the edge-based text extraction algorithm are

given below, and diagrammed in Figure 5.1. The details are explained

in the following sections.

1. Create a Gaussian pyramid by convolving the input image with

a Gaussian kernel and successively down-sample each direction

by half.

2. Create directional kernels to detect edges at 0, 45, 90 and 135

orientations.

3. Convolve each image in the Gaussian pyramid with each

orientation filter.

4. Combine the results of step 3 to create the Feature Map.

5. Dilate the resultant image using a sufficiently large structuring

element to cluster candidate text regions together.

6. Create final output image with text in white pixels against a

plain black background.

Figure 4.1.2 Basic Block diagram for edge based text extraction

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 26

The procedure for extracting a text region from an image can be

broadly classified into three basic steps:

1. Detection of the text region in the image.

2. Localization of the region.

3. Creating the extracted output character image.

4.1.1 Detection

This section corresponds to Steps 1 to 4 of 5.1. Given an input

image, the region with a possibility of text in the image is detected. A

Gaussian pyramid is created by successively filtering the input image

with a Gaussian kernel of size 3x3 and downsampling the image in

each direction by half. Down sampling refers to the process whereby

an image is resized to a lower resolution from its original resolution. A

Gaussian filter of size 3x3 will be used as shown in Figure 5.2. Each

level in the pyramid corresponds to the input image at a different

resolution. A sample Gaussian pyramid with 4 levels of resolution is

shown in Figure 5.3. These images are next convolved with directional

filters at different orientation kernels for edge detection in the

horizontal (0°), vertical (90°) and diagonal (45°, 135°) directions. The

kernels used are shown in Figure 5.5.

Figure 4.1.3 Default filter returned by the fspecial Gaussian function

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 27

Figure 4.1.4 Sample Gaussian pyramid with 4 levels

Figure 4.1.5 Each resolution image resized to original image size

Figure 4.1.6 The directional kernels

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 28

Figure 4.1.7 Sample image from Figure 3 after convolution with each

directional kernel Note how the edge information in each direction is

highlighted

Figure 4.1.8 Sample resized image of the pyramid after convolution

with 0º kernel

After convolving the image with the orientation kernels, a

feature map is created. A weighting factor is associated with each pixel

to classify it as a candidate or non candidate for text region. A pixel is

a candidate for text if it is highlighted in all of the edge maps created

by the directional filters. Thus, the feature map is a combination of all

edge maps at different scales and orientations with the highest

weighted pixels present in the resultant map.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 29

4.1.2 Localization

This section corresponds to Step 5 of 5.1. The process of

localization involves further enhancing the text regions by eliminating

non-text regions. One of the properties of text is that usually all

characters appear close to each other in the image, thus forming a

cluster. By using a morphological dilation operation, these possible

text pixels can be clustered together, eliminating pixels that are far

from the candidate text regions. Dilation is an operation which

expands or enhances the region of interest, using a structural element

of the required shape and/or size. The process of dilation is carried

out using a very large structuring element in order to enhance the

regions which lie close to each other. In this algorithm, a structuring

element of size [7x7] has been used. Figure 5.8 below shows the result

before and after dilation.

Figure 4.1.9 (a) Before dilation (b) After dilation

The resultant image after dilation may consist of some non-text

regions or noise which needs to be eliminated. An area based filtering

is carried out to eliminate noise blobs present in the image. According

to only those regions in the final image are retained which have an

area greater than or equal to 1/20 of the maximum area region.

4.1.3 Character Extraction

This section corresponds to Step 6 of 5.1. The common OCR

systems available require the input image to be such that the

characters can be easily parsed and recognized. The text and

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 30

background should be monochrome and background-to-text contrast

should be high. Thus this process generates an output image with

white text against a black background. A sample test image and its

resultant output image from the edge based text detection algorithm

are shown in Figures 9(a) and 9(b) below.

Figure 4.1.10 (a) Original image (b) Result

4.2 IMPLEMENTATION

4.2.1 Software Requirement-MATLAB

Matlab is a high performance language for technical computing.

It integrates computation visualization and programming in an easy to

use environment. Matlab stands for matrix laboratory. It was written

originally to provide easy access to matrix software developed by

LINPACK (linear system package) and EISPACK (Eigen system

package) projects. Matlab is therefore built on a foundation of

sophisticated matrix in which the basic element in matrix that does

not require pre dimensioning which to solve many technical

computing problem especially those with matrix and vector

formulations, in a fraction of time.

Matlab features of applications specific solutions called toolbox.

Very important to most users of Matlab, toolboxes allow learning and

applying specialized technology. These are comprehensive collections

of Matlab functions that extend the Matlab environment to solve

particular classes of problems. Areas in which toolboxes are available

include signal processing, control system, neural networks, fuzzy

logic, wavelets, simulation and many others.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 31

4.2.2 Typical Uses of MATLAB

The typical using areas of Matlab are:

1. Math and computation

2. Algorithm and development

3. Data acquisition

4. Data analysis, exploration and visualization

5. Scientific and engineering graphics

6. Modelling

7. Simulation

8. Prototyping

9. Application development

10. Including graphical user interface building

Matlab is an interactive system whose basic data element is an

array that does not require dimensioning. This allow you to solve

many technical computing problems, especially those with matrix and

vector formulations, in a fraction of the time it would take to write a

program in a scalar non-interactive language such as C or FORTRON.

Matlab features a family of add-on application-specific solutions

called toolbox. Very important to most users of Matlab, toolbox allows

you to learn and apply specialized technology. Toolbox is

comprehensive collections of Matlab functions that extend the Matlab

environment to solve particular classes of problems. Areas in which

toolboxes are available include signal processing, control systems,

neural networks, fuzzy logic, wavelets, simulation and many others.

4.2.3 Features of MATLAB

1. Advance algorithm for high performance numerical

computation, especially in the field matrix algebra.

2. A large collection of predefined mathematical functions and the

ability to define one‟s own functions.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 32

3. Two- and three dimensional graphics for plotting and displaying

datass

4. Powerful, matrix or vector oriented high level programming

language for individual applications.

5. Toolboxes available for solving advanced problems in several

application areas.

4.2.4 Basic Building Blocks of MATLAB

The basic building block of Matlab is matrix. The fundamental

data type is the array. Vectors, scalars, real matrices and complex

matrix are handled as specific class of this basic data type. The built

in functions are optimized for vectors operations. No dimension

statements are requiring Med for vectors of arrays.

4.3 MATLAB Window

The Matlab works based on five windows

1. Command window

2. Work space window

3. Current directory window

4. Command history window

5. Editor window

6. Graphics window

7. Online-help window

4.3.1 Command Window

The command window is where the user types Matlab

commands and expressions at the prompt (>>) and where the output

of those commands is displayed. It is opened when the application

program is launched. All commands including user-written programs

are typed in this window at Matlab prompt for execution.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 33

4.3.2 Work Space Window

Matlab defines the workspace as the set of variables that the

user creates in a work session. The workspace browser shows these

variables and some information about them. Double clicking on a

variable in the work space browser launches the array editor, which

can be used to obtain information.

4.3.3 Current Directory Window

The current directory tab shows the contents of the current

directory, whose path is shown in the current directory window. For

example, in the windows operating system the path might be as

follows: c\matlab\work, indicating that directory “work” is a sub

directory of the main directory “Matlab”, which is installed in drive c.

Clicking on the arrow in the current directory window shows a list of

recently used paths. Matlab uses a search path to find M-files and

other Matlab related files. Any file run in Matlab must reside in the

current directory that is on search path.

4.3.4 Command History Window

The command history window contains a record of the

commands a user has entered in the command window, including

both current and previous Matlab sessions. Previously entered Matlab

commands can be selected and re-executed from the command history

window by right clicking on a command. This is useful to select

various options in addition to executing the commands and is useful

feature when experimenting with various commands in work sessions.

4.3.5 Editor Window

The Matlab editor is both a text editor specialized for creating

M-files and a graphical Matlab debugger. The editor can appear in a

window by itself, or it can be a sub window in the desktop. In this

window one can write, edit, create and save programs in files called M-

files.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 34

Matlab editor window has numerous pull-down menus for tasks

such as saving, viewing and debugging files. Because it performs some

simple checks and also uses color to differentiate between various

elements of code, this text editor is recommend as the tool of choice

for writing and editing M-files.

Matlab editor window has numerous pull-down menus for tasks

such as savings, viewing and debugging files. Because it performs

some simple checks and also uses color to differentiate between

various elements of code, this editor is recommended as the tool of

choice for writing and editing M-files.

4.3.6 Graphics or Figure Window

The output of all graphic commands typed in the command window is

seen in this window.

4.3.7 Online Help Window

Matlab provides online help for its built in functions and

programming language constructs. The principal way to get help

online is to use the Matlab help browser, opened as a separate window

either by clicking on the question mark symbol on the desktop

toolbar, or by typing help browser at the prompt in the command

window. The help browser is a web browser integrated into the Matlab

desktop that displays a hypertext mark up language documents. The

help browser consists of two panes, the help navigator plane, used to

find information, and the display plane, used to view this information.

Self-explanatory tabs other than navigator plane are used to perform a

search.

4.4 MATLAB Files

Matlab has three types of files for storing information. They are M-files

and MAT- files.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 35

4.4.1 M-Files

These are standard ASCII text file with „m‟ extension to the file

name and creating own matrices using m-files which are text files

containing Matlab code. Matlab editor or another editor is used to

create a file containing the same statements which are typed at the

Matlab command line and save the file under a name that ends in m.

There are two types of m-files.

4.4.2 Script Files

M-files with a set of Matlab commands in it and is executed by

typing name of file on the command line. These files work on global

variables currently in that environment.

4.4.3 Function Files

A function file is also an M-file except that the variables in a

function file are all local. This type of files begins with a function

definition line.

4.4.4 Mat-Files

These are binary files with .mat extension to that files created

by Matlab when the data is saved. The data written in a special format

that only Matlab can read. These are located into Matlab with load

command.

4.5 MATLAB System

The Matlab system consists of five main parts.

4.5.1 Development Environment

This is the set of tools and facilities that help you see use

Matlab functions and files. Many of these tools are graphical user

interface. In includes the Matlab desktop and command window, a

command history, an editor and debugger, and browser for viewing

help, the work space, files and search path.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 36

4.5.2 MATLAB Mathematical Function

This is a vast collection of computational algorithms ranging from

elementary functions like sum, sine, cosine and complex arithmetic to

more many functions like matrix inverse, martrix Eigen values, Bessel

functions and fast Fourier transforms.

4.5.3 MATLAB Language

This is a high level matrix or array language with control flow

statements, functions, data structures, input or output and object

oriented programming features. It allows both programming in the

small to rapidly create quick and dirty throw-away programs, and

programming in the large to create complete large and complex

application programs.

4.5.4 GUI Construction

Matlab has extensive facilities for displaying vectors and

matrices as graphs, as well as annotating and printing these graphs.

It includes high-level functions for two-dimensional and three-

dimensional data visualization, image processing, and animation and

presentation graphics. It also includes low-level functions that allow

you to fully customize the appearance of graphics as well as to build

complete graphical user interface on your Matlab applications.

4.5.5 MATLAB Application Interface

It is a library that allows you to write C and FORTRAN programs

that interact with Matlab. It includes facilities for calling routines from

Matlab, calling Matlab as a computational engine and for reading and

writing MAT-files.

4.6 MATLAB Working Environment

4.6.1 MATLAB Desktop

Matlab desktop is the main Matlab application window. The

desktop contains five sub windows, the command window, workspace

browser, current directory window, command history window, and one

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 37

or more figure windows which are shown only when the user displays

a graphic.

Figure 4.2 Representation of MATLAB Window

The command window is where the user types Matlab

commands and expressions at the prompt (>>) and where the output

of those commands is displayed. Matlab defines the workspace as the

set of variables that the user creates in a work session. The workspace

shows these variables and some information about them. Double

clicking on a variable in the workspace browser launches the array

editor, which can be used to obtain information and income instances

edit certain properties of the variable.

The current directory tab above the workspace tab shows the

contents of the current directory, whose path is shown in the current

directory window. Clicking on the arrow in the current directory

window shows a list of recently used paths. Clicking on the button to

the right of the window allows the user to change the current

directory.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 38

MATLAB uses a search path to find M-Files and other related

files, which are organize in directories in the computer file system.

Any file run in Matlab must reside in the current directory that is on

search path. By default, the files supplied with MATLAB and math

works toolboxes are included in the search path. The easiest ways to

see which directories is soon the search path or add to modify as

search path is to select set path from the file menu the desktop and

then use the set path dialog box. It is good practice to add any

commonly used directories to the search path to avoid repeatedly

having the change the current directory.

The command history window contains a record of the

commands a user has entered in the command window including both

current and previous MATLAB sessions. Previously entered MATLAB

commands can be selected and re-executed from the command history

window by right clicking on a command or sequence of commands.

4.6.2 Using MATLAB Editor To Create M-Files

The Matlab editor is both a text editor specialized for creating

m-files and a graphical Matlab debugger. The editor can appear in

window by itself, or it can be a sub window in the desktop. M-files are

denoted by the extension .m.

The Matlab editor window has numerous pull-down menus for

tasks such as savings, viewing and debugging files. Because it

performs some simple checks and also uses color to differentiate

various elements of code, this text editor is recommended as the tool

of choice for writing and editing m-functions.

To open the editor type edit at the prompt opens the m-file

filename.m in an editor window is ready for editing. As noted that the

file must be in the current directory or in a directory in the search

path.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 39

4.6.3 Getting Help

The principal way to get help online is to use the Matlab help

browser opened as a separate window either by clicking on the

questions mark symbol on the desktop toolbar or by typing help

browser at the prompt in the command window. The help browser is a

web browser integrated into the Matlab desktop that displays a

hypertext mark up language documents. The help browser consists of

two windows, the help navigator window, used to find information and

the display window, used to view the information. Self-explanatory

tabs other than navigator pane are used to perform a search.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 40

CHAPTER 5

RESULTS

When program is executed, running GUI is appeared on the command

window as shown fig5.1.

Figure 5.1 Running GUI

After appearing, GUI window is opened by which we select a desired

picture as shown in fig 5.2

Figure 5.2 Popup window for selecting picture

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 41

Open option indicates selection of a desired picture of formats such as

jpg, png etc.

Figure 5.3 Browsing for a picture

After selecting the desired picture, it will be displayed on GUI axes as

shown in fig5.4

Figure 5.4 Loading of a Picture

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 42

Selected image directory is displayed on edit box as shown in fig5.5

Figure 5.5 Directory of the picture

After selecting the Close option, Finished finding picture will be

appeared on command window.

Figure 5.6 Finished finding of picture from directory

The text extracted from the image is converted into .txt file format and

result is displayed on Notepad window as shown in fig 5.7

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 43

Figure 5.7 Converted .txt file and result in notepad

The previously selected image will again appear in figure window to

compare with the text result as shown in fig 5.8

Figure 5.8 Appeared picture and output text in notepad

The extracted text from the desired image iss converted into speech as

shown in fig 5.9

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 44

Figure 5.9 Extracted text to speech

5.1 ADVANTAGES

Advantages

It is a portable device.

The texts with complex backgrounds can be recognize and

extract.

It will enhance independent living and social self-sufficiency for

blind people.

Automatic detection.

Accuracy and flexibility.

Applications

It is applicable for blind people to know about the products.

It is also applicable for them to read signages.

5.2 CONCLUSION AND FUTURESCOPE

Conclusion

In this project, we have described a simulation method to read

printed text on hand-held objects for assisting blind people. In order

to solve the common aiming problem for blind users, we have

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 45

proposed a region of interest method to detect the object, while the

blind user simply shakes the object for a couple of seconds. This

method can effectively distinguish the object of interest from

background or other objects in the camera view. To extract text

regions from complex backgrounds, we have proposed an edge based

text region extraction. The corresponding feature maps estimate the

global structural feature of text at every pixel. Off-the-shelf OCR is

used to perform word recognition on the localized text regions and

transform into audio output for blind users by using text to speech

synthesizer.

Future scope

We will also extend our algorithm to handle non-horizontal text

strings. Furthermore, we will address the significant human interface

issues associated with reading text by blind users. This will be done

by eliminating the below disadvantages

It is difficult to recognize the text from images which are not flat

using this process.

It cannot handle non-horizontal text strings.

Assistive text and product label reading from hand-held objects for blind persons using

MATLAB

Department of ECE Page 46

REFERENCES

[1]. Chucai Yi, Yigli Tian. Aries Arditi, “Portable Camera-Based

Assistive Text and Product Label Reading From Hand-Held

Objects for Blind Persons,” In IEEE Transactions on

mechatronics, Vol. 19, No. 3, June 2014.

[2]. L. Ma, C. Wang, and B. Xiao, “Text detection in natural images

based on multi-scale edge detection and classification,” in Proc.

Int. Congr. Image Signal Process., 2010, vol.4, pp. 1961–1965.

[3]. C. Yi and Y. Tian, “Text string detection from natural scenes by

structure based partition and grouping,” IEEE Trans. Image

Process., vol. 20, no. 9, pp. 2594–2605, Sep. 2011.

[4]. X. Chen and A. L. Yuille, “Detecting and reading text in natural

scenes,” In CVPR, Vol. 2, pp. II-366 – II-373, 2004.

[5]. X. Chen, J. Yang, J. Zhang and A. Waibel, “Automatic detection

and recognition of signs from natural scenes,” In IEEE

Transactions on image processing, Vol. 13, No. 1, pp. 87-99,

2004.

[6]. “KReader Mobile User Guide”, knfb Reading Technology Inc,

http://www.knfbReading.com

[7]. J. Liang, D. Doermann, and H. Li, Camera-based analysis of text

and documents: a survey. In International Journal of Document

Analysis and Recognition (IJDAR), No. 2-3, pp. 84-104, 2005.

[8]. N. Nikolaou and N. Papamarkos, “Color Reduction for Complex

Document Images,” In International Journal of Imaging Systems

and Technology, Vol.19, pp.14-26, 2009.

[9]. ScanTalker, Bar code scanning application to help Blind Identify

over one million

products,http://www.freedomscientific.com/fs_news/PressRoom

/en/2006/ScanTalker2-Announcement_3-30-2006.asp