image to excel sheet conversion and measurement of ... · image processing-converting images to...

International Journal of Video&Image Processing and Network Security IJVIPNS-IJENS Vol:12 No:05 1

120805-4343-IJVIPNS-IJENS © October 2012 IJENS I J E N S

Image to Excel Sheet Conversion and Measurement of Similarity Using VB.Net

Dr. Ebtesam Najim Abdullah Al-Shemmary College of Education for Girls

Computer Science Department

University of Kufa, An Najaf, Iraq

Email: [email protected]

Abstract In this paper we present a new

algorithm to design and implement a fully automatic

system with high level of accuracy to convert digital

image to Excel sheet and comparing two images to find

the discrepancy between them. Matrix-based are a

useful tool for exploring relationships between related

records in a data set. Relationships can be any relation

between two records, but are generally similarity or

dissimilarity measures. Converting image to Excel file

provides the possibility of dealing with the

image matrix and makes any mathematical

operations on the image much easier to the user.

Experimental results from test images data are given

to demonstrate the performance of the proposed

program and algorithms. The program was written in

Visual Basic.Net.

Index Term Digital Image, Excel Workbook,

Spreadsheet, Similarity Measurement.

I. INTRODUCTION

Some can imagine that the images processing means

processes only adorn the pictures and make some

decorations and drawings or delete them to become

different from the original. However, images

processing almost do not care with this aspect of

images processing at all. Its important to focus on

digital coding of images and the appropriate ways to

deal with this digital data. So that the information

carried by the image are usable with a machine,

computer, robot or other machines. Today, the

medical industry, astronomy, physics, chemistry,

forensics, remote sensing, manufacturing, and

defense are just some of the many fields that rely

upon images to store, display, and provide

information about the world around us. The

challenge to scientists, engineers and business

people is to quickly extract valuable information

from raw image data. This is the primary purpose of

image processing-converting images to information

[1]. Image that processed by computer called the

digital image. A digital image is composed of a grid

of pixels and stored as an array. A single pixel

represents a value of either light intensity or color.

Images are processed to obtain information beyond

what is apparent given the image‘s initial pixel

values [2]. Figure 1 show square pixels (picture

elements) arranged in columns and rows.

Fig. 1. An image — an array or a matrix of pixels arranged in

columns and rows.

In a (8-bit) grayscale image shown in Figure 2,

each picture element has an assigned intensity that

ranges from 0 to 255. A gray scale image is what

people normally call a black and white image, but

the name emphasizes that such an image will also

include many shades of gray. Each pixel has a

value from 0 (black) to 255 (white). The possible

range of the pixel values depend on the color depth

of the image, here 8 bit = 256 tones or grayscales.

Fig. 2. 8-bit grayscale image

A normal grayscale image has 8 bit color depth

= 256 grayscales. A ―true color‖ image has 24 bit

color depth = 8 x 8 x 8 bits = 256 x 256 x 256 colors

= ~16 million colors. Figure 3 shows a true-color

image assembled from three grayscale images

colored red, green and blue. Such an image may

contain up to 16 million different colors.



Fig. 3. A true-color image

Some grayscale images have more grayscales,

for instance 16 bit = 65536 grayscales. In principle

three grayscale images can be combined to form an

image with 281,474,976,710,656 grayscales.

There are two general groups of ‗images‘: vector

graphics (or line art) and bitmaps (pixel-based or

‗images‘). Table 1 illustrates some of the most

common file formats [3].

Excel is perhaps the most important computer

software program used in the workplace today.

From the viewpoint of the employer, particularly

those in the field of information systems, the use of

Excel as an end-user computing tool is essential. In

general, Excel dominates the spreadsheet product

industry with a market share estimated at 95

percent. Excel 2007 has the capacity for

spreadsheets of up to a million rows by 16,000

columns, enabling the user to import and work with

massive amounts of data and achieves faster

calculation performance than ever before.

TABLE I

Some Common File Formats

Group Description

Bit

ma

p F

orm

at

GIF an 8-bit (256 color), non-destructively

compressed bitmap format. Mostly used for web.

Has several sub-standards one of which is the

animated GIF.

JPEG

a very efficient (i.e. much information per byte)

destructively compressed 24 bit (16 million

colors) bitmap format. Widely used, especially

for web and Internet (bandwidth-limited).

TIFF

the standard 24 bit publication bitmap format.

Compresses non-destructively with, for instance,

Lempel-Ziv-Welch (LZW) compression

Vec

tor

Fo

rma

t

PS

Postscript, a standard vector format. Has

numerous sub-standards and can be difficult to

transport across platforms and operating systems.

PSD a dedicated Photoshop format that keeps all the

information in an image including all the layers.

Outside the workplace, Excel is in broad use for

everyday problem solving. There is a surprisingly

wide range of mathematics involved in manipulating

digital images, from simple arithmetic (e.g.

increasing or decreasing the image brightness), to

matrix algebra (applying a filter) to numerical PDEs

(image sharpening) to more complex processes such

as pattern recognition. Students and end-user are

familiar with digital images, most likely because of

their mobile phone camera, and many are also

familiar with the post-processing possibilities -either

using in-camera tools or through tools such as

Photoshop. Many students have a good working

knowledge of spreadsheets -particularly Microsoft

Excel- as they are widely used as a support tool in

their course. This paper describes algorithms to

allow students to make use of their spreadsheet

skills in applying the mathematical techniques

necessary to implement a variety of image

modifications. It can also find the similarity degree

between images from the data in the worksheets [4].

In 2007, a detailed description of the algorithms

of fingerprint analysis was made by AlShemmary

[5] in which a new method of data analysis,

especially very large datasets analysis using Excel

were produced. Data preparation can be easily

accomplished in Excel.

Many techniques exist to create an Excel file.

Each of them offers some unique advantages.

Knowing and understanding the different techniques

is essential for programmers to quickly and

effectively produce a report that will meet the

requirement provided by the customer [6]. This

paper described the algorithms to generate an Excel

file from digital image using VB.NET, and provide

the appropriate method to use when an Excel output

must be created.

The structure of this paper is as follows: first we

introduce the computer drawings in Section 2.

Section 3 provides some background on algorithms

that use patch matching, and reviews the approaches

of measuring image patch similarity. Section 4

proposes a method for automatically similarity

matrix calculation. Section 5 shows the results of

our experiments and finally, Section 6 interprets the

results and indicates our future work.

I. Computer Drawings

Computer drawings are divided into 2 categories:

1- Vector Drawings

It's made up of lines and curves inside the

computer knowledge of mathematical objects called

"vector". Vectors describing picture elements

according to engineering. In other words, this aspect

of processing based on the lines related to making

the computer read drawing like a set of

mathematical equations that lead it to redraw a

board if it wanted to be getting bigger or smaller

size or want to move from its place. However, the

computers work and understand if the drawings

works in such a way seem more intelligent. Lines

and neighboring is the foundation for the work

because if we knew the straight line, we will find

the line between two points without need to define

all the points in the middle. The advantages of this



type [7]:-

Does not require much space for storage on

the computer.

Does not affect by the painting, whether

zoomed in or out, and up to 10 times the

original size.

The drawings are stored using the format:

eps, cdr, and ai. Programs that

deal with these drawings are Adobe

Illustrator, Corel Draw, Freehand, and

Macromedia Flash.

2 Bitmap Drawings

It's a network of colors representing the image,

and every point in the optical network unit called

"pixel". Every pixel is determined by two pieces, the

location of the pixel through the coordinates, and

color of the pixel. These drawings are used

electronically in the bitmap graphics, photographs

and digital graphics. The advantages of this drawing

[7]:-

Display a huge spectrum of colors, so that it

displays color gradients and shadows, and

complex interactions.

The number of pixels representing image is

static, but in fact it is disadvantage because,

it has negative impact at zoom in or out, less

image quality.

The drawings are stored using the format: gif, jpg,

bmp, tiff, psd, and png. Programs that

deal with these drawings are Adobe Photoshop,

Macromedia Fireworks, Paint ShopPro, CorelDraw,

and Photo Paint [7].

II. IMAGE PATCH SIMILIARITY

The ability to compare image regions (patches)

has been the basis of many approaches to core

computer vision problems, including object, texture

and scene categorization. Developing

representations for image patches has also been in

the focus of much work. However, the question of

appropriate similarity measure between patches has

largely remained unattended.

The main context in which comparing image

patches has emerged in computer vision is that of

high-level vision tasks, that can be described as

scene understanding. This includes [8]:

Object recognition: This means finding a specific

object: a face of a certain person, a shoe of a

particular make, a magazine etc.

Object categorization and object class detection:

Rather than looking for a specific instance of an

object, the interest here is in all objects that belong

to a certain class, for an appropriate definition of the

latter: any face, any car, etc.

Entire image classification: Sometimes the goal is

not to localize, or determine the presence of an

object, but rather to assign the entire image to a

certain class. For instance, location recognition and

texture classification belong to this category of tasks

[5].

The question of measuring similarity between

patches has not received very much attention in the

computer vision literature. Usually, a standard

distance measure is adopted for whatever

representation is used:

1 Pixel-based distance

The simplest similarity measure consists of

directly comparing the pixel values of the two

regions, e.g. by means of the L1 distance [3]:

( ) ∑ ( ) ( )

This is rarely a useful measure, since it is extremely

sensitive to minor transformations, both in geometry

(shifts and rotations) and in imaging conditions

(lighting or noise).

2 Correlation

Normalized correlation between patches x1 and

x2 is defined as:

( ) ∑ ( ( ) ̅ ) ( ( ) ̅ )

,

where ̅ are the mean and standard

deviation of pixels in xi. Because of the factoring in

of the means it is much more robust than the pixel-

wise distance. Normalized correlation has been used

extensively in fragment-based recognition, where it

is assumed that viewing conditions are fixed, or

alternatively that there exist examples from all

viewing conditions–in other words, not matching a

patch to a version of itself rotated by 90 degrees is

acceptable. We would like to avoid such an

assumption [9].

3 Descriptor distance

Another popular method is to compute a

descriptor of each patch, and then simply apply a

distance measure on the two descriptors. Most

commonly the descriptors are vectors in a metric

space of fixed dimensions and the distance of choice

is L1 [3].

4 Probabilistic matching

A different approach is taken by some of the

methods that instead of measuring distance between

representations patches, evaluate directly the

probability that the two patches belong to the same

class. This is usually limited to models in which a

fixed number of patch classes, called parts, are

combined in some framework. A well known

example of this kind is the family of constellation

models [10].

IV. PROPOSED ALGORITHM

Matching covers the groups of techniques

based on similarity measures where the distance

between the feature vectors, describing the extracted

character and the description of each class is

calculated. Different measures may be used, but the

common is the Euclidean distance. This minimum

distance classifier works well when the classes are

well separated, that is when the distance between



the means large is compared to the spread of each

class [11]. Three common pattern arrangements

used in practices are: (Vectors, Strings, and Trees).

In this paper Vectors are used. The complete

diagram of proposed design was presented in Fig 4.

Fig. 4. Flowchart of the proposed system.

Decision-theoretic approaches to recognition are

based on the use of decision functions. Let represent

an n-dimensional pattern vector. For W pattern

classes, we want to find W decision functions with

the property that, if a pattern x belongs to class, then

[3]:

The decision boundary separating class is given by:

This section is to identify the decision of similarity

between two images by the single function

0)()()( xxxjiij

ddd . Thus dij<=0.002

(similarity decision) for image is similar, otherwise

is not.

Suppose each image is presented by a mean

vector:

∑

( )

where:

Nj: the number of vectors from image wj

W: the number of images

One way to find the closeness between images is the

Euclidean distance. If Euclidean distance is used for

closeness decision:

( ) ‖ ‖ ( )

where: ‖ ‖ √ is the Euclidean norm.

Selecting the smallest distance is equivalent to

evaluating the functions:

( )

( )

From Equ.(2) and Equ.(5), the decision

boundary between images wi and wj for a minimum

distance classifier is:

)()()( xxxjiij

ddd

( )

( )

( )

( )

In classic paper, Fisher [12] reported the use of

what, then was a new technique called discriminate

analysis to recognize three types of iris flowers (iris

setosa, virginca, and versicolor) by measuring the

widths and lengths of their petals (see Figure 5).

Figure 6 shows an example of two vectors extracted

from the iris samples in Fig. 5. The two images, iris

versicolor and iris setosa denoted w1 and w2

respectively, have mean vectors m1= (4.3, 1.3)T and

m2=( 1.5, 0.3)T.

The minimum distance classifier work well

when the distance between means is large compared

to the spread or randomness of each image with

respect to its mean.

Fig. 5. Three types of iris flowers described by two

measurements

Yes

No

Start

Determine size of image from

settings

Open Image

Convert Image to Gray Image

Create Excel Sheet

Convert Image to

Excel

Vector of Gray

Image

Save Vector

in Notepad

IF

d12<=0.

002

Not

Similar

Find m1&m2

for two images

Find d1&d2 for

two images

Find d12=d1-d2

Is

similar

End

)1....(;,...,2,1 )()( ijWjddji

xx

)2.....(0)()(or )()( xxxxjiji

dddd



Fig. 6. Decision boundary of minimum distance classifier for

images of iris versicolor and iris setosa. The dark dot and square

are the means.

V. Implementation and Results

The image processing algorithms discussed above

are modeled in VBasic.Net using Windows7

Ultimate operating system. The design is

implemented on Laptop LG Dual-Core CPU 2GHz

and 2Gbyte RAM. Visual Basic.Net is a powerful

tool and effective way to develop applications

compatible with the Windows environment.

Providing with an integrated development

environment to create easy use solutions, running

commands, viewing output, editing, and managing

files and variables. The screens and windows are

designed by clicks and mouse movements as

paint light boxes and circles using drawing

programs and other. The (.NET) is essentially a

framework for software development.

In this Section we will describe operations that

are fundamental to digital image processing. These

operations can be divided into two lists: file list

which contain the standard command (open, save,

save vector, and exit) and process list that contain

all operations of the program (convert to gray and to

binary, vector of gray and binary, create Excel

sheet, convert image to Excel file, and measurement

of similarity degree). We have evaluated our

methods on several different standard images, as

shown in Table 2.

The implementations of the program starts by

selecting image size from settings, then opening

images, convert to gray and find vectors of those

images. Also we can save this vector in (. dat) or

(.txt) format to be used in another techniques, e.g.

(input data to the neural network). Using these

vectors we can find if these two images are similar

or not. Further, these operations and their executions

can also be described in GUI shown in Appendix-A.

Our experiments showed that such a system is a

powerful tool that provides programmers with a

wide range of solutions and options for exporting

data to Excel. Although several methods exist to

accomplish the same result, they might not

necessarily be appropriate for the task at hand.

While some techniques offer better customization of

reports, others provide simpler, more efficient

syntax. This project gave us the opportunity to

subject ourselves to the tools and techniques we are

studying and gave us valuable insight into the how

the digital image processing integrates into the

Excel workbook. Programming an Excel report is a

common task for most programmers. Customers

often want to be able to view, sort, and filter their

data, and Excel is usually the tool they master best.

Excel spreadsheets used to manage and analyze data

efficiently. Excel provide many functions and

formulas that will not only help to manage data

records, but will also make sure that they could

analyze all data based on the constantly changing

business environment. Excel is spreadsheet software

that helps us organize and chart large amounts of

data [13].

The experimental results show how image

processing can be done in Microsoft Excel. The

paper also explains how can read any image and

obtain the pixel values of the image. Thus it is

shown that the whole set of image processing

operations such as reading, processing and printing

of image can be done in excel. Above all this paper

stressed on the potential of Microsoft Excel as a

scientific learning tool. VI. CONCLUSIONS

In this paper, we have introduced similarity

measure, which reveal Euclidean distance on the use

of decision functions. A system to achieve, analyze

and compare two images using minimum distance

classifier algorithm have been proposed in this

work. The approach is founded on similarity

analysis, in which the smallest distance (dij) between

images gives a good result for similarity. The

efficient program design enables us to resize image

in an efficient way. In the context of the larger

research project, this project has identified some

possible avenues for future work. Future work will

focus on applying this work to other techniques and

applications. Adding large database, to save the

images that are similar and that different, comparing

them directly is a good suggestion in pattern

recognition. Almost everyone can use Excel

experienced. Therefore, data processing in Excel

can be implemented easily, not only by people with

specialty knowledge but also by people without

specialty knowledge.

REFERENCES

[1] http://heritage.stsci.edu/commonpages/infoindex/ourimage

s/color_comp.html. [2] Thomas M. Lehmann, Claudia G¨onner, Klaus Spitzer, "

Survey: Interpolation Methods in Medical Image

Processing", IEEE Transactions on Medical Imaging, Vol. 18, No. 11, November 1999.

[3] Rafael Gonzalez, Richard E. Woods, "Digital image

Processing /2E", Printice Hall, Upper Saddle River, New Jersey 07458, 2002.

http://heritage.stsci.edu/commonpages/infoindex/ourimages/color_comp.html

http://heritage.stsci.edu/commonpages/infoindex/ourimages/color_comp.html



[4] Sheri Graves, "where is Microsoft Excel Used?".

Http://ezinearticles.com, September, 14, 2007. [5] Ebtesam N. AlShemmary, "Fingerprint Image

Enhancement and Recognition Algorithms", PhD.

Thesis, University of Technology, Baghdad, 2007. [6] Musa J. Jafar, "A Tools-Based Approach to Teaching

Data Mining Methods", Journal of Information

Technology Education, Vol 9, Canyon, TX, USA, 2010. [7] http://ar.wikipedia.org/wiki

[8] Yu Shiu, Hong Jeong, C.C.Jay Kuo, "Similarity Matrix

Processing for Music Structure Analysis", AMCMM, Santa Barbara, California, USA, 06, October 27, 2006.

[9] IDL,"Image Processing", 0509IDL71IP, ITT Visual

Information Solutions, May 2009. [10] Ian T. Young, Jan J. Gerbrands, Lucas J. van Vliet,

"Fundamentals of Image Processing", Delft University

of Technology, 1997. [11] T. Y. Young & K-S Fu. "Handbook of Pattern

Recognition &Image Processing", Academic Press, 1986.

[12] Fisher, R. A., "The Use of Multiple Measurements in

Taxonomic Problems", Ann Eugenics, Vol.7, Part2, pp.

179-188, 1963.

[13] MacDonald, M. , "Excel 2007 : The Missing Manual", Published by Pogue Press, O‟Reilly. ISBN:978-0-596-

52759-4, 2006.

TABLE II

Experimental Results of Proposed Method

Original Image Binary Image Excel File of Image (10% Zoom)



Appendix-A: Program implementation and examples of acquired results

Fig. A-1. Start project screenshot

Fig. A-2. Project GUI screenshot

Fig. A-3. File list screenshot Fig. A-4. Process list screenshot

Fig. A-5. Choose size of the image screenshot Fig. A-6. Opening first and second image screenshot

Fig. A-7. Convert first and second image to gray screenshot Fig. A-8. Convert first and second image to gray with its vectors (by tricking

shows boxes of vector to the right) screenshot.



Appendix-A: Program implementation and examples of acquired results (continued)

Fig. A-9. Create excel sheet of first and second image screenshot

Fig. A-10. Implementation of converting image to excel file and its output

Fig. A-11. Two examples of similar images.

Fig. A-12. An Example of not similar images Fig. A-13. Implementation Command of Saving image Vector in Notepad



Fig. A-14. Dialog box to save image vector in another format. Fig. A-15. Store image vector of in (. dat) file format

image to excel sheet conversion and measurement of ... · image processing-converting images to...

Documents