smart snap - report
TRANSCRIPT
Smart Glass
Submitted in partial fulfillment of the requirements
of the degree of
Bachelor of Engineering
by
Jay Shah, 60003100048
Pooja Shah, 60003100043
Tapan Desai, 60003100012
Supervisors:
Prof. Vinaya Sawant
Prof. Anuja Nagare
Information Technology
Dwarkadas J. Sanghvi College of Engineering, University Of Mumbai
2013-2014
Project Report Approval for B. E.
This project report entitled “Smart Glass” by Jay Shah, Pooja Shah and
Tapan Desai is approved for the degree of Information Technology.
Internal Guide (Prof. Vinaya Sawant)
Internal Examiner External Examiner
Vice Principal (Acad) and HOD, IT Dept. Principal
(Dr. A. R. Joshi) (Dr. Hari Vasudevan)
Declaration
We declare that this written submission represents our ideas in our own words and
where others' ideas or words have been included, we have adequately cited and
referenced the original sources. We also declare that we have adhered to all principles
of academic honesty and integrity and have not misrepresented or fabricated or
falsified any idea/data/fact/source in our submission. We understand that any violation
of the above will be cause for disciplinary action by the Institute and can also evoke
penal action from the sources which have thus not been properly cited or from whom
proper permission has not been taken when needed.
Jay Shah, 60003100048
----------------------------------------- -----------------------------------------
Pooja Shah, 60003100043
----------------------------------------- -----------------------------------------
Tapan Desai, 60003100012
----------------------------------------- -----------------------------------------
(Name of student and Roll No) (Signature)
Date: 28/04/2014
ACKNOWLEDGEMENTS
We are highly indebted to Dwarkadas J. Sanghvi College of Engineering for their
guidance and constant supervision as well as for providing necessary information
regarding the project & also for their support in completing the project.
We would like to express our heartfelt gratitude towards our guide Prof. Vinaya Sawant
and our co-guide Prof. Anuja Nagare for their kind co-operation and encouragement
which help us in completion of this Synopsis.
We would like to express our special gratitude and thanks to faculty of Information
Technology Department for giving us such attention and time. Our thanks and
appreciations also go to our colleagues in developing the project and people who have
willingly helped us out with their abilities.
Jay Shah
Pooja Shah
Tapan Desai
Table of Contents
1. Introduction……………………………………………………………………………………1
2. Literature Review……………………………………………………………………………..2
3. Problem Definition…………………………………………………………………………….5
4. Proposed System………………………………………………………………………………7
5. Project Management………………………………………………………………………….11
5.1 Schedule……………………………………………………………………………………...11
5.2 Project Resources…………………………………………………………………………….12
5.3 Project Estimates...…………………………………………………………………………...13
5.4 Risk Mitigation Strategy……………………………………………………………………..15
6. Project Design………………………………………………………………………………..18
6.1 System Architecture. ………………………………………………………………………...18
6.2 Module/Component Description……………………………………………………………..20
6.3 User Interface Design………………………………………………………………………...27
7. Implementation……………………………………………………………………………….30
7.1 Modules/ Component Description……………………………………………………………30
7.2 Module-wise Algorithm……………………………………………………………………...32
8. Experiments and Project Testing……………………………………………………………..34
8.1 Test plan……………………………………………………………………………………...34
8.2 Test cases……………………………………………………………………………………..35
8.3 Methods used…………………………………………………………………………………36
8.4 Test Results…………………………………………………………………………………..37
9. Maintenance………………………………………………………………………………….38
9.1 User Manual………………………………………………………………………………….38
9.2 Constraints……………………………………………………………………………………38
10. Conclusion and Future Scope………………………………………………………………...40
11. References and Bibliography………………………………………………………………...41
List of Figures
Figure 2.1: Screenshot for existing system .......................................................................................................... 3
Figure 3.1: Use your mobile ................................................................................................................................ 6
Figure 3.2: Snap a picture……………………………………………………………………............................. 6
Figure 3.3: Get related content ............................................................................................................................. 6
Figure 4.1: Architecture of Proposed System ...................................................................................................... 9
Figure 5.1: Gnatt Chart for Project Schedule ..................................................................................................... 12
Figure 6.1: Actual System Architecture based on modules ............................................................................... 20
Figure 6.2: Use Case Diagram ........................................................................................................................... 21
Figure 6.3: Activity Diagram ............................................................................................................................. 22
Figure 6.4: State transition to capture image ..................................................................................................... 23
Figure 6.5: State transition for text conversion .................................................................................................. 23
Figure 6.6: Deployment Diagram ...................................................................................................................... 24
Figure 6.7: Data Flow Diagram ......................................................................................................................... 26
Figure 6.8: Splash Screen .................................................................................................................................. 26
Figure 6.9: Smart Snap of Play Store ................................................................................................................. 27
Figure 6.10: Main Menu .................................................................................................................................... 27
Figure 6.11: Slide Menu .................................................................................................................................... 27
List of Tables
Table 1: COCOMO Model………………………………………………………………...................................................14 Table 2: Risk Mitigation Strategy………………………………………………………………........................................15 Table 3: Risk Sheet………………………………………………………………..............................................................16 Table 4: Test Case………………………………………………………………...............................................................36
Project Report: Smart Glass
Chapter 1
1. Introduction
Mobile photo + Image Recognition = Identification
A lot of existing systems exist which provide text based results for news. There are E-commerce portals
to buy products based on text search. The thing common with both the systems is a text based search.
Our project aims on eliminating text based search and give results based on visual search.
Our project add context to images. Our image recognition products help in connecting images to
relevant information. The user snaps a picture; the application tells the user what's in it.
Now that the communication has moved from text to images, technology needs to improve. The project
is based on the latest technology of visual search. The application is an android application in which it
identifies objects from images. Once objects are identified, it is able to provide additional information
about them, e.g. related web-pages, description, present statistics etc.
The application allows the user to click an image and search for it over the web using our web engine
or just execute the actions specified by them on their own spreadsheet. As we see the application is
totally customizable as per user requirements. The main features of the application includes Document
Capture, News aggregator, Real-Time Information and Visual Commerce.
Project Report: Smart Glass
2
Chapter 2
2. Literature Review
This section consists of the research conducted by the team. It includes a cursory preface of the existing
system and its drawbacks. It also gives an idea about all potential frameworks that could have been
used in building this application.
2.1 Existing Systems
Google Goggles:
Scan barcodes using Goggles to get product information
Scan QR codes using Goggles to extract information
Recognize famous landmarks
Translate by taking a picture of foreign language text
Add Contacts by scanning business cards or QR codes
Scan text using Optical Character Recognition (OCR)
Recognize paintings, books, DVDs, CDs, and just about any 2D image
Solve Sudoku puzzles
Find similar products
Project Report: Smart Glass
3
Smart Glance:
Support for secured cloud hosted service & on-premises Installation
Supports multiple mobile platforms
View Data in Rich Graphical formats
Analyse by Charts – trend & Column, Pie & Donut
Zoom In/Out on charts for better view
Tools to compare two elements instantly
Tools to compare against target/benchmark
Support for offline access
Localization in multiple languages – Spanish, German, Italian, French, Chinese &
Japanese.
Figure 1: Screenshots of existing system
2.2 Pitfalls
Google Goggles:
Redirects to the Google search engine while scanning an image instead of giving the
results on the website.
No visual commerce present. Scans the image and provides the web link for the product.
Users cannot customize the search based on their requirements. Gives all the
information, instead of providing a link which the user actually needs.
Smart Glance:
Project Report: Smart Glass
4
Provides information for all the machines instead of narrowing the search down to the
machine required.
Doesn’t include any other features.
Project Report: Smart Glass
5
Chapter 3
3. Problem Definition
Problem Definition
Smart Glass is an android application which takes in images using mobile camera. The application uses
optical character recognition and image processing to scan the images and gives results based on the search.
The application has a number of features Visual Commerce, Visual Search, Real time object statistics,
language translator, user specified spreadsheets, form processing and document capture.
The application will run on all android operating systems. The user can get any information based on the
image they’ve searched from a dedicated database.
Document Capture:
This features converts a printed document to editable text. Scanned documents can be exported to
various formats. Prevents wastage of time and resources as the user doesn’t have to transfer and type
content from images.
Visual Commerce:
Project Report: Smart Glass
6
User can purchase products directly from the application from EBay and Amazon. The user just have
to click the picture of the product they want to purchase and the application will give them results.
News Aggregator:
The application will give results based on the images the user clicks. The application will retrieve
information from the dedicated databases which will get populated automatically with the latest content
every day.
Real-Time Information:
This feature will give real time results for the captured products. This information is retrieved from a user
defined spread-sheet or database. This system can be used to retrieve information from ERP systems as
well. This feature gives users complete customization over the results.
The basic idea of the application is:
Figure 2: Use your mobile Figure 3: Snap a picture Figure 4: Get related content
The user clicks a photo from his/her smart phone. The image is scanned using an OCR. The results from
the OCR are passed as a query to our database. This database is populated everyday using crawlers. The
data matched in the database is passed back to the mobile device and displayed to the user.
Project Report: Smart Glass
7
Chapter 4
4. Proposed System
4.1 Draft of Proposed System
Project plan is basically a proposed approach to creating the application.
The basic concept of the application is that the user won’t have to type everything they want to
search. The user can just click the image of the object they want to search and the application will
give results directly.
This software can provide information such as machine specification, maintenance statistics,
production capacity, effectiveness etc. by just placing the camera on the machine. It can also help
detect real time machine speed or the temperature of the furnace just by connecting it to the
factory’s ERP system.
The application can be used by anyone. The interface is user-friendly and the user will get all the
results directly in the application. The user won’t be redirected to any of the web browser. Just by
placing the camera on any brand name will retrieve all the information about the brand along with
its specialization, web pages and various branches.
Major technologies to be used:
Project Report: Smart Glass
8
• OCR Engine
• MATLAB
• Php
• MySQL
• Eclipse
Today more than texts images get uploaded on Internet. This application makes use of images and
retrieves news and important data needed as per the user’s requirement. It helps user to manage his
own spreadsheet through this app. User can specify the action that he /she wants to perform on the
click of that particular object. After capturing a particular object user can retrieve latest news on
that object. User can also change the file format using this application. Users can even buy products
using this online.
4.2 Expected Modules:
The basic project plan is dividing the application into four modules:
Document Capture: This module will allow user to capture documents and edit them. The user
would be allowed to save them in a PDF format. This is done using an OCR engine. The text which
is read in the document is then sent to the server. The server will then send back an edited form of
the document.
Visual Search: This module will again be created using an OCR engine. This module will allow
user to click photos of nearby texts and retrieve results based on the text from the database. The
project plan includes creating our own database which will be populated using crawlers.
Visual Commerce: The basic idea of this module is to allow user to shop using our application.
The application will be connected to various API’s from top E-commerce portals. Whenever the
user clicks the photo of a nearby object they want to buy, the application will display products from
e-commerce portals.
Real-Time information: This module will include giving real-time information from a connected
spreadsheet. This module will make the application user customisable.
Project Report: Smart Glass
9
4.3 Architecture for Proposed System:
Figure 5: Architecture of Proposed System
The proposed architecture is based on a mobile device which is the central unit of the
application. The mobile device would include the OCR engine and the required Camera
Activity. The application then connects to the services of individual modules.
Visual Search: For this module the application is connected to the search engine which is
integrated with the application database.
Document Capture: This uses the Document Translator API.
Real-Time Information: The application is connected to the web services which are then
connected to the Spreadsheet and the ERP system.
4.4 Advantages of Proposed System
The application is an all-in-one system. Each module has its own advantage for the user.
They are as follows:
Document Capture: The user doesn’t have to type or re-type an entire document. The user has just
to click an image of a printed document and get results based on the text recognized. The main
advantage will be that the user will be saving a lot of time. Also the application converts the new
document into PDF format. This document could be shared using all major social networking
platforms.
Project Report: Smart Glass
10
Visual Search: The user doesn’t have to login to the web browser and search for the news. The
main advantage of this module is that it allows user to retrieve news within seconds. All the news
is aggregated and stored in a database. Again saves time for the user.
Visual Commerce: This module will allow user to shop online without searching for an item they
like over all the e-commerce portals. The top e-commerce portals are aggregated in this small
application. Thus, it acts as an one stop shopping solution. The main feature will be visual search
in e-commerce instead of traditional text-based search. The user is allowed to just click a photo of
the object they want to acquire and get the product price and availability from e-commerce portals.
Real-Time Information: This module will allow user to customise the application as per their
requirement. The application can be linked to user created spreadsheets or an ERP system. The
major advantage is that the user will get real-time information from the application.
Project Report: Smart Glass
11
Chapter 5
5. Project Management
Project management is the discipline of planning, organizing, securing and managing resources to
achieve specific goals.
Schedule
Schedule helped us to know project's milestones, activities, and deliverables, with start and
finish dates.
Project scope is defined and the appropriate methods for completing the project are determined.
The durations for the various tasks necessary to complete the work are listed.
Gnatt chart for our project is as follows:
Project Report: Smart Glass
12
Figure 6: Gnatt Chart for Project Schedule
Project Resources
The project will require a limited amount of resources.
Hardware:
A mobile device running on Android Platform. The application should have a running camera.
A computer which could serve as a server for Database. The server needs to run 24x7.
Software:
Android SDK for development along with an Android IDE.
Google API
API’s from E-Commerce portals to link them to the application.
Google Spread sheets.
Other Requirements:
The mobile network should have an active internet service.
The mobile device should support Google Play Store.
Project Report: Smart Glass
13
Project Estimates
Estimation is basically identifying and acquiring necessary resources such as equipment’s,
materials, man-power etc. required for accomplishing the project successfully. Estimation
techniques used for our project are as follows:
Lines of Code
Lines of code (LOC) is a software metric used to measure the size of a computer program by
counting the number of lines in the text of the program's source code. It is typically used to
predict the amount of effort that will be required to develop a program, as well as to estimate
programming productivity or maintainability once the software is produced.
For our project, the estimated lines of code = 6.5 K
The above mentioned Lines of Code (LOC) include the following:
• Authentication to access the tool
• Code for adding/deleting questions and updating keywords of the questions
• Logic for highlighting the important sentences in the document
• Frequency calculation of the keywords in the document for report preparation
• Graph generation code to display the overall performance of the class
COCOMO Estimation Model
The Constructive Cost Model (COCOMO) is an algorithmic software cost estimation model
that computes software development effort and cost as a function of program size. Program
size is expressed in estimated thousands of source lines of code (SLOC). Basic COCOMO is
good for quick estimate of software cost.
COCOMO applies to three classes of software projects:
Organic projects: "small" teams with "good" experience working with "less than rigid"
requirements
Project Report: Smart Glass
14
Semi-detached projects: "medium" teams with mixed experience working with a mix of rigid
and less than rigid requirements
Embedded projects: developed within a set of "tight" constraints. It is also combination of
organic and semi-detached projects.
The basic COCOMO equations take the form:
Effort applied=a*(KLOC)b [man-months]
Development time=c*(effort applied)d [months]
People required = Effort applied [ count]
Development time
where, KLOC is the estimated number of delivered lines (expressed in thousands ) of code for project.
The co-efficient a, b, c, d is given in the following table:
Software Project A B C D
Organic 2.4 1.05 2.5 0.38
Semi-detached 3.0 1.12 2.5 0.35
Embedded 3.6 1.20 2.5 0.32
Table 1: COCOMO Model
Estimates of Effort, Cost, Duration:
E = a*(KLOC)b = 3.0*(6.5)1.12
= 24.411
D = c*(E)d = 2.5 *(24.411)0.35
= 7.65
Project Report: Smart Glass
15
P = E/D = 24.411
7.65
= 3
Risk Mitigation Strategy
RISK CATEGORY PROBABILITY IMPACT
Insufficient Accuracy PS High 1
Inadequate knowledge about
application
BU Low 3
Table 2: Risk Mitigation Strategy
Impact Values :
1-catastropic 2-critical
3-marginal 4-negligible
A project team begins by listing all risks in first column of table. Each risk is
categorized in the second column (PS-Project Risk, DE-Development Risk, BU-Business Risk,
and TE- Technical Risk). The probability of occurrence of each risk is entered in the next
column of the table.
RMMM Plan for each risk:
Risk Information Sheet
Project Name : Smart Glass
Risk Id:- 001 Date :- 4/8/2013 Probability :- High Impact :- catastrophic
Origin :- Jay Shah Assigned To :- Tapan Desai
Description :-
The chances of loss due to this could be 65%. Lack of accuracy will lead into incorrect results
given by blurred or tilted image and it may lead to crashing of the application.
Mitigation/Monitoring :-
Project Report: Smart Glass
16
The user can ensure accuracy by using the proper or specified megapixel of the camera in his
android smartphone. He should try to click image where there is sufficient brightness.
Management :-
Once risk becomes active then we will provide the user with the facility to edit the text if he or
she feels that the converted image is incorrect.
Status :- Still left to implement.
Approval :- Vinaya Sawant Closing Date :- 17-10-2013
Table 3: Risk Sheet
Project Report: Smart Glass
17
Risk Information Sheet
Project Name : Smart Glass
Risk Id:- 002 Date :- 15/8/2013 Probability :- Low Impact :- serious
Origin :- Tapan Desai Assigned To :- Pooja Shah
Description :-
The user has 5% chances of loss. If the user does not have prior knowledge about the working
about the application then he or she may switch to some other application if they find any.
Mitigation/Monitoring :-
The GUI(Graphical user interface) of the application would be simple enough for the user to
understand the basic operations that are to be carried out for successful working of the
application.
Management :-
Once risk becomes active then we will provide the user with the Help menu where he would
find all the steps that are to be carried out for using the application efficiently.
Status :- Still left to implement.
Approval :- Vinaya Sawant Closing Date :- 17-10-2013
Table 4: Risk Sheet 2
Project Report: Smart Glass
18
Chapter 6
6. Project Design
System Architecture
The main aim of the system is to retrieve data from the captured image. The data is then passed
on to the servers. The servers then check for relevant information of the searched module. This
information is then passed on to the mobile device which the user can read.
The architecture is based on an OCR engine and Camera Activity running on a mobile device.
The architecture is divided into various units for each type of modules
a. Visual Search: Visual search module is based on Database Technology, Crawlers and
OCR engines. The user has to click an image of the text he/she wants to search. This text
is then read by the OCR engine. The text is then passed as an query to the server. The server
consists of MySQL database which daily populated by two crawlers. The database is then
searched for the query. Once the results are found they are sent back to the mobile device.
The response is quick and the results are displayed within a span of 3 seconds. The main
feature of Visual Search is not using any readymade search engines but aggregating news
in our own database and displaying results for the user.
b. Document Capture: The text in this module is read by the OCR Engine. The recognized
text is then passed to the server. There is another OCR present in the server to improve the
text accuracy. The retrieved text is then passed back to the user mobile device in an editable
Project Report: Smart Glass
19
format. After the text is edited the document is converted into PDF Format using Document
Translator. The saved document can also be shared on all the social networking platforms.
c. Real Time Information: This feature makes the application customisable. The application
can be linked to user defined spread sheets or on site ERP System. This module again uses
the OCR engine and read the text from the image. The text is then matched from the linked
spreadsheet or ERP system and the results are displayed to the user. The spreadsheet can
contain any data which can be filled by the user. The application sends a query to the
spreadsheet and retrieves information from the same. Real time information works well for
large factories with heavy machineries as the information can be retrieved quickly from the
machines.
d. Visual Commerce [6]: Visual Commerce is an unique blend of Visual Search and E-
Commerce. This module is implemented using the Voila Jhones algorithm. In this module
the user doesn’t need to click a photo of the required object. Instead they have to just hover
their camera over the object. The object is recognized from a data set using the algorithm.
Once the object is recognized it is searched in the top e-commerce portals using their API’s.
These API’s are linked with the application at first to gain easy access over the portals.
Once the object is recognized in the portals they are displayed in an aggregated form to the
user with their price and availability mentioned. The data set can be expanded over time
using various machine learning algorithms.
e. Database: The database is MySQL database being populated via crawlers. The database is
created as an aggregation for news articles. There are two crawlers running daily on the
database. The first crawler runs through RSS links and stores the URL in the database. The
other crawler then runs through those links and stores all the news articles. The search is
based on a few parameters which includes relevancy, precision, date of the article and
priority.
f. OCR Engine: The OCR Engine using is Teserract [2]. Tesseract is an optical character
recognition engine for various operating systems. It is free software, released under the
Apache License, Version 2.0, and development has been sponsored by Google since 2006.
Tesseract is considered one of the most accurate open source OCR engines currently
available. If Tesseract is used to process right-to-left text such Arabic or Hebrew the results
are ordered as though it is left-to-right text.
Tesseract is suitable for use as a backend engine.
Module wise System Architecture:
Project Report: Smart Glass
20
Figure 7: Actual System Architecture based on modules
Module/Component Description
UML is de facto standard notation for software design. It can be used for drawing diagrams and
also to generate codes, apply design patterns, mine requirements and perform impact analysis.
UML is flexible and UML models are portable. UML is well known visual language that can
capture much of the information that one needs to communicate about the system.
Use Case Diagram: A use case diagram at its simplest is a representation of a user's interaction
with the system and depicting the specifications of a use case. A use case diagram can portray the
different types of users of a system and the various ways that they interact with the system. This
type of diagram is typically used in conjunction with the textual use case and will often be
accompanied by other types of diagrams as well.
Project Report: Smart Glass
21
Figure 8: Use Case Diagram
Activity Diagram: Activity diagrams are graphical representations of workflows of stepwise
activities and actions with support for choice, iteration and concurrency. In the Unified Modelling
Language, activity diagrams are intended to model both computational and organisational
processes (i.e. workflows). Activity diagrams show the overall flow of control.
Project Report: Smart Glass
22
Figure 9: Activity Diagram
State Transition Diagram: A state diagram is a type of diagram used in computer science and
related fields to describe the behavior of systems. State diagrams require that the system described
is composed of a finite number of states; sometimes, this is indeed the case, while at other times
this is a reasonable abstraction. Many forms of state diagrams exist, which differ slightly and
have different semantics.
Project Report: Smart Glass
23
Figure 10: State transition to capture image
Figure 11: State transition for text conversion
Deployment Diagram: Deployment diagrams are used to visualize the topology of the physical
components of a system where the software components are deployed.
So deployment diagrams are used to describe the static deployment view of a system. Deployment
diagrams consist of nodes and their relationships
Project Report: Smart Glass
24
Figure 12: Deployment Diagram
Class Diagram: The class diagram is a static diagram. It represents the static view of an
application. Class diagram is not only used for visualizing, describing and documenting different
aspects of a system but also for constructing executable code of the software application.
The class diagram describes the attributes and operations of a class and also the constraints
imposed on the system. The class diagrams are widely used in the modelling of object oriented
systems because they are the only UML diagrams which can be mapped directly with object
oriented languages.
Figure 15: Class Diagram
Component Diagram: Component diagrams are different in terms of nature and behaviour.
Component diagrams are used to model physical aspects of a system.
Project Report: Smart Glass
25
Now the question is what are these physical aspects? Physical aspects are the elements like
executables, libraries, files, documents etc which resides in a node.
So component diagrams are used to visualize the organization and relationships among
components in a system. These diagrams are also used to make executable systems.
Figure 16: Component Diagram
Data Flow Diagram: A data flow diagram (DFD) is a graphical representation of the "flow" of
data through an information system, modelling its process aspects. Often they are a preliminary
step used to create an overview of the system which can later be elaborated. DFDs can also be
used for the visualization of data processing (structured design).
A DFD shows what kinds of information will be input to and output from the system, where the
data will come from and go to, and where the data will be stored. It does not show information
about the timing of processes, or information about whether processes will operate in sequence
or in parallel (which is shown on a flowchart).
Project Report: Smart Glass
27
User Interface Design
Figure 14: Splash Screen Figure 15: Smart Snap of Play Store
Figure 156: Main Menu Figure 167: Slide Menu
Project Report: Smart Glass
28
Figure 18: Working of Visual Search. The selection screen.
Fig. 19: Visual Search results Fig. 20: News articles based on Visual search
Project Report: Smart Glass
29
Fig. 21 Share option in the application Fig. 22 Loading Screen for Document Capture
Fig. 23 Real-Time Information Fig. 24 Results for Real-Time Information
Project Report: Smart Glass
30
Chapter 7
7. Implementation
Modules/ Component Description:
Mobile Device:
This Module contains three sub modules. It consists of all the features that are present on the
mobile phone of the user. It contains GUI and OCR. GUI contains one more module.
Mobile Device is the connection between users’s input to the backend.
GUI:
The GUI is the interface with which the user interacts. The GUI is based for an high end
Android Device. It also contains a slide menu. Visual Search consists of UI which connects to
the database. Document Capture is designed in a way to connect to the server. It also includes
a load screen. Real Time information connects with the user defined spreadsheets.
OCR:
This module contains the recognition of the data from the image. This module helps in
retrieving data from the images and sends back to display it to the user. This module works on
Project Report: Smart Glass
31
the mobile device of the user. There is also an OCR running on the server for Document
Capture Module. The OCR Engine used in the project is Teserract.
Capture Image:
This module helps the user to capture images through the mobile device. The images captured
through this module are used for further processing. This module provides the input to the
whole system. The OCR module works on the captured image.
Image Processing [1]:
This module helps in processing and retrieval of the images from the image that is captured
from the user. The images extracted from this module helps in retrieval of the information as
per the user’s request. The application uses Voila Johnes Algorithm to process image from
capture image in Visual Commerce.
Visual Search:
In Visual Search the images extracted or captured by the user are used for buying the products
in that image from the sites like Amazon and eBay. In this module the information about those
products from the shopping sites from where the user can buy those products.
Search Engine:
This module is a search engine from where the data or the information is extracted based on
the user’s request. It is connected to a database which consists of images and information and
news about products. The database is created on MySql which is populated by two crawlers [5].
It consists of all the files and details about the products, websites and RSS feeds regarding
various fields and retrieve data according to the user’s request.
Web Service:
Helps to establish a link between the data source and the android application. Based on the php
framework, it fires query to the MySQL server and retrieves query results which are then
encoded in the JSON format and passed to the application.
Project Report: Smart Glass
32
Spreadsheets:
Instead of hard coding the actions to specific keywords, the user is provided with an ability to
specify his own actions to specific keywords through configuring his spreadsheet. The user has
to specify his web publish url provided by the Google Spreadsheet to configure it. The
spreadsheet has columns like products, action and a url belonging to that particular action.
Document Translator
This provides a convenient way to process an image and segregate it into text and images
separately. It saves the time and energy of retyping and also provides a convenient way to
change the fonts and resizing the images.
Module-wise Algorithm:
Visual Search
1. Scan the word content
2. Extract the text content from the image
3. Spell check the retreived content
4. Text sent to the server
5. Server Side Programming
a. Split the sentence into words
b. Remove stop words
c. Search for relevant news articles based on scanned words
d. JSON encode the retreived articles and pass it to the device
6. Decode the JSON response on the device
7. Display results.
Visual Commerce
1. Scan the word content
2. Extract the text content from the image
3. Spell check the retreived content
4. Text sent to the server
5. Server Side Programming
Project Report: Smart Glass
33
a. Split the sentence into words
b. Remove stop words
c. Aggregate deals from leading ecommerce portals
d. Standardizing the results into a uniform format.
e. JSON encode the result and pass it to the device
6. Decode the JSON response on the device
7. Display results
Real Time Information
1. Scan the word content
2. Extract the text content from the image
3. Spell check the retrieved content
4. Text sent to the server
5. Server Side Programming
a. Split the sentence into words
b. Remove stop words
c. Connect to the ERP / Excel / Spreadsheet from which the content is to
be monitored
d. Filter out the rows matching with the searched content
e. Standardizing the results into a uniform format as per the required
information
f. JSON encode the result and pass it to the device
6. Decode the JSON response on the device
7. Display results
Project Report: Smart Glass
34
Chapter 8
8. Experiments and Project Testing
Test plan
The objective of the plan is to break the product down into distinct areas and identify features
of the Smart Snap Application that are to be tested. The test plan approach that has been used
in our project includes the following:
1. Design verification or Compliance test:
These stages of testing have been performed during the development or approval stage of
the product, typically on a small sample of units.
2. Test Coverage
The design verification tests have been performed at the point of reaching every milestone.
Test areas include testing of various features such as line segmentation, line and edge
detection, Binary image conversion, symbol detection, palm colour detection, etc.
3. Test Methods
Testing of diverse features has been performed in “Smart Snap”. For each module,
corresponding outputs were checked. For testing each module, the output produced from
running the code was checked with the test data set.
Project Report: Smart Glass
35
4. Test Responsibility
The team members working on their respective features performed the testing of those
features. Test responsibilities also include, the data collected, and how that data was used
and reported.
8.2Test cases
A test case is a set of conditions or variables under which we will determine whether the Smart
Snap application is working correctly or not. We have used many test cases to determine that
the system is sufficiently scrutinized.
Test
Case
ID
Case Description Expected Result Actual Result Pass/Fail
1 User takes picture
of a document
The image should be
converted into editable
text and displayed.
The image is converted
into editable text and
displayed.
Pass
2 The user takes a
picture of a text
The OCR should read
the text and display the
related news articles to
the user.
The OCR reads the text
correctly and displays
it to the user.
Pass
3 The user clicks a
photo in portrait
mode instead of
landscape mode.
An error message is
displayed and the user
is prompted to click the
image back in
landscape mode.
The error message is
displayed prompting
user to click the photo
in landscape mode.
Pass
4 The user clicks the
share button
The application should
display all the
available platforms on
The application
displays all the
sharable platforms to
the user.
Pass
Project Report: Smart Glass
36
which the user can
share the results
5 The user clicks the
create PDF option.
A PDF should be
created for the user in
the Micro SD card.
A PDF is created for
the user in the memory
card.
Pass
Table 4: Test Case
8.3 Methods used
The methods used by us for testing are as given below:
1. Unit Testing
Unit testing is a method by which individual units of source code, sets of one or more
program modules together are tested to determine if they fit for use. In our application, we
considered each module as one unit and tested these units with help of test cases and test
plan developed. Unit testing was carried out on each module and on every function within
the module. Output of each unit was assessed for accuracy and if found incorrect,
appropriate corrections were made.
2. Integration Testing
Integration testing is a phase in software testing in which individual software modules are
combined and tested as a group. The purpose of integration testing is to detect any
inconsistencies between the software units that are integrated together. The modules of our
application were integrated together in order to verify that they provide the required
functionalities appropriately. The various modules were tested together to check for their
accuracy and compatibility.
3. System Testing
System testing of software or hardware is testing conducted on a complete, integrated
system to evaluate the system's compliance with its specified requirements. As a rule,
system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with
any applicable hardware system. We performed system testing after integration testing to
ensure proper functioning of the project as whole.
4. Acceptance Testing
Project Report: Smart Glass
37
Acceptance testing will be conducted to determine if the requirements and specifications
are met. It may involve performance tests. Acceptance testing performed by the customer
is known as user acceptance testing (UAT), end-user testing, site (acceptance) testing, or
field (acceptance) testing. Acceptance testing generally involves running a suite of tests on
the completed system. Each individual test, known as a case, exercises a particular
operating condition of the user's environment or feature of the system, and will result in a
pass or fail, or Boolean, outcome. Here the end user will use the Smart Snap Application
first in a tested environment and then in the environment of his/her own home.
5. Usability testing
Usability testing is a technique used to evaluate a product by testing it on users. Usability
testing focuses on measuring a human-made product's capacity to meet its intended
purpose. . Usability testing measures the usability, or ease of use, of a specific object or set
of objects. The results of this review will help improve the end-user interaction of the
software. The purpose of usability testing is to ensure that the Smart Snap Application will
function in a manner that is acceptable to the user.
6. Performance testing
Performance testing is testing that is performed, to determine how fast some aspect of
a system performs under a particular workload. It can also serve to validate and verify
other quality attributes of the system, such as scalability, reliability and resource usage. In
the Smart Snap Application, these tests ensure that the system provides acceptable response
times. It should not exceed 10 seconds once the user has finished loading the image and
has clicked on the ‘Output’ button.
8.4 Test Results
Preliminary test conducted showed promising results. The tests conducted included taking a
number of photographs at different angles to ensure readability of data. The information read
via the OCR engine were good. The data was even read at an angle of 30 degrees. However,
the data could not be read when the photo was blur. The angle for the OCR engine can be
improved.
Further tests were conducted for the retrieval of data from the servers. The data was retrieved
within 3 seconds for Visual Search which was astounding. The results before were quite bad
as the entire data was being loaded to the local database and then being searched. Algorithms
were improved and then the searched data only was being sent. This improved the speed
exponentially.
Project Report: Smart Glass
39
Chapter 9
9. Maintenance
9.1 User Manual
Prerequisites
The user must have an Android Phone
The user must have an active data plan or any sort of internet connectivity
How to use
The user must start the application.
Select one of the various modules which are Document Capture, Visual Search, Visual
Commerce and Real Time Information.
The user then must click a photo for which he wants the results
The results are displayed which can be shared on various social networking platforms
9.2 Constraints
The user must have internet connectivity.
Project Report: Smart Glass
40
Visual commerce does not work in portrait mode. The user must click the photo in
landscape mode only.
Document Capture works only in portrait mode. The image shouldn’t be clicked in
landscape mode.
While clicking the image the hand should be stable and the image clicked should not have
any disruptions. The environment should not be too dark.
Project Report: Smart Glass
41
Chapter 10
10. Conclusion and Future Scope
After working on the project for a span of 6 months we have proposed a new system of search
based on visual search instead of text based search. Text based search is a traditional outdated
search. A new way of search was required which was more precise and optimized.
Based on Visual Search, Image Recognition and Database Technology a more optimized way
of search and conversion tool was created which would help the user in saving a lot of time.
The features are closely integrated and help the user in converting image to text, buying new
products and getting the latest news.
The application has a number of future scopes. The precision of the application needs to be
improved for better results. The database can be optimized to give quicker search results.
Also the future scope includes adding a number of features. These features include adding
Sudoku Solver, Location based Navigation and Voice Based Search.
Overall we conclude, the application brings on new technology which is user-friendly and save
a lot of time for the user.
Project Report: Smart Glass
42
References and Bibliography
A
Android: It is an operating system for Mobile devices and tablets being developed by Google.
D
Document Capture: A feature that will allow users to convert images into editable texts. These texts
will be copied to the devices clipboard.
F
Forward-When the recipient of an email message sends it on to someone he or she thinks might find
it interesting or benefit from.
I
Image Recognition: Image recognitions is recognises relevant information from the image clicked
using various available technologies.
M
MATLAB: Developed by MathWorks, MATLAB allows matrix manipulations, plotting
of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with
programs written in other languages, including C, C++, Java, and Fortran.
N
News Aggregator: News aggregator is a feature which pulls in all the relevant information about a
topic from the net using custom search.
O
Optical Character Recognition: Optical character recognition, usually abbreviated to OCR, is
the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed
text into machine-encoded text.
P
Page Rank: It assigns a weight to the web page depending on the ranking of the users.
Q
QR Codes: QR code (abbreviated from Quick Response Code) is the trademark for a type of matrix
barcode
S
Spreadsheets: Spreadsheets are sheets available online by Google. The user can create tables and
files within a spredsheets and store custom information.
Project Report: Smart Glass
43
T
Test-An action taken to ensure an email will perform properly before it is sent. A test message is sent
to several “testing” accounts and allows marketers to identify problems such as broken links or
images and rectify them before sending the email to an entire list as well as a means of comparing
the results of different versions of an email.
V
Visual Commerce: A feature that allows users to purchase online not by typing but just by clicking
an image of the object they want to purchase.
Recommended sites, newsletters, blogs and books.
Websites
1. The website has all the details about image processing basics.
http://www.idi.ntnu.no/~blake/gbimpdet.htm
2. Google Codes-This is the place to find explanations related to Tesseract OCR.
http://code.google.com/p/tesseract-ocr/
3. http://seomojo.net/how_seo.htm
4. http://framework.zend.com/
5. http://visual.ly/google%E2%80%99s-hummingbird-algorithm-%E2%80%93-
what%E2%80%99s-it-all-about
6. http://www.mathworks.in/help/vision/ref/vision.cascadeobjectdetectorclass.html
Books
Document image analysis: A Primer
Rangachar Department of Computer Science & Engineering, The Pennsylvania State
University, University Park, PA 16802, USA
Kasturi LawrenceO’Gorman Avaya Labs, Room 1B04, 233 Mt. Airy Road, Basking Ridge, NJ
07920, USA
Venu Govindraju CEDAR, State University of New York at Buffalo, Amherst, NY 14228, USA