hackerbrains.files.wordpress.com€¦ · web viewa home button to return to the user's home...
TRANSCRIPT
1. ABSTRACT
This paper reviews the topic of voice recognizing web-browser. Voice recognition
is a process that recognizes our human voice to produce sentence of word or commands.
The output of voice recognition systems can be applied in various fields.
Therefore, it will be implement in this project by expand a web browser with speech
recognition.
Nowadays, most of the web browsers don’t support speech recognition. For those
who are disabilities in typing will facing problem during the web surfing. In this project
we will focus on the method to develop prototype speech recognition by using Vb.net, a
technology such an agent is implemented in the way for solving about the speech
recognition.
Vb.net provides necessary environment and libraries for voice recognition. The
main objective of the project is to build a prototype of speech recognition to navigate a
web browser in English language with continuous and speaker independent. Throughout
the project, Vb.net is utilized to build the prototype. Moreover common research
methodology is applied for the project development.
1
2. INTRODUCTION
Web-browsing is defined as finding information documents or web-pages on the
internet associated with a given technical or other such criteria of interest to the user. The
primary mechanism to search for specific web pages is to key-in search strings of
characters to a search engine or the equivalent in a commercially available browser. The
searching provides a list of hits or matches and the specific text or web pages can be
displayed. Any of the listed web pages can be brought up on the screen by known
methods, e.g. ” pointing and clicking” on words that are “ linked ”to classes of
information desired and bringing up those web pages on the user’s screen if desired or at
least bring up the text on the user’s screen if graphics are not available to the user.
A web browser is a software application for retrieving, presenting, and traversing
information resources on the World Wide Web. An information resource is identified by
a Uniform Resource Identifier (URI) and may be a web page, image, video, or other piece
of content. Hyperlinks present in resources enable users easily to navigate their browsers
to related resources.
The goal of this project is to develop a Web browser that allows user to navigate
by speaking the text of a link or an associated number instead of clicking with a mouse.
Our project is useful to those people who have difficulty in accessing the World
Wide Web, and those who temporarily cannot use a existing web browser (for example,
because their eyes or hands are occupied or because they are not near their computer).
A voice browser is a web browser with the following capabilities:
Can interpret spoken input for navigation (speech recognition)
Can interpret spoken input to perform various task related to web browser.
2
3. AIMS AND OBJECTIVES
The goal of this project is to develop a Web browser that allows user to navigate
by speaking the text of a link or an associated number instead of clicking with a mouse.
Voice operated web browser is designed especially for a disabled person.
This web browser will have the ability to search, navigate web-pages on World
Wide Web from a PC through speech input i.e. the user will exercise control over the web
browsing through voice.
Speech recognition is a process that recognizes our human voice and converts it
into text or commands that would perform a specific task. The output of voice
recognition systems can be applied in various fields. Therefore, it will be implemented in
this project by introducing a web browser with speech recognition.
Since OS like windows / Mac introduced Speech Recognition technology,
developers began to concentrate more on this technology. Now days many Smart phones
offer speech recognition capabilities. The main objective of the project is to build a
prototype of speech recognition to navigate a web browser in English which is
continuous and speaker independent.
In our project we are going to develop a Web-browser which will support Speech
Recognition. This project is useful for those who have disabilities in typing and for those
who face problems during web surfing. In this project we are going to focus on the
method to develop prototype speech recognition by using Vb.net, a technology such an
agent is implemented in the way for solving about the speech recognition.
3
4. LITERATURE SURVEYED
A web browser is a software application for retrieving, presenting, and traversing
information resources on the World Wide Web. An information resource is identified by
a Uniform Resource Identifier (URI) and may be a web page, image, video, or other piece
of content. Hyperlinks present in resources enable users easily to navigate their browsers
to related resources.
A web browser can also be defined as an application software or program designed to
enable users to access, retrieve and view documents and other resources on the Internet.
Although browsers are primarily intended to access the World Wide Web, they can also
be used to access information provided by web servers in private networks or files in file
systems.
The major web browsers are Internet Explorer, Firefox, Google Chrome, Safari, and
Opera.
4.1 Web Browser History
Dozens of innovative web browsers have been created by various people and teams over
the years.
The first widely used web browser was NCSA Mosaic. The Mosaic programming team
then created the first commercial web browser called Netscape Navigator, later renamed
Communicator, then renamed back to just Netscape. The Netscape browser led in user
share until Microsoft Internet Explorer took the lead in 1999 due to its distribution
advantage. A free open source software version of Netscape was then developed called
Mozilla, which was the internal name for the old Netscape browser, and released in 2002.
Mozilla has since gained in market share, particularly on non-Windows platforms, largely
due to its open source foundation, and in 2004 was released in the quickly popular
Firefox version.
4
A chronological listing of some of the influential early web browsers that advanced the
state of the art is provided below:
World Wide Web . Tim Berners-Lee wrote the first web browser on a NeXT
computer, called World Wide Web, finishing the first version on Christmas day,
1990. He released the program to a number of people at CERN in March, 1991,
introducing the web to the high energy physics community, and beginning its spread.
libwww . Berners-Lee and a student at CERN named Jean-Francois Groff ported the
World Wide Web application from the NeXT environment to the more common C
language in 1991 and 1992, calling the new browser libwww. Groff later started the
first web design company, InfoDesign.ch.
Line-mode . Nicola Pellow, a math student interning at CERN, wrote a line-mode web
browser that would work on any device, even a teletype. In 1991, Nicola and the team
ported the browser to a range of computers, from UNIX to Microsoft DOS, so that
anyone could access the web, at that point consisting primarily of the CERN phone
book.
Erwise . After a visit from Robert Cailliau, a group of students at Helsinki University
of Technology joined together to write a web browser as a master's project. Since the
acronym for their department was called "OTH", they called the browser "erwise", as
a joke on the word "otherwise". The final version was released in April, 1992, and
included several advanced features, but wasn't developed further after the students
graduated and went on to other jobs.
ViolaWWW . Pei Wei, a student at the University of California at Berkeley, released
the second browser for Unix, called ViolaWWW, in May, 1992. This browser was
built on the powerful interpretive language called Viola that Wei had developed for
Unix computers. ViolaWWW had a range of advanced features, including the ability
to display graphics and download applets.
5
Midas . During the summer of 1992, Tony Johnson at SLAC developed a third
browser for Unix systems, called Midas, to help distribute information to colleagues
about his physics research.
Samba . Robert Cailliau started development of the first web browser for the
Macintosh, called Samba. Development was picked up by Nicola Pellow, and the
browser was functional by the end of 1992.
Mosaic . Marc Andreessen and Eric Bina from the NCSA released the first version of
Mosaic for X-Windows on UNIX computers in February, 1993. A version for the
Macintosh was developed by Aleks Totic and released a few months later, making
Mosaic the first browser with cross-platform support. Mosaic introduced support for
sound, video clips, forms support, bookmarks, and history files, and quickly became
the most popular non-commercial web browser. In August, 1994, NCSA assigned
commercial rights to Mosaic to Spyglass, Inc., which subsequently licensed the
technology to several other companies, including Microsoft for use in Internet
Explorer. The NCSA stopped developing Mosaic in January 1997.
Arena . In 1993, Dave Raggett at Hewlett-Packard in Bristol, England, developed a
browser called Arena, with powerful features for positioning tables and graphics.
Lynx . The University of Kansas had written a hypertext browser independently of the
web, called Lynx, used to distribute campus information. A student named Lou
Montulli added an Internet interface to the program, and released the web browser
Lynx 2.0 in March, 1993. Lynx quickly became the preferred web browser for
character mode terminals without graphics, and remains in use today. Resources
include the Browser.org Lynx page, the ISC Lynx page, and the Lynx User Guide.
Cello . Tom Bruce, cofounder of the Legal Information Institute, realized that most
lawyers used Microsoft PC's, and so he developed a web browser for that platform
called Cello, finished in the summer of 1993.
6
Opera . In 1994, the Opera browser was developed by a team of researchers at a
telecommunication company called Telenor in Oslo, Norway. The following year,
two members of the team -- Jon Stephenson von Tetzchner and Geir Ivarsøy -- left
Telenor to establish Opera Software to develop the browser commercially. Opera 2.1
was first made available on the Internet in the summer of 1996.
Internet in a box . In January, 1994, O'Reilly and Associates announced a product
called Internet in a Box which collected all of the software needed to access the web
together, so that you only had to install one application, instead of downloading and
installing several programs. While not a unique browser in its own right, this product
was a breakthrough because it distributed other browsers and made the web a lot
more accessible to the home user.
Navipress . In February, 1994, Navisoft released a browser for the PC and Macintosh
called Navipress. This was the first browser since Berners-Lee's WorldWideWeb
browser that incorporated an editor, so that you could browse and edit content at the
same time. Navipress later became AOLPress, and is still available in some download
locations on the Internet but has not been maintained since 1997.
Mozilla . In October, 1994, Netscape released the the first beta version of their
browser, Mozilla 0.96b, over the Internet. On December 15, the final version was
released, Mozilla 1.0, making it the first commercial web browser. An open source
version of the Netscape browser was released in 2002 was also named Mozilla in
tribute to this early version, and then released as the quickly popular FireFox in
November, 2004.
Internet Explorer . On August 23rd, 1995, Microsoft released their Windows 95
operating system, including a Web browser called Internet Explorer. By the fall of
1996, Explorer had a third of market share, and passed Netscape to become the
leading web browser in 1999.
7
4.2 Features
Available web browsers range in features from minimal, text-based user interfaces with
bare-bones support for HTML to rich user interfaces supporting a wide variety of file
formats and protocols. Browsers which include additional components to support e-mail,
Usenet news, and Internet Relay Chat (IRC), are sometimes referred to as "Internet
suites" rather than merely "web browsers".
All major web browsers allow the user to open multiple information resources at the
same time, either in different browser windows or in different tabs of the same window.
Major browsers also include pop-up blockers to prevent unwanted windows from
"popping up" without the user's consent.
Most web browsers can display a list of web pages that the user has bookmarked so that
the user can quickly return to them. Bookmarks are also called "Favorites" in Internet
Explorer. In addition, all major web browsers have some form of built-in web feed
aggregator. In Firefox, web feeds are formatted as "live bookmarks" and behave like a
folder of bookmarks corresponding to recent entries in the feed. In Opera, a more
traditional feed reader is included which stores and displays the contents of the feed.
Furthermore, most browsers can be extended via plug-ins, downloadable components that
provide additional features.
4.3 User interface
Most major web browsers have these user interface elements in common:[17]
Back and forward buttons to go back to the previous resource and forward
respectively.
A refresh or reload button to reload the current resource.
A stop button to cancel loading the resource. In some browsers, the stop button is
merged with the reload button.
8
A home button to return to the user's home page.
An address bar to input the Uniform Resource Identifier (URI) of the desired
resource and display it.
A search bar to input terms into a search engine.
A status bar to display progress in loading the resource and also the URI of links
when the cursor hovers over them, and page zooming capability.
Major browsers also possess incremental find features to search within a web page.
4.4 Privacy and security
Most browsers support HTTP Secure and offer quick and easy ways to delete the web
cache, cookies, and browsing history. For a comparison of the current security
vulnerabilities of browsers, see comparison of web browsers
9
5. EXISTING SYSTEM
Speech recognition has been for-fronted much over years now, although the technology
still requires training to reduce the error margin. Up until now, mainly dictation software
and software for disabled people made use of speech recognition, but the most recent
advances have brought speech recognition to the brink of becoming an established
technology in the mass market. With various Operating System bundling speech
recognition and biometrics techniques, the way PC’s are used had taken much step ahead.
Since OS like windows / Mac introduced Speech Recognition technology, developers
began to concentrate much on this technology. Now a day many Smartphone’s offers
speech recognition capabilities.
5.1 Browsing web with Speech Recognition
With Windows Speech Recognition, browsing the Internet means figuring out what the
objects in the web browser window are called and then saying the commands you need to
interact with them. You could easily activate browsing feature through various voice
commands. For browsers like Firefox, Firesay is an interesting add-on for Firefox, as it
adds speech recognition to the web browser so that users can use voice commands to
issue commands in the browser.
5.2 What Chrome Offers???
Google Chrome too has added the potential for voice recognition to the beta version of its
Chrome browser in the latest update to the software.
10
The tool works through the newly-included HTML5 speech input API and once
implemented would give users the ability to add text to any input area on a website
without touching the keyboard. Google said that once recorded, the audio would be sent
to Google’s specialist speech servers for transcription, before being sent back to the
original computer and typed into the text box
The move comes as Google continues to expand its voice recognition services from
portable devices onto the desktop. Already introduced technology -Google Voice Search
(www.google.com/mobile/voice-search/) is one of the most prominent apps on the mobile
side, and is available across the Android, BlackBerry, iOS, Nokia S60 and Windows
platforms.
5.3 Using Opera with your voice
Opera Software has announced a voice-activated browser. The new browser, launch date
not yet announced, incorporates IBM’s ViaVoice software and will respond to voice
commands from the user.
As with other voice recognition programs, the software must be trained to learn the user’s
speech patterns and voice. The initial version will be targeted toward the English
language market, and Opera predicts the browser will increase accessibility for those
individuals with difficulties working a mouse or keyboard.
11
6. PROBLEM STATEMENT
This project relates to the general field of internet web browsing or searching for
particular web pages or other information references. More particularly, our project is
related to speech recognition, and identification and isolation of keywords from that
speech, and passing those words to search functions found on web browser.
Speech recognition is a process that recognizes our human voice and converts it
into text or commands that would perform a specific task. The output of voice
recognition systems can be applied in various fields. Therefore, it will be implemented in
this project by introducing a web browser with speech recognition.
The goal of this project is to develop a Web browser that allows user to navigate
by speaking the text of a link or an associated number instead of clicking with a mouse.
12
7. SCOPE OF THE PROJECT
A web browser is a software application for retrieving, presenting, and traversing
information resources on the World Wide Web. An information resource is identified by
a Uniform Resource Identifier (URI) and may be a web page, image, video, or other piece
of content. Hyperlinks present in resources enable users easily to navigate their browsers
to related resources.
Speech recognition is a process that recognizes our human voice and converts it into text
or commands that would perform a specific task. The output of voice recognition systems
can be applied in various fields. Therefore, it will be implemented in this project by
introducing a web browser with speech recognition.
Voice operated web browser is designed especially for a disabled person. The design of
such web browser needs to consider many technical constraints, disabilities and wishes of
the future.
This web browser will have the ability to search, navigate web-pages on World Wide
Web from a PC through speech input i.e. the user will exercise control over the web
browsing through voice.
Information contained on the World Wide Web is inaccessible to many people. The web
is primarily a visual medium that requires a keyboard and mouse to navigate, and this
disenfranchises several types of users.
People who have difficulty in accessing the World Wide Web, and those who temporarily
cannot use a traditional web browser (for example, because their eyes or hands are
occupied or because they are not near their computer).
13
8. PROPOSED SYSTEM
This project has the speech recognizing and speech synthesizing capabilities though it is
not a complete replacement of what we call a WEB BROWSER but still a good web browser to
be used through voice. This software also can search web pages and perform tasks related to web
browser.
This project also includes a voice mark feature which is very much similar to the book
mark feature that is available in the existing web-browser. The following are the main modules
which are supposed to be implemented in our project: -
• Tab Control
• Search Provider
• Working with favorites
• Implementation of Voice-Mark feature
• Dealing with pop-up’s
• Working with History and Cookies
• Simple RSS (Really Simple Syndication) reader
• Voice recognizing text editor
14
9. METHODOLOGY AND ANALYSIS
Software process model deals with the model which we are going to use for the development of
the project. There are many software process models available but while choosing it we should
choose it according to the project size that is whether it is industry scale project or big scale
project or medium scale project.
Accordingly the model which we choose should be suitable for the project as the software
process model changes the cost of the project also changes because the steps in each software
process model varies.
The software process model we will be using is the Incremental Software Model.
15
The incremental build model is a method of software development where the model is
designed, implemented and tested incrementally (a little more is added each time) until the
product is finished. It involves both development and maintenance. The product is defined as
finished when it satisfies all of its requirements. This model combines the elements of the
waterfall model with the iterative philosophy of prototyping.
The product is decomposed into a number of components, each of which are designed and built
separately (termed as builds). Each component is delivered to the client when it is complete. This
allows partial utilization of product and avoids a long development time. It also creates a large
initial capital outlay with the subsequent long wait avoided. This model of development also
helps ease the traumatic effect of introducing completely new system all at once.
The incremental Model is an evolution of the waterfall model, where the waterfall model is
incrementally applied.
The series of releases is referred to as “increments”, with each increment providing more
functionality to the customers. After the first increment, a core product is delivered, which can
already be used by the customer. Based on customer feedback, a plan is developed for the next
increments, and modifications are made accordingly. This process continues, with increments
being delivered until the complete product is delivered. The incremental philosophy is also used
in the agile process model.
The Incremental model combines elements of the linear sequential model with the iterative
philosophy of the prototyping. This model has been explicitly designed to accommodate a
product that evolves over time.
When an incremental model is used, the first increment is often a core product. The core product
is used by the customer or undergoes a detailed review. As a result of use and/or evaluation a
plan is developed for the next increment. The plan addresses the modification to the core product
16
to better meet the needs of the customer and delivery of additional features and functionality.
Software is constructed in a step-by-step manner. While a software product is being developed,
each step adds to what has already been completed.
Analysis Phase:
To attack a problem, by breaking it into sub-problems. The objective of analysis is to determine
exactly what must be done to solve the problem. Typically, the system’s logical elements (its
boundaries, processes, and data) are defined during analysis.
Design Phase:
The objective of design is to determine how the problem will be solved. During design the
analyst’s focus shifts from the logical to the physical Data elements are grouped to form physical
data structures, screens, reports, files, and databases.
Coding Phase:
The system is created during this phase. Programs are coded, debugged, documented, and tested.
New hardware is selected and ordered. Procedures are written and tested. End-user
documentation is prepared. Databases and files are initialized. Users are trained.
Testing Phase:
Once the system is developed, it is tested to ensure that it does what it was designed to do. After
the system passes its final test and any remaining problems are corrected, the system is
implemented and released to the user.
17
10. DETAILS OF THE HARDWARE & SOFTWARE
10.1 SOFTWARES:-
Front-End:- VB.net
Back-End:- Database using File Handling and XML
(i) VB.net:-
Visual Basic .NET (VB.NET) is an object-oriented computer programming language that can be
viewed as an evolution of the classic Visual Basic (VB), which is implemented on the .NET
Framework. Microsoft currently supplies two major implementations of Visual Basic: Microsoft
Visual Studio, which is commercial software and Microsoft Visual Studio Express, which is free
of charge. Microsoft implementation of Visual Basic .NET is called "Microsoft Visual Basic".
(ii) Database using File Handling and XML:-
VB.NET File Handling
With the File style in the .NET Framework, you can perform efficient and simple manipulations
of files, including reads, writes, and appends. Using a set of examples, we look at how you can
programmatically test and mutate files in the VB.NET language.
Here, we describe in brief many of the methods available in the VB.NET language and .NET
framework on the File type. Some methods have been omitted; also the System.IO namespace
provides many other types that are separate from these shared methods.
File.ReadAllBytes: Useful for files not stored as plain text. You can open images or movies
with this method.
File.ReadAllLines: Microsoft: "Opens a file, reads all lines of the file with the specified
18
encoding, and closes the file."
File.ReadAllText: Returns the contents of the text file at the specified path as a string. Very
useful for plain text or settings files.
File.WriteAllBytes: Useful for files such as images that were created or mutated in memory.
File.WriteAllLines: Stores a string array in the specified file, overwriting the contents. Shown
in an example below.
File.WriteAllText: Writes the contents of a string to a text file. One of the simplest ways to
persist text data.
File.AppendAllText: Use to append the contents string to the file at path. Microsoft: "Appends
the specified string to the file, creating the file if it doesn't already exist."
XML
XML is a markup language for documents containing structured information.
Structured information contains both content (words, pictures, etc.) and some indication of what
role that content plays (for example, content in a section heading has a different meaning from
content in a footnote, which means something different than content in a figure caption or
content in a database table, etc.). Almost all documents have some structure.
19
A markup language is a mechanism to identify structures in a document. The XML specification
defines a standard way to add markup to documents.
Data is usually best stored in a structured format. In cases where a database is not necessary, you
can use XML files. With XmlReader and XmlWriter in the VB.NET language, you can read and
write XML in an efficient way.
10.2 HARDWARE:-
Microphones:
A quality microphone is key when utilizing the speech recognition system. Desktop microphones
are not suitable to continue with speech recognition system, because they have tendency to pick
up more ambient noise.
The best choice, and most common is the headset style. It allows the ambient noise to be
minimized, while allowing you to have the microphone at the tip of your tongue all the time.
Headsets are available without earphones and with earphones (mono or stereo).
Computer/ Processors:
Speech recognition applications can be heavily dependent on processing speed. This is because
continuous voice recognition can take place. Thus to avoid unnecessary delays a processor
having good processing capability is required.
20
11. DESIGN DETAILS
The Interaction diagram of the proposed software project can be given as follows:-
21
The Flowchart of the proposed software project can be given as follows:-
22
12. IMPLEMENTATION PLAN FOR THE NEXT SEMESTER
In the next semester we would be working on the different modules in the project. The
following are the main modules which are supposed to be implemented in our project: -
• Tab Control
• Search Provider
• Working with favorites
• Implementation of Voice-Mark feature
• Dealing with pop up’s
• Working with History and Cookies
• Simple RSS (Really Simple Syndication) reader
• Voice recognizing text editor
23
13. REFERENCES
Books:
Visual Basic .NET Black Book
Web Sites:
http://msdn.microsoft.com/en-us/library/ms990659.aspx
http://channel9.msdn.com/coding4fun/articles/Giving-Computers-a-Voice
http://www.google.com/patents/about/6311182_Voice_activated_web_browser.html?
id=ZYsIAAAAEBAJ
24