hackerbrains.files.wordpress.com€¦ · web viewa home button to return to the user's home...

1. ABSTRACT

This paper reviews the topic of voice recognizing web-browser. Voice recognition

is a process that recognizes our human voice to produce sentence of word or commands.

The output of voice recognition systems can be applied in various fields.

Therefore, it will be implement in this project by expand a web browser with speech

recognition.

Nowadays, most of the web browsers don’t support speech recognition. For those

who are disabilities in typing will facing problem during the web surfing. In this project

we will focus on the method to develop prototype speech recognition by using Vb.net, a

technology such an agent is implemented in the way for solving about the speech

recognition.

Vb.net provides necessary environment and libraries for voice recognition. The

main objective of the project is to build a prototype of speech recognition to navigate a

web browser in English language with continuous and speaker independent. Throughout

the project, Vb.net is utilized to build the prototype. Moreover common research

methodology is applied for the project development.

1

2. INTRODUCTION

Web-browsing is defined as finding information documents or web-pages on the

internet associated with a given technical or other such criteria of interest to the user. The

primary mechanism to search for specific web pages is to key-in search strings of

characters to a search engine or the equivalent in a commercially available browser. The

searching provides a list of hits or matches and the specific text or web pages can be

displayed. Any of the listed web pages can be brought up on the screen by known

methods, e.g. ” pointing and clicking” on words that are “ linked ”to classes of

information desired and bringing up those web pages on the user’s screen if desired or at

least bring up the text on the user’s screen if graphics are not available to the user.

A web browser is a software application for retrieving, presenting, and traversing

information resources on the World Wide Web. An information resource is identified by

a Uniform Resource Identifier (URI) and may be a web page, image, video, or other piece

of content. Hyperlinks present in resources enable users easily to navigate their browsers

to related resources.

The goal of this project is to develop a Web browser that allows user to navigate

by speaking the text of a link or an associated number instead of clicking with a mouse.

Our project is useful to those people who have difficulty in accessing the World

Wide Web, and those who temporarily cannot use a existing web browser (for example,

because their eyes or hands are occupied or because they are not near their computer).

A voice browser is a web browser with the following capabilities:

Can interpret spoken input for navigation (speech recognition)

Can interpret spoken input to perform various task related to web browser.

2

3. AIMS AND OBJECTIVES



Voice operated web browser is designed especially for a disabled person.

This web browser will have the ability to search, navigate web-pages on World

Wide Web from a PC through speech input i.e. the user will exercise control over the web

browsing through voice.

Speech recognition is a process that recognizes our human voice and converts it

into text or commands that would perform a specific task. The output of voice

recognition systems can be applied in various fields. Therefore, it will be implemented in

this project by introducing a web browser with speech recognition.

Since OS like windows / Mac introduced Speech Recognition technology,

developers began to concentrate more on this technology. Now days many Smart phones

offer speech recognition capabilities. The main objective of the project is to build a

prototype of speech recognition to navigate a web browser in English which is

continuous and speaker independent.

In our project we are going to develop a Web-browser which will support Speech

Recognition. This project is useful for those who have disabilities in typing and for those

who face problems during web surfing. In this project we are going to focus on the

method to develop prototype speech recognition by using Vb.net, a technology such an

agent is implemented in the way for solving about the speech recognition.

3

4. LITERATURE SURVEYED






A web browser can also be defined as an application software or program designed to

enable users to access, retrieve and view documents and other resources on the Internet.

Although browsers are primarily intended to access the World Wide Web, they can also

be used to access information provided by web servers in private networks or files in file

systems.

The major web browsers are Internet Explorer, Firefox, Google Chrome, Safari, and

Opera.

4.1 Web Browser History

Dozens of innovative web browsers have been created by various people and teams over

the years.

The first widely used web browser was NCSA Mosaic. The Mosaic programming team

then created the first commercial web browser called Netscape Navigator, later renamed

Communicator, then renamed back to just Netscape. The Netscape browser led in user

share until Microsoft Internet Explorer took the lead in 1999 due to its distribution

advantage. A free open source software version of Netscape was then developed called

Mozilla, which was the internal name for the old Netscape browser, and released in 2002.

Mozilla has since gained in market share, particularly on non-Windows platforms, largely

due to its open source foundation, and in 2004 was released in the quickly popular

Firefox version.

4

A chronological listing of some of the influential early web browsers that advanced the

state of the art is provided below:

World Wide Web . Tim Berners-Lee wrote the first web browser on a NeXT

computer, called World Wide Web, finishing the first version on Christmas day,

1990. He released the program to a number of people at CERN in March, 1991,

introducing the web to the high energy physics community, and beginning its spread.

libwww . Berners-Lee and a student at CERN named Jean-Francois Groff ported the

World Wide Web application from the NeXT environment to the more common C

language in 1991 and 1992, calling the new browser libwww. Groff later started the

first web design company, InfoDesign.ch.

Line-mode . Nicola Pellow, a math student interning at CERN, wrote a line-mode web

browser that would work on any device, even a teletype. In 1991, Nicola and the team

ported the browser to a range of computers, from UNIX to Microsoft DOS, so that

anyone could access the web, at that point consisting primarily of the CERN phone

book.

Erwise . After a visit from Robert Cailliau, a group of students at Helsinki University

of Technology joined together to write a web browser as a master's project. Since the

acronym for their department was called "OTH", they called the browser "erwise", as

a joke on the word "otherwise". The final version was released in April, 1992, and

included several advanced features, but wasn't developed further after the students

graduated and went on to other jobs.

ViolaWWW . Pei Wei, a student at the University of California at Berkeley, released

the second browser for Unix, called ViolaWWW, in May, 1992. This browser was

built on the powerful interpretive language called Viola that Wei had developed for

Unix computers. ViolaWWW had a range of advanced features, including the ability

to display graphics and download applets.

5

Midas . During the summer of 1992, Tony Johnson at SLAC developed a third

browser for Unix systems, called Midas, to help distribute information to colleagues

about his physics research.

Samba . Robert Cailliau started development of the first web browser for the

Macintosh, called Samba. Development was picked up by Nicola Pellow, and the

browser was functional by the end of 1992.

Mosaic . Marc Andreessen and Eric Bina from the NCSA released the first version of

Mosaic for X-Windows on UNIX computers in February, 1993. A version for the

Macintosh was developed by Aleks Totic and released a few months later, making

Mosaic the first browser with cross-platform support. Mosaic introduced support for

sound, video clips, forms support, bookmarks, and history files, and quickly became

the most popular non-commercial web browser. In August, 1994, NCSA assigned

commercial rights to Mosaic to Spyglass, Inc., which subsequently licensed the

technology to several other companies, including Microsoft for use in Internet

Explorer. The NCSA stopped developing Mosaic in January 1997.

Arena . In 1993, Dave Raggett at Hewlett-Packard in Bristol, England, developed a

browser called Arena, with powerful features for positioning tables and graphics.

Lynx . The University of Kansas had written a hypertext browser independently of the

web, called Lynx, used to distribute campus information. A student named Lou

Montulli added an Internet interface to the program, and released the web browser

Lynx 2.0 in March, 1993. Lynx quickly became the preferred web browser for

character mode terminals without graphics, and remains in use today. Resources

include the Browser.org Lynx page, the ISC Lynx page, and the Lynx User Guide.

Cello . Tom Bruce, cofounder of the Legal Information Institute, realized that most

lawyers used Microsoft PC's, and so he developed a web browser for that platform

called Cello, finished in the summer of 1993.

6

Opera . In 1994, the Opera browser was developed by a team of researchers at a

telecommunication company called Telenor in Oslo, Norway. The following year,

two members of the team -- Jon Stephenson von Tetzchner and Geir Ivarsøy -- left

Telenor to establish Opera Software to develop the browser commercially. Opera 2.1

was first made available on the Internet in the summer of 1996.

Internet in a box . In January, 1994, O'Reilly and Associates announced a product

called Internet in a Box which collected all of the software needed to access the web

together, so that you only had to install one application, instead of downloading and

installing several programs. While not a unique browser in its own right, this product

was a breakthrough because it distributed other browsers and made the web a lot

more accessible to the home user.

Navipress . In February, 1994, Navisoft released a browser for the PC and Macintosh

called Navipress. This was the first browser since Berners-Lee's WorldWideWeb

browser that incorporated an editor, so that you could browse and edit content at the

same time. Navipress later became AOLPress, and is still available in some download

locations on the Internet but has not been maintained since 1997.

Mozilla . In October, 1994, Netscape released the the first beta version of their

browser, Mozilla 0.96b, over the Internet. On December 15, the final version was

released, Mozilla 1.0, making it the first commercial web browser. An open source

version of the Netscape browser was released in 2002 was also named Mozilla in

tribute to this early version, and then released as the quickly popular FireFox in

November, 2004.

Internet Explorer . On August 23rd, 1995, Microsoft released their Windows 95

operating system, including a Web browser called Internet Explorer. By the fall of

1996, Explorer had a third of market share, and passed Netscape to become the

leading web browser in 1999.

7

4.2 Features

Available web browsers range in features from minimal, text-based user interfaces with

bare-bones support for HTML to rich user interfaces supporting a wide variety of file

formats and protocols. Browsers which include additional components to support e-mail,

Usenet news, and Internet Relay Chat (IRC), are sometimes referred to as "Internet

suites" rather than merely "web browsers".

All major web browsers allow the user to open multiple information resources at the

same time, either in different browser windows or in different tabs of the same window.

Major browsers also include pop-up blockers to prevent unwanted windows from

"popping up" without the user's consent.

Most web browsers can display a list of web pages that the user has bookmarked so that

the user can quickly return to them. Bookmarks are also called "Favorites" in Internet

Explorer. In addition, all major web browsers have some form of built-in web feed

aggregator. In Firefox, web feeds are formatted as "live bookmarks" and behave like a

folder of bookmarks corresponding to recent entries in the feed. In Opera, a more

traditional feed reader is included which stores and displays the contents of the feed.

Furthermore, most browsers can be extended via plug-ins, downloadable components that

provide additional features.

4.3 User interface

Most major web browsers have these user interface elements in common:[17]

Back and forward buttons to go back to the previous resource and forward

respectively.

A refresh or reload button to reload the current resource.

A stop button to cancel loading the resource. In some browsers, the stop button is

merged with the reload button.

8

http://en.wikipedia.org/wiki/Web_browser#cite_note-16

A home button to return to the user's home page.

An address bar to input the Uniform Resource Identifier (URI) of the desired

resource and display it.

A search bar to input terms into a search engine.

A status bar to display progress in loading the resource and also the URI of links

when the cursor hovers over them, and page zooming capability.

Major browsers also possess incremental find features to search within a web page.

4.4 Privacy and security

Most browsers support HTTP Secure and offer quick and easy ways to delete the web

cache, cookies, and browsing history. For a comparison of the current security

vulnerabilities of browsers, see comparison of web browsers

9

5. EXISTING SYSTEM

Speech recognition has been for-fronted much over years now, although the technology

still requires training to reduce the error margin. Up until now, mainly dictation software

and software for disabled people made use of speech recognition, but the most recent

advances have brought speech recognition to the brink of becoming an established

technology in the mass market. With various Operating System bundling speech

recognition and biometrics techniques, the way PC’s are used had taken much step ahead.

Since OS like windows / Mac introduced Speech Recognition technology, developers

began to concentrate much on this technology. Now a day many Smartphone’s offers

speech recognition capabilities.

5.1 Browsing web with Speech Recognition

With Windows Speech Recognition, browsing the Internet means figuring out what the

objects in the web browser window are called and then saying the commands you need to

interact with them. You could easily activate browsing feature through various voice

commands. For browsers like Firefox, Firesay is an interesting add-on for Firefox, as it

adds speech recognition to the web browser so that users can use voice commands to

issue commands in the browser.

5.2 What Chrome Offers???

Google Chrome too has added the potential for voice recognition to the beta version of its

Chrome browser in the latest update to the software.

10

The tool works through the newly-included HTML5 speech input API and once

implemented would give users the ability to add text to any input area on a website

without touching the keyboard. Google said that once recorded, the audio would be sent

to Google’s specialist speech servers for transcription, before being sent back to the

original computer and typed into the text box

The move comes as Google continues to expand its voice recognition services from

portable devices onto the desktop. Already introduced technology -Google Voice Search

(www.google.com/mobile/voice-search/) is one of the most prominent apps on the mobile

side, and is available across the Android, BlackBerry, iOS, Nokia S60 and Windows

platforms.

5.3 Using Opera with your voice

Opera Software has announced a voice-activated browser. The new browser, launch date

not yet announced, incorporates IBM’s ViaVoice software and will respond to voice

commands from the user.

As with other voice recognition programs, the software must be trained to learn the user’s

speech patterns and voice. The initial version will be targeted toward the English

language market, and Opera predicts the browser will increase accessibility for those

individuals with difficulties working a mouse or keyboard.

11

6. PROBLEM STATEMENT

This project relates to the general field of internet web browsing or searching for

particular web pages or other information references. More particularly, our project is

related to speech recognition, and identification and isolation of keywords from that

speech, and passing those words to search functions found on web browser.

Speech recognition is a process that recognizes our human voice and converts it

into text or commands that would perform a specific task. The output of voice

recognition systems can be applied in various fields. Therefore, it will be implemented in

this project by introducing a web browser with speech recognition.



12

7. SCOPE OF THE PROJECT






Speech recognition is a process that recognizes our human voice and converts it into text

or commands that would perform a specific task. The output of voice recognition systems

can be applied in various fields. Therefore, it will be implemented in this project by

introducing a web browser with speech recognition.

Voice operated web browser is designed especially for a disabled person. The design of

such web browser needs to consider many technical constraints, disabilities and wishes of

the future.

This web browser will have the ability to search, navigate web-pages on World Wide

Web from a PC through speech input i.e. the user will exercise control over the web

browsing through voice.

Information contained on the World Wide Web is inaccessible to many people. The web

is primarily a visual medium that requires a keyboard and mouse to navigate, and this

disenfranchises several types of users.

People who have difficulty in accessing the World Wide Web, and those who temporarily

cannot use a traditional web browser (for example, because their eyes or hands are

occupied or because they are not near their computer).

13

8. PROPOSED SYSTEM

This project has the speech recognizing and speech synthesizing capabilities though it is

not a complete replacement of what we call a WEB BROWSER but still a good web browser to

be used through voice. This software also can search web pages and perform tasks related to web

browser.

This project also includes a voice mark feature which is very much similar to the book

mark feature that is available in the existing web-browser. The following are the main modules

which are supposed to be implemented in our project: -

• Tab Control

• Search Provider

• Working with favorites

• Implementation of Voice-Mark feature

• Dealing with pop-up’s

• Working with History and Cookies

• Simple RSS (Really Simple Syndication) reader

• Voice recognizing text editor

14

9. METHODOLOGY AND ANALYSIS

Software process model deals with the model which we are going to use for the development of

the project. There are many software process models available but while choosing it we should

choose it according to the project size that is whether it is industry scale project or big scale

project or medium scale project.

Accordingly the model which we choose should be suitable for the project as the software

process model changes the cost of the project also changes because the steps in each software

process model varies.

The software process model we will be using is the Incremental Software Model.

15

The incremental build model is a method of software development where the model is

designed, implemented and tested incrementally (a little more is added each time) until the

product is finished. It involves both development and maintenance. The product is defined as

finished when it satisfies all of its requirements. This model combines the elements of the

waterfall model with the iterative philosophy of prototyping.

The product is decomposed into a number of components, each of which are designed and built

separately (termed as builds). Each component is delivered to the client when it is complete. This

allows partial utilization of product and avoids a long development time. It also creates a large

initial capital outlay with the subsequent long wait avoided. This model of development also

helps ease the traumatic effect of introducing completely new system all at once.

The incremental Model is an evolution of the waterfall model, where the waterfall model is

incrementally applied.

The series of releases is referred to as “increments”, with each increment providing more

functionality to the customers. After the first increment, a core product is delivered, which can

already be used by the customer. Based on customer feedback, a plan is developed for the next

increments, and modifications are made accordingly. This process continues, with increments

being delivered until the complete product is delivered. The incremental philosophy is also used

in the agile process model.

The Incremental model combines elements of the linear sequential model with the iterative

philosophy of the prototyping. This model has been explicitly designed to accommodate a

product that evolves over time.

When an incremental model is used, the first increment is often a core product. The core product

is used by the customer or undergoes a detailed review. As a result of use and/or evaluation a

plan is developed for the next increment. The plan addresses the modification to the core product

16

to better meet the needs of the customer and delivery of additional features and functionality.

Software is constructed in a step-by-step manner. While a software product is being developed,

each step adds to what has already been completed.

Analysis Phase:

To attack a problem, by breaking it into sub-problems. The objective of analysis is to determine

exactly what must be done to solve the problem. Typically, the system’s logical elements (its

boundaries, processes, and data) are defined during analysis.

Design Phase:

The objective of design is to determine how the problem will be solved. During design the

analyst’s focus shifts from the logical to the physical Data elements are grouped to form physical

data structures, screens, reports, files, and databases.

Coding Phase:

The system is created during this phase. Programs are coded, debugged, documented, and tested.

New hardware is selected and ordered. Procedures are written and tested. End-user

documentation is prepared. Databases and files are initialized. Users are trained.

Testing Phase:

Once the system is developed, it is tested to ensure that it does what it was designed to do. After

the system passes its final test and any remaining problems are corrected, the system is

implemented and released to the user.

17

10. DETAILS OF THE HARDWARE & SOFTWARE

10.1 SOFTWARES:-

Front-End:- VB.net

Back-End:- Database using File Handling and XML

(i) VB.net:-

Visual Basic .NET (VB.NET) is an object-oriented computer programming language that can be

viewed as an evolution of the classic Visual Basic (VB), which is implemented on the .NET

Framework. Microsoft currently supplies two major implementations of Visual Basic: Microsoft

Visual Studio, which is commercial software and Microsoft Visual Studio Express, which is free

of charge. Microsoft implementation of Visual Basic .NET is called "Microsoft Visual Basic".

(ii) Database using File Handling and XML:-

VB.NET File Handling

With the File style in the .NET Framework, you can perform efficient and simple manipulations

of files, including reads, writes, and appends. Using a set of examples, we look at how you can

programmatically test and mutate files in the VB.NET language.

Here, we describe in brief many of the methods available in the VB.NET language and .NET

framework on the File type. Some methods have been omitted; also the System.IO namespace

provides many other types that are separate from these shared methods.

File.ReadAllBytes: Useful for files not stored as plain text. You can open images or movies

with this method.

File.ReadAllLines: Microsoft: "Opens a file, reads all lines of the file with the specified

18

encoding, and closes the file."

File.ReadAllText: Returns the contents of the text file at the specified path as a string. Very

useful for plain text or settings files.

File.WriteAllBytes: Useful for files such as images that were created or mutated in memory.

File.WriteAllLines: Stores a string array in the specified file, overwriting the contents. Shown

in an example below.

File.WriteAllText: Writes the contents of a string to a text file. One of the simplest ways to

persist text data.

File.AppendAllText: Use to append the contents string to the file at path. Microsoft: "Appends

the specified string to the file, creating the file if it doesn't already exist."

XML

XML is a markup language for documents containing structured information.

Structured information contains both content (words, pictures, etc.) and some indication of what

role that content plays (for example, content in a section heading has a different meaning from

content in a footnote, which means something different than content in a figure caption or

content in a database table, etc.). Almost all documents have some structure.

19

A markup language is a mechanism to identify structures in a document. The XML specification

defines a standard way to add markup to documents.

Data is usually best stored in a structured format. In cases where a database is not necessary, you

can use XML files. With XmlReader and XmlWriter in the VB.NET language, you can read and

write XML in an efficient way.

10.2 HARDWARE:-

Microphones:

A quality microphone is key when utilizing the speech recognition system. Desktop microphones

are not suitable to continue with speech recognition system, because they have tendency to pick

up more ambient noise.

The best choice, and most common is the headset style. It allows the ambient noise to be

minimized, while allowing you to have the microphone at the tip of your tongue all the time.

Headsets are available without earphones and with earphones (mono or stereo).

Computer/ Processors:

Speech recognition applications can be heavily dependent on processing speed. This is because

continuous voice recognition can take place. Thus to avoid unnecessary delays a processor

having good processing capability is required.

20

11. DESIGN DETAILS

The Interaction diagram of the proposed software project can be given as follows:-

21

The Flowchart of the proposed software project can be given as follows:-

22

12. IMPLEMENTATION PLAN FOR THE NEXT SEMESTER

In the next semester we would be working on the different modules in the project. The

following are the main modules which are supposed to be implemented in our project: -

• Tab Control

• Search Provider

• Working with favorites

• Implementation of Voice-Mark feature

• Dealing with pop up’s

• Working with History and Cookies

• Simple RSS (Really Simple Syndication) reader

• Voice recognizing text editor

23

13. REFERENCES

Books:

Visual Basic .NET Black Book

Web Sites:

http://msdn.microsoft.com/en-us/library/ms990659.aspx

http://channel9.msdn.com/coding4fun/articles/Giving-Computers-a-Voice

http://www.google.com/patents/about/6311182_Voice_activated_web_browser.html?

id=ZYsIAAAAEBAJ

24

hackerbrains.files.wordpress.com€¦ · web viewa home button to return to the user's home...

Documents