middle east technical university department of computer ... · middle east technical university...

Middle East Technical University Department of Computer Engineering

Fall 2011

SOFTWARE REQUIREMENTS SPECIFICATION

for

AUGMENTATIVE & ALTERNATIVE COMMUNICATION APPLICATION

FOR ANDROID

By

Sponsored by

Coşkun Şahin 1631357

Gökhan Oğuz 1631076

Evren Pala 1631084

2

Table of Content

1. INTRODUCTION .......................................................................................................................................................... 3

1.1. Problem Definition ............................................................................................................................................ 3

1.2. Purpose ............................................................................................................................................................... 3

1.3. Scope .................................................................................................................................................................. 4

1.4. User and Literature Survey ................................................................................................................................ 4

1.5. Definitions and Abbreviations ........................................................................................................................... 5

1.6. References ............................................................................................................................................................. 5

1.7. Overview ............................................................................................................................................................ 6

2. OVERALL DESCRIPTION .............................................................................................................................................. 6

2.1. Product Perspective ............................................................................................................................................ 6

2.2. Product Functions .............................................................................................................................................. 7

2.3. Constraints, Assumptions and Dependencies .................................................................................................. 10

3. SPECIFIC REQUIREMENTS ........................................................................................................................................ 10

3.1. Interface Requirements .................................................................................................................................... 10

3.2. Functional Requirements ................................................................................................................................. 11

3.3. Non-Functional Requirements ......................................................................................................................... 16

4. DATA MODEL AND DESCRIPTION ............................................................................................................................ 16

4.1. Data Descriptions ............................................................................................................................................. 17

5. BEHAVIORAL MODEL AND DESCRIPTION ................................................................................................................ 20

5.1. Description for Software Behavior .................................................................................................................. 20

5.2. State Transition Diagram ................................................................................................................................. 21

6. PLANNING ................................................................................................................................................................ 22

6.1. Team Structure ................................................................................................................................................. 22

6.2. Estimation (Basic schedule) ............................................................................................................................. 22

6.3. Process Model .................................................................................................................................................. 25

7. CONCLUSION ............................................................................................................................................................ 25

3

1. INTRODUCTION

This document contains software requirements of Augmentative and Alternative Communication

Application for Android of group JAVATAR. In the document, firstly we will give the purpose and scope,

next continue with an overall description of the product. After introducing the system generally, we will

state specific requirements for it. In the following part of the document, we will describe functional

requirements and non-functional requirements. Then, data model and behavioral model will be mentioned.

In the last part, we will describe the planning of the project.

1.1. Problem Definition

Disabilities are serious problems that they deprive people of participating in the social life like

others. Integration of disabled people to the society is another subject that issued by helpful and sensitive

people. In today's world, disabled people can gradually take active role in social life thanks to use of

technology.

Speech disorder is one type of disabilities having bad effects in daily life. Speech inability prevents

people from expressing themselves clearly. Moreover, it obstructs all the communicative activities in several

fields like educational or commercial enterprises. In this sense, we are willing to lower the effects of speech

disorder with Augmentative and Alternative Communication Application project.

1.2. Purpose

The purpose of this document is to provide a complete description of all the functions and

specifications of AAC-Droid, which will be developed by Javatar, METU CENG490 project group,

cooperated with INNOVA. This document is intended to decrease the effort needed for development, and to

provide information for validation and verification. Content of this document will set up a basis for

functionality, external interfaces and design constraints of the system.

4

1.3. Scope

AAC-Droid, the name of our product, will be a Java-based Android application for mobile devices,

which helps speech-impaired people to communicate with the other people. It is intended to be a system

which will allow people to get sound output of the text they have written. Moreover, users can build

sentences from the images they have chosen from the categorized pictures. In case of necessity to more

pictures for practical usage, AAC-Droid users have the option to add images by matching them with new

words or phrases. On the other hand, it will not check the correctness or meaningfulness of the input. In

addition, AAC-Droid generates speech in only Turkish Language.

1.4. User and Literature Survey

There are some AAC devices and software products currently available:

ECO2[1]

is PRC’s advanced AAC device and Windows 7-based computer in fast processing

speed and plenty of computing power for greater communication results. “One touch”

transition from computing to speech output. With its large 14.1" XGA TFT display, larger

keys make access easier for those with visual or motor challenges.

Fig.1 Echo2

LightWRITER[2]

is a portable text-to-speech communication device, developed by ZYGO

Industries, Inc. It is the only device that has dual displays, one facing the user, so he or she

can see what is being typed, and a second outfacing display to allow communication in a

natural face-to-face position.

EZ Keys[3]

by Words+ is AAC software. It has time-saving features, including dual word

prediction and abbreviation expansion. When the user begins to type a word, EZ Keys

displays a table of the six most frequently used words that begin with those letters. User

selects the appropriate word from the display, and EZ Keys instantly types the remainder of

http://www.words-plus.com/website/products/soft/ezkeys.htm

5

the word. In addition, EZ Keys features next word prediction, where the program actually

learns the user’s word patterns and displays a list of the last six words he/she has used in

conjunction with the previous word. The user selects the correct word and EZ Keys types it

for him/her.

There are some other products with the similar features, but there is none for Turkish language.

As mentioned in section 1.1. potential users of our product are speech-impaired people. In Turkey,

population of speech-impaired people is not negligible. According to a report prepared by TUIK in 2002,

rate of people having speech disorders is 0.38 %, which indicates that there are almost 280 thousand such

people in our country. Another research published by the same association in 2010 states that 54.4 % of the

speech-impaired demand to improve educational facilities[4]

.

1.5. Definitions and Abbreviations

SRS Software Requirements Specification

AAC Augmentative and Alternative Communication

TUIK Turkey Statistics Institution

NLP Natural Language Processing

TTS text-to-speech

GUI Graphical User Interface

OS Operating System

SDK Software Development Kit

1.6. References

[1]: Prentke Romich Company (PRC), Eco-2 from

sdfdsfdsfsfdshttps://store.prentrom.com/product_info.php/cPath/11/products_id/53

[2]: Toby Churchill, Lightwriter product from

sdsadsadsadshttp://www.toby-churchill.com/

[3]: Simulations Plus’ E-Z keys from

sasdsadsads http://www.words-plus.com/website/products/soft/ezkeys.htm

https://store.prentrom.com/product_info.php/cPath/11/products_id/53

http://www.toby-churchill.com/

http://www.words-plus.com/website/products/soft/ezkeys.htm

6

[4]: Turkey Statistics Institution, (2010). Özürlülerin Sorun ve Beklentileri Araştırması, from

aadsdsadsdhttp://www.tuik.gov.tr/VeriBilgi.do?tb_id=5&ust_id=1

[5]: Laboratory for Computational Studies of Language, (2011), from

aasdsasdasdhttp://www.ceng.metu.edu.tr/research/lcsl

[6]: Search Mobile Computing, text-to-speech, from

sadsdadsadhttp://searchmobilecomputing.techtarget.com/definition/text-to-speech

1.7. Overview

The following sections will contain detailed description where the product features and functions are

explained in detail as well as the constraints, assumptions and dependencies of the system. The interface,

functional and non-functional requirements will follow that. Then, you will find the data model and the

behavior model of the system with specific diagrams giving the intuition of how the system works in detail.

Finally, there will be a conclusion at the end of the document.

2. OVERALL DESCRIPTION

2.1. Product Perspective

AAC-Droid, is an application working on Android OS. The system consists of several

components. Firstly, it has a user interface enabling the user to give input to the TTS engine. There

are preloaded images associated with some words & phrases about many situations that appear

frequently in daily life. Users will be able to insert appropriate pictures, each corresponding to a

meaningful word or phrase, as well as typing just alphabetic characters from keyboard. GUI sends

the entered input to the central part of the system, Turkish TTS engine. If the inputs are images,

before sending it to TTS, system sends it to a phrase library to convert the input to a suitable text.

TTS engine processes the text, forms the appropriate sound and sends this sound stream to the related

output device. In order to generate the sound output, TTS interacts with database. Thus, it interacts

http://www.tuik.gov.tr/VeriBilgi.do?tb_id=5&ust_id=1

7

with sound database and output device as well as GUI. AAC-Droid is an independent and totally

self-contained software system.

Diagram 1. System Interfaces of AAC-Droid

Diagram 1 shows us the general structure of the system. TTS engine is the main part of the system. It

deals with the pronunciation details. For a realistic pronunciation, sounds for words and syllables will be

optimized according to correct usage in the language. Thus, database connection is an important part in

terms of performance. In addition, it contains some words as whole, but not all of them. If a required word is

not in the database, TTS engine will syllabize the word and interact with the database.

2.2. Product Functions

The user shall provide specific sequence of inputs to the system to make it operate properly. The

system shall produce the corresponding sound stream via a sound output device. We can categorize the

functions of AAC-Droid into three parts.

8

Diagram 2. System Use Cases

In the above diagram, functions of AAC-Droid are shown as writing the sentence or statement

directly, forming sentences and phrases with images and adding new pictures bound with an expression.

There is only one type of actor to use system. There is no need for admin or any classification in users.

2.2.1. Writing the sentence or statement directly

Keyboard of mobile device and text box on the GUI enables user to write their sentences. System

operates like a text editor in this sense. Editor includes rich set of keyboard commands for manipulating

entire words, lines and paragraphs at a time. Some shortcut methods are provided to user such as copy/paste

or redo/undo properties. Moreover, text editor has auto-complete property.

9

Diagram 3. Text typing use case

User is able to type text via editor of the application. After completion, user should click on submit

button in order to make application vocalize the input.

2.2.2. Forming statements and sentences with images

Beside text typing, AAC-Droid provides functionality of composing sentences and phrases using

categorized images.

Diagram 4. Image selecting use case

After selecting images, system generates related sentences. User chooses one of them and submits to

vocalize it.

2.2.3. Adding new images

In the image selection interface, the users can also add new images and related texts. In this way, the

user can customize the program and make it easy to use for his/her needs. It is required that an appropriate

10

image exists expressing the situation that they wanted to be described. After typing the text for the image,

they are inserted into the database and ready to use for new inputs.

Diagram 5. Adding image use case

Text of the image must be syllabized according Turkish language rules. System shall try to parse the

text and vocalize it. User can cancel the operation if the sound output does not satisfy.

2.3. Constraints, Assumptions and Dependencies

As mentioned in section 2.1, system is compatible with Android OS. In addition, system will operate

for only Turkish. It is assumed that user is able to understand Turkish language and write meaningful

sentences or expressions in Turkish.

One of two constraints is that the user has a mobile device operated by Android OS. The other one is

the mobile device must have enough disk space to store sound data.

3. SPECIFIC REQUIREMENTS

3.1. Interface Requirements

3.1.1. Text Editing Interface

When the application is launched, text editing interface appears as default. This interface is related

with the function of writing sentences and statements directly with keyboard, which is explained in section

2.2.1. In this screen, there is a text box displaying the typed text, a submit button to send data to be

processed and another button to switch to the other interface. When the application is launched, text editing

11

interface appears as default. This interface is related with the function of writing sentences and statements

directly with keyboard, which is explained in section 2.2.1. In this screen, there is a text box displaying the

typed text, a submit button to send data to be processed and another button to switch to the other interface.

This interface operates like a traditional text editor having some additional properties such as some useful

keyboard shortcuts and font format editing options. While user is typing some text, system does not convert

words from text to speech, it waits until the user clicks the “submit” button.

3.1.2. Image Selection Interface

This is the second user interface of the system. In this interface, users are able to select any image

from a catalog. In this catalog, images are categorized into groups according to their characteristics (e.g.

colors, places, foods …). In fact, these images are used as buttons for selecting words. Each image

corresponds to a common word. When the image is selected, related word is inserted into the text area. To

be more specific, images are connected with word stems because images are selected according to their

frequency of occurrence in everyday speech. It will not only be loss of space to bind separate images to

suffixed or prefixed forms of the words, but also cause ambiguity. (For example, the image-button

corresponding to the words “masayı” ,”masadan”, and ”masalar” should be same, and specifically

corresponds to the word “masa”). Therefore, when an image is selected, forms of word with suffixes

are suggested by the system. Otherwise, noun simple condition is selected as default.

3.1.3. New Image Addition Interface

New image addition function is done by the user via a separate user interface. This interface gives

user chance to add new images and attribute meaning to it by binding a word, a phrase or a full sentence. In

the word form, user is able to bind directly with a particular word only or add its word stem if he/she desires.

This interface also enables user to add his/her image into any category of pictures. In addition to that,

creating a new image group (new group of buttons in fact) is another facility provided to the user by this

interface.

3.2. Functional Requirements

12

3.2.1. Text-to-speech

Description: This function of the application converts the text typed by the user to sound stream.

Basic Data Flow:

1. User types the text.

2. User clicks the submit button.

3. Whole text is parsed into words.

4. Matching words’ sounds are retrieved from the database.

5. Remaining words are parsed into syllables.

6. Corresponding sound for each syllable is retrieved.

7. Sound streams combined together and vocalized.

Alternative Data Flow 1:

1. User types any text.


3. The system finds unrecognized characters.

4. Operation cancelled.

5. The GUI shows an error message asking for user to check the spelling.




3. The system cannot parse the text with Turkish language rules.






3. The text contains more than 100 characters.

13



Functional Requirements:

REQ-1 User must type a text with no more than 100 characters.

REQ-2 The text could be syllabized according Turkish language rules.

3.2.2. Image-to-speech

Description: This function of the application inserts the text corresponding to the chosen image.

Basic Data Flow:

1. User selects an image.

2. System suggests several different forms of the word.

3. User chooses one of them.

4. Selected word is inserted into text area.

5. User clicks on the submit button.







1. User selects an image.

2. System suggests several different forms of the word.

3. User chooses one of them.

4. Selected word is inserted into text area.



7. The text contains more than 100 characters.

14




REQ-1 User is able to select more than one image.

3.2.3. Adding new image

Description: This function of the application adds new image and related text into database.

Basic Data Flow:

1. User clicks on the new image button.

2. GUI shows a file browse frame.

3. User chooses the image.

4. User fills the text box with the meaning of image.

5. User optionally fills the other text box with word stem if a word is bound to image.







12. User clicks verify button.

13. Image is inserted into the database.




3. User clicks on cancel button.

4. Operation is cancelled.


15




4. Image already exists in the database.


6. The GUI shows an error message asking for user to choose another image.






5. The text already exists in the database, associated with an image.


7. The GUI shows an error message telling the user that text already exists.






5. User optionally fills the other text box with word stem if a word is bound to image.







12. User clicks cancel button.


16


REQ-1 User must type a text with no more than 100 characters.

REQ-2 The text could be syllabized according Turkish language rules.

REQ-3 The image should preserve its main form after minimized.

3.3. Non-Functional Requirements

3.3.1. Performance Requirements

AAC-Droid provides real-time communication to its users, so it converts text to speech in less than

two seconds.

Because application is used on a mobile device, it is efficiently working that no other processes /

applications executing on the same mobile device does not slow down.

3.3.2. Design Constrains

Java programming language is used for implementation of the system. System is developed for

Android OS. No other platforms are considered during development of the application. Eclipse integrated

development environment is used for development. Because it is compatible with Android SDK and it is

very useful for Java programming, group had deal with using Eclipse. System is tested on any mobile device

operated by Android OS. Application is not web-based or network attached. In the development process, no

pirated software should be used. Java coding conventions are conformed.

4. DATA MODEL AND DESCRIPTION

This section describes information domain for the software.

17

4.1. Data Descriptions

In this project, there are five types of data objects in the system namely, word objects, syllable

objects, image objects, sound objects and TTSEngine object.

4.1.1. Data Objects

Word object: It will be obtained from parsed inputs. System will search the word in database.

It also contains syllables and corresponding sound object.

Syllable object: The word will be syllabized and syllable objects shall be formed. It contains

the object of the sound.

Image object: Images will be obtained as inputs. System will convert the images into text that

will be parsed. This object holds the words related with the image.

Sound object: All the text will be matched with sound objects. In addition to the stream of the

sound, it contains a unique ID.

18

TTSEngine object: It parses the input and generates word objects. After that, it also parses the

word objects to form syllable objects. In addition, this object finds the appropriate versions of

sound objects according to the input.

4.1.2. Relationships

This section describes the relationships between the data objects described in the previous

section.

Word – Syllable: Each word may be associated with one or more syllables. In the same way,

a syllable may be associated with multiple words.

Word – Sound: Each word is associated with more than one sound object. However, a sound

object represents only one word.

Syllable – Sound: Each sound represents only one syllable, but a syllable may own more than

one sound because of different versions of pronunciations.

Image – Word: Each image represents a set of words.

TTSEngine – Word & Syllable: Since there is only one TTSEngine, all the words and

syllables associated with it.

19

4.1.3. Complete Data Model

Diagram 6. Complete Data Model

4.1.4. Data Dictionary

Stream: It is multimedia that is constantly received by and presented to an end-user while being

delivered by a streaming provider. The name refers to the delivery method of the medium rather than

to the medium itself.

NLP: Natural Language Processing (also known as Computational Linguistics) aims to bring

together technological and scientific studies of human languages, in particular, their processing by

machine, with the dual goals of advancing language technology and understanding the computational

basis of human language capacity [5]

.

http://en.wikipedia.org/wiki/Multimedia

http://en.wikipedia.org/wiki/End-user_%28computer_science%29

20

TTS: Text-to-speech is a type of speech synthesis application that is used to create a spoken sound

version of the text in a computer document, such as a help file or a Web page. TTS can enable the

reading of computer display information for the visually challenged person, or may simply be used to

augment the reading of a text message [6]

.

5. BEHAVIORAL MODEL AND DESCRIPTION

5.1. Description for Software Behavior

The program has two main parts, which are text typing and image selection. After starting the

program, text typing screen is shown by default. In this state, the users can write any text that they want to

listen. There are two functions that can be chosen in this interface: submitting the text to the TTS engine and

switching to image selection interface. If the text is submitted, it is processed in the TTS and generates the

sound by interacting with the database.

In the image selection state, the users can select some images that refer to the situation they want to

describe. After that, the program adds the text which is bound with that image to the text box. If the text is

submitted, the procedure that occurs in the first state is applied here, too. The user can also switch to the first

state by using the same button. In addition to this, new images can be added and related words to those

images can be assigned in this state. In order to do that, after clicking the “add image” button, they browse

the image and type the text. The image and the text are saved in the database and can be used to form new

sentences more quickly

21

5.2. State Transition Diagram

Diagram 7. State Transition Diagram

22

6. PLANNING

6.1. Team Structure

Our team consists of three people:

Gökhan OĞUZ – Head, Researcher, Database Developer, Software Tester, Public Relations

CoĢkun ġAHĠN – Researcher, Database Developer, GUI Developer, Android OS Specialist

Evren PALA – Researcher, Language Specialist, GUI Developer, Software Tester

Our project is mainly consists of three parts. First and main part is building a TTS engine. Since it

requires more work and energy, it needs to work in collaboration for that part. All the team members should

participate in the researches about this field. When technical part of TTS engine is over, remaining will be

divided. Evren Pala is responsible from the correct usage of the words. Gökhan Oğuz and CoĢkun ġahin are

responsible of database interactions of TTS engine. Second part is generating phrases or sentences from

images. CoĢkun ġahin and Gökhan Oğuz are mainly responsible of building a phrase library. Third part is

the GUI. Evren Pala and CoĢkun ġahin are responsible of GUI development.

Although we have a team leader, the decisions are made in collaboration, as it is supposed to be. We

meet with our teaching assistant once a week (more than once if needed) to decide on the details of weekly

progress and make a schedule to work together.

6.2. Estimation (Basic schedule)

23

Diagram 8. Main Parts of the Project

24

Diagram 8. Gantt Chart

25

6.3. Process Model

In our project, we will use waterfall as process model. Waterfall model is separate and distinct phases

of specification and development.

It is most widely used model in the field of software development since it is a linear model and

simple to implement, documentation is produced at every stage which brings simple understanding of design

procedure and after each step of coding, testing is done to check the code so it enables to eliminate possible

errors at each stage. Due to the advantages mentioned above, we have also decided to use this model in our

project.

Diagram 9. Waterfall Model

7. CONCLUSION

This report is prepared to show AAC project's requirement details in terms of several aspects. Firstly,

a brief description of “AAC-Droid” is introduced. Then, a marketing and technology research is carried out

and the results are established. And at the body part, the requirement details of the project are described and

Behavioral Model and Description is introduced. Scheduling and timeline have also been specified in this

document. This specification will hopefully constitute the basis for design, development, and testing of the

project.

middle east technical university department of computer ... · middle east technical university...

Documents