middle east technical university department of computer ... · middle east technical university...
TRANSCRIPT
Middle East Technical University Department of Computer Engineering
Fall 2011
SOFTWARE REQUIREMENTS SPECIFICATION
for
AUGMENTATIVE & ALTERNATIVE COMMUNICATION APPLICATION
FOR ANDROID
By
Sponsored by
Coşkun Şahin 1631357
Gökhan Oğuz 1631076
Evren Pala 1631084
2
Table of Content
1. INTRODUCTION .......................................................................................................................................................... 3
1.1. Problem Definition ............................................................................................................................................ 3
1.2. Purpose ............................................................................................................................................................... 3
1.3. Scope .................................................................................................................................................................. 4
1.4. User and Literature Survey ................................................................................................................................ 4
1.5. Definitions and Abbreviations ........................................................................................................................... 5
1.6. References ............................................................................................................................................................. 5
1.7. Overview ............................................................................................................................................................ 6
2. OVERALL DESCRIPTION .............................................................................................................................................. 6
2.1. Product Perspective ............................................................................................................................................ 6
2.2. Product Functions .............................................................................................................................................. 7
2.3. Constraints, Assumptions and Dependencies .................................................................................................. 10
3. SPECIFIC REQUIREMENTS ........................................................................................................................................ 10
3.1. Interface Requirements .................................................................................................................................... 10
3.2. Functional Requirements ................................................................................................................................. 11
3.3. Non-Functional Requirements ......................................................................................................................... 16
4. DATA MODEL AND DESCRIPTION ............................................................................................................................ 16
4.1. Data Descriptions ............................................................................................................................................. 17
5. BEHAVIORAL MODEL AND DESCRIPTION ................................................................................................................ 20
5.1. Description for Software Behavior .................................................................................................................. 20
5.2. State Transition Diagram ................................................................................................................................. 21
6. PLANNING ................................................................................................................................................................ 22
6.1. Team Structure ................................................................................................................................................. 22
6.2. Estimation (Basic schedule) ............................................................................................................................. 22
6.3. Process Model .................................................................................................................................................. 25
7. CONCLUSION ............................................................................................................................................................ 25
3
1. INTRODUCTION
This document contains software requirements of Augmentative and Alternative Communication
Application for Android of group JAVATAR. In the document, firstly we will give the purpose and scope,
next continue with an overall description of the product. After introducing the system generally, we will
state specific requirements for it. In the following part of the document, we will describe functional
requirements and non-functional requirements. Then, data model and behavioral model will be mentioned.
In the last part, we will describe the planning of the project.
1.1. Problem Definition
Disabilities are serious problems that they deprive people of participating in the social life like
others. Integration of disabled people to the society is another subject that issued by helpful and sensitive
people. In today's world, disabled people can gradually take active role in social life thanks to use of
technology.
Speech disorder is one type of disabilities having bad effects in daily life. Speech inability prevents
people from expressing themselves clearly. Moreover, it obstructs all the communicative activities in several
fields like educational or commercial enterprises. In this sense, we are willing to lower the effects of speech
disorder with Augmentative and Alternative Communication Application project.
1.2. Purpose
The purpose of this document is to provide a complete description of all the functions and
specifications of AAC-Droid, which will be developed by Javatar, METU CENG490 project group,
cooperated with INNOVA. This document is intended to decrease the effort needed for development, and to
provide information for validation and verification. Content of this document will set up a basis for
functionality, external interfaces and design constraints of the system.
4
1.3. Scope
AAC-Droid, the name of our product, will be a Java-based Android application for mobile devices,
which helps speech-impaired people to communicate with the other people. It is intended to be a system
which will allow people to get sound output of the text they have written. Moreover, users can build
sentences from the images they have chosen from the categorized pictures. In case of necessity to more
pictures for practical usage, AAC-Droid users have the option to add images by matching them with new
words or phrases. On the other hand, it will not check the correctness or meaningfulness of the input. In
addition, AAC-Droid generates speech in only Turkish Language.
1.4. User and Literature Survey
There are some AAC devices and software products currently available:
ECO2[1]
is PRC’s advanced AAC device and Windows 7-based computer in fast processing
speed and plenty of computing power for greater communication results. “One touch”
transition from computing to speech output. With its large 14.1" XGA TFT display, larger
keys make access easier for those with visual or motor challenges.
Fig.1 Echo2
LightWRITER[2]
is a portable text-to-speech communication device, developed by ZYGO
Industries, Inc. It is the only device that has dual displays, one facing the user, so he or she
can see what is being typed, and a second outfacing display to allow communication in a
natural face-to-face position.
EZ Keys[3]
by Words+ is AAC software. It has time-saving features, including dual word
prediction and abbreviation expansion. When the user begins to type a word, EZ Keys
displays a table of the six most frequently used words that begin with those letters. User
selects the appropriate word from the display, and EZ Keys instantly types the remainder of
5
the word. In addition, EZ Keys features next word prediction, where the program actually
learns the user’s word patterns and displays a list of the last six words he/she has used in
conjunction with the previous word. The user selects the correct word and EZ Keys types it
for him/her.
There are some other products with the similar features, but there is none for Turkish language.
As mentioned in section 1.1. potential users of our product are speech-impaired people. In Turkey,
population of speech-impaired people is not negligible. According to a report prepared by TUIK in 2002,
rate of people having speech disorders is 0.38 %, which indicates that there are almost 280 thousand such
people in our country. Another research published by the same association in 2010 states that 54.4 % of the
speech-impaired demand to improve educational facilities[4]
.
1.5. Definitions and Abbreviations
SRS Software Requirements Specification
AAC Augmentative and Alternative Communication
TUIK Turkey Statistics Institution
NLP Natural Language Processing
TTS text-to-speech
GUI Graphical User Interface
OS Operating System
SDK Software Development Kit
1.6. References
[1]: Prentke Romich Company (PRC), Eco-2 from
sdfdsfdsfsfdshttps://store.prentrom.com/product_info.php/cPath/11/products_id/53
[2]: Toby Churchill, Lightwriter product from
sdsadsadsadshttp://www.toby-churchill.com/
[3]: Simulations Plus’ E-Z keys from
sasdsadsads http://www.words-plus.com/website/products/soft/ezkeys.htm
6
[4]: Turkey Statistics Institution, (2010). Özürlülerin Sorun ve Beklentileri Araştırması, from
aadsdsadsdhttp://www.tuik.gov.tr/VeriBilgi.do?tb_id=5&ust_id=1
[5]: Laboratory for Computational Studies of Language, (2011), from
aasdsasdasdhttp://www.ceng.metu.edu.tr/research/lcsl
[6]: Search Mobile Computing, text-to-speech, from
sadsdadsadhttp://searchmobilecomputing.techtarget.com/definition/text-to-speech
1.7. Overview
The following sections will contain detailed description where the product features and functions are
explained in detail as well as the constraints, assumptions and dependencies of the system. The interface,
functional and non-functional requirements will follow that. Then, you will find the data model and the
behavior model of the system with specific diagrams giving the intuition of how the system works in detail.
Finally, there will be a conclusion at the end of the document.
2. OVERALL DESCRIPTION
2.1. Product Perspective
AAC-Droid, is an application working on Android OS. The system consists of several
components. Firstly, it has a user interface enabling the user to give input to the TTS engine. There
are preloaded images associated with some words & phrases about many situations that appear
frequently in daily life. Users will be able to insert appropriate pictures, each corresponding to a
meaningful word or phrase, as well as typing just alphabetic characters from keyboard. GUI sends
the entered input to the central part of the system, Turkish TTS engine. If the inputs are images,
before sending it to TTS, system sends it to a phrase library to convert the input to a suitable text.
TTS engine processes the text, forms the appropriate sound and sends this sound stream to the related
output device. In order to generate the sound output, TTS interacts with database. Thus, it interacts
7
with sound database and output device as well as GUI. AAC-Droid is an independent and totally
self-contained software system.
Diagram 1. System Interfaces of AAC-Droid
Diagram 1 shows us the general structure of the system. TTS engine is the main part of the system. It
deals with the pronunciation details. For a realistic pronunciation, sounds for words and syllables will be
optimized according to correct usage in the language. Thus, database connection is an important part in
terms of performance. In addition, it contains some words as whole, but not all of them. If a required word is
not in the database, TTS engine will syllabize the word and interact with the database.
2.2. Product Functions
The user shall provide specific sequence of inputs to the system to make it operate properly. The
system shall produce the corresponding sound stream via a sound output device. We can categorize the
functions of AAC-Droid into three parts.
8
Diagram 2. System Use Cases
In the above diagram, functions of AAC-Droid are shown as writing the sentence or statement
directly, forming sentences and phrases with images and adding new pictures bound with an expression.
There is only one type of actor to use system. There is no need for admin or any classification in users.
2.2.1. Writing the sentence or statement directly
Keyboard of mobile device and text box on the GUI enables user to write their sentences. System
operates like a text editor in this sense. Editor includes rich set of keyboard commands for manipulating
entire words, lines and paragraphs at a time. Some shortcut methods are provided to user such as copy/paste
or redo/undo properties. Moreover, text editor has auto-complete property.
9
Diagram 3. Text typing use case
User is able to type text via editor of the application. After completion, user should click on submit
button in order to make application vocalize the input.
2.2.2. Forming statements and sentences with images
Beside text typing, AAC-Droid provides functionality of composing sentences and phrases using
categorized images.
Diagram 4. Image selecting use case
After selecting images, system generates related sentences. User chooses one of them and submits to
vocalize it.
2.2.3. Adding new images
In the image selection interface, the users can also add new images and related texts. In this way, the
user can customize the program and make it easy to use for his/her needs. It is required that an appropriate
10
image exists expressing the situation that they wanted to be described. After typing the text for the image,
they are inserted into the database and ready to use for new inputs.
Diagram 5. Adding image use case
Text of the image must be syllabized according Turkish language rules. System shall try to parse the
text and vocalize it. User can cancel the operation if the sound output does not satisfy.
2.3. Constraints, Assumptions and Dependencies
As mentioned in section 2.1, system is compatible with Android OS. In addition, system will operate
for only Turkish. It is assumed that user is able to understand Turkish language and write meaningful
sentences or expressions in Turkish.
One of two constraints is that the user has a mobile device operated by Android OS. The other one is
the mobile device must have enough disk space to store sound data.
3. SPECIFIC REQUIREMENTS
3.1. Interface Requirements
3.1.1. Text Editing Interface
When the application is launched, text editing interface appears as default. This interface is related
with the function of writing sentences and statements directly with keyboard, which is explained in section
2.2.1. In this screen, there is a text box displaying the typed text, a submit button to send data to be
processed and another button to switch to the other interface. When the application is launched, text editing
11
interface appears as default. This interface is related with the function of writing sentences and statements
directly with keyboard, which is explained in section 2.2.1. In this screen, there is a text box displaying the
typed text, a submit button to send data to be processed and another button to switch to the other interface.
This interface operates like a traditional text editor having some additional properties such as some useful
keyboard shortcuts and font format editing options. While user is typing some text, system does not convert
words from text to speech, it waits until the user clicks the “submit” button.
3.1.2. Image Selection Interface
This is the second user interface of the system. In this interface, users are able to select any image
from a catalog. In this catalog, images are categorized into groups according to their characteristics (e.g.
colors, places, foods …). In fact, these images are used as buttons for selecting words. Each image
corresponds to a common word. When the image is selected, related word is inserted into the text area. To
be more specific, images are connected with word stems because images are selected according to their
frequency of occurrence in everyday speech. It will not only be loss of space to bind separate images to
suffixed or prefixed forms of the words, but also cause ambiguity. (For example, the image-button
corresponding to the words “masayı” ,”masadan”, and ”masalar” should be same, and specifically
corresponds to the word “masa”). Therefore, when an image is selected, forms of word with suffixes
are suggested by the system. Otherwise, noun simple condition is selected as default.
3.1.3. New Image Addition Interface
New image addition function is done by the user via a separate user interface. This interface gives
user chance to add new images and attribute meaning to it by binding a word, a phrase or a full sentence. In
the word form, user is able to bind directly with a particular word only or add its word stem if he/she desires.
This interface also enables user to add his/her image into any category of pictures. In addition to that,
creating a new image group (new group of buttons in fact) is another facility provided to the user by this
interface.
3.2. Functional Requirements
12
3.2.1. Text-to-speech
Description: This function of the application converts the text typed by the user to sound stream.
Basic Data Flow:
1. User types the text.
2. User clicks the submit button.
3. Whole text is parsed into words.
4. Matching words’ sounds are retrieved from the database.
5. Remaining words are parsed into syllables.
6. Corresponding sound for each syllable is retrieved.
7. Sound streams combined together and vocalized.
Alternative Data Flow 1:
1. User types any text.
2. User clicks the submit button.
3. The system finds unrecognized characters.
4. Operation cancelled.
5. The GUI shows an error message asking for user to check the spelling.
Alternative Data Flow 2:
1. User types any text.
2. User clicks the submit button.
3. The system cannot parse the text with Turkish language rules.
4. Operation cancelled.
5. The GUI shows an error message asking for user to check the spelling.
Alternative Data Flow 3:
1. User types any text.
2. User clicks the submit button.
3. The text contains more than 100 characters.
13
4. Operation cancelled.
5. The GUI shows an error message asking for user to check the spelling.
Functional Requirements:
REQ-1 User must type a text with no more than 100 characters.
REQ-2 The text could be syllabized according Turkish language rules.
3.2.2. Image-to-speech
Description: This function of the application inserts the text corresponding to the chosen image.
Basic Data Flow:
1. User selects an image.
2. System suggests several different forms of the word.
3. User chooses one of them.
4. Selected word is inserted into text area.
5. User clicks on the submit button.
6. Whole text is parsed into words.
7. Matching words’ sounds are retrieved from the database.
8. Remaining words are parsed into syllables.
9. Corresponding sound for each syllable is retrieved.
10. Sound streams combined together and vocalized.
Alternative Data Flow 1:
1. User selects an image.
2. System suggests several different forms of the word.
3. User chooses one of them.
4. Selected word is inserted into text area.
5. User clicks on the submit button.
6. Whole text is parsed into words.
7. The text contains more than 100 characters.
14
8. Operation cancelled.
9. The GUI shows an error message asking for user to check the spelling.
Functional Requirements:
REQ-1 User is able to select more than one image.
3.2.3. Adding new image
Description: This function of the application adds new image and related text into database.
Basic Data Flow:
1. User clicks on the new image button.
2. GUI shows a file browse frame.
3. User chooses the image.
4. User fills the text box with the meaning of image.
5. User optionally fills the other text box with word stem if a word is bound to image.
6. User clicks on the submit button.
7. Whole text is parsed into words.
8. Matching words’ sounds are retrieved from the database.
9. Remaining words are parsed into syllables.
10. Corresponding sound for each syllable is retrieved.
11. Sound streams combined together and vocalized.
12. User clicks verify button.
13. Image is inserted into the database.
Alternative Data Flow 1:
1. User clicks on the new image button.
2. GUI shows a file browse frame.
3. User clicks on cancel button.
4. Operation is cancelled.
Alternative Data Flow 2:
15
1. User clicks on the new image button.
2. GUI shows a file browse frame.
3. User chooses the image.
4. Image already exists in the database.
5. Operation is cancelled.
6. The GUI shows an error message asking for user to choose another image.
Alternative Data Flow 3:
1. User clicks on the new image button.
2. GUI shows a file browse frame.
3. User chooses the image.
4. User fills the text box with the meaning of image.
5. The text already exists in the database, associated with an image.
6. Operation is cancelled.
7. The GUI shows an error message telling the user that text already exists.
Alternative Data Flow 4:
1. User clicks on the new image button.
2. GUI shows a file browse frame.
3. User chooses the image.
4. User fills the text box with the meaning of image.
5. User optionally fills the other text box with word stem if a word is bound to image.
6. User clicks on the submit button.
7. Whole text is parsed into words.
8. Matching words’ sounds are retrieved from the database.
9. Remaining words are parsed into syllables.
10. Corresponding sound for each syllable is retrieved.
11. Sound streams combined together and vocalized.
12. User clicks cancel button.
13. Operation is cancelled.
16
Functional Requirements:
REQ-1 User must type a text with no more than 100 characters.
REQ-2 The text could be syllabized according Turkish language rules.
REQ-3 The image should preserve its main form after minimized.
3.3. Non-Functional Requirements
3.3.1. Performance Requirements
AAC-Droid provides real-time communication to its users, so it converts text to speech in less than
two seconds.
Because application is used on a mobile device, it is efficiently working that no other processes /
applications executing on the same mobile device does not slow down.
3.3.2. Design Constrains
Java programming language is used for implementation of the system. System is developed for
Android OS. No other platforms are considered during development of the application. Eclipse integrated
development environment is used for development. Because it is compatible with Android SDK and it is
very useful for Java programming, group had deal with using Eclipse. System is tested on any mobile device
operated by Android OS. Application is not web-based or network attached. In the development process, no
pirated software should be used. Java coding conventions are conformed.
4. DATA MODEL AND DESCRIPTION
This section describes information domain for the software.
17
4.1. Data Descriptions
In this project, there are five types of data objects in the system namely, word objects, syllable
objects, image objects, sound objects and TTSEngine object.
4.1.1. Data Objects
Word object: It will be obtained from parsed inputs. System will search the word in database.
It also contains syllables and corresponding sound object.
Syllable object: The word will be syllabized and syllable objects shall be formed. It contains
the object of the sound.
Image object: Images will be obtained as inputs. System will convert the images into text that
will be parsed. This object holds the words related with the image.
Sound object: All the text will be matched with sound objects. In addition to the stream of the
sound, it contains a unique ID.
18
TTSEngine object: It parses the input and generates word objects. After that, it also parses the
word objects to form syllable objects. In addition, this object finds the appropriate versions of
sound objects according to the input.
4.1.2. Relationships
This section describes the relationships between the data objects described in the previous
section.
Word – Syllable: Each word may be associated with one or more syllables. In the same way,
a syllable may be associated with multiple words.
Word – Sound: Each word is associated with more than one sound object. However, a sound
object represents only one word.
Syllable – Sound: Each sound represents only one syllable, but a syllable may own more than
one sound because of different versions of pronunciations.
Image – Word: Each image represents a set of words.
TTSEngine – Word & Syllable: Since there is only one TTSEngine, all the words and
syllables associated with it.
19
4.1.3. Complete Data Model
Diagram 6. Complete Data Model
4.1.4. Data Dictionary
Stream: It is multimedia that is constantly received by and presented to an end-user while being
delivered by a streaming provider. The name refers to the delivery method of the medium rather than
to the medium itself.
NLP: Natural Language Processing (also known as Computational Linguistics) aims to bring
together technological and scientific studies of human languages, in particular, their processing by
machine, with the dual goals of advancing language technology and understanding the computational
basis of human language capacity [5]
.
20
TTS: Text-to-speech is a type of speech synthesis application that is used to create a spoken sound
version of the text in a computer document, such as a help file or a Web page. TTS can enable the
reading of computer display information for the visually challenged person, or may simply be used to
augment the reading of a text message [6]
.
5. BEHAVIORAL MODEL AND DESCRIPTION
5.1. Description for Software Behavior
The program has two main parts, which are text typing and image selection. After starting the
program, text typing screen is shown by default. In this state, the users can write any text that they want to
listen. There are two functions that can be chosen in this interface: submitting the text to the TTS engine and
switching to image selection interface. If the text is submitted, it is processed in the TTS and generates the
sound by interacting with the database.
In the image selection state, the users can select some images that refer to the situation they want to
describe. After that, the program adds the text which is bound with that image to the text box. If the text is
submitted, the procedure that occurs in the first state is applied here, too. The user can also switch to the first
state by using the same button. In addition to this, new images can be added and related words to those
images can be assigned in this state. In order to do that, after clicking the “add image” button, they browse
the image and type the text. The image and the text are saved in the database and can be used to form new
sentences more quickly
21
5.2. State Transition Diagram
Diagram 7. State Transition Diagram
22
6. PLANNING
6.1. Team Structure
Our team consists of three people:
Gökhan OĞUZ – Head, Researcher, Database Developer, Software Tester, Public Relations
CoĢkun ġAHĠN – Researcher, Database Developer, GUI Developer, Android OS Specialist
Evren PALA – Researcher, Language Specialist, GUI Developer, Software Tester
Our project is mainly consists of three parts. First and main part is building a TTS engine. Since it
requires more work and energy, it needs to work in collaboration for that part. All the team members should
participate in the researches about this field. When technical part of TTS engine is over, remaining will be
divided. Evren Pala is responsible from the correct usage of the words. Gökhan Oğuz and CoĢkun ġahin are
responsible of database interactions of TTS engine. Second part is generating phrases or sentences from
images. CoĢkun ġahin and Gökhan Oğuz are mainly responsible of building a phrase library. Third part is
the GUI. Evren Pala and CoĢkun ġahin are responsible of GUI development.
Although we have a team leader, the decisions are made in collaboration, as it is supposed to be. We
meet with our teaching assistant once a week (more than once if needed) to decide on the details of weekly
progress and make a schedule to work together.
6.2. Estimation (Basic schedule)
23
Diagram 8. Main Parts of the Project
24
Diagram 8. Gantt Chart
25
6.3. Process Model
In our project, we will use waterfall as process model. Waterfall model is separate and distinct phases
of specification and development.
It is most widely used model in the field of software development since it is a linear model and
simple to implement, documentation is produced at every stage which brings simple understanding of design
procedure and after each step of coding, testing is done to check the code so it enables to eliminate possible
errors at each stage. Due to the advantages mentioned above, we have also decided to use this model in our
project.
Diagram 9. Waterfall Model
7. CONCLUSION
This report is prepared to show AAC project's requirement details in terms of several aspects. Firstly,
a brief description of “AAC-Droid” is introduced. Then, a marketing and technology research is carried out
and the results are established. And at the body part, the requirement details of the project are described and
Behavioral Model and Description is introduced. Scheduling and timeline have also been specified in this
document. This specification will hopefully constitute the basis for design, development, and testing of the
project.