dahan-shoval-sturm - re-14 - comparing the impact of oo-dfd and use case methods

17
ORIGINAL ARTICLE Comparing the impact of the OO-DFD and the Use Case methods for modeling functional requirements on comprehension and quality of models: a controlled experiment Michal Dahan Peretz Shoval Arnon Sturm Received: 2 April 2011 / Accepted: 4 May 2012 / Published online: 31 May 2012 Ó Springer-Verlag London Limited 2012 Abstract Users’ requirements of an information system are modeled in the analysis phase of the development process. The requirements can be modeled with various modeling methods. In this study, we compare two alter- native methods for modeling the functional requirements: one is the UML Use Case (UC) model; the other is OO- DFD transaction (Object-Oriented DFD is a variant of DFD that includes data classes rather than ‘‘traditional’’ data stores). Each of these modeling methods consists of dia- grams accompanied with narrative, semi-structured descriptions explaining their details. We conducted a controlled experiment that compared the comprehension of the two models (i.e., the diagrams and their descriptions) of a certain system and the quality of models created for a certain system with each of the two modeling methods. The main results of the experiment are that models created with the UC method are of better quality than models created with the OO-DFD transaction method because the former are simpler and less detailed; creating highly detailed models are error prone. Interestingly, in spite of the dif- ference in the level of detail and structure, the experiment reveals no significant difference in comprehension of models of the two methods. The results call for improve- ment of the modeling methods in a way that considers the advantages of each of them, and thus we propose an improved method sketch that we call Enhanced Use Case, which will be evaluated in future work. Keywords Functional analysis Á IS development Á Modeling methods Á User requirements Á Use Case Á UML Á FOOM Á OO-DFD 1 Introduction Elicitation and modeling the users’ requirements from an information system is a very important task, for which many methods were introduced over the years. A few examples are discussed below. i* [37] is a framework that attempts to deal with activities that precede the formulation of the initial requirements. These ‘‘early phase’’ requirements engineering activities include those that consider how the intended system would meet organizational goals, why the system is needed, what alternatives might exist, what the implications of the alter- natives are for various stakeholders, and how the stake- holders’ interests and concerns might be addressed. The emphasis in i* is on understanding the ‘‘whys’’ that underlie system requirements, rather than on the precise and detailed specification of ‘‘what’’ the system should do, which is modeled with other modeling methods. Business Process Modeling Notation (BPMN) [35] provides a notation that is readily understandable by all business users, from business analysts who create the initial drafts of the processes, to technical developers who are responsible for implementing the technology that will perform those processes, and finally, to the business people who will manage and monitor those processes. BPMN defines a Business Process Diagram (BPD), which is based on a flowcharting technique tailored for creating graphical M. Dahan Á P. Shoval (&) Á A. Sturm Department of Information Systems Engineering, Ben-Gurion University, 84105 Beer-Sheva, Israel e-mail: [email protected] M. Dahan e-mail: [email protected] A. Sturm e-mail: [email protected] 123 Requirements Eng (2014) 19:27–43 DOI 10.1007/s00766-012-0155-2

Upload: msih

Post on 08-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

ORIGINAL ARTICLE

Comparing the impact of the OO-DFD and the Use Case methodsfor modeling functional requirements on comprehensionand quality of models: a controlled experiment

Michal Dahan • Peretz Shoval • Arnon Sturm

Received: 2 April 2011 / Accepted: 4 May 2012 / Published online: 31 May 2012

� Springer-Verlag London Limited 2012

Abstract Users’ requirements of an information system

are modeled in the analysis phase of the development

process. The requirements can be modeled with various

modeling methods. In this study, we compare two alter-

native methods for modeling the functional requirements:

one is the UML Use Case (UC) model; the other is OO-

DFD transaction (Object-Oriented DFD is a variant of DFD

that includes data classes rather than ‘‘traditional’’ data

stores). Each of these modeling methods consists of dia-

grams accompanied with narrative, semi-structured

descriptions explaining their details. We conducted a

controlled experiment that compared the comprehension of

the two models (i.e., the diagrams and their descriptions) of

a certain system and the quality of models created for a

certain system with each of the two modeling methods. The

main results of the experiment are that models created with

the UC method are of better quality than models created

with the OO-DFD transaction method because the former

are simpler and less detailed; creating highly detailed

models are error prone. Interestingly, in spite of the dif-

ference in the level of detail and structure, the experiment

reveals no significant difference in comprehension of

models of the two methods. The results call for improve-

ment of the modeling methods in a way that considers the

advantages of each of them, and thus we propose an

improved method sketch that we call Enhanced Use Case,

which will be evaluated in future work.

Keywords Functional analysis � IS development �Modeling methods � User requirements � Use Case � UML �FOOM � OO-DFD

1 Introduction

Elicitation and modeling the users’ requirements from an

information system is a very important task, for which

many methods were introduced over the years. A few

examples are discussed below.

i* [37] is a framework that attempts to deal with activities

that precede the formulation of the initial requirements.

These ‘‘early phase’’ requirements engineering activities

include those that consider how the intended system would

meet organizational goals, why the system is needed, what

alternatives might exist, what the implications of the alter-

natives are for various stakeholders, and how the stake-

holders’ interests and concerns might be addressed. The

emphasis in i* is on understanding the ‘‘whys’’ that underlie

system requirements, rather than on the precise and detailed

specification of ‘‘what’’ the system should do, which is

modeled with other modeling methods.

Business Process Modeling Notation (BPMN) [35]

provides a notation that is readily understandable by all

business users, from business analysts who create the initial

drafts of the processes, to technical developers who are

responsible for implementing the technology that will

perform those processes, and finally, to the business people

who will manage and monitor those processes. BPMN

defines a Business Process Diagram (BPD), which is based

on a flowcharting technique tailored for creating graphical

M. Dahan � P. Shoval (&) � A. Sturm

Department of Information Systems Engineering,

Ben-Gurion University, 84105 Beer-Sheva, Israel

e-mail: [email protected]

M. Dahan

e-mail: [email protected]

A. Sturm

e-mail: [email protected]

123

Requirements Eng (2014) 19:27–43

DOI 10.1007/s00766-012-0155-2

models of business process operations. Business process

modeling is used to communicate a wide variety of infor-

mation to different audiences and provides much more

information on the business process than just modeling the

functional requirements of the users from the system.

The early systems analysis and design methodologies,

which emerged in the late 1970s, utilized Data Flow Dia-

grams (DFD) as their main modeling technique. Most

notable examples for such methodologies are DeMarco

[10] and Gane and Sarson [12]. One of the biggest prob-

lems with the ‘‘traditional’’ methodologies was in the

transition from the analysis phase to the design phase,

which was not smooth and caused many problems. Spe-

cifically, it was not easy to convert the DFDs, products of

the analysis phase, to Structure Charts, which were then a

common method for designing the application programs. In

order to deal with this and other problems, Shoval [30]

developed the ADISSA methodology, which enables a

smooth transition from the analysis phase and its DFDs to

the design phase. This was done by defining the concept of

DFD transactions. A DFD transaction is an independent

process performed by the information system to support a

user’s task. According to ADISSA, all transactions of a

system are derived from the DFDs: each transaction con-

sists of one or more elementary functions that are chained

by data flows, and of external entities and data stores that

are too connected with data flows to these functions. The

process logic of each transaction is defined and described

using pseudo-code. Later on in the design phase, the

transaction descriptions are more detailed and other com-

ponents of the system are designed, notably the application

programs, the user interface (menus trees), the input and

output screens and reports, and the relational database

schema.

With the emergence of Object-Oriented (OO) develop-

ment methodologies, Shoval enhanced ADISSA and created

FOOM—Functional and Object-Oriented Methodology,

which combines the functional and OO approaches [31, 32].

According to FOOM, two main models are created in the

analysis phase: a conceptual data model and a functional

model. The conceptual data model is an initial class diagram

that consists of the data classes, attributes, and various

relationships between classes (but with no methods yet). The

functional model consists of OO-DFDs, which are similar to

the traditional DFDs, but instead of data stores, they include

data classes, which are taken from the initial class diagram.

As in ADISSA, the OO-DFDs are then decomposed into

transaction diagrams, and then their process logic is descri-

bed. Later on, the transaction descriptions are decomposed

into class methods, which are attached to proper classes in

the class diagram.

In the early 1990s, many OO analysis and design meth-

odologies emerged, replacing the traditional, functional-

oriented methodologies. Some of the famous ones were

OOA/OOD [7], OMT [29], and Booch method [4]. Rumb-

augh’s and Booch’s methodologies became very popular,

and in 1995, they joined forces and created the Unified

Method (UM). Later on, Jacobson joined the two with his

comprehensive method ObjectOry [17], thus creating the

Unified Modeling Language (UML) [5]. Since the mid

1990s, UML became a ‘‘de facto standard’’ for modeling

systems based on the OO approach. UML consists of over 15

types of diagrammatic modeling methods. One of the most

popular of them, which is used for modeling the users’

functional requirements, is the Use Case (UC) model.

In this study, we compare the UC model of UML and

the OO-DFD transaction diagrams of FOOM. The two

models have many things in common. In essence, a UC and

a transaction have the same objective: to define and

describe an independent task performed by the system to

support a user. The similarity between the two modeling

methods was also noted by Ramsin and Paige [27] who

compared various object-oriented development methodol-

ogies. Yet, the two methods differ in the variety of com-

ponents included in the respective diagrams, and in the

structure of the description accompanying the diagrams (as

will be elaborated later on). The main objective of this

study is to compare between these alternative functional

requirements modeling methods and to find out and ana-

lyze the strengths and weaknesses of each.

The importance of comparing alternative modeling

methods stems also from the evolution of the model-driven

development approach, which claims that models (rather

than the code) serve as the system’s backbone. Accord-

ingly, it is important to compare and analyze the differ-

ences between ‘‘competing’’ modeling methods. Modeling

methods can be similar or different from various perspec-

tives; in this study we compare between the two modeling

methods mainly from the point of view of understanding

models and creating correct models. Such comparisons are

important not only from an ‘‘academic’’ viewpoint but also

from the pragmatic one, because it might guide analysts/

designers regarding which modeling method to adopt and

use. Furthermore, due to insights gained regarding the

strengths and weaknesses of the two modeling methods, we

are able to propose how to improve each of them and also

to propose a hybrid modeling method that combines the

strong parts of each of them.

The rest of this paper is structured as follows: Sect. 2

provides a background on the two modeling methods and

surveys related studies on empirical evaluations of mod-

eling methods, specifically on comparisons between the

DFD and UC methods. Section 3 describes the planning of

the controlled experiment, and Sect. 4 presents the results.

Section 5 discusses the evaluation of the results, presents a

new proposal for a combined modeling method, and

28 Requirements Eng (2014) 19:27–43

123

discusses threats to the validity of the experiment. Finally,

in Sect. 6, we discuss the experiment conclusions and set

the plans for future research.

2 Background

2.1 Two functional modeling methods: OO-DFD

and Use Case

2.1.1 OO-DFD

An OO-DFD, according to FOOM, is a DFD adapted to the

OO world: instead of data stores, it includes data classes—

classes that are already defined in the initial class diagram

of the application, which is created prior to that. Then,

using certain rules, the OO-DFDs are decomposed into

transaction diagrams. Each transaction consists of one or

more elementary functions that are chained by data flows,

and of the external entities and data classes that are too

connected with data flows to those functions. After this

decomposition, the process logic of each transaction is

defined and described. At first, a top-level (i.e., general)

description is created. Later on, the transaction description

is more detailed, taking into consideration the inputs and

outputs/reports involved in each transaction; and the

reading and writing of data from/to the data classes. In a

later stage, each transaction description is decomposed into

methods that are attached to proper classes. But this paper

deals only with the functional modeling aspect of this

methodology, as expressed by transaction diagrams and

their top-level descriptions.

Figure 1 shows an example of a transaction diagram that

has been extracted from an OO-DFD that models the IFIP

conference system [22], while Table 1 shows its top-level

description. Note that the description refers to the various

components included in the transaction diagram.

2.1.2 Use Case

The Use Case method (UC) was introduced by Jacobson

[16] and then became part of UML. Nowadays, UC is one

of the most popular techniques to describe the functional

requirements of a system in the OO world. The main

components of a UC diagram are the UC bubbles and the

actors. A UC bubble is a process that executes from an

external point of view, while a UC actor represents a role

of an entity that interacts with the system, that is, that

operates the UC. A common way to define UCs is to start

from identifying the actors of the system and then think of

all ways those actors will use the system [1]. Actors are

connected with UC bubbles that they interact with, but the

connections are (usually) undirected. In a UC diagram

sometimes, there is a UC bubble that is connected to other

UC bubbles. The main types of connections are ‘‘include’’

and ‘‘extend.’’ (More details about these types of UCs are

beyond the scope of this paper).

Many ways have been published over the years to

describe UCs, for example, [8, 9, 21]. Figure 2 shows a UC

diagram that is assumed to portray the same transaction as

of Fig. 1, while Table 2 shows the description of this UC.

2.1.3 Differences between the two methods

As said, the above two modeling methods have the same

purpose: to describe the functional requirements of an

information system. As can be seen in the above examples,

the methods differ both in the diagrams and in the

descriptions. The major difference is in the details included

in each diagram. A process in the UC diagram is just a

bubble (as mentioned, sometimes a few bubbles connected

to each other with ‘‘include/extends’’ connections) that is

connected to the role (actor) or roles that interact with the

system in order to operate the process. Contrarily, an

OO-DFD transaction diagram usually consists of a series of

functions that are chained to each other with directed data

flows and of data classes and user/external entities that are

connected to those functions with data flows. The data

flows carry data elements from a source component to a

destination component. Note that the data classes included

in the OO-DFD transaction diagrams are taken from the

initial class diagram. Another difference relates to the

external entities: in an OO-DFD transaction, the external

entities, also termed user entities, are the sources of input

data or the destinations of information produced by the

transaction—but they are not necessarily the actual oper-

ators of the transaction. Contrarily, actors in the UC dia-

grams signify the operators of the UC who interact with it.

The difference between the descriptions follows from the

differences between the diagrams: a UC description

includes many details that are not included in the diagram;

this is actually the only place where the ‘‘meaning’’ of the

UC is defined. Contrarily, the OO-DFD transaction

description repeats details that can be seen in the diagram,

that is, there is some redundancy. Another difference is in

the description of the process logic: in the UC description

(where it appears under the ‘‘Main success scenario’’ and

‘‘Extensions’’ compartments), the process logic is written

in natural-language-like fashion, while in the OO-DFD

transaction, the process logic (appearing under the ‘‘Top-

level description’’ compartment) is described using more

pseudo-code-like style, referring to the various components

of the OO-DFD transaction and using standard structured

programming patterns (not shown in the above example).

In essence, the UC diagram includes very little informa-

tion, while the UC description includes the details; on the

Requirements Eng (2014) 19:27–43 29

123

other hand, an OO-DFD transaction diagram includes

many details, while the transaction description includes

redundant information that is presented in a structured

format.

2.2 Related studies

2.2.1 Empirical evaluations of modeling methods

Over the years, many studies have been published on

comparisons between different modeling methods,

1Display details of

reviewers and nonassigned articles

U1PC Chair

Details of reviewers and non assigned articles

Chosen article and assigned reviewers

Status=”sent for review”

U2Reviewer

Article’s details and last date to submit review report

Non assigned articles

Reviewers’details

Details of Reviewers and non

assigned articles

U1PC Chair

Assignment details

C1 - Article

2Assign reviewers to

a chosen article

C3 - Reviewer C6 - Article review

Fig. 1 Transaction diagram extracted from an OO-DFD

Table 1 Top-level description of the transaction

Component Description

Transaction

name

Assign reviewers to a submitted article

Transaction

type

User transaction {user transaction means an

interactive transaction, in which a use/operator

interacts with the system}

Top-level

description

Begin transaction

Read from class C1 (Article): non-assigned articles

Read from class C3 (Reviewers): reviewers’ details

Execute function 1—display details of reviewers

and non-assigned articles

Output to U1 (PC Chair): details of reviewers and

non-assigned articles

Move to function 2: details of reviewers and non-

assigned articles

Input from user U1 (PC Chair): chosen article and

assigned reviewers

Execute function 2—assign reviewers to a chosen

article

Write to class C1 (Article): Status = ‘‘sent for

review’’

Write to class C6 (Article review): assignment

details

Output to U2 (Reviewer): article’s details and last

date to submit review report

End transaction

Assign reviewers toa submitted article

PC Chair Electronic mail system

Fig. 2 UC diagram

Table 2 Description of the UC

Component Description

UC name Assign reviewers to a submitted article

Description The PC Chair assigns reviewers to a submitted but not

yet assigned article

Actors PC Chair, electronic mail system

Preconditions There are submitted articles in the system that have not

yet been assigned to reviewers

Postconditions The chosen article is assigned to reviewers

Main success

scenario

1. The system displays details of reviewers and non-

assigned articles to the PC Chair

2. The PC Chair selects and inputs to the system an

article from the list and the reviewers assigned to

review it

3. The assignment details are saved in the system

4. The system sends the article’s details and the last

date to submit review reports to the reviewers, by the

electronic mail system

Extensions None

30 Requirements Eng (2014) 19:27–43

123

including comparisons between different methods and

frameworks for modeling users’ requirements.

Bass et al. [3] evaluated five methods for elicitation and

expression of requirements with respect to their ability to

capture architecturally significant requirements. The eval-

uated methods included: natural language, use case, quality

attribute workshop (QAW), global analysis, and O’Brien’s

approach. The main conclusions were that QAW is the

most expressive method for specifying unambiguous

requirements, and that O’Brien’s approach is the only

method that addresses business goals explicitly. The other

methods rely on the stakeholders to translate business and

mission goals into requirements.

Hadar et al. [14] present an empirical study that com-

pares Use Case and Tropos, a goal-oriented approach. The

authors’ objective was to evaluate different levels of

comprehension of requirements models expressed in both

methods, as well as to estimate the time required to per-

form simple analysis tasks. Preliminary results show that

Tropos models seem to be more comprehensible to novice

requirements analysts, although more time consuming than

Use Case models.

Topi and Ramesh [34] surveyed many studies pub-

lished between 1978 and 2001 that employed laboratory

experiments to evaluate the usability of data models/

methods. They found out that the most frequent inde-

pendent variable in those studies was the data model; the

next category of independent variables consists of user

characteristics, for example, experience, education, and

intellectual ability; other independent variables are task

characteristics, for example, comprehension and task

complexity. The dependent variables are mostly model

correctness, time used to create the model, declarative

knowledge (understanding of the notation), and attitude.

In most cases, the correctness of a model has been

measured according to the degree to which it corresponds

to a predefined solution. Attitude includes mainly pref-

erence to use a certain model and perceived ease of use.

There are studies that take an analyst/designer/modeler

perspective; these are mainly concerned with measuring

performance—usually model correctness. Other studies

that take a user perspective are mainly concerned with

measuring comprehensibility of models and preference of

models by users. For example, Peleg and Dori [26]

compared two methodologies: OPM/T, a variant of OPM

for real-time systems, and OMT/T, a similar variant of

OMT. The study included comparison of both data

modeling and process specifications from the points of

view of comprehension and quality. The authors found

that in more cases, OPM specifications were more com-

prehensible than OMT specifications. The authors also

found that OPM analysts produced significantly better

diagrams than OMT analysts. Kabeli and Shoval [19]

compared OPM and FOOM in two controlled experiments

from the point of view of users and analysts. Their main

results were that FOOM analysis specifications are more

comprehensible to users than those based on OPM,

whereas in terms of the data model, there is no significant

difference. From the analysts’ point of view, they com-

pared quality, that is, correctness of specifications. The

main results of these experiments were that analysts

create more correct specifications with FOOM method-

ology than with OPM.

2.2.2 Empirical comparisons of DFD and UC modeling

methods

As we have seen, many studies have evaluated and com-

pared different modeling methods that may be used for the

same purpose. Therefore, it is only natural to compare

the DFD with the UC modeling methods, since they have

the same purpose: to define the users’ functional require-

ments. In this section, we review studies that have com-

pared between these two methods.

Millet and Nelson [23] compared DFDs and UCs using

questionnaires. During 2003–2006, the course of system

analysis was offered 12 times, each time to different stu-

dents but all times by the same lecturers. In each semester,

the course was given to two different groups in parallel,

and the researches split the students into two groups: stu-

dents in one group had to perform a DFD task and then a

UC task, and students in the other group had to perform a

UC task and then a DFD task. All tasks included only

diagrams, without descriptions. The first task of the first

group was to analyze with several DFD’s a simple ‘‘work

order’’ system including only 3 processes and 3 external

entities, while using a certain CASE tool; the participants

had 2 days to complete this task. Afterward, the partici-

pants learned the UC method and were given the second

task—to analyze the same system with UCs using another

CASE tool; the participant had 2 days to complete this

task. For the second group, the order of learning and tasks

was the opposite. After completing the two tasks, each

participant was asked to answer a questionnaire with five

claims on each of the methods, using a 1–7 point scale. For

example, one claim was: ‘‘This methodology is easy to

understand.’’ The results showed that the participants found

the DFD method better in its effect on helping systems

analysts communicate requirements to programmers, but

they did not find differences between the methods in all

other claims.

This research has several drawbacks. One is the use of a

questionnaire only, that is, the results are based solely on

the participants’ opinions regarding the five claims.

Requirements Eng (2014) 19:27–43 31

123

Although the tasks of the participants were to analyze

(though very small) systems, the correctness or quality of

their solutions were neither examined nor compared.

Another problem is that the students had to use two dif-

ferent CASE tools; there is a possibility that the learning

process and usability of these tools had some impact on the

students’ opinions about the modeling methods. Note also

that the students had 2 days to complete each task; it means

that there was not enough control on the process of solving

the tasks by the students. Finally, this experiment dealt

only with comprehension of diagrams.

Jeyayaj and Sauter [18] too compared between the DFD

and the UC modeling methods. In their experiment, the

participants were 4 classes of business administration stu-

dents, who did not learn the examined methods, and 5

classes of MIS students, who did learn and practiced with

the examined methods. The business administration stu-

dents were considered novice users, while the MIS students

were considered experienced users. The differences

between the methods were examined from the point of

view of comprehension of diagrams. A student registration

system was modeled into a DFD diagram and a UC dia-

gram. (Each diagram included several DFDs/UCs, but there

is no mention of their descriptions). Each participant was

asked to explain in free text what he/she understands from

each diagram. The participants were given a week to

complete the first task and then another week to complete

the second task. The participants (of either type) were split

into two groups, one receiving the DFD task prior to the

UC task and the other receiving the UC task prior to the

DFD task. The overall conclusion of this research was that

for experienced users, the DFD method provides a better

tool to describe functional requirements, while for the

novice users, no difference was found between the

methods.

This research too has several drawbacks. First, the par-

ticipants were asked to provide free-text explanations of

what is seen in the given diagrams, which was then coded

into some numbers; this is not a precise way to measure

comprehension. Secondly, the students had a week to

complete each task, so here too there was not enough

control on the process. Third, this experiment too dealt

only with comprehension of diagrams.

In summary, the above studies on the comparison

between DFDs and UCs suffer from many limitations. In

particular, they deal only with comprehension of dia-

grams. In our research, we compare the two alternative

modeling methods from the points of view of both

comprehension and quality of model created by analysts.

In our comparison, we use OO-DFDs, which include data

classes rather than data stores in traditional DFDs; this

enables us to compare two methods that may be used

equivalently for defining functional requirements in an

object-oriented development process. Note that we do not

compare UCs with OO-DFDs, but rather with the trans-

actions that are derivable from them, because actually it is

the transactions that are equivalent to UCs, while DFDs

have other components and characteristics that do not

exist in UCs (for example, DFDs express hierarchical

decomposition of functionality, which has no equivalent

in UCs). In the next sections, we describe the experiment

planning and analyze the results, following the guidelines

for reporting experimental research as specified by

Wohlin et al. [36].

3 Experiment planning

3.1 Goals

As said, our objective is to compare the two alternative

functional modeling methods: UCs and their descriptions

versus OO-DFD transaction and their descriptions. The

comparison between the methods is performed based on six

goals:

G1: Comprehension of models—we ask whether there is

a difference in the understanding of the models (i.e.,

diagrams and descriptions) by users.

G2: Quality of created models—we ask whether there is

a difference in the quality of models created by

modelers.

G3: Time taken to understand—we ask whether there is

a difference in the time it takes to understand given

models.

G4: Time taken to model—we ask whether there is a

difference in the time it takes to create models.

G5: Perceived comprehension of modeling methods—

we ask whether there is a difference in the comprehen-

sion of the modeling methods as perceived by users.

G6: Perceived quality of modeling methods—we ask

whether there is a difference in the quality of the

modeling methods as perceived by modelers.

We compared the two methods in a controlled experi-

ment in quasi-laboratory setting, where we address the

above goals. The experiment involved two tasks: the first

task was of understanding diagrams and descriptions rep-

resenting the functional requirements of a certain system;

the second task was of creating diagrams and descriptions

from a requirements document of a certain (other) system.

The participants were divided into two homogeneous

groups; each participant in each group had to perform the

two tasks using one of the two methods: the first task was

of understanding given models, and the second, of creating

models. Thus, we designed an experiment of one factor

with two treatments.

32 Requirements Eng (2014) 19:27–43

123

3.2 Hypotheses and the research models

3.2.1 Comprehension of models

The research model for the comprehension task is pre-

sented in Fig. 3.

Following the research goals, the conjectures for this

experiment are as follows:

1. There are no differences between the two methods

regarding the level of comprehension (i.e., the under-

standing) of the diagrams and their descriptions.

2. The UC method is superior to the OO-DFD transaction

method regarding the time it takes to understand the

diagrams and their descriptions.

3. The UC method is superior to the OO-DFD transaction

method regarding the perceived comprehensibility of

the model.

Theoretic rationale for the conjectures: According to a

framework suggested by Gemino and Wand [13] for

evaluating modeling techniques, both grammar-based and

cognitive-based approaches should be considered comple-

mentary in order to evaluate two (or more) different

modeling techniques. Grammar-based approaches identify

the differences and generate predictions regarding gram-

mar efficacy, while cognitive-based approaches suggest

ways to observe the effects of grammar differences and test

the predictions. We follow this framework and rationalize

our hypotheses by emphasizing the differences between the

two methods and then considering cognitive researches in

order to predict the effect of a grammar’s expressiveness

(the differences between the methods) on the effectiveness

of its use by individuals (i.e., understanding the model

presented to the participants). The grammar- and cognitive-

based rationales for our hypotheses are presented below.

An OO-DFD transaction diagram contains many details;

it usually includes several functions, data classes, external

entities, and directed data flows between them. In contrast,

a UC diagram contains only UC bubbles, actors, and

connections between them. Cognitive theories regarding

the comprehensibility and quality of diagrams can help

assume which kinds of diagrams are more or less com-

prehensive. Several studies were done on comprehension

and quality of diagrams in conceptual modeling based on

cognitive theories, for example, [15, 20, 28]. One important

framework based on cognitive theory is COGEVAL pre-

sented by Rockwell and Bajaj [28]. The framework states

that a model that has less relationship information is less

comprehensive than a model that has more relationship

information. In our case, a UC diagram has less relation-

ship information compared to an OO-DFD transaction

diagram. An experimental study for evaluating the influ-

ence of the number of concepts (NOC) in a model on the

readability of the model [2] also supports this assumption;

the experiment’s results showed that a higher NOC will

lead to a higher percentage of questions about the domain

that the subject can answer (we refer to this as higher

comprehension). Moreover, one of the guidelines for cre-

ating good modeling diagrams is that diagrams should not

exceed perceptual and cognitive limits in the sense that it

should not show too much information on a single diagram,

which may result in ‘‘absurdly complex diagrams’’ that are

a barrier rather than an aid to communication [24]. In our

context, the OO-DFD transaction diagram contains a lot of

information that can make it more complex for under-

standing than the UC diagram. This is also supported by the

COGEVAL framework according to which a model with

more elements will be more complicated to understand

than a model with fewer elements, due to the limited

capacity of the short-term memory. An OO-DFD transac-

tion diagram has more elements than a UC diagram; hence,

due to this proposition, it is expected to be less understood.

These arguments are valid also for the descriptions of

the diagram: an OO-DFD transaction description is full of

details; it mentions the various components of the OO-DFD

transaction and specifies the process logic using pseudo-

code. In contrast, a UC description is less structured—more

similar to natural language. So, on the one hand, an

OO-DFD transaction description includes more informa-

tion than a UC description, but on the other hand, it might

be complicated to ‘‘process.’’ Hence, the dilemma is which

of the conflicting ‘‘forces’’ is stronger: simplicity but less

information versus complexity but more information? more

‘‘natural’’ description or more structured description?

Recall that in the above two surveyed studies that dealt

with comprehension, the researchers concluded that DFDs

are to a certain degree superior to UCs. We speculate that

due to the above contrasts and based on cognitive theories,

there will be no difference between the two modeling

methods in understanding the diagrams and their descrip-

tions. Our speculation is further supported by Gemino and

Wand [13], who claim that ‘‘while one modeling grammarFig. 3 The comprehension research model

Requirements Eng (2014) 19:27–43 33

123

may be highly expressive and hence superior in a grammar-

based evaluation (in our case, OO-DFD transactions are

grammatically superior to use cases), a representation

created with that grammar might be overly complicated,

leading to difficulties in developing domain understanding

by the person viewing the model.’’

As for the time it might take to understand the models,

we believe that due to the simplicity of the UC diagram and

the more natural language used in the description, it will

take less time to understand UC models. Bajaj [2] also

shows that a higher NOC will lead to more time to answer

questions regarding schemas. This finding supports our

speculation that it will take less time to answer compre-

hension questions in the UC model. For the same reasons,

we speculate that users will perceive the UC method as

better than the alternative.

To examine the various conjectures, we formulate the

hypotheses as appears in Table 3.

3.2.2 Creation of models

The research model for the model creation task is presented

in Fig. 4.

Following the research goals, the conjectures for this

experiment are as follows:

1. The UC method is superior to the OO-DFD transaction

method regarding the quality of the diagrams and

descriptions created by analysts.

2. The UC method is superior to the OO-DFD transaction

method regarding the time it takes to create the

diagrams and descriptions.

3. The UC method is superior to the OO-DFD transaction

method regarding the perceived quality of the created

model.

Theoretic rationale for the conjectures: Based on the

grammatical differences between the two methods and on the

cognitive theories mentioned above, the COGEVAL

framework states that when a model requires a greater

number of simultaneous items to create the diagram, the

diagram is of less quality than a diagram of a model that

requires fewer numbers of items. An OO-DFD transaction

diagram includes more types of components compared to a

UC diagram, and so does the OO-DFD transaction descrip-

tion compared to a UC description. Moreover, a UC

description is simpler to create because it is mostly in natural

language, less detailed and less structured, while an

OO-DFD transaction description includes all the details/

components of the respective diagram, and the process logic

is expressed in pseudo-code. This implies that with UCs,

there are fewer ‘‘chances’’ to commit errors while writing the

descriptions. Hence, we speculate that the diagrams and

descriptions created with the UC method will be of better

quality than the alternative. Our speculation is also supported

by Bajaj [2], who claimed that increasing the number of

concepts (NOC) in a model makes it harder to create model

schemas from an analyst point of view. Hence, OO-DFD

transaction diagrams and description, which have more

concepts than UC diagrams and descriptions, will be harder

to create and thus will have more errors. For the same reason,

we assume that it will take less time to create UC models and

that analysts will perceive the UC method as better for

defining user requirements.

To examine the various conjectures, we formulate the

hypotheses as appears in Table 4.

3.3 The independent variables

The independent variable is the modeling method: UC or

OO-DFD transaction.

3.4 The dependent variables

a. In the comprehension experiment, the dependent

variables are as follows:

1. Comprehension of diagrams and their descriptions.

This variable is measured by a comprehension

Table 3 The comprehension hypotheses

Aspect H0 H1

Comprehension of models H01: Comp (UC) = Comp (DFD) H11: Comp (UC) = Comp (DFD)

Time taken to understand models H03: TComp (UC) = TComp (DFD) H13: TComp (UC) = TComp (DFD)

Perceived comprehension of modeling methods H05: PComp (UC) = PComp (DFD) H15: PComp (UC) = PComp (DFD)

Fig. 4 The quality of model creation research model

34 Requirements Eng (2014) 19:27–43

123

questionnaire consisting of true/false/can’t tell state-

ments about information included in the diagrams and

their descriptions. The same questions were used to

measure the comprehension of the two models.

2. Time taken to comprehend. This variable is measured

by the time (in minutes) taken by participants to

complete the comprehension task.

3. Perceived comprehensibility of models, that is, the

participants’ opinion regarding the ease of understand-

ing the diagrams and descriptions. This variable is

measured using a 7-point scale.

b. In the modeling experiment, the dependent variables

are as follows:

1. Quality of the diagrams and their descriptions. The

participants create models based on a requirements

document of a certain (other) system. The quality of a

created model is the degree to which the diagrams and

descriptions describe correctly and accurately the

requirements of the system. Quality is determined by

graders who were assisted with an expert solution.

2. Time taken to create models. This variable is measured

by the time (in minutes) taken by participants to

complete the model creation task.

3. Perceived quality of modeling methods, that is, the

participants’ opinion regarding how good is the

modeling method for requirements specification. This

variable too is measured using a 7-point scale.

3.5 The controlled variables

The controlled variables in the two experiments are the

participants and the tasks. The participants are a homogenous

group of third-year students of software engineering who

studied the same courses. They were split randomly into the

two treatment groups, where members in each group used

only one of the two modeling methods. This way we avoided

possible biases that might have affected this variable (e.g.,

age, sex, and experience). Note that this kind of assignment

of participants to treatment groups is often used in experi-

mental evaluation of modeling techniques.

The control over the tasks variable is achieved by giving

the participants in the two groups the same tasks; only the

modeling methods they used were different: in the com-

prehension task, all participants had to answer exactly the

same questionnaire while given different diagrams and

descriptions of a certain information system; in the model

creation task, all participants received the same require-

ments document of a certain (other) system and had to

create the models using one of the two modeling methods

(the same method that they used in the comprehension

task).

3.5.1 The subjects

Fifty-three students participated in the experiment that took

the form of a ‘‘bonus test’’ at the end of the semester, in

which they have studied the two modeling methods in two

mandatory courses that were given in parallel: in the

‘‘Information Systems Analysis & Design’’ course, they

studied OO-DFDs and transactions as part of the FOOM

methodology; in the ‘‘Object-Oriented Analysis & Design’’

course, they studied the UC method as part of UML. Each

method was taught by an experienced lecturer.

Before the experiment began, the subjects were ran-

domly divided into two treatment groups as shown in

Table 5. In these kinds of empirical experiments, there are

many factors that may influence the results, such as the

personal characteristics of the subjects and their experi-

ence. As in most experiments of this form, neutralizing

such factors is done by randomization. Note also that third-

year students have almost no experience, so this variable

could have no effect in this study.

As said, subjects in each group were assigned to work

with one modeling method: subjects in group G1 per-

formed the two tasks using the OO-DFD transaction

method, while subjects in group G2 performed the two

tasks using the UC method. The subjects were motivated

by adding bonus points to their final grades in the course in

which they studied the method that they were assigned to in

the experiment. The number of bonus points given to each

student was determined according to his/her performance

in the two tasks. (The procedure was approved by the

Ethics committee).

Table 4 The model creation hypotheses

Aspect H0 H1

Quality of models H02: Create (UC) = Create (DFD) H12: Create (UC) = Create (DFD)

Time taken to create the models H04: TCreate (UC) = TCreate (DFD) H14: TCreate (UC) = TCreate (DFD)

Perceived quality of models H06: PCreate (UC) = PCreate (DFD) H16: PCreate (UC) = PCreate (DFD)

Table 5 Assignment of subject into treatment groups

Group Size Task 1: comprehension of

models

Task 2: model

creation

G1 28 OO-DFD transaction diagrams and descriptions

G2 25 Use Case diagrams and descriptions

Requirements Eng (2014) 19:27–43 35

123

3.5.2 The tasks

In the first task—comprehension of models—each subject

received diagrams and their respective descriptions of an

information system, according to his/her group. The system

for this task was Greeting Cards Ordering (similar to the

one used in Dori and Goodman [11] and in Kabeli and

Shoval [19]). The domain of this system was not familiar to

the students (unlike, for example, a university management

system would), and thus, there is no concern that the par-

ticipants will answer the questions due to background

knowledge of the domain [6]. This guideline is also men-

tioned in Parsons and Cole: ‘‘subject matter experts should

not be used, it is critical that participants can answer

questions by using only the script, rather than by using

background knowledge’’ [25]. For the OO-DFD transaction

method, we created a flat OO-DFD of the system consisting

of 5 transactions, along with a description of each trans-

action. Similarly, we created an equivalent UC model that

included respective UC diagram and descriptions. The

diagram and descriptions were examined by experts of the

two methods to verify that they are correct and in accord

with the respective method.1

Along with the diagrams and descriptions, each partic-

ipant received a questionnaire consisting of 22 questions.

The same questionnaire was given to all subjects in the two

groups. The questions were actually statements dealing

with facts appearing in the diagrams and their descriptions.

For each statement, each subject had to mark whether it is

‘‘true,’’ ‘‘false,’’ or ‘‘can’t tell.’’ As already said, the

methods are not ‘‘information equivalent’’ [6, 33]. For

example, data classes appear in transactions, but not in

UCs. Parsons and Cole claim: ‘‘…if one form provides

enough information to answer selected questions correctly

while a second form does not, it would not be surprising to

find that participants receiving the first form outperform

those receiving the second form on those questions’’ [25].

We agree with this claim and overcame the problem by

asking questions that do not refer to components not

appearing in the two models. For example, instead of

referring to a certain data class used to store or retrieve

data, as might appear in a transaction only, we just refer to

information items. Table 6 presents a few examples of

statements included in the questionnaire.

Once completed the questionnaire, each subject was

given a post-test questionnaire where he/she was asked to

express his/her opinion about the comprehensibility of the

model (diagrams and descriptions) he/she used. This

questionnaire consisted of two parts: (a) a 1–7 ordinal scale

question, where 1 means ‘‘the model is very difficult to

understand’’ and 7 means ‘‘the model is very easy to

understand’’; (b) an open-ended question, where the subject

was asked to write comments about the comprehensibility

of the modeling method he/she worked with.

In the second task of the experiment—model creation—

each subject received a narrative requirements document of

an information system. The task was to create diagrams

(OO-DFD transaction or UCs) and respective descriptions.

(Recall that each subject continued to work with the same

modeling method as in the first task so that he/she is not

confused with a different method). The requirements doc-

ument was part of the IFIP Conference System (used also

in [19, 22]). Prior to the experiment, two detailed solutions,

one for each method, were created by experts of each

method. These solutions were used after the experiment by

the graders who graded the models that were created by the

subjects.

Along with the requirements document, each subject

received an instructions form that included explanations

about what he/she is expected to do, and an example of a

diagram (a transaction or a UC) and a respective descrip-

tion. The example was provided in order to help the par-

ticipants remember the notations and structures used in the

modeling method.

Once completed this task, each participant was given

another post-test questionnaire, where he/she was asked to

express his/her opinion about how good is the method to

create a functional model of a system. Here too we used a

1–7 ordinal scale question (1—‘‘very bad,’’ 7—‘‘very

good’’) and an open question to write comments about the

method.

It has to be noted that a few weeks prior to the actual

experiment, we have conducted a pilot test with 4 senior

students who studied and practiced with the two methods.

The pilot test helped us to verify the correctness of the

tasks and the clarity of the questions, and to assess the time

needed to complete the two tasks. Based on that, we have

decided to allocate a total of 3 h for the two tasks. We told

the subjects in advance that they have 3 h to complete the

Table 6 Examples of statements in the questionnaire

When a client’s order is updated, first the updated order details are

inserted and then the card details are read

The client does not need to approve the chosen design details

before ordering the desired amount

A message is sent to the client if his order is not updated according

to his request

When a client wants to choose a card from the existing variety, the

system will display to the client all the cards existing in the

collection

1 In the Comprehension task we face a potential bias since the

researchers created the models. This problem is well known since in

many other experimental studies the researchers create the modeling

tasks. Our work was done honestly and we created the ‘‘best’’ possible

models with each of the methods, with no bias (as can be proved by

the results of the comprehension task in next sections).

36 Requirements Eng (2014) 19:27–43

123

two tasks, but we advised them to dedicate about an hour to

the first task (comprehension) and the rest, to the second

task (model creation).

3.6 The analysis procedure: grading the solutions

The grades of the solutions of the two tasks were deter-

mined as follows:

• Comprehension task: The comprehension task included

a questionnaire containing 22 statements having one

correct answer each (True or False). For each solution,

the number of correct answers was counted.

• Model creation task: Grading the solutions of this task

was more complicated, because for a given require-

ments document, many correct solutions (i.e., func-

tional models) are possible. This requires deep

examination of the model created by each subject.

Each solution, within each of the two groups, was

evaluated and graded by two experienced graders.2 The

two graders worked and graded each solution sepa-

rately (using separate copies of the solution sheets). To

assist the graders, we prepared ‘‘expert solutions’’ of

the tasks (prior to the experiment). Since there might be

other correct solutions, as discussed in Sect. 3.5, the

graders were instructed to use the expert solutions as

guidelines only and to grade each solution compared to

the requirements document. At the beginning of their

grading work, the two graders of each treatment group

worked together on a few solutions in order to

synchronize their work; then they continued working

separately. This way, each solution received two

independent grades. We checked the correlation

between the two grades using Pearson correlation

statistic, which is a mean to measure correlation among

graders when the grades are in an ordered scale.3 The

results showed high correlations: for the graders of the

OO-DFD transaction, the Pearson correlation coeffi-

cient was 0.81; for the UCs, it was 0.76. Therefore, we

could average the two grades and declare it as the

quality grade of each solution.

4 Analysis

Prior to testing the differences between the results using

parametric t tests, some assumptions had to be validated. The

t test assumptions are that: (a) the samples are independent;

(b) the samples distribution is normal; (c) the homoscedas-

ticity assumption is valid. The independency assumption is

validated due to taking the observations from an independent

sample: the participants were split randomly into two groups,

thus creating independent samples. The normal distribution

assumption was examined using Kolmogorov–Smirnov test

for each of the groups on each dependent variable (i.e.,

comprehension, comprehension time, comprehension pref-

erence, quality of model, modeling time, modeling prefer-

ence). We found out that all groups were normally

distributed, except for two variables: perceived compre-

hensibility (p value \ 0.05) and perceived quality for the

OO-DFD transaction group, which has a week normal dis-

tribution (p value = 0.07). Therefore, we conclude that

t tests are suitable for all variables except these two. For

these, we used the Wilcoxon test, which is suitable when one

of the assumptions of the parametric test does not hold. The

homoscedasticity assumption is validated each time a t test is

executed by the Levene test. When the assumption is not

fulfilled, another version of t test that does not assume the

homoscedasticity has to be performed.

4.1 Results of the comprehension experiment

Table 7 summarizes the results of the comprehension

experiment. It shows that there was no significant differ-

ence in comprehension, but it took less time to comprehend

the UC model, while users perceived that the OO-DFD

model is easier to comprehend. Following the results, we

accept H01 and reject H03 and H05.

2 Because of the criticality of this grading, we decided to use two

experienced graders for each solution. They were graduate students of

Information Systems Engineering who served as TAs of the two

involved courses (IS Analysis & Design, and OO Analysis & Design).3 We cannot use Kappa’s coefficient since it is suitable only for

qualitative/categorical items; grades on a 0–100 scale are not

categorical.

Table 7 The comprehension results

OO-DFD

transactions

UC Statistical analysis results

(a = 0.05)

Conclusion

Comprehension grade 70.71 % 70.68 % p = 0.98 (t test) No difference in comprehension

Time to complete comprehension

task

43 min 35 min p = 0.008 (t test) Difference in favor of UC

method

Perceived comprehensibility

of models

5.18 4.28 p = 0.023 (Wilcoxon) Difference in favor of OO-DFD

transaction method

Requirements Eng (2014) 19:27–43 37

123

4.2 Results of the model creation experiment

As said, prior to the analysis of the model creation results,

we computed the Pearson correlation between the pairs of

grades given by the two grades; we found them highly

correlated (p value \ 0.05), thus we could average the two

grades of each participant. Table 8 presents the results of

this experiment. As can be seen, there was a significant

difference in the quality of the created models in favor of

the UC method, but there was no significant difference in

the time taken to create the models, and no difference in the

users’ perceptions about how good are the models for

modeling requirements. Following the results, we reject

H02 and accept H04 and H06.

We wanted to explore in more details the reasons for the

difference in quality of the created models. For this, we

classified the possible types of errors in the diagrams and

descriptions into five categories:

A—missing or redundant function/transaction/UC

B—missing or wrong name of function/UC in diagram

or description

C—missing, redundant, or wrong external entity/actor in

diagram or description

D—missing, redundant, or wrong direction data flow/

connection

E—wrong logical description of transaction/UC

We counted the number of errors made by each partic-

ipant in each of the above categories. Then we tested the

differences in the number of errors within each of the

categories, using t tests. The results are summarized in

Table 9. As can be seen, in categories A and B, there were

no significant differences between the two methods, while

in categories C and D, there were significant differences in

favor of the UC method. These results are reasonable

because in OO-DFD transaction, there are more external

entities and data flows compared to actors and connections

in UCs; hence, there are more chances to make errors when

using the OO-DFD transaction method, making the UC

method superior in those categories. Contrarily, in category

E, there was a significant difference in favor of the OO-

DFD transaction method, possibly because the logical

descriptions of the transactions are more structured com-

pared to the informal descriptions of UCs.

5 Discussion

5.1 Evaluation of results and implications

5.1.1 Analysis of the comprehension results

(A) Comprehension of models. The results showed no

significant difference between the comprehensibility of the

two methods. This result is in accord with our speculations

prior to the experiment due to the contrary considerations

and the support of cognitive theories. On one hand, detailed

diagrams and descriptions, as in OO-DFD transaction

versus UC diagrams and descriptions, may provide for

understandability, but on the other hand, too many details

might cause confusion. As described in the hypotheses

section, the COGEVAL framework proposes cognitive

principles that on the one hand support the assumption that

transaction diagrams are easier to understand than UC

Table 8 The quality of modeling results

OO-DFD

transactions

UC Statistical analysis

results (a = 0.05)

Conclusion

Model creation grade 73.87 % 87.52 % p = 0.001 (t test) Difference in favor of UC method

Time to complete model creation task 112 min 108 min p = 0.53 (t test) No difference in time

Perceived quality of modeling method 4.04 3.92 p = 0.96 (Wilcoxon) No difference in perceived quality

Table 9 Results per categories

Category Average no of errors

in OO-DFD transactions

Average no

of errors in UCs

Statistical analysis

results (a = 0.05)

Conclusion

A 1.46 1.71 p = 0.56 No significant difference

B 0.15 0.19 p = 0.81 No significant difference

C 2.38 1.05 p = 0.001 Significant difference in favor of UC method

D 3.58 0.43 p = 0 Significant difference in favor of UC method

E 1.35 2.76 p = 0.014 Significant difference in favor of OO-DFD

transaction method

38 Requirements Eng (2014) 19:27–43

123

diagrams (more relationship information makes more

comprehensive model), while on the other hand, they

support the opposite assumption that transaction diagrams

are less understood (more elements in a model makes it less

comprehensible).

In order to get more insight into the results, we analyzed

the comments written by the participants in the post-test

questionnaires. We first reviewed the comments and

extracted phrases that appeared many times, that is, written

by many participants. Then, we counted the number of

times each phrase was written about each of the two

methods. The results of this analysis, which are summa-

rized in Table 10, support our earlier explanations. As can

be seen, the phrase ‘‘The diagram contains many details

and is confusing and complicated’’ was written 3 times

about the OO-DFD transaction method, but never about the

UC method, while the phrase ‘‘The diagram is clear,

understandable and visual’’ was written 16 times about the

UC method and only 8 times about the OO-DFD transac-

tion method. On the other hand, the phrases ‘‘The diagram

is too general; not enough details’’ and ‘‘The descriptions

are not detailed enough and it is hard to extract specific

information from them’’ were written many times about the

UC method (9 and 12, respectively), but never about the

OO-DFD transaction method. Table 10 also shows some

contradictions: the phrase ‘‘The transaction/use case

descriptions are clear and understandable’’ was written

only 2 times about the OO-DFD transaction and 10 times

about the use cases, while the phrase ‘‘The transaction/use

case descriptions add information to the understanding of

the system’’ was written 8 times about the OO-DFD

transaction and only 3 times about the UCs. These con-

tradicting comments too support the result that we found,

that is, that there is no difference in comprehension

between the two modeling methods.

(B) Time to comprehend models. We found that it takes

less time to comprehend a model expressed in a UC model.

This finding is also in accord with our assumptions prior to

the experiment and can be explained as before: because

OO-DFD transaction diagrams and descriptions include

many more components and details compared to the

alternative, it takes more time to read and understand them.

(C) Perceived comprehensibility of models. Surpris-

ingly, the participants perceive the OO-DFD transaction

model as more comprehensible—contrary to our a priori

assumption. This outcome is difficult to explain, in par-

ticular, if we compare it to the opposite outcome regarding

the time dimension. A possible explanation may be that the

subjects thought that since the method is more detailed and

the descriptions are more structured—it ‘‘must be more

comprehensible.’’ Another possible explanation for this

result might be that each participant only assessed one of

the two modeling techniques; therefore, the difference in

perceived comprehensibility of the models is threatened by

the variation in the individual appreciation of comprehen-

sibility. What is ‘‘difficult’’ or ‘‘easy’’ for some can be

considered differently for others. We address this limitation

in Sect. 5.2—Threats to validity.

5.1.2 Analysis of the model creation results

(A) Quality of created model. We found a significant dif-

ference between the qualities of the created models in favor

of the UC method. As discussed earlier, these findings can

be explained by cognitive theories: according to the

COGEVAL framework, when a model requires a greater

number of simultaneous items to create a diagram, the

diagram is of less quality than a diagram of a model which

requires less number of items. Indeed, an OO-DFD trans-

action diagram includes more types of components com-

pared to a UC diagram, and an OO-DFD transaction

description too includes more types of components com-

pared to UC descriptions. Moreover, the process logic of an

OO-DFD transaction is described in pseudo-code, while a

UC description is less structured, more natural-like

description. Hence, an OO-DFD transaction modeler has

many more ‘‘chances’’ to commit errors compared to a UC

modeler.

Table 10 Summary of the participant’s comments

Phrase No of times said

about OO-DFD

transactions

No of times

said about UCs

The diagram is clear, understandable and visual 8 16

The diagram contains many details and is confusing and complicated 3 0

The diagram describes clearly the data flow in the system 4 0

The transactions/use cases descriptions add information to the understanding of the system 8 3

The transactions/use cases descriptions are clear and understandable 2 10

The diagram is too general, not enough details 0 9

The descriptions are not detailed enough and it is hard to extract specific information from them 0 12

Requirements Eng (2014) 19:27–43 39

123

But note that we found significant differences in certain

types of categories of the models: in category C (missing,

redundant, or wrong external entity or actor in diagram or

description) and in category D (missing, redundant, or

wrong direction of data flow/connection), we found

advantage to the UC method; this is reasonable as already

explained, because the OO-DFD transaction method

requires to define external entities, data classes, and data

flows, which are not required in the UC model. Contrarily,

in category E (wrong logical descriptions of transaction/

UC), we found an advantage to the OO-DFD transaction

method. The reason for this result is that the OO-DFD

transaction description is more structured, thus causing

fewer mistakes.

(B) Time to create models. Contrary to our a priori

assumptions, we found no significant differences in time

to create the models. This result is somewhat surprising

because it is expected that a more demanding task

(transactions) would take more time. A possible expla-

nation for this indifference is that the participants, who

were allocated the same amount of time to work on the

two tasks, were advised to use the first hour for the

comprehension task, and therefore, they have reserved

about the same amount of the remaining time to work on

the model creation task.

(C) Perceived quality of modeling methods. We found

no significant differences in the perceived quality of the

two modeling methods. This result is somewhat inconsis-

tent with the equivalent outcome regarding perceived

comprehensibility of the models, where we found a sig-

nificant difference. This result can be explained by the

contradiction in the participants’ opinions, as shown in

Table 10. On one hand, many subjects thought that the

OO-DFD transaction method is too detailed and confusing,

while on the other hand, many others thought the UC

method is too general and not detailed enough. As

described in the ‘‘perceived comprehensibility of models’’

section, we can also explain this inconsistent result due to

the different individual appreciation of comprehensibility

by the subjects, since in our experiment settings, each

subject used and assessed only one of the two modeling

techniques. Section 5.2 further relates to this issue.

5.1.3 An improved method for modeling functional

requirements

We have seen that there are differences between the two

methods, and each has some advantages and disadvantages.

Our observations led us to conclude that each of the

methods can be improved by considering the advantages of

the other. The OO-DFD method can be improved by sim-

plifying the redundant, overly detailed, and structured

description of transactions, but retaining the structured

description of the process logic of the transaction. The UC

method can be improved by adding these missing compo-

nents to the UC diagram and referring to them in the UC

description.

In the following, we briefly describe an enhanced UC

method that we term EUC—Enhanced Use Case, which

combines the advantages of the two:

A. A EUC diagram will include components that exist in

an OO-DFD transaction diagram, that is, one or more

functions, external entities signifying sources of input

data and destinations of output, data classes from

where the functions can retrieve data or where they

can store data, and directed data flows between the

respective components. Proponents of existing UC-

driven methodologies may ask: Where can the data

classes come from at this stage? Two answers are

possible: (a) if we also adopt the FOOM approach,

prior to creating the EUCs, we create an initial class

diagram and use these classes in the respective EUC

diagrams; (b) even if we do not adopt the FOOM

approach, while creating a EUC diagram, we may

expect the modeler to define not only the functions of

the EUC but also the required data classes, as well as

the external entities.

B. There is no need to include the ‘‘traditional’’ actors in

the EUC diagram; we can list them in the description

of the EUC, where we also define other things that are

not part of the diagram (e.g., pre- and post-conditions).

Another reason for this change is that a EUC may

sometimes be operated by many different types of

operators, each having different access privileges;

there is simply not enough room to include such

details in the diagram.

C. There is no need to include the special types of ‘‘uses’’

and ‘‘includes’’ UCs in the EUC diagram. Instead of

them, we may have just chained functions; a data flow

from one function to another means that the first

triggers the latter and may pass some data/parameters

to it.

D. The description of the EUC will have to distinguish

between the external entities that are source of input

or destination of output, and the operators of the use

case at runtime. But contrary to the current too-

structured description of a transaction, we adopt a less

structured, more natural-language-like description, as

in ‘‘traditional’’ UC descriptions.

We complete this sketch of the enhanced method by

showing a possible description of a EUC. We assume that

the EUC is the same as the transaction diagram in Fig. 1;

Table 11 shows its description.

40 Requirements Eng (2014) 19:27–43

123

5.2 Threats to validity

As any quasi-laboratory experiment, this one too has lim-

itations and threats to validity of the results. We distinguish

between four types of threats: conclusion, internal, con-

struct, and external validity.

Conclusion validity: Conclusion validity refers to

whether the conclusions reached in a study are correct.

For controlled experiments, conclusion validity is directly

related to the application of statistical tests to the data. If

the statistical tests are not applied correctly, this is a

threat to the conclusion validity. In the presented exper-

iment, we believe that all statistical tests were applied

correctly. Yet, the reliability of measure of quality of

models might introduce some threats to validity; to

address this concern, each solution was graded by two

independent graders.

Internal validity: Internal validity refers to the extent to

which the independent variable(s) were actually responsi-

ble for the effects seen to the dependent variable. A pos-

sible threat to the internal validity may be that unknown

factors may have had an influence on the results and

therefore put limitations on the internal validity of the

study. This possible threat, which exists in all experiments

of this kind, was minimized by the random assignment of

the subjects into the two treatment groups.

Construct validity: Construct validity is the degree to

which the independent and dependent variables accurately

measure the concepts they are supposed to measure.

Threats to construct validity have been identified in both

experimental tasks:

(a) In the comprehension task, a threat to validity is the

measure of understandability. To deal with this threat, we

used a questionnaire consisting of 22 statements, thus

addressing many facts appearing in the presented models.

Of course, counting and summing the number of correct

answers is not a perfect measure, because it gives an equal

weight to all statements. In spite of this weakness, we argue

that the measure of understandability we used is reasonable

and follows the ‘‘best practice’’ in other comparative

experiments of comprehension.

(b) Another threat to construct validity in the model

creation task is that each subject was given a requirements

document of a certain case study (part of the IFIP Con-

ference system). This does not represent a realistic situa-

tion: in reality, analysts usually elicit requirements from

real users in an interactive and iterative process. We

skipped this important aspect and gave the subjects a

‘‘ready’’ requirements document. So, it can be argued that

we mainly measured how well analysts convert predefined

textual requirements into diagrammatic specifications. We

cannot be sure that the same results would be obtained had

the subjects been working in a more realistic environment.

But on the other hand, we have no reason to assume that

this would have affected differently the results.

(c) In both comprehension and modeling tasks, a major

threat to validity relies on the fact that each participant was

asked to assess only one of the modeling methods on a 1–7

ordinal scale. Each individual appreciation of the level of

comprehensibility and the level of the modeling method’s

compatibility to describe functional requirements is differ-

ent. What seems to be ‘‘difficult’’ for some may seem to be

‘‘easy’’ for others. Similarly, a method that seems to be

‘‘extremely compatible’’ to some may seem to be ‘‘not at all

compatible’’ to others. The ideal way would have been to

choose a ‘‘within-subject’’ experiment setting, in which each

subject would have experimented with the two modeling

methods. In that case, we could have asked directly questions

such as: ‘‘Which of the two methods do you understand

better?’’ or ‘‘Which method do you think is more suitable to

Table 11 The EUC description

Component Description

EUC name Assign reviewers to a submitted article

Operators PC Chair, electronic mail system

Preconditions There are submitted articles in the system that have not yet been assigned to reviewers

Postconditions The chosen article is assigned to reviewers

Main success scenario 1. The system reads the details of reviewers from class ‘‘Reviewer’’ and non-assigned articles from class

‘‘Article’’ and displays them to the PC Chair

2. The PC Chair selects and inputs to the system an article from the list and the reviewers assigned to

review it

3. The assignment details are saved in class ‘‘Article review’’

4. The system sends the article’s details and the last date to submit review reports to the reviewers, by the

electronic mail system

Extensions None

Requirements Eng (2014) 19:27–43 41

123

define functional requirements?’’ But this would have caused

other problems. One is a problem of practicality: we would

need many more participants because each participant would

have had to perform the two tasks using one method and then

perform two different tasks using the other method. So we

either need many more (homogeneous) participants or have

each participant perform all those tasks, which would take a

double amount of time (6 h at least instead of 3 h in our

experiment)—this is not so realistic in an exam setting,

besides the problem of getting tired. Another problem with

such an experiment is that we have to make sure that the

different tasks given to each subject are similar in compre-

hensibility, complexity, etc.—problems that are not easy to

solve. Another problem with a ‘‘within-subject’’ form of

experiment is the possible bias due to the order effect (since

each subject has to perform each task using a different

method); to overcome this problem, we might need to double

the number of treatment groups. (For example, subjects in

one group would perform one task using OO-DFD first and

then another task using the UC method, while subjects in

another group would perform the same tasks in the opposite

order). Due to these practical reasons, we chose the

‘‘between-subject’’ setting, and are aware of its limitations,

which may explain the inconsistent and surprising results

regarding the perceived comprehensibility and quality.

External validity: External validity is the degree to

which the results of the experiment can be generalized to

the population under study and other research settings. The

following possible threats have been identified:

(a) The case studies used in the experiments are rela-

tively small and may not be representative in terms of size

and complexity. But this limitation is true for almost all

controlled experiments conducted in the areas of software

engineering and method evaluation.

(b) The subjects who participated in the experiment

were students. In the comprehension task, they played the

role of users who read and evaluate analysis specifications,

but actually they were not real users who evaluate speci-

fications of real systems developed for them. However, we

have no reason to assume that the use of students as sur-

rogates of real users caused any bias of the comparative

results. Similarly, in the model creation task, we used the

same students who played the role of analysts, but they

were not real, experienced analysts of real requirements.

But again, in spite of this limitation, we have no reason to

assume that real analysts would perform differently with

the two modeling methods. Recall that our subjects were

senior software engineering students who were trained to

become systems analysts. Moreover, each subject learned

his/her method from an expert in that method. We note

again that in almost all experimental research in the area,

the subjects were students (as it is not feasible to conduct

such experiment in the real world).

6 Conclusions and future work

We compared two alternative methods for modeling the

functional requirements of an information system: OO-

DFD transaction diagrams and descriptions versus UC

diagrams and descriptions. In a controlled experiment, we

examined the differences between the two methods with

respect to comprehension, quality, time, and perceived

comprehensibility and quality of the modeling methods.

The main result of the comparative experiment is that the

quality of the models created with the UC method is sig-

nificantly better than the quality of models created with the

OO-DFD transaction method. But this result is mainly

because UCs do not include components that are included

in OO-DFD transaction, and therefore, UC analysts can

avoid creating errors that cannot be avoided by OO-DFD

transaction analysts. We also concluded that the OO-DFD

transactions are overly detailed, while the UC diagrams are

too general and not detailed enough. This led us to propose

an improved method to define and describe the functional

requirements that is a hybrid of the two.

But note that the ECU method has not yet been tested

empirically. In the future, we plan to evaluate the enhanced

method compared to the original methods. In that research,

we plan to repeat the experiments using tasks of different

size and complexity.

References

1. Arlow J, Neustadt I (2001) UML and the unified process: prac-

tical object oriented analysis & design. Addison Wesley, Reading

2. Bajaj A (2004) The effect of the number of concepts on the

readability of schemas: an empirical study with data models.

Working Paper, University of Tulsa, OK, USA

3. Bass L, Bergey J, Clements P, Merson P, Ozkaya I, Sangwan R

(2006). A comparison of requirements specification methods

from a software architecture perspective. Technical report, Soft-

ware Engineering Institute, Carnegie Mellon

4. Booch G (1991) Object-oriented design with applications. Ben-

jamin/Cummings, Menlo Park, CA

5. Booch G, Rumbaugh J, Jacobson I (1996) The unified modeling

language for object-oriented development. Rational Software

Corporation

6. Burton-Jones A, Weber R, Wand Y (2009) Guidelines for

empirical evaluations of conceptual modeling grammars. J Assoc

Inf Syst 10:495–532

7. Coad P, Yourdon E (1991). Object oriented analysis, 2nd edn.

Prentice Hall, Englewood Cliffs, NJ

8. Cockburn A (2001) Writing effective use cases. Addison-Wesley,

Boston

9. Cox K, Phalp K, Shepperd M (2001) Comparing use case writing

guidelines. In: Proceedings of the 7th international workshop on

requirements engineering: foundation for software quality,

pp 101–112

10. DeMarco (1978) Structured Analysis and System Specification.

Yourdon Press, New York

11. Dori D, Goodman M (1996) From object-process analysis to

object-process design. Ann Softw Eng 2:25–50

42 Requirements Eng (2014) 19:27–43

123

12. Gane C, Sarson T (1979) Structured system analysis: tools and

techniques. Prentice Hall, Englewood Cliffs, NJ

13. Gemino A, Wand Y (2003) Evaluating modeling techniques

based on models of learning. Commun ACM 46:79–84

14. Hadar I, Kuflik T, Perini A, Reinhartz-Berger I, Ricca F, Susi A

(2010) An empirical study of requirements model understanding:

use case vs. tropos models. SAC’10 March, Sierre, Switzerland,

pp 22–26

15. Hahn J, Kim J (1999) Why are some diagrams easier to work

with? Effects of diagrammatic representation on the cognitive

integration process of systems analysis and design. ACM Trans

Comput Hum Interact 6(3):181–213

16. Jacobson I (1987) Object oriented development in an industrial

environment. In: Proceeding of the object-oriented programming

systems, languages and applications, pp 183–191

17. Jacobson I (1992) Object oriented software engineering: a use

case driven approach. Addison-Wesley Professional, Reading

18. Jeyayaj A, Sauter VL (2007) An empirical investigation of the

effectiveness of system modeling and verification tools. Commun

ACM 50(6):63–76

19. Kabeli J, Shoval P (2005) Comprehension and quality of analysis

specifications—a comparison of FOOM and OPM methodolo-

gies. Inf Softw Technol 47(4):271–290

20. Kim J, Hahn J, Hahn H (2000) How do we understand system

with (so) many diagrams? Cognitive integration processes in

diagrammatic reasoning. Inf Syst Res 11(3):284–303

21. Larman C (2002) Applying UML and patterns: an introduction to

object-oriented analysis and design, and the unified process, 2nd

edn. Prentice Hall, Englewood Cliffs, NJ

22. Mathiassen L, Munk-Madsen A, Nielsen P, Stage J (2000) Object

oriented analysis and design. Marko Publishing, Alborg

23. Millet I, Nelson R (2007) Data flow diagram vs. use cases—

student perceptions. Internatl J Inf Commun Technol Educ

3(1):70–78

24. Moody D (2006) What makes a good diagram? Improving the

cognitive effectiveness of diagrams in is development. In: Pro-

ceedings of the 15th international conference in information

systems development (ISD)

25. Parsons J, Cole L (2005) What do the pictures mean? Guidelines

for experimental evaluation of representation fidelity in dia-

grammatical conceptual modeling techniques. Data Knowl Eng

55:327–342

26. Peleg M, Dori D (2000) The model multiplicity problem:

experimenting with real-time specification methods. IEEE Trans

Softw Eng 6:1–18

27. Ramsin R, Paige R (2008) Process-centered review of object

oriented software development methodologies. ACM Comput

Surv 40(1):3–89

28. Rockwell S, Bajaj A (2005) COGEVAL: applying cognitive

theories to evaluate conceptual models. Adv Top Database Res

4:255–282

29. Rumbaugh J, Blaha M, Premerlani W, Eddy F, Lorensen W

(1991) Object-oriented modeling and design. Prentice-Hall,

Englewood Cliffs, NJ

30. Shoval P (1988) ADISSA: architectural design of information

systems based on structured analysis. Inf Syst 13(2):193–210

31. Shoval P (2007) Functional and object-oriented analysis and

design: an integrated methodology. Idea (IGI) Publishers, Hershey

32. Shoval P, Kabeli J (2001) FOOM: functional- and object-oriented

analysis & design of information systems—an integrated meth-

odology. J Database Manag 12(1):15–25

33. Siau K (2004) Informational and computational equivalence in

comparing information modeling methods. J Database Manag

15:73–86

34. Topi H, Ramesh V (2002) Human factors research on data

modeling: a review of prior research, an extended framework and

future research directions. J Database Manag 13(2):3–19

35. White SA (2004) Introduction to BPMN, IBM Corporation,

business process trends

36. Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen

A (2000) Experimentation in software engineering: an introduc-

tion. Kluwer, Boston

37. Yu E (1997) Towards modeling and reasoning support for early-

phase requirements engineering. In: 3rd IEEE international sym-

posium on requirements engineering, Annapolis, USA, pp 226–235

Requirements Eng (2014) 19:27–43 43

123