building families of software products for e learning platforms a case study

8
64 IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE, VOL. 9, NO. 2, MAY 2014 Building Families of Software Products for e-Learning Platforms: A Case Study Pablo Sánchez Barreiro, Diego García-Saiz, and Marta Elena Zorrilla Pantaleón Abstract— Applications for e-learning platforms must deal with certain variability inherent to their domain. For example, these applications must be adapted to the variations of each teaching- learning process. Thus, they must be changed manually, accord- ing to the particular environment in which they will be deployed. This manual adaptation process is costly and error-prone. Our hypothesis is that software product line (SPL) engineering, whose goal is the effective production of similar software systems, can help to alleviate this problem. This paper illustrates this idea by refactoring an e-learning application named E-Learning Web Miner in a SPL. The benefits obtained are highlighted and analyzed. Index Terms— Software product line, refactoring, e-learning, data mining. I. I NTRODUCTION M OST educational institutions use some kind of e-learning platform nowadays. Examples of these e-learning platforms are Moodle, WebCT/Blackboard or Shakai. These platforms aim to ease the teaching-learning process and improve it by means of taking advantage of internet technologies. These platforms offer different resources and tools that enable students to develop both autonomous and collaborative learning. Moreover, this kind of teaching offers advantages to both students and teachers. The former can set their pace of learning, time and workplace; and the latter can design content and activities for different student’s profiles and learning styles [1] using the interactive and multimedia capabilities that these tools provide. A market of auxiliary applications has been created around these e-learning platforms. For instance, the Moodle platform has currently got 786 complementary modules and plugins in its website. These modules provide different utilities such as eCalendars synchronisation or advanced support for the development of eportfolios [2], as Mahara [3] does. All these e-learning platforms are quite similar. All of them support concepts such as activity, assignment, quizzes or grade. However, they also present a wide range of differences among them. For instance, the database schemas they use Manuscript received January 14, 2013; revised February 13, 2013; accepted March 14, 2013. Date of publication April 16, 2014; date of current version May 15, 2014. This work was supported in part by the University of Cantabria by means of a Ph.D. Grant, in part by the TIN2008-01942/TIN Project through the Spanish Science and Technology Ministry, and in part by the TIC-5131 Regional Project through the Junta de Andalucía. The authors are with the Department of Mathematics, Statistics and Computation, University of Cantabria, Cantabria E-39005, Spain (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/RITA.2014.2317531 in their back-end are different, although they refer to the same concepts. Therefore, the companies developing auxiliary products for e-learning systems must deal with this inevitable variability. Moreover, in the case of a company or a product which targets a single e-learning platform, developers must also deal with some variability inherent to the domain since the design of each course is highly dependent on the instructor and the subject. For instance, to manage student groups as a single unit may be helpful in courses where collaborative activities are proposed, whereas this capability would be not required for those courses based on individual activities. Furthermore, the environment where these applications are deployed is not always the same. The environment can also add some variations. For instance, depending on the country where these applications are deployed, student data can be subject to different regulations and laws [4]. Therefore, auxiliary applications for e-learning platforms should be customised in order to meet the customer needs. These applications must support different variations in order to adapt to: (1) different e-learning platforms; (2) differ- ent learning strategies, subjects and organisation of courses; (3) different environments. Although development teams might create flexible and easily adaptable software architectures that support the quick adaptation of e-learning applications in order to cope with the variations enumerated above, these adaptations would have to be performed manually. This can be a tedious, time-consuming and error-prone task. Software Product Line engineering [5] has the effective production of similar software systems as a goal. In this paradigm, specific products are obtained from a set of reusable software assets which are constructed as automatically as possible thanks to the use of generative techniques. Therefore, we believe that the use of a Software Product Line approach for developing auxiliary software applications for e-learning platforms can be highly beneficial. This article describes how to develop a Software Product Line for a family of applications, named E-learning Web Miner (ElWM), which aims to extract knowledge from activity data stored on e-learning platforms in the form of rules and patterns. After this introduction, this article is structured as follows. Section II introduces our case study, E-learning Web Miner and provides some background on Software Product Line engineering. Section III explains how to construct a Software Product Line for E-learning Web Miner. Section IV shows how 1932-8540 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: wingztechnologieschennai

Post on 21-Jun-2015

52 views

Category:

Education


2 download

DESCRIPTION

2014 IEEE / Non IEEE / Real Time Projects & Courses for Final Year Students @ Wingz Technologies It has been brought to our notice that the final year students are looking out for IEEE / Non IEEE / Real Time Projects / Courses and project guidance in advanced technologies. Considering this in regard, we are guiding for real time projects and conducting courses on DOTNET, JAVA, NS2, MATLAB, ANDROID, SQL DBA, ORACLE, JIST & CLOUDSIM, EMBEDDED SYSTEM. So we have attached the pamphlets for the same. We employ highly qualified developers and creative designers with years of experience to accomplish projects with utmost satisfaction. Wingz Technologies help clients’ to design, develop and integrate applications and solutions based on the various platforms like MICROSOFT .NET, JAVA/J2ME/J2EE, NS2, MATLAB,PHP,ORACLE,ANDROID,NS2(NETWORK SIMULATOR 2), EMBEDDED SYSTEM,VLSI,POWER ELECTRONICS etc. We support final year ME / MTECH / BE / BTECH( IT, CSE, EEE, ECE, CIVIL, MECH), MCA, MSC (IT/ CSE /Software Engineering), BCA, BSC (CSE / IT), MS IT students with IEEE Projects/Non IEEE Projects and real time Application projects in various leading domains and enable them to become future engineers. Our IEEE Projects and Application Projects are developed by experienced professionals with accurate designs on hot titles of the current year. We Help You With… Real Time Project Guidance Inplant Training(IPT) Internship Training Corporate Training Custom Software Development SEO(Search Engine Optimization) Research Work (Ph.d and M.Phil) Offer Courses for all platforms. Wingz Technologies Provide Complete Guidance 100% Result for all Projects On time Completion Excellent Support Project Completion & Experience Certificate Real Time Experience Thanking you, Yours truly, Wingz Technologies Plot No.18, Ground Floor,New Colony, 14th Cross Extension, Elumalai Nagar, Chromepet, Chennai-44,Tamil Nadu,India. Mail Me : [email protected], [email protected] Call Me : +91-9840004562,044-65622200. Website Link : www.wingztech.com,www.finalyearproject.co.in

TRANSCRIPT

Page 1: Building families of software products for e learning platforms a case study

64 IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE, VOL. 9, NO. 2, MAY 2014

Building Families of Software Products fore-Learning Platforms: A Case StudyPablo Sánchez Barreiro, Diego García-Saiz, and Marta Elena Zorrilla Pantaleón

Abstract— Applications for e-learning platforms must deal withcertain variability inherent to their domain. For example, theseapplications must be adapted to the variations of each teaching-learning process. Thus, they must be changed manually, accord-ing to the particular environment in which they will be deployed.This manual adaptation process is costly and error-prone. Ourhypothesis is that software product line (SPL) engineering, whosegoal is the effective production of similar software systems, canhelp to alleviate this problem. This paper illustrates this ideaby refactoring an e-learning application named E-Learning WebMiner in a SPL. The benefits obtained are highlighted andanalyzed.

Index Terms— Software product line, refactoring, e-learning,data mining.

I. INTRODUCTION

MOST educational institutions use some kind ofe-learning platform nowadays. Examples of these

e-learning platforms are Moodle, WebCT/Blackboard orShakai. These platforms aim to ease the teaching-learningprocess and improve it by means of taking advantage ofinternet technologies. These platforms offer different resourcesand tools that enable students to develop both autonomous andcollaborative learning. Moreover, this kind of teaching offersadvantages to both students and teachers. The former can settheir pace of learning, time and workplace; and the latter candesign content and activities for different student’s profilesand learning styles [1] using the interactive and multimediacapabilities that these tools provide.

A market of auxiliary applications has been created aroundthese e-learning platforms. For instance, the Moodle platformhas currently got 786 complementary modules and pluginsin its website. These modules provide different utilities suchas eCalendars synchronisation or advanced support for thedevelopment of eportfolios [2], as Mahara [3] does.

All these e-learning platforms are quite similar. All ofthem support concepts such as activity, assignment, quizzes orgrade. However, they also present a wide range of differencesamong them. For instance, the database schemas they use

Manuscript received January 14, 2013; revised February 13, 2013; acceptedMarch 14, 2013. Date of publication April 16, 2014; date of current versionMay 15, 2014. This work was supported in part by the University of Cantabriaby means of a Ph.D. Grant, in part by the TIN2008-01942/TIN Project throughthe Spanish Science and Technology Ministry, and in part by the TIC-5131Regional Project through the Junta de Andalucía.

The authors are with the Department of Mathematics, Statistics andComputation, University of Cantabria, Cantabria E-39005, Spain (e-mail:[email protected]; [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/RITA.2014.2317531

in their back-end are different, although they refer to thesame concepts. Therefore, the companies developing auxiliaryproducts for e-learning systems must deal with this inevitablevariability.

Moreover, in the case of a company or a product whichtargets a single e-learning platform, developers must also dealwith some variability inherent to the domain since the designof each course is highly dependent on the instructor and thesubject. For instance, to manage student groups as a singleunit may be helpful in courses where collaborative activitiesare proposed, whereas this capability would be not requiredfor those courses based on individual activities.

Furthermore, the environment where these applications aredeployed is not always the same. The environment can alsoadd some variations. For instance, depending on the countrywhere these applications are deployed, student data can besubject to different regulations and laws [4].

Therefore, auxiliary applications for e-learning platformsshould be customised in order to meet the customer needs.These applications must support different variations in orderto adapt to: (1) different e-learning platforms; (2) differ-ent learning strategies, subjects and organisation of courses;(3) different environments.

Although development teams might create flexible andeasily adaptable software architectures that support the quickadaptation of e-learning applications in order to cope with thevariations enumerated above, these adaptations would have tobe performed manually. This can be a tedious, time-consumingand error-prone task.

Software Product Line engineering [5] has the effectiveproduction of similar software systems as a goal. In thisparadigm, specific products are obtained from a set of reusablesoftware assets which are constructed as automatically aspossible thanks to the use of generative techniques. Therefore,we believe that the use of a Software Product Line approachfor developing auxiliary software applications for e-learningplatforms can be highly beneficial.

This article describes how to develop a Software ProductLine for a family of applications, named E-learning WebMiner (ElWM), which aims to extract knowledge from activitydata stored on e-learning platforms in the form of rules andpatterns.

After this introduction, this article is structured as follows.Section II introduces our case study, E-learning Web Minerand provides some background on Software Product Lineengineering. Section III explains how to construct a SoftwareProduct Line for E-learning Web Miner. Section IV shows how

1932-8540 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Building families of software products for e learning platforms a case study

BARREIRO et al.: BUILDING FAMILIES OF SOFTWARE PRODUCTS FOR e-LEARNING PLATFORMS 65

to specify products such that particular settings of these can beautomatically derived from a Software Product Line. Section Vcomments on related work. Finally, Section VI analyses thestrengths and weaknesses of our approach and Section VIIsummarises the article and outlines future work.

II. BACKGROUND

This section provides background on those concepts thatwill be used throughout this paper. First, we introduce thecase study, E-learning Web Miner. Then, we describe how aSoftware Product Line works.

A. Case Study: E-Learning Web Miner

The lack of face-to-face communication in distance learningpresents a number of problems, found in the literature, whichaffect both teachers and students [6]. On the one hand, studentsfeel isolated and disoriented in course hyperspace rapidlylosing motivation [7]. On the other hand, teachers do not havetools to adequately monitor their students, so that they canidentify the difficulties encountered and get to know how theyevolve over the course [8], [9].

E-learning Web Miner (ElWM) [6], a tool developed in theUniversity of Cantabria, aims to assist instructors involved invirtual education by extracting and providing useful informa-tion that these instructors can use to improve their virtualcourses and learning activities. To extract this information,ElWM relies on data mining techniques. Nevertheless, instruc-tors, in general, lack knowledge on data mining. A key goalof ElWM is to provide a simple and easy-to-use interfacethat hides all the details about data mining techniques to theend-user and presents the computed results in a user-friendlyand easily understandable way.

ElWM currently offers instructors answers to three differentqueries: (1) what kinds of resources are frequently usedtogether (e.g., forum, mail) by students in each learningsession?; (2) what is the profile or the most relevant featuresof the different sessions carried out by students?; and (3) whatis the profile of the students who enrol in the course?

As an example, we briefly describe how the students’profile is extracted. The goal is to group students accordingto their demographic data and the activity carried out in aspecific course hosted in an e-learning platform. This templateutilizes the following input variables: gender, age, number ofsessions carried out in the course, time spent on the course,average sessions per week and average time spent per weekof each student enrolled in a particular course. Before themining process starts (also named KDD [10], an acronymthat stands for Knowledge Discovery from Databases), inputdata is pre-processed to evaluate their quality. For example,correlated or highly unbalanced data is eliminated. To groupthe students in clusters according to their similarity, ElWMuses the EM algorithm (Expectation Maximization) combinedwith SimpleKMeans. Nevertheless, in certain circumstances,due to some particularities of the input data, some variantsof the SimpleKMeans algorithm, such as xMeans or kMe-doids might produce more accurate results. In these cases,

the SimpleKMeans is replaced by the algorithm that providesbetter results.

ElWM relies on third-party implementations of thesealgorithms. These implementations are provided bydifferent data mining suites, such as Weka [10], KNime [12]or RapidMiner, formerly called YALE (Yet Another LearningEnvironment) [13]. ElWM was designed and developedfollowing a Service Oriented Architecture (SOA) andimplemented by means of Web Services [14].

ElWM has been assessed by the teachers responsible fortwo online courses taught at the University of Cantabria duringfour academic years, and in their opinion, this tool allows themto discover their students’ behaviour in relation to time spentand resources used in the course, enabling them to validateand refute the assumptions made in the design of the learningprocess [14].

B. Software Product Line Engineering

A Software Product Line aims to create the infrastructurefor the rapid and effective production of software systems fora specific market segment [5], [15]. These software systemsare similar and therefore share a subset of common features,but variations may also be present. Therefore the main goalof a Software Product Line is to construct, as automaticallyas possible, specific software products from the selection of aset of features available in the product family.

A Software Product Line Engineering process comprisestwo different phases: Domain Engineering and ApplicationEngineering (see Fig. 1).

Domain Engineering deals with the creation of theinfrastructure that will enable the rapid, or even automatic,construction of specific software systems belonging to thesame family. This step creates a set of reusable softwareassets, as well as a set of procedures to automatically assemblethese software assets, according to the needs of each particularcustomer.

Application Engineering is concerned with the engineeringof single software systems using the infrastructure previouslycreated at the Domain Engineering level. The specific stepswithin each phase are detailed in the following sections ofthis article.

We would like to point out that the Domain Engineeringphase is executed just once in order to construct the product-line infrastructure, whereas the Application Engineering phaseis executed each time a new product is derived from such aninfrastructure. So, the less expensive the application engineer-ing phase is, the higher the benefits of a Software Product Linewill be.

Fig. 2 illustrates this idea. It aims to show a typical softwareproduct line scenario. This plot represents how the accumu-lated development cost of a set of similar but slightly differentsoftware products grows together with the number of releasedsoftware products. Plots are shown for both a software productline engineering approach and a single-system engineeringapproach.

In the single system approach, a particular software productis firstly developed. When we need a second software product,

Page 3: Building families of software products for e learning platforms a case study

66 IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE, VOL. 9, NO. 2, MAY 2014

Fig. 1. An eye-bird view of the SPL engineering process.

Fig. 2. Cost-effectiveness of a Software Product Line.

we can reuse the software assets created for the first softwareproduct, but we will need to adapt them manually. Moreover,we might need to develop some specific portions of this newsoftware product. In any case, thanks to reutilisation, the costof constructing the second software product is lower than thatof the first one. For the third product, the cost would be moreor less the same than for the second, and so on. Consequently,the cost grows slower but constantly after the first product,as can be seen in Fig. 2.

In the software product line engineering case, a higherinvestment is required for the first product. This is becausewe need to develop the domain engineering infrastructure,which has a considerable cost. But, as the development ofspecific customised products is done practically for free,once the domain engineering phase has been completed theaccumulated cost grows much more slowly than in the single-system engineering approach. Thus, there will be a point whereboth functions meet. From this point forward, the cost of thesoftware product line approach will be lower than in the singlesystem engineering case. Therefore, we will get a return forour initial investment.

In conclusion, it is key in a Software Product Line that theApplication Engineering phase is as automated as possible,so that its cost is as low as possible. In the best case, theusers would simply have to specify which features they wantto include in a product and this product would be automaticallygenerated.

The following sections describe how this approach has beenused to refactor the application ElWM in a Software ProductLine.

III. DOMAIN ENGINEERING

This section describes how the Domain Engineering activityhas been carried out to refactor ElWM in a Software ProductLine. This phase is performed in three steps as describedhereinafter. Besides, we point out the benefits obtained.

A. Variability Analysis

The first step to build a Software Product Line is to carryout an analysis of the variability inherent to the domain thatthe software product family covers (Fig. 1, label 1). Thegoal is to identify which features are mandatory, which areoptional, which are alternatives, and so forth. To accomplishthis task, several variability analysis techniques, such as FODA(Feature-Oriented Domain Analysis) [16], are available. In ourcase, we obtained our feature model using FODA.

A feature model is a tree where each node represents afeature of the system. The children of a node represent thedifferent features which are part of the parent feature. Thereare several parent-child relationships, depending on whetherthe child is an optional feature, or an alternative one. The rootof this tree is the system itself, which is decomposed into itsmost prominent features. Then, these prominent features aredecomposed into their prominent subfeatures, and so on, untilthe system is fully decomposed into features.

For space reasons, we cannot display the full feature model.Therefore, Fig. 3 shows a part of our feature model for ElWM.It specifies that the ElWM application has three main features,which are: (1) the e-learning platforms it uses (Platform);(2) the kind of queries which can be deployed in a specificinstallation (Queries); and (3) the third-party data miningsoftware it uses (DataMining Suite).

The Platform can be either WebCT or Moodle, but not bothat the same time, since these features are mutually exclusive.ElWM requires a third-party data mining software to work.Three different data-mining suites can be used: (1) Weka,(2) Knime; or (3) RapidMiner. The software selected willdepend on which algorithms are chosen to answer each query.

ElWM can answer three different queries: Resources Usage,Session Profile and DataMining Suite. According to the user

Page 4: Building families of software products for e learning platforms a case study

BARREIRO et al.: BUILDING FAMILIES OF SOFTWARE PRODUCTS FOR e-LEARNING PLATFORMS 67

Fig. 3. A feature model for ElWM.

Fig. 4. External constraints for the ElWM feature model.

needs, certain queries can be excluded from a specific instal-lation of this product. Thus, each one of these queries is anoptional feature, but, at least, one of them must be chosen.

Each query can be answered using slightly different versionsof the same data mining technique. For instance, to answer thequestion “Which kinds of sessions are there in the course?”(Session Profile), ElWM can use one of the available clusteringtechniques, such as Kmeans or Kmedoids. Similarly, there aredifferent alternatives for ResourceUsage and StudentProfilequeries.

Finally, we must take into account that not all relationshipsbetween features can be expressed using the feature modelsyntax. For instance, the kMedoids algorithm is only providedby the RapidMiner platform. Consequently, if we want to usekMedoids, we must also select RapidMiner as a data miningsuite to be installed with our product. This kind of constraintis usually expressed using propositional formulae where theatoms of these formulae are the features of the feature model.An atom is evaluated as true when the corresponding featurehas been selected. Otherwise, it is evaluated as false.

Fig. 4 shows the external constraints we have had to specifyfor the ElWM feature model. These constraints basicallyspecify dependencies between data mining algorithms and datamining suites.

Once analyzed and specified the variability of a softwareproduct family, the next step is to design a flexible softwarearchitecture that supports the different variations identified(Fig. 1, label 3). In the next section, we explain how thisgoal was achieved in our case.

B. Reference Architecture Design

Fig. 5 depicts an eye-bird overview of the ElWM archi-tecture. It is an adaptable architecture that can be modified

Fig. 5. Arquitectura de referencia para ElWM.

according to the set of features each customer wants to includein a specific ElWM installation.

ElWM has been designed as a three-layer web-based appli-cation. Several static HTML forms serves as front-end for theapplication (Fig. 5, label 1). The user fills in these forms andrequests a query to be answered. This request is sent to a webservice (Fig. 5, label 2), which is implemented by a componentthat contains the core logic for the ElWM system (Fig. 5,label 3).

This core logic retrieves the data to be analysed from ane-learning platform. To isolate the retrieval of this data fromthe particularities of each e-learning platform, data is retrievedusing a well-defined interface (Fig. 5, label 4), independent ofany e-learning platform. For each specific e-learning platform,there is a software component which implements this interface(Fig. 5, label 5).

Using this source data, ElWM launches a prebuilt KDDprocess for computing the answer to the request (Fig. 5,label 6). This KDD process uses data mining algorithmsprovided by third-party data-mining suites. Results of theseprocesses are returned to the core logic as raw data. Thisraw data is processed by dynamic web server pages (Fig. 5,label 7), which render this data appropriately in order toimprove their comprehension using, for instance, a pie-chart.

This reference architecture supports the variability specifiedby the feature model of the previous section. For example,new e-learning platforms can be added by simply implement-ing a new adapter component.

Page 5: Building families of software products for e learning platforms a case study

68 IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE, VOL. 9, NO. 2, MAY 2014

Fig. 6. Static HTML form for requesting queries.

This reference architecture is not the specific architecturefor any product, therefore, it is neither operative norcan be deployed. Prior to using this architecture, it mustbe instantiated according to the feature selection specifiedby each customer. For instance, if the customer does notuse WebCT, the corresponding adapter component must besimply removed. Therefore this configuration process must beautomated if you want the Software Product Line to producebenefits.

Next section describes how to define the rules that specifyhow to instantiate the architecture according to a set offeatures selected.

C. Automating the Process of Instantation ofa Reference Architecture

As commented above, the main goal of a Software ProductLine is that, once a customer selects the features he or shewants to include in a specific product, the construction of thisproduct is performed as automatically as possible. To achievethis goal, the last step is to specify what actions must beperformed on our reference architecture in order to adapt it toa certain set of features (Fig. 1, label 2). For example, Fig. 6shows the web form that a user would use to launch queries ona product in which the three types of questions were included.If we do not wish to include one of them, for example, thequery about the sessions’ profile, we should remove that part.This part should be removed from both the graphical interfaceand the application logic.

These modifications can be performed parameterising, bymeans of code generation templates, the different artifacts thatcomprise ElWM. These templates allow us to specify whethera certain piece of code will appear in the final version of theartifact or not, depending on the values assigned to a certain

Fig. 7. Template for the generation of customised HTML pages.

input parameters. In our case, these input parameters are theset of features we want to include in a particular product.These code generation templates were implemented using theEpsilon language [17].

Fig. 7 displays part of the code generation template used toparameterise the web form of the Fig. 6. It works similarly toa JSP page.

The code between the escape characters [% and %] iswritten in Epsilon, which regulates the text produced as a resultof the execution of the template.

For example, line 2 of Fig. 7 specifies that, if the Session-Profile feature has been selected (featureModel.isSelected(“SessionProfile”)), then the HTML code of the lines 2 and 3corresponding to the button “tools used together” must beincluded in a particular version of this artifact. If not, thebutton should not appear on the form.

By using templates for code generation, all software arte-facts are parameterised, these are: the web interface, the corelogic and the data access components.

Page 6: Building families of software products for e learning platforms a case study

BARREIRO et al.: BUILDING FAMILIES OF SOFTWARE PRODUCTS FOR e-LEARNING PLATFORMS 69

Fig. 8. Configuration of a feature model for Data Structures course.

After this last step, the Domain Engineering phase ends.As a result of this phase, we do not have any concrete artifactthat can be deployed. In order to do that, we must select aset of features and use them as input to the code generationtemplates. These code generation templates will produce theset of artifacts that will be deployed. This process is describedin the next section.

IV. APPLICATION ENGINEERING

The Application Engineering phase, whose goal is theconstruction of specific software products, comprises two mainsteps (see Fig. 1): (1) the creation of a configuration of thefeature model built at the Domain Engineering level; and(2) the execution of the customisation and adaptation processthat automatically builds the specific software product. Bothphases are explained in the following subsections.

A. Product Configuration

The goal of this first step is to determine what features bestmeet the needs of a particular customer. In our case, this stepinvolves two main stakeholders: (1) the customer who wantsto acquire a specific setting of ElWM; and (2) a data miningexpert who assists the customer in the configuration process.

The customer will specify the e-learning platforms whichhis or her organisation uses; and the queries he or she wantsto include. Then, the data mining expert will analyse the sourcedata and suggest the most suitable algorithm for each kind ofquery.

Fig. 8 shows a configuration of the feature model depicted inFig. 3. This configuration corresponds to an ElWM setting fora Data Structure course in the University of Cantabria hosted inMoodle. Thus, the Moodle feature is selected and the WebCTfeature discarded. Due to the characteristics of this course, theResourcesUsage query is not of interest, so it is discarded. Forthe StudentProfile and SessionProfile queries, according to thedata mining expert’s opinion, the EM+kMeans algorithm is themost suitable choice. Hence, it is selected. For this algorithm,the data mining expert suggested relying on Weka. Therefore,Weka was selected as the data mining suite for our particularsetting.

Once a configuration of a feature model has been created,we must validate it. That means, we must check no externalconstraint is violated. The validated configuration is used asinput for code generation templates, as is described in thefollowing section.

B. Automatic Product Derivation

To build a final customised product from a valid configu-ration model is really easy, due to the automation providedby the use of code-generation techniques. It is only necessaryto indicate the configuration which we want to use and thecode generation templates will produce the set of personalisedsoftware artifacts according to the features selected. Thesesoftware artifacts are operative and ready to be deployed.

V. RELATED WORK

To the best of our knowledge, there is no other SoftwareProduct Line for mining log data from e-learning platformsas we have presented in this article. This is natural as we areaddressing a very specific topic. Thus, we analyse previouswork that: (1) has used Software Product Line engineering inthe e-learning domain; or (2) uses data mining techniques inthe e-learning domain.

Regarding the first point, Oberweis et al. [18] and Pankratiusand Stucky [19] used a Software Product Line engineeringapproach to improve both reusability and maintenance of thecontents of virtual courses offered through an e-learning plat-form. In this case, the features refer to the different materialsincluded in a family of courses. Thus, to create a specificcourse, the instructor selects what topics he or she wants toinclude so the material for such a course is (automatically)constructed and adapted. This is a very intelligent solutionthat reinforces our hypothesis that Software Product Lineengineering has a lot of potential applications in the e-learningdomain, even in certain facets not directly related with thedevelopment of software products.

Sierra et al [20] and Martínez-Ortiz et al. [15] proposean approach similar to ours, but from a different point ofview. Their main focus is on the building of domain-specificlanguages, using a well-defined language engineering process.The goal is that instructors, using these languages, can developcertain aspects of the e-learning application, such as the speci-fication of the flow and learning activities, by themselves [15].These languages feed a set of code generators which producethe desired e-learning applications [20]. Although this work issimilar to our proposal, there are several differences. Whereasthe Software Product Line engineering aims to increasingthe productivity in the development of the similar softwaresystems, the approaches based on domain specific languagesare more oriented to increase the level of abstraction at whichsoftware applications are developed, but these applications donot need to be similar.

Both approaches are complementary. For example,a Software Product Line could be used for controlling thevariations existing in a learning process designed by meansof a specific domain language.

Page 7: Building families of software products for e learning platforms a case study

70 IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE, VOL. 9, NO. 2, MAY 2014

Díez et al. [21] use feature trees to analyse and classifysoftware services used to implement e-learning applications.However, they only use feature trees as mechanisms to char-acterise, and not to configure the automatic construction ofsoftware. On the other hand, Zhou et al. [23] propose areference architecture, with certain degree of modularity andflexibility, to develop e-learning applications. This architecturecould be used as the base to construct a Software ProductLine for e-learning platforms, reforcing the usefulnesss of theSoftware Product Line approach in this domain.

As mentioned, this is the first work which implements adata mining application following a Software Product Lineapproach. Thus, we can proudly say we have put the firststone on this road. We would like to emphasise the data miningdomain is also a domain with a wide range of inherent variabil-ity. For example, there are different sets of slightly differentalgorithms that share a common goal, but they produce moreaccurate results or exhibit a better performance under certainparticular conditions. Therefore, adopting a Software ProductLine approach for the development of data mining applicationseems an attractive idea.

Regarding the last point, we must mention two interestingpapers, Romero and Ventura [24] and Castro et al. [25], whichdetail and summarise the application of data mining techniquesto educational systems. The final objective being pursued isto improve the teaching-learning process and to reduce thehigh dropout that occurs in virtual courses. For instance,Hung et al. [26] analyse various patterns of online learningbehaviours, and to make predictions on learning outcomes,Ueno et al. [27] provide instructional messages to learnerswith the aim of improving the effectiveness of the course.Obsivac et al. [28] predict dropouts from student activitydata enriched with data derived of their social behaviour.Zhang et al. [29] personalise recommendations about thelearning contents according to the learning style and wayof browsing, or works such as [30] and [31] carried out bytwo of the authors of this paper, whose main aim is focusedon developing user-friendly data mining tools which helpinstructors to better understand the learning process, and toanalyse the course organisation effectiveness (design, tasks,resources used, and so on) so that they can assess the situationand take corresponding corrective actions.

VI. DISCUSSION

This article has explained how a Software Product Linefor the development of similar but customised data miningapplications for e-learning platforms can be constructed.Specifically, we have constructed a Software Product Line inorder to illustrate with an example how the Software ProductLine engineering can provide benefits when developing appli-cations associated to e-learning platforms.

As commented along this article, a Software Product Lineapproach provides benefits if: (1) a certain number of similarbut slightly different applications is expected to be constructed,and (2) the cost of constructing a specific product, this is, thecost associated to carrying out the Application Engineeringphase, is as low as possible.

Regarding the second point, in our case, once the config-uration which meets an instructor’s needs of a course hasbeen obtained and validated, the generation of the necessaryartifacts to deploy the application is performed automatically.As a result, the cost of the Application Engineering phase ispractically neglectable.

Using the infrastructure described in this article, we havedeveloped different versions of ElWM according to theneeds of different virtual courses without experiencing furtherproblems.

Nevertheless, we would like to highlight that a SoftwareProduct Line only provides benefits when we are planningto develop a certain amount of similar but slightly differentsoftware systems. If only a couple of similar applications aregoing to be developed, this approach will not probably beuseful.

In the e-learning domain, due to the huge number ofeducative institutions existing in the world, each one withits own particular needs, we firmly believe that the adoptionof a Software Product Line approach is justified and highlyrecommended for the development of auxiliary applicationsfor e-learning platforms.

VII. SUMMARY AND FINAL CONCLUSIONS

This article has illustrated by means of an example howSoftware Product Line engineering can help the developmentof software applications associated with e-learning platforms.We have used a data mining application which analyses logdata from e-learning platforms, called E-learning Web Miner.This tool was developed by two of the authors of this paper in2010, and now it has been refactored into a Software ProductLine.

The main advantage of the adoption of this approach is thatthe adaptation process of the product is performed practicallywithout cost, since it is carried out automatically. Furthermore,as the adaptation process is performed by a computer, thereare no human errors. This assures the quality of the productand avoids the additional costs related with the resolution ofthese errors.

In order to assess our proposal, several configurations ofthe ElWM application were implemented to meet the needs ofdifferent courses using the Software Product Line developed.

As future work, we will add more features to this SoftwareProduct Line and build more components to render the queryresults in a friendly way. Likewise, we are working on newinteresting queries for instructors involved in virtual teachingsuch as predicting the students’ performance according to theactivity performed in the course [31], and to develop toolswhich help us automate the KDD process [32].

REFERENCES

[1] P. Brusilovsky and E. Millán, “The adaptive web,” in User Models forAdaptive Hypermedia and Adaptive Educational Systems, P. Brusilovsky,A. Kobsa, and W. Nejdl, Eds. Berlin, Germany: Springer-Verlag, 2007,pp. 3–53.

[2] L. Stefani, R. Mason, and C. Pegler, The Educational Potential ofe-Portfolios. Evanston, IL, USA: Routledge, Jun. 2007.

[3] E. M. Murphy, Mahara 1.4 Cookbook. Birmingham, U.K.: PacktPublishing, Sep. 2011.

Page 8: Building families of software products for e learning platforms a case study

BARREIRO et al.: BUILDING FAMILIES OF SOFTWARE PRODUCTS FOR e-LEARNING PLATFORMS 71

[4] P. Carey, Data Protection: A Practical Guide to UK and EU Law, 2nd ed.Oxford, U.K.: Oxford Univ. Press, Mar. 2009.

[5] A. Rashid, J. C. Royer, and A. Rummler, Aspect-Oriented Model-DrivenSoftware Product Lines. Cambridge, U.K.: Cambridge Univ. Press, 2011.

[6] M. E. Zorrilla and D. García-Saiz, “Business intelligence applicationsand the web: Models, systems and technologies. Information sciencereference,” in Mining Service to Assist Instructors Involved in VirtualEducation. Hershey, PA, USA: IGI Global Publishers, Sep. 2011.

[7] R. Mazza and V. Dimitrova, “Coursevis: A graphical student monitoringtool for supporting instructors in web-based distance courses,” Int.J. Mach. Studies, vol. 65, no. 2, pp. 125–139, 2007.

[8] L. P. Macfadyen and S. Dawson, “Mining LMS data to develop an earlywarning system for educators: A proof of concept,” Comput. Educ.,vol. 54, no. 2, pp. 588–599, 2010.

[9] R. Hijon and A. Velazquez, “E-learning platforms analysis and devel-opment of students tracking functionality,” in Proc. World Conf. Educ.Multimedia, Hypermedia Telecommun., 2006, pp. 2823–2828.

[10] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “Advances in knowl-edge discovery and data mining,” in From Data Mining to KnowledgeDiscovery: An Overview, U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth,and R. Uthurusamy, Eds. Menlo Park, CA, USA: Amer. Assoc. Artif.Intell., 1996, pp. 1–34.

[11] I. H. Witten, E. Frank, M. A. Hall, and G. Holmes, Data Mining:Practical Machine Learning Tools and Techniques, 3rd ed. San Mateo,CA, USA: Morgan Kaufmann, Feb. 2011.

[12] M. R. Berthold et al., “KNIME: The Konstanz information miner,” inProc. 31st Conf. Gesellschaft Klassifikation, Mar. 2008, pp. 319–326.

[13] I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler,“YALE: Rapid prototyping for complex data mining tasks,” in Proc.12th Int. Conf. KDD Mining, Philadelphia, PA, USA, Aug. 2006,pp. 935–940.

[14] M. Zorrilla and D. Garcíaıa-Saiz, “A service oriented architecture toprovide data mining services for non-expert data miners,” DecisionSupport Syst., vol. 52, no. 2, pp. 464–473, 2012.

[15] K. Pohl, G. Böckle, and F. J. van der Linden, Software Product LineEngineering: Foundations, Principles and Techniques. New York, NY,USA: Springer-Verlag, Oct. 2005.

[16] K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterso,“Feature-oriented domain analysis (FODA) feasibility study,” SoftwareEng. Inst. (SEI), Carnagie Mellon Univ., Pittsburgh, PA, USA, Tech.Rep. CMU/SEI-90-TR-21, Nov. 1990.

[17] L. M. Rose, R. F. Paige, D. S. Kolovos, and F. Polack, “The Epsilongeneration language,” in Proc. 4th ECMDAFA, vol. 5095. Jun. 2008,pp. 1–16.

[18] A. Oberweis, V. Pankratius, and W. Stucky, “Product lines for digitalinformation products,” Inf. Syst., vol. 32, no. 6, pp. 909–939, Nov. 2007.

[19] V. Pankratius and W. Stucky, “A strategy for content reusabilitywith product lines derived from experience in online education,” inProc. Softw. Educ. Training Sessions ICSE, vol. 4309. May 2005,pp. 128–146.

[20] J. L. Sierra, A. Fernández-Valmayor, and B. Fernández-Manjón, “Fromdocuments to applications using markup languages,” IEEE Softw.,vol. 25, no. 2, pp. 68–76, Mar./Apr. 2008.

[21] I. Martínez-Ortiz, J. L. Sierra, B. Fernández-Manjón, and A. Fernández-Valmayor, “Language engineering techniques for the development ofe-learning applications,” J. Netw. Comput. Appl., vol. 32, no. 5,pp. 1092–1105, Sep. 2009.

[22] D. Díez, A. Malizia, I. Aedo, P. Díaz, C. Fernández, and J. M. Dodero,“A methodological approach to encourage the service-oriented learningsystems development,” Educ. Technol. Soc., vol. 12, no. 4, pp. 138–148,2009.

[23] D. Zhou, Z. Zhang, S. Zhong, and Y. P. Xie, “The design of softwarearchitecture for e-learning platforms,” in Proc. 3rd Int. Conf. Technol.E-Learn. Digital Entertainment (Edutainment), Jun. 2008.

[24] C. Romero and S. Ventura, “Educational data mining: A review of thestate-of-the-art,” IEEE Tans. Syst., Man, Cybern. C, Appl. Rev., vol. 40,no. 6, pp. 601–618, Nov. 2010.

[25] F. Castro, A. Vellido, Á. Nebot, and F. Múgica, “Applying data min-ing techniques to e-learning problems,” in Evolution of Teaching andLearning Paradigms in Intelligent Environment (Studies in Computa-tional Intelligence), vol. 62, J. Kacprzyk, L. Jain, R. Tedman, andD. Tedman, Eds. New York, NY, USA: Springer-Verlag, 2007,pp. 183–221.

[26] J. L. Hung and K. Zhang, “Revealing online learning behaviors andactivity patterns and making predictions with data mining techniques inonline teaching,” J. Online Learn. Teaching, vol. 8, no. 4, pp. 426–436,Dec. 2008.

[27] M. Ueno and T. Okamoto, “Bayesian agent in e-learning,” in Proc. 7th.ICALT, Niigata, Japan, Jul. 2007, pp. 282–284.

[28] T. Obsivac, L. Popelinsky, J. Bayer, J. Geryk, and H. Bydzovska,“Predicting drop-out from social behaviour of students,” in Proc. EDM,2012, pp. 103–109.

[29] L. Zhang, X. Liu, and X. Liu, “Personalized instructing recommendationsystem based on web mining,” in Proc. 9th ICYCS, 2008, pp. 2517–2521.

[30] D. García-Saiz and M. Zorrilla, “E-learning web miner: A data miningapplication to help instructors involved in virtual courses,” in Proc. 4thInt. Conf. EDM, Jul. 2011, pp. 323–324.

[31] D. García-Saiz and M. Zorrilla, “A promising classification method forpredicting distance students performance,” in Proc. 5th Int. Conf. Educ.Data Mining, 2012, pp. 206–207.

[32] R. Espinosa, D. García-Saiz, J. J. Zubcoff, J. Mazón, and M. Zorrilla,“Towards the development of a knowledge base for realizing userfriendlydata mining,” in Proc. 6th Metadata Semantics Res. Conf., 2012.

Pablo Sánchez Barreiro is an Associate Profes-sor with Universidad de Cantabria. His principalresearch interests are in aspect-oriented softwaredevelopment, model-driven development, and soft-ware product lines. He has been an Active Mem-ber of the AOSD Network of Excellence and theAMPLE European projects. His work can be foundat conferences like MODELS, ECMDA, ECSA, orSLE, and international journals, such as InformationSoftware and Technology. He has been involved inevents, such as the AOM, MOMPES, or MoDRE

workshop series. He has also reviewed articles for journals, such as Softwareand System Modeling and Journal of Systems and Software.

Diego García-Saiz has been a Computer ScienceEngineer since 2010. He is currently pursuing thePh.D. degree with the University of Cantabria. Hisresearch field is data mining applied to educationalcontexts. He has several publications, remarking hisarticle in Decision Support Systems journal. Hisother fields of interest are software engineering anddatabases.

Marta Elena Zorrilla Pantaleón is an AssociateProfessor of Computer Science with the Univer-sity of Cantabria, Spain, where she received thebachelor’s degree in telecommunication engineeringand the Ph.D. degree in computer science in 1994and 2001, respectively. She has participated in andmanaged more than 20 research projects, most ofthem with companies. She has authored a databasebook and more than 50 works published in interna-tional journals, books, and conferences. She is anactive reviewer of several international journals and

conferences (DSS, IJCSA, IEEE-Education, IEEE-RITA, SCI, and BEWEB).Her research interests are the design and development of information systemsand intelligent systems for companies, and, inside the educational area, theapplication of data mining techniques and OLAP technologies to analyze andimprove the teaching-learning process.