universidad de chileusers.dcc.uchile.cl/~omotelet/papers/phdmotelet.pdf · their great scientiﬁc...

Universidad de Chile

Facultad de Ciencias Fısicas y MatematicasEscuela de Postgrado

Graphs for Metric Space Searching

by

Rodrigo Paredes

Submitted to the Universidad de Chile in fulfillmentof the thesis requirement to obtain the degree of

Ph.D. in Computer Science

Advisor : Gonzalo Navarro

Committee : Patricio Poblete: Marcos Kiwi: Peter Sanders

(External Professor,Universitat Karlsruhe, Germany)

This work has been supported in part by Millennium Nucleus Center for WebResearch, Grant P04-067-F, Mideplan, Chile

Departamento de Ciencias de la Computacion - Universidad de ChileSantiago - ChileOctober 2006

ImprovingLearning-Object Metadata Usage

during Lesson Authoring

by

Olivier Motelet



Advisors : Nelson Baloian and Jose A. Pino

Committee : ????: ????: ????

(External Professor,university?, ?)

This work has been supported in part by MECESUP Project, Grant UCH0109, Chile

Departamento de Ciencias de la Computacion - Universidad de ChileSantiago - Chile

August 2007

Universidad de ChileFacultad de Ciencias Físicas y MatemáticasDepartamento de Ciencias de la Computación

ImprovingLearning-Object Metadata Usage

during Lesson Authoring

by

Olivier Motelet



Advisors : Nelson Baloian: José A. Pino

Committee : Sergio Ochoa: Rosa Alarcon

(DCC, Pontificia Universidad Católica de Chile)

: Erik Duval(External Professor,Dept. Computerwetenschappen,Katholike Universiteit Leuven, Belgium)

This work has been supported in part by MECESUP Project, Grant UCH0109, Chile

Santiago - ChileOctober 2007

Abstract

In order to achieve coherence and flexibility in multimedia-based learning units many au-thors recommend that the lesson components are structured as a graph. In a lesson graph,educational resources are encapsulated into learning objects (LO) along with their respectivemetadata (LOM) and are interconnected through different kinds of rhetorical and semanticalrelationships. Lesson graph LOs are stored within repositories where their metadata is usedto ease their retrieval. Nevertheless, such systems face serious problems with LOM usage:Metadata is difficult to instantiate and lesson authors are generally reluctant to perform suchtedious tasks since they do not benefit from the metadata they generate.

Generating metadata automatically solves this problem. Nevertheless, this approach islimited to a restricted set of LOM attributes that generally excludes subjective metadata suchas the educational characteristics of LOs. This limitation motivates this thesis to focus on acomplementary method: a hybrid approach based on the synergy between automatic processesand human interventions. Hybrid LOM generation can be applied to metadata attributes thatare not automatically generated. Nevertheless, it still involves the contribution of generallyuncooperative users that require benefits in order to encourage their participation.

We propose to study LOM usage during lesson authoring not only from the perspective ofhybrid LOM generation but also from the perspective of the benefits resulting from generatingLOM. Such strategy aims at supporting a positive feedback loop where the benefits motivategood-quality LOM generation while good-quality LOM increases these benefits.

In particular, this thesis investigates methods for (1) seamlessly integrating hybrid LOMgeneration and hybrid LOM validation in a lesson authoring tool, (2) processing LOM evenwhen some metadata attribute values are missing, incomplete, or incorrect, and (3) enhancingthe performance of classical LO retrieval methods using the metadata of authored-lesson LOs.

We developed an open-source lesson authoring tool, LessonMapper2, for testing thefeasibility and validity of the proposals of this thesis. Preliminary experiments show thatLOM generated by users significantly enhances LO retrieval during lesson authoring.

3

4 Abstract

Resumen

Para lograr coherencia y flexibilidad en unidades de aprendizaje basadas en documentosmultimedia, varios autores han recomendado estructurar los componentes de los cursos engrafos. En un grafo de curso, los recursos educacionales son encapsulados como objetos deaprendizaje (LO - Learning Objects) con sus respectivos metadatos (LOM - Learning-ObjectMetadata) y son inter-conectados con relaciones de varios tipos retóricos y/o semánticos.Los grafos de recursos son almacenados en repositorios en los cuales los metadatos sirvenpara facilitar su recuperación y reutilización . Sin embargo, tales sistemas se enfrentan conproblemas serios en cuanto al uso de los LOMs: los metadatos son dificiles de instanciar y losautores de cursos generalmente no tienen estímulos para cumplir con esta tediosa tarea yaque ellos mismos no se benefician de los metadatos que generan.

La generación automática de metadatos resuelve este problema. Sin embargo, estemétodo se limita a ciertos metadatos excluyendo la mayor parte de los metadatos subjetivostales como los metadatos educacionales. Esta limitación motivó el enfoque de esta tesissobre una técnica complementaria: un método híbrido basado en la sinergía entre procesosautomáticos e intervención humana. La generación híbrida de LOMs puede ser aplicadasobre los atributos que no pueden ser automáticamente generados. Sin embargo, este enfoqueesta basado en la contribución de usuarios no siempre cooperativos, quienes necesitarían verbeneficios para motivar su participación.

Proponemos estudiar los usos de LOM durante la creación de cursos, no sólo desdela perspectiva de la generación híbrida sino también desde la perspectiva de los beneficiosque pueden brindar los LOMs. Esta estrategia tiene como objetivo soportar una retroacciónpositiva en la cual los beneficios puedan motivar la generación de LOMs de buena calidad, y labuena calidad de los LOMs pueda mejorar los beneficios.

En particular, esta tesis investiga métodos para (1) integrar sin transición la generaciónhíbrida de LOMs dentro de una herramienta de creación de cursos, (2) procesar un conjuntode LOMs aunque ciertos metadatos quedaran incompletos, incorrectos, o faltantes, (3) mejorarlos resultados de los métodos clásicos de recuperación de LOs usando los metadatos de los LOsque componen un curso.

Desarrollamos una herramienta de código abierto para validar las propuestas de estatesis. Experimentos preliminares mostraron que los LOMs pueden mejorar significativamentela recuperación de LOs adicionales durante el proceso de creación de cursos.

5

6 Resumen

Acknowledgments

I would like to gratefully acknowledge the supervision of Prof. José A. Pino for his constantsupport and comprehension, and Nelson Baloian for his optimistic guidance and his realsympathy.

I thank Georges Dupret, Sergio Ochoa, Rodrigo Paredes, and Benjamin Piwowarski fortheir great scientific and human collaboration. I am also grateful to my colleagues KarinaFigueroa, Johan Fabry, Gilberto Gutierrez, Matthieu Morrel, Andres Neyem, Daniel Perovic,and Andres Vignaga for their friendship and advices.

This PhD thesis was the opportunity for me to live and learn with the very dear friends wholovely bear my daily presence: Karla Anavalon, Sylvie Lamarre, Guillaume Pothier, Cristianand Cristopher Segura, Roberto Smith and Eric Tanter. It is a privilege to know them.

I am also sincerely grateful to my parents, my sister, and Matisse. First, because theyare the models of perseverance, faith, tranquility, and love that I attempt to imitate in this life.Second, because they are traveling from so far away to share with me the intense momentsdecorating this spring month.

I am always impressed with life’s patience to teach me how to walk, while I persist in walkinglike an inebriate. Hand in hand with my wife, we mutually support each other in findingbalance. Thank you so much, Ina, for sharing this colorful adventure with me.

7

8 Acknowledgments

Contents

Abstract 3

Resumen 5

Acknowledgments 7

1 Introduction 171.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.1.1 Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.1.2 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Basic Concepts 232.1 Lesson Authoring Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.2 Learning Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.1 Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2.2 Learning-Object Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Learning-Object Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.3.1 Learning-Object Hierarchy Models . . . . . . . . . . . . . . . . . . . . . . 272.3.2 Learning-Object Sequencing Models . . . . . . . . . . . . . . . . . . . . . 292.3.3 Visual Structures for Learning Objects . . . . . . . . . . . . . . . . . . . 30

2.4 Learning-Object Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.5 Learning-Object Repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Learning-Object Metadata 353.1 IEEE LOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 LOM Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2.1 Data Types of LOM Elements . . . . . . . . . . . . . . . . . . . . . . . . 373.2.2 Educational Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2.3 Objectiveness and Context Dependency of LOM elements . . . . . . . . . 39

3.3 LOM Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.3.1 LOM Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.3.2 State of existing LOM Instances . . . . . . . . . . . . . . . . . . . . . . . 423.3.3 LOM Usefulness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.4 Supporting LOM Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.4.1 Templates and Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.4.2 Information Sources for automatic LOM Generation . . . . . . . . . . . 45

9

10 Contents

3.4.3 Generation Techniques based on Content Analysis . . . . . . . . . . . . . 463.4.4 Generation based on Context Analysis . . . . . . . . . . . . . . . . . . . . 483.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5 Supporting LOM Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.5.1 LOM Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.5.2 LOM Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.5.3 LOM Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.6 Supporting Learning-Object Retrieval with LOM . . . . . . . . . . . . . . . . . . 543.6.1 Keyword Queries for Learning Object Retrieval . . . . . . . . . . . . . . 543.6.2 Semantically Rich Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.6.3 Query Result Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4 The Learning-Object Metadata Usage Problem 634.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.2 Existing Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3 Remaining Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.4 Work Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.5 Work Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5 Introducing Human LOM Usage into Lesson Authoring 695.1 Organizing a Lesson with Learning Objects . . . . . . . . . . . . . . . . . . . . . 705.2 Lesson Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705.3 Learning-Object Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.3.1 Existing Types for the relation LOM Category . . . . . . . . . . . . . . 725.3.2 Type Proposal for the relation LOM Category . . . . . . . . . . . . . . 73

5.4 Authoring a LO Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.4.1 LO as Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.4.2 Nested LO Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.5 LOM and LO Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.5.1 Instantiating LOM during Lesson Authoring . . . . . . . . . . . . . . . . 785.5.2 Characterizing LOs with LOM . . . . . . . . . . . . . . . . . . . . . . . . 82

5.6 LO Sharing and Retrieval in a LO Graph . . . . . . . . . . . . . . . . . . . . . . 835.6.1 Seamless Sharing of LOs when Authoring a Lesson . . . . . . . . . . . . 835.6.2 Seamless Sharing of LOs when Modifying Colleague’s Lesson . . . . . . 845.6.3 Seamless Retrieval of LOs during Lesson Authoring . . . . . . . . . . . . 84

5.7 System Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6 Processing LOM during Lesson Authoring 896.1 Classical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.2 Diffusion-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.2.1 Context Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.2.2 Conceptual Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.2.3 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.3 Generic Propagation Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.3.1 CMV Update Process Characteristics . . . . . . . . . . . . . . . . . . . . 94

Contents 11

6.4 CMV Model Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986.4.1 Attribute Similarities and Value Suggestion . . . . . . . . . . . . . . . . 986.4.2 Graph Consistency and Value Restriction . . . . . . . . . . . . . . . . . . 101

6.5 Context Diffusion Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1106.6 Framework Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.6.1 Avoiding Propagation Side-Effect . . . . . . . . . . . . . . . . . . . . . . . 1116.6.2 Java Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7 Hybrid LOM Generation and Validation 1157.1 Hybrid LOM Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1157.2 Hybrid LOM Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1187.3 System Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8 Rewarding LOM Generation 1238.1 Using LOM for supporting Lesson Design . . . . . . . . . . . . . . . . . . . . . . 1238.2 Using LOM for Enhancing LO Retrieval . . . . . . . . . . . . . . . . . . . . . . . 125

8.2.1 Querying a LO Repository from within the Lesson Graph . . . . . . . . . 1268.2.2 Evaluating Potential Results using the authored-Lesson Semantics . . . 1278.2.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1298.2.4 Machine Learning System for combining Classifiers . . . . . . . . . . . . 1358.2.5 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1368.2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

8.3 System Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1458.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

9 Contributions and Perspectives 1479.1 LOM Generation and Validation Integration . . . . . . . . . . . . . . . . . . . . 148

9.1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1489.1.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 148

9.2 LOM Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1499.2.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1499.2.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 150

9.3 LOM Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519.3.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519.3.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 152

9.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

A Suggestion Probabilities 155

B Restriction Rules 159

C Restriction Rule Definition 161

D Experiment Data 163

E RankBoost Results 171

12 Contents

Publications of this Thesis 179Book Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179International Journal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179International Conferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179Doctoral Consortium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

References 181

List of Figures

1.1 IEEE Learning-Object Metadata specification . . . . . . . . . . . . . . . . . . . 19

2.1 Search interface for the Merlot learning object repository . . . . . . . . . . . . . 33

3.1 Distribution of data types in the IEEE LOM specification . . . . . . . . . . . . . 383.2 Interface of Reload, a multi-format editor for LOM . . . . . . . . . . . . . . . . . 45

4.1 Human-based usage of learning-object metadata . . . . . . . . . . . . . . . . . . 644.2 Automatic usage of learning-object metadata . . . . . . . . . . . . . . . . . . . . 644.3 Hybrid usage of learning-object metadata . . . . . . . . . . . . . . . . . . . . . . 64

5.1 Start of a lesson graph about “object instantiation”. . . . . . . . . . . . . . . . . 715.2 The relation category in the IEEE LOM specification . . . . . . . . . . . . . . 725.3 Start of a LO graph about “object instantiation” . . . . . . . . . . . . . . . . . . 755.4 Simple LO graph with LessonMapper2 . . . . . . . . . . . . . . . . . . . . . . . . 775.5 Nested LO graph in LessonMapper2 . . . . . . . . . . . . . . . . . . . . . . . . . 795.6 Simultaneous edition with LessonMapper2 . . . . . . . . . . . . . . . . . . . . . 805.7 Edition “in comparison” with LessonMapper2 . . . . . . . . . . . . . . . . . . . . 815.8 Visual Characterization of some metadata attributes of the Edad en dias LO. . 835.9 Query node with title constructor and associated results . . . . . . . . . . . . . 855.10 LessonMapper2 deployment model . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.1 Conceptual model for systems taking advantage of lesson graph semantics . . 906.2 Conceptual model using context diffusion for taking advantage of lesson graph

semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.3 Context diffusion-based framework for taking advantage of lesson graph semantics 936.4 Extract of the LO graph shown Figure 5.3. . . . . . . . . . . . . . . . . . . . . . 986.5 Suggestion weights for general/keyword attribute. . . . . . . . . . . . . . . . 996.6 Suggestion update process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.7 Extract of the LO graph shown Figure 5.3. . . . . . . . . . . . . . . . . . . . . . 1026.8 Restriction interval update process. . . . . . . . . . . . . . . . . . . . . . . . . . 1056.9 Restriction rule context-free grammar . . . . . . . . . . . . . . . . . . . . . . . . 110

7.1 Suggestions and restriction displayed into the LOM editor . . . . . . . . . . . . 1177.2 Displaying the validation state of the metadata of a LO with LessonMapper2. . 1197.3 LessonMapper2 deployment model including rule customization . . . . . . . . . 121

8.1 Start of a LO graph about “object instantiation” (from Figure 5.3) . . . . . . . . 126

13

14 List of Figures

8.2 Top – Recall-precision for the original classifiers based on restriction compli-ance. Bottom – Recall-precision for the mixed classifiers based on restrictioncompliance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

8.3 Top – Recall-precision for the original classifiers based on suggestion similarity.Bottom – Recall-precision for the mixed classifiers based on suggestion similarity. 133

8.4 Precision-recall results for Lucene, context-only diffusion, and full diffusion . . 1348.5 Precision-recall results of Lucene and the RankBoost combinations. . . . . . . 1368.6 Box-plots of the improvement of RankBoost (context-only diffusion version) over

Lucene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1378.7 Cumulated hypothesis generated by the context-only diffusion version of Rank-

Boost for all the situations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1398.8 Box-plots of the improvement of context-only diffusion and full-diffusion version

of RankBoost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1428.9 Box-plots of the improvement of context-only diffusion and topology version of

RankBoost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1438.10 LessonMapper2 deployment model including search module . . . . . . . . . . . 145

A.1 Probabilities for a certain LOM attribute to have the same value in two relatedLOs. The nature of the relation linking the LOs is defined in horizontal axis. . 156

B.1 Restriction rules for LOM attribute values based on the values of related LOs.These rules depend on the relation type relating the LOs. . . . . . . . . . . . . . 160

C.1 Restriction rule context-free grammar . . . . . . . . . . . . . . . . . . . . . . . . 162C.2 XML configuration of some restriction rule examples . . . . . . . . . . . . . . . 162

D.1 Help sheet for interviewed people. . . . . . . . . . . . . . . . . . . . . . . . . . . 164D.2 First LO graph that the interviewed teachers had to complete. . . . . . . . . . . 165D.3 Second LO graph (containing the first one) that the interviewed teachers had to

complete. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166D.4 Third LO graph (containing the second one) that the interviewed teachers had

to complete. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167D.5 Fourth LO graph (containing the third one) that the interviewed teachers had to

complete. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168D.6 Interview sheet for each LO graph. . . . . . . . . . . . . . . . . . . . . . . . . . . 169

E.1 Hypothesis of the context-only diffusion version of RankBoost for situations 5, 9,and 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172



E.4 Hypothesis of the context-only diffusion version of RankBoost for situations 1, 5,and 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

E.5 Hypothesis of the full-diffusion version of RankBoost for all situations . . . . . 176E.6 Hypothesis of the topology diffusion version of RankBoost for all situations . . 177

List of Tables

2.1 Examples of hierarchy models for organizing LOs . . . . . . . . . . . . . . . . . 28

3.1 LOM consistency recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.1 Proposal of relation types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.1 Examples of restriction rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036.2 Restriction update process for single-element attributes. . . . . . . . . . . . . . 1066.3 Restriction update process for element set attributes. . . . . . . . . . . . . . . . 106

8.1 T-tests on precision differences between RankBoost and Lucene . . . . . . . . . 137

15

16 List of Tables

Chapter 1

Introduction

Last decade witnesses the creation of various specialized databases storing digital entitiesaiming at supporting learning/teaching situations. These databases are generally calledlearning-object repositories, where learning object (LO in short) refers to those digital entitieshaving educational purposes. One of the main motivations for having such LO repositoriesis to allow the stored LOs to be reused by as many people as possible (Wiley, 2000). In orderto make this possible, the characteristics of the LOs should be exposed, so that other peoplecould locate and retrieve them. Two very critical issues in this process are (1) how to describethe educational resource so that it can be retrieved and (2) how to search for it in order to findthose who really match the needs of certain potential users.

Document retrieval is largely dominated today by search engines like Google or Yahoo!(Google, 2007; Yahoo!, 2007). Basically, these systems automatically index documents bymining their textual content. The algorithms used for analyzing and indexing documents areso efficient that they make possible the retrieval of documents among billions of Web pages ina few milliseconds. This fact seems to make text-mining systems the ideal tools for supportingLOs retrieval. However, there are many factors that make these systems inapplicable whileretrieving learning material in a dedicated repository. First, learning material is not onlytext-based; it may be a video, a picture, or an audio file, a simulation program or a file withanother structure which is difficult to efficiently mine with content analysis methods (see thecontent of (Merlot, 2007) for a sample). Second, educational resources could be copyrightedand their access restricted. In this case, documents cannot be indexed since their contentis not available. Third, the LO retrieval process should also deal with information aboutits usage like its educational context, its pedagogical methodology, and/or its interactivity(Hiddink, 2001). Nevertheless, this kind of data is generally not contained in the documentitself. Text-mining systems cannot consider such implicit information. An alternative method

17

18 Chapter 1. Introduction

to document content analysis is necessary.

The usage of metadata for describing the educational contribution of a learning objecthas been a widely accepted approach for classifying educational material (IMS, 2007; SCORM,2007; Hiddink, 2001). Metadata is data about data. Typically, the metadata of a documentdescribes its relationships with other documents, its content, its life cycle, its technicalcharacteristics, its potential usage, and its expected position inside higher-level processes.Kim (Kim, 2005) defines metadata as data which “helps the user to understand the semanticsand lineage of stored data and to properly run the applications in support of the businessneeds”. In order to remain independent from the format and accessibility of the learningmaterial to which it relates, metadata should be stored as an individual file. Metadata canrecord information not present in the described document. Metadata having these featuresmay be instrumented to solve the problems, which arise when trying to use indexing enginesin order to find educational material.

Metadata for educational resources aim at facilitating the retrieval and reuse of learningmaterial. Standard document metadata specifications like, e.g., basic Dublin Core (DC)(DublinCore, 2007) do not completely satisfy this requirement. Although DC attributesholding content data such as authors, title, or granularity are definitely useful for describingeducational resource content, DC does not contain any field for describing the pedagogicalaspects of a document, such as its educational purpose, its intended learner profile, its difficultyor its interactivity type. In order to cope with educational concerns, various metadata setswere defined such as DC Educational extension (DublinCoreEdu, 2007), CANCORE (Cancore,2007), IMS Metadata (IMS, 2007), SCORM Metadata (SCORM, 2007), and IEEE LearningObject Metadata (LOM) (LOM, 2002). Among these specifications, IEEE LOM is commonlyemphasized as the standard reference from which the other specifications derive.

1.1 The Problem

1.1.1 Facts

The LOM specification contains almost 60 attributes (Figure 1.1). These elements covervarious aspects of the learning material including data about general content, life cycle,meta-metadata, technical characteristics, educational usage, rights, relations, annotation, andclassification. Nevertheless, the following four facts about LOM usage currently stand out:

Generating LO metadata is a tedious task. Manually instantiating LO metadata meansto generate values for a large number of metadata attributes (see Figure 1.1). Instan-

The Problem 19

Fig

ure

1.1:

IEE

EL

earn

ing-

Obj

ect

Met

adat

asp

ecifi

cati

on


tiating metadata may be a straightforward task for some objective attributes like, e.g.,author, technical format, or language. They can even be automatically generated if theLO content format or the setting, in which the LO is used, enable it (Downes, 2004;Duval and Hodgins, 2004a; Simon et al., 2004; Brooks et al., 2006). However, subjectiveattributes like, e.g., the description, the coverage, the interactivity type, the semanticdensity, the difficulty, the intended end-user role, or the typical learning time of a LO,remain complex to automatically instantiate. Moreover, they often require a seriousevaluation effort from the user defining their values (Hodgins, 2001; Kabel et al., 2003).

LO authors are generally responsible for generating LO metadata. Correct instanti-ation of LO metadata requires combined educational and technical skills (Kabel et al.,2003). For that reason, third-party professionals dedicated to generate LO metadataare generally too costly to be hired in all educational institutions. Moreover, studiesshow that LO authors cannot be completely replaced by those metadata professionals(Currier et al., 2004). Therefore, LO authors are generally responsible for generating LOmetadata.

Produced metadata are generally incomplete or missing, they may also be incor-rect. Surveys about metadata usage (Friesen, 2004; Currier et al., 2004; Heath et al.,2005; Najjar and Duval, 2006) confirm that metadata instantiation is a difficult taskwhich is generally ignored or made in a hurry. Frequently, authors leave “by default”values, which may not correspond to real values.

LO authors have little direct retribution for the metadata they generate. Gener-ating metadata is mostly a selfless concern for LO authors: Only learners or otherinstructional designers looking for specific LOs benefit from metadata generated by theauthors of the retrieved LOs. Tools supporting LOs and metadata authoring do not pro-vide direct retribution to the LO authors (Currier et al., 2004). Moreover, when searchingfor a LO, most users do not specify metadata attributes in their queries, but limit themto keyword sets (Najjar et al., 2005). In fact, LOM are mostly used for characterizing thesearch results.

1.1.2 The Problem

Teaching communities are generally interested in sharing material between their members(Recker, 2006). Nevertheless, this interest does not focus on using entire lessons prepared byother teachers, because most teachers prefer that their lessons reveal their own teaching style.Instead, lesson authors look for reusing small pieces of material, e.g., a meaningful example ofa certain topic or an interesting diagram. Such fine-grained educational resources are moreadequate to fit the teaching style of the teacher than coarse-grained material (Wiley, 2001b).

Objectives 21

Lesson authors need to be able to retrieve such fine-grained LOs in order to reusethem. If we consider LOM as the base indexation medium for LOs, LOM of the sharededucational material should be properly instantiated for enabling LO reuse during lessonauthoring. However, according to the previous hypothesis, LOM usage during lesson authoringis currently confined in a negative situation: Generating LOM is a tedious task imposed tolesson authors. Consequently, the LOM quality of their production is generally low. Since theinaccuracy of these metadata limits their use in beneficial application, the lack of rewardsdecreases the lesson author motivation in generating LOM. Therefore, LOM usage duringlesson authoring should be first improved in order to enable teacher communities share andreuse LOs indexed with LOM.

1.2 Objectives

The main goal of this thesis is to study the possibility of technically improving LOMusage during lesson authoring.

Literature shows that manual metadata instantiation is a tedious process for lessonauthors. If the automatic generation systems provide the best solution to cope with theLOM generation problem, these technologies do not accurately work with attributes havingsubjective values such as, e.g., the description, the coverage, the interactivity type, thesemantic density, the difficulty, the intended end-user role, or the typical learning time. Forthese attributes, an hybrid approach integrating both human and automatic processes appearsto be the best strategy (Greenberg, 2004). This thesis focuses on this hybrid approach in orderto study the LOM usage during lesson authoring.

From a technical perspective, improving LOM usage during lesson authoring involvesnot only supporting users in generating LOM during lesson authoring, but also rewardingthem for this effort. For this reason, the specific objectives of this thesis consist of studyingthe following three research questions:

RQ1 How to seamlessly integrate hybrid LOM generation and hybrid LOM valida-tion into a lesson authoring tool? In order to answer this research question, thisthesis studies LOM generation and validation interfaces suiting the lesson authoringcontext. In particular, these interfaces should use automatic systems for facilitatingmetadata generation and metadata validation.

RQ2 How can processes dealing with LOM semantics cope with incomplete, miss-ing, or incorrect metadata? Since hybrid generation of LO metadata depends on


automatic processes, this thesis investigates LOM processing techniques in the lessonauthoring setting. In particular, it focuses on processing the LO context, i.e., the meta-data of the other lesson LOs. This approach is reasonable during lesson authoring whereLOs are seldom used in isolation. Nevertheless, some lesson LO metadata may also bemissing, incomplete, or incorrect during lesson authoring. In order to cope with thissituation, we propose to extend the web metadata propagation model of (Marchiori, 1998)with the rich semantics of LOM and the definition of sophisticated rules for processingthese metadata.

RQ3 How can lesson authors benefit from the LOM they generate? The hybrid ap-proach also involves the contribution of lesson authors. However, there are currently nodirect rewards encouraging their participation in this process. This thesis studies theuse of LOM for facilitating lesson design and for enhancing LO retrieval during lessonauthoring.

1.3 Outline

Chapter 2 introduces the basic concepts related to educational material reuse: The lessonauthoring context and LO-related technologies are presented.

Chapter 3 describes the state-of-the-art related to LOM usage. In particular, it focuses on theLOM generation, the LOM validation, and the use of LOM for querying LO repositories.

Chapter 4 summarizes the existing LOM usage issues. It also states the hypothesis andresearch questions of the thesis.

Chapter 5 introduces lesson graphs as favorable structures for authoring lessons based onLOs characterized with LOM. A tool for authoring such lesson graphs is introduced anddescribed. Then, human-based LOM usage in such an environment is discussed.

Chapter 6 introduces a formal model for analyzing the metadata of lesson graph LOs andtaking advantage of their semantics. A context diffusion method is proposed in order tocope with the problem of incomplete or missing metadata during LOM graph analysis.This method aims at answering RQ2. Two model applications are also described.

Chapter 7 discusses the use of the LOM processing model presented in Chapter 6 for enablinghybrid LOM usage during lesson authoring. In particular, the integration of hybrid LOMgeneration and hybrid LOM validation into lesson authoring is explored in order toanswer RQ1.

Chapter 8 deals with rewarding lesson authors for generating LOM and thus, answeringRQ3. First, the use of LOM for facilitating lesson design is discussed. Next, a systembased on our LOM processing model is proposed for supporting LO retrieval during

Outline 23

lesson authoring. Preliminary experiments witnessing the performance of this approachover classical information retrieval systems are described and analyzed.

Chapter 9 discusses the contributions and perspectives of this thesis.

Chapter 2

Basic Concepts

Learning Object (LO) is the expression of an emerging fusion between two separated domains:(1) the educational domain (learning) and (2) the software engineering domain (objects).Because of the distance between these two domains, the literature witnesses contradictorypoints of view on the LO concept: “there is no common definition of terms and related workrarely reference or build upon one another” (Wiley, 2007). This section attempts to define onthe terms and concepts related to LOs from the perspective of the particular context of thisresearch: the lesson authoring process. These definitions root the remaining of this thesis.

2.1 Lesson Authoring Context

During the lesson authoring process, lesson authors (e.g., teachers, instructional designer)may benefit from the work of their colleagues. For instance, they could integrate a relevantexample or an interesting simulation that was done by another teacher into their lesson.

(Recker, 2006) summarizes the judgment of teachers about authoring lessons with sharedresources into 5 categories:

Ease Teachers generally highlight the benefits of using resources shared in a repositorycompared with web based search of external resources.

Enrichment Many teachers appreciate online resources for enriching classroom activities.In particular, they generally value the use of interactive and engaging online resources

25

26 Chapter 2. Basic Concepts

to provide meaningful learning activities for students, or to provide supplemental infor-mation to students.

Research Teachers generally think that online resources are useful for supporting research.They describe the important role of online resources in furthering their content knowledgeas well as their teaching knowledge.

Networking Few teachers mention the importance of using the network to find out whatother teachers are doing.

Barriers Nonetheless, many teachers comment on difficulties associated with using onlineresources. Many of these barriers were associated with technical problems (e.g., insuffi-cient access, draconian filters, outdated technology). Others mentioned problems withmanaging and sifting through the large amount of content available on the Web.

(From (Recker, 2006))

Sharing and reusing appears a concrete advantage for teachers while the technicalaspects surrounding these practices are still limiting them. A group of teachers sharing andreusing their resources can be seen as a community of practice in the sense of (Wenger, 1998).This community may be as small as a group of teachers of a single university. It may also belarge like, e.g., the communities using open LO repositories like Merlot or EdNa.

From the perspective of a community, sharing resources is not only a problem of accessingthese resources but “it involves a process of appropriation. To appropriate a resource thenew user must imagine how the designers intended it to be used and adapt their practiceaccordingly, or else must conceive of some new use for the material. Thus, appropriation maybe easy across related contexts but can involve considerable creativity in other situations.”(Oliver, 2005).

(Fill et al., 2006) describes the intent of re-purposing a particular learning activity,developed in one university and taken up enthusiastically by two others. The experienceconfirms the theory: ”Teachers are highly unlikely to reuse materials created by others withoutextensive investigation and a varying amount of modification to satisfy their own perceptionsof the context, learner attributes and motivation, aims and desirable outcomes. This is part ofpedagogic expertise and goes to the heart of what it means to be a teacher.” (Fill et al., 2006).

In addition to an appropriation process, the reuse of educational material implies a kindof re-contextualization. For instance, it could involve the definition of new learning material inorder to introduce the reused educational resource, or the modification of the reused materialin order to suit the style of the authored lesson. For these reasons, this thesis defines the reuseof educational material during the lesson authoring process as:

Learning Objects 27

Definition 2.1 (Educational-Material Reuse) Re-contextualization of an existing educa-tional material in order to fit to a new lesson in which it will be used.

2.2 Learning Objects

2.2.1 Origins

In 1994, Wayne Hodgins introduced the concept of learning objects in naming the CedMa(Cedma, 2007) working group “Learning Architectures, APIs and Learning Objects”. Hisprimary intention were made explicit in (Hodgins and Conner, 2000):

“All LEGO blocks [,the plastic-brick assembly game for children,] adhere to oneabsolute standard for pin size. Every LEGO piece, no matter what shape, color, size,age, or purpose can always be snapped together with any other piece because oftheir uniformly shaped pins. This allows children of all ages to create, deconstruct,and reconstruct LEGO structures easily and into most any form they can imagine.If we map this to the world of learning content, we start to see the opportunitiesthat would result if we were able to have the same standards and capabilities toreuse and assemble or disassemble content drawn from any source at any time.”

The LEGO model has been largely criticized since it was proposed. (Friesen, 2003)describes three of the most popular criticisms about LOs: (1) The community of interestseems incapable of reaching agreement on a common set of terms. (2) Specifications andstandards related to LOs are almost completely technical and fail to directly engage pedagogy.(3) The major influence of military and industrial training purposes on the specificationsand standards makes them unsuitable for public and higher education. Friesen summarizes,“objects and infrastructures for learning cannot simultaneously be both pedagogically neutral[so that they could be reused in various pedagogical contexts] and pedagogically valuable.”

(Wiley et al., 2004) insists on the influence of context on LOs: For a LO to be easily reusedin various educational contexts, its original context should have no strong influence on itscontent. Thus, (Wiley et al., 2004) shows that:“a paradox arises because learning theoristsare increasingly emphasizing the preeminence of context in learning, using language such associal context (Vygotsky, 1981); cultural, historical, and institutional setting (e.g., (Wertsch,1993)), and situatedness (e.g., (Lave and Wenger, 1991; Jonassen, 1991)).” (Jonassen and


Churchill, 2004) confirms this issue by saying that the LO approach is supporting outdatedconstructivist ways of thinking about teaching and learning.

In response to these critics, some research efforts have tried to look for a better metaphorfor LOs than the LEGO model. For instance, (Wiley, 1999; Norman, 2004) suggests a molecularmetaphor: Learning objects can be seen as small molecules containing educational materialcontent. According to their semantic and structural makeup, these molecules have strongeraffinities for binding with some learning objects and weaker affinities for binding with otherones. (Wiley and Edwards, 2002) makes another proposal with a brick-and-mortar metaphor.In this metaphor, LOs are considered as small bricks containing learning material. Being avariety of shapes and sizes, these bricks are difficult to assemble in a meaningful way withoutsome kind of contextual glue (the mortar) to hold them together and give meaning to theaggregation.

2.2.2 Learning-Object Definition

There were various attempts for defining LOs. Some of them are listed below:

• “Learning objects are defined as any entity, digital or non-digital, which can be used,reused,or referenced during technology supported learning.” (LOM, 2002).

• “A learning object is a digital resource that can be reused to facilitate learning.” (Wiley,2001a)

• “A learning object is an independent and self-standing unit of learning content that ispredisposed to reuse in multiple instructional context.” (Polsani, 2003).

All these definitions seem to agree on the idea of a LO as an entity intended to supportlearning. Nevertheless, they diverge on the type of material that a LO contains (only digitalor both digital and non-digital) and also on the type of reuse intended for the LOs: (LOM,2002; Wiley, 2001a) does not precise the notion of reuse when (Polsani, 2003) considers LOs aspredisposed to be reused in various educational contexts.

In the lesson authoring context, reuse is considered as a re-contextualization process(Definition 2.1) of digital or non-digital entities. According to this, the definition of IEEE LOMappears as the most adequate:

Learning-Object Organization 29

Definition 2.2 (Learning Object) Any entity, digital or non-digital, which can be used,reused,or referenced during technology supported learning.

Next section reviews the different approaches dealing with authoring a lesson based onLOs.

2.3 Learning-Object Organization

An important part of the lesson authoring process consists of organizing the set of educationalmaterial and activities aiming at supporting learning. Various research efforts attempt tostructure this task. Three main topics emerge from the literature about educational-resourceorganization: hierarchical models, sequencing and visual structure. This section reviews eachof them.

2.3.1 Learning-Object Hierarchy Models

Definition 2.3 (Learning-Object Hierarchy Model) Classification of learning-objecttype according to their relative inclusiveness.

Various research efforts attempt to specify a hierarchy for organizing a lesson. Table 2.1 showsa sample of hierarchy models. The nine models presented are: Netg (L’Allier, 1997), CISCO(Barrit et al., 1999), SCORM (Dodds, 2001), Learnativity (Duval and Hodgins, 2003), CDCD(Ochoa et al., 2003),Hierarchical Graphs (Santos et al., 2004), ALOCoM (Verbert and Duval,2004), Learning-Activity Toolkit (Conole and Fill, 2005) and IMS LD (IMS, 2007). For eachmodel, the proposed granularity levels are named from the finest to the coarsest. In general,the finest level corresponds to an atomic resource fragment (e.g a video, an image, a text, apresentation slide) while the coarsest generally belongs to an entire lesson.

As this table witnesses, there are notable differences among the proposed models. Forinstance, it may be observed than most of these approaches make explicit the idea of areusable entity as a self-contained ready-to-be-used learning resource. This tendency followsa trend rooted in Hodgins’ LEGO model. On the one hand, some proposals are based on


Model Name

Fine-grained Items Coarse-grained Items

Netg Raw Contents Objective + Activity +

Assessment Topics Lessons Units Courses

CISCO Content Items

Reusable Information Objects (Fact + Concept + Procedure +

Process + Principle) Reusable Learning objects

SCORM Assets Sharable Content Objects Learning Resource Aggregation

Learnativity Raw

Media

Information objects (Procedure + Principle, Concept +

Process + Fact + Overview + Summary)

Learning objects (Information Objects

+ Objective)

Aggregate Assemblies

Collections

CBCD Basic

Contents Contents Items Chapters Sections Courseware

Hierarchical Graphs

Learning Resources Learning Units (Learning Resources or Graphs of Learning Units)

ALOCoM Content Fragments Content Objects

(Content Fragment + Navigation) Learning Objects (Content Objects +

Navigation + Objective)

LA Toolkit Resources Learning Activities

(Task (including Resources) + Learning and Teaching Approaches + Context)

IMS LD Learning objects

Environments Role Parts Acts Plays Units

Table 2.1: Examples of hierarchy models for organizing LOs

Learning-Object Organization 31

informal recommendations like the Shareable Content Objects of SCORM. On the other hand,various hierarchy models attempt to formalize what should be the content of a self-containedready-to-use learning resource. For instance, the Horn’s Information Blocks (Horn, 1995)are used to formalize this notion in the Learnativity and CISCO models. Horn argues aself-contained complete information can be structured with five fundamental parts: Fact,Concept, Process, Procedure, Overview, Principle and Summary. ALOCoM uses theDITA XML model (Priestley, 2001) where Task, Concept, and Reference are defined asreusable . (Conole and Fill, 2005) defines a self-contained educational resource as being alearning activity composed of a set of Tasks defined for a certain Context and applying acertain Learning and Teaching Approach.

IMS LD suggests a hierarchy based on a social approach of how to structure a learningactivity: the role of each actor of the described activity has to be defined. This generic canbe seen as a language for modeling an educational activity (IMS LD was formerly calledEducational Modeling Language (EML, 2007)). (McCalla, 2004; Koper, 2004; Allert, 2004)suggest that languages for describing the educational processes could serve as a base forthe recognition and reuse of design patterns, such as those defined by Alexander (Alexanderet al., 1977), into the structure of a lesson or a learning activity. According to this trend,some research efforts aim at defining such learning design patterns (e.g., (Hernández-Leoet al., 2005; Patterns, 2007)). Others try to support learning design pattern authoring (e.g.,(Inaba and Mizoguchi, 2004)). Finally, learning management systems integrating IMS LDlike the Learning Activity Management System (LAMS, 2007) are interesting supports forpopularizing design pattern usage.

In contrast with the previous models that attempt to be exhaustive, the HierarchicalGraph model suggests a simple generic structure for organizing a course. In this model,learning units are organized in nested graphs: A learning unit may be a learning resourceor a graph of learning units. (Baloian et al., 2004b) uses a similar approach for organizinggraphs of learning resources.

2.3.2 Learning-Object Sequencing Models

Definition 2.4 (Learning-Object Sequencing Model) Set of processes and apparatus forsequencing learning objects.

Another aspect of the lesson authoring process consists of defining the sequence of LOscomposing the lesson. This sequence may be linear like in Netg, CBCD and ALOCoM. Sequence


may also be defined by a conditional process like in SCORM, CISCO and IMS LD. A conditionalprocess is generally based on the results of assessments: For instance, if a learner reachesa very good score in a MCQ test, he will pass to another topic, if he obtains a medium scorehe will have to solve more exercises on the topic being taught or, if his score is very low, hewill have to study at learning resources dealing with the basics of the lesson. IMS has definedSimple Sequencing, a standard specification for defining such conditional sequencing processes(IMS, 2007).

Some research efforts suggest that the sequencing of LOs should be manually done atrun-time: For instance, it may consist of letting the sequencing choice to the learners as in(Conole and Fill, 2005). It may also be a dynamic choice of the teacher to deal with the reactionsof her students as suggested in (Winn, 1997). In such situations, the visual organization of theeducational resources may be an important support as shown in the next section.

In some other approaches, the sequencing of LOs is based on the semantics of the relationslinking them. This is typically the case in intelligent tutoring systems in which the sequencingof the educational resources is done in a way that fits a certain learner profile (Murray, 1999).Similarly, in adaptable hypermedia systems, this is the navigational model which dynamicallyadapts to the learner profile (Bruvilosky, 2003; Bra et al., 2004). Typically, intelligent tutoringsystems and adaptive hypermedia are built upon a set of correspondence rules between thelearner profile and the semantics of the relations existing in sets of educational resources andconcepts. These approaches aim at automatically providing tutoring for learners. In contrast,other research efforts focus on helping the teacher in sequencing the learning material. Forinstance, (Baloian et al., 2004b) suggest to analyze the rhetorical and semantic relationshipsbetween various educational resources in order to choose a potential teaching path. When ateacher asks for a path corresponding to her preferred teaching style (e.g., deductive, inductive,short path), the system automatically calculates it in the space of available educationalresources.

2.3.3 Visual Structures for Learning Objects

Definition 2.5 (Visual Structure for Learning Objects) Methods for visually organiz-ing a set of learning objects.

Most supports for lesson authoring are based on a tree view generally limiting the navigationin the hierarchy of lesson units to a linear process (see (Moodle, 2007; Blackboard, 2007) fora sample). (LAMS, 2007) features a visual organization of the activities of the lesson in a

Learning-Object Indexing 33

kind of basic concept map. Concept Maps are tools for organizing and representing knowledge(Novak, 1998). The cognitive advantage of such maps is mainly due to their visual aspectenabling dual coding in memory: Visual memory makes easier to find an element in a conceptmap than in a list. Concept maps are generally known as constructivist learning tools orassessment tools. Nevertheless, they can also used as planning tools: (Martin, 1994; McDanielet al., 2005) show that concept maps as planning tool provide educators a more comprehensiveunderstanding of what students need to learn and helps eliminate sequencing errors in thedevelopment of lesson plans. (Baloian et al., 2004b) proposes a special type of concept mapscalled Didactic Networks to organize learning material. A Didactic Network is a graph ofeducational resources based on rhetorical and semantical relations between them.

2.4 Learning-Object Indexing

Definition 2.6 (Learning-Object Indexing) Process of building a data structure to speedup searching of learning objects.

Most popular indexation mechanism deal with document content analysis. This is typicallywhat do the search engines like Google or Yahoo!. However, there are many factors that makethese systems inapplicable while retrieving learning material in a dedicated repository amongwhich there are: (1) The format of the document can be difficult to efficiently mine with contentanalysis methods (e.g., images, sounds, proprietary format). (2) Educational resources couldbe copyrighted and the indexation system may not have access to them. (3) Information aboutLO usage like its educational context, its pedagogical methodology, or/and its interactivity isnecessary to be indexed but this data is generally not contained in the document itself. Forthese reasons, an alternative method to document content analysis is necessary. Metadata hasthe purpose to cope with these problems.

Metadata is literally data about data. More specifically, “metadata is data associatedwith objects which relieves their potential users of having full advance knowledge of theirexistence or characteristics”(Dempsey and Heery, 1997). Metadata is a systematic methodfor describing resources and thereby improving access to them, such as the cards in a cardcatalog help library users find books. If a resource is worth making available, then it is worthdescribing it with metadata, so as to maximize the ability to locate it.

The five most important metadata specifications are the IEEE Learning Object Meta-data standard (LOM, 2002), the IMS Learning Resource Metadata specification (IMS, 2007),the ARIADNE metadata specification (Ariadne, 2007), the Dublin Core metadata standard


(DublinCore, 2007), and SCORM metadata specification (SCORM, 2007). At the beginningof their activity, the Dublin Core, IMS, and ARIADNE projects were working independently,developing separate specifications of the type of metadata to use and the way it should beexpressed. Afterward, the IMS and ARIADNE projects agreed that they should harmonizetheir efforts under the auspices of a true international standard organization. In such a way,they could finally guarantee that their work would inter-operate. The IEEE LOM standardwas chosen to base this collaboration. Dublin Core later agreed to participate under similarterms. The ADL’s SCORM inherits the benefits of these interoperability agreements becausethe SCORM package includes the LOM standard.

Practical efforts to create interoperability between these varying specifications andstandards have been quite successful. (Najjar et al., 2003) describe how they transformARIADNE metadata into LOM metadata using XSLT. IMS (IMS, 2007) has also released abest practice guide for using XSLT to transform IMS Learning Resource Meta-data into IEEELOM.

For theses reasons, this thesis considers the LOM standard as the basic specificationwhen dealing with metadata for educational resources.

Definition 2.7 (Learning-Object Metadata) Standard set of data for describing and in-dexing learning objects. This data set is based on the IEEE LOM specification.

2.5 Learning-Object Repositories

Definition 2.8 (Learning-Object Repository) Dedicated databases for storing learningobjects together with their respective metadata. Learning object indexation is based on thesemetadata.

In order to share LOs, several dedicated repositories have been developed, e.g., the EuropeanKnowledge Pool Ariadne (Ariadne, 2007), the Canadian repositories Careo (Careo, 2007)and Lornet (Lornet, 2007), the American resource base Merlot (Merlot, 2007), the Japanesedigital library NIME (Nime, 2007) , and the Australian repository EdNA (EdNa, 2007). Theserepositories are large databases cataloging learning resources with the metadata defined inthe LOM specification.

Summary 35

In practice, repositories do not use the whole LOM specification, but define their ownsubset of LOM adapted to the need of their potential users. These subsets are called LOMProfiles. LOM Profiles define which LOM elements are essential to the local requirements forsharing and retrieving LOs. They also specify which elements are recommended or optional.Besides LOM Profiles may propose a specific vocabulary to be used when instantiating certainLOM attributes. In case a LOM Profile changes the name of an attribute (e.g., in Ariadnegeneral/aggregationLevel is defined as pedagogical/granularity), correspondencesbetween the local specification and the LOM standard are defined so that this LOM Profileremains interoperable with LOM (Najjar et al., 2003).

As argued in a survey about LO repositories usage (Neven and Duval, 2002), reaching acritical mass is crucial for LO repositories to become functional. In order to reach this criticalmass, the major repositories have collaborated for building an interoperable query mechanismallowing them to gather their data. Therefore, Merlot, Edna, Ariadne, Lornet and Nime offera Federated Search service in common: A query on the federated search engine returns amerged list of the best results of the five repositories.

This Federated Search is based on a SOAP architecture similar to the proposals ofweb-services for LO repository of (Ternier and Duval, 2003; Xiang et al., 2003). The OpenArchives Initiative (OAI, 2007) has defined a specification for an interoperable XML basedquery-answer mechanism. Finally, the IMS consortium (IMS, 2007) has released the DigitalRepository Interoperability specification defining a general framework for interoperabilityincluding SOAP, XQueries and OAI as base technologies.

From a user perspective, searching a repository generally means to browse the searchpage of a repository and to list some keywords for the query like one does in Google. Advancedsearch consists of filling a form containing the information held by the metadata (e.g., see theMerlot search interface Figure 2.1). Nevertheless, advanced search results is a fastidious taskwhich is not of common usage in LO repositories (Najjar et al., 2005): Users prefer keywordsfor querying a repository as they do in classical search engined like Google or Yahoo. Moredetails about LO retrieval are discussed in Section 3.6.

2.6 Summary

This section has reviewed the basic concerns of this thesis. LOs and LO reuse were defined inthe context of lesson authoring. These definitions consider learning-resource reuse inside acommunity of teachers implies necessary processes of appropriation and re-contextualization.Next, the available methods for organizing educational resources during lesson authoring


object repositories. In these systems, the sequence of keywords is processed on one

specific element of metadata (e.g., title) or on a limited set of elements (e.g., title,

keywords, description). Some research (Najjar et al., 2005) indicates the convenience of

searching the occurrences of the expressions given by the users in all the elements of

metadata.

In most learning object repositories, keyword queries are limited to the metadata of

the resources available in this repository. Therefore, they strongly depend on the validity

of these metadata. Sometimes, the incompleteness of available LOM documents is a

serious drawback for processing queries. Thus, in some situations, it is still faster to look

for an educational resource with an indexing engine like Google than with repositories

for educational resources. Metadata and queries on this metadata should not be restricted

to data describing the contents, because educational context is an important characteristic

to consider when reusing learning material. However, in order to query this kind of data,

the semantics of LOM must be considered.

Figure 2. Search interface for the Merlot learning object repository

4.2. Semantically Rich Queries

Metadata for educational resources usually contains more information than content

description. Such information (e.g., educational usage) is generally implicit in the

document and search engines are not able to grasp it. Educational metadata was

especially designed for improving the retrieval of learning material. However, users are

not comfortable with querying such data. First, most people are used to query document

content and not document context. Second, it implies the understanding of the semantics

characterizing the context and the knowledge of the associated vocabulary. In order to

Figure 2.1: Search interface for the Merlot learning object repository

Summary 37

were described. In particular, it was shown that there is no consensus on which method is thebest one. Furthermore, the LO indexing was introduced and IEEE LOM was chosen as thestandard for indexing LO in this thesis. Finally, LO repositories were presented: Not only,opportunities of interoperability between LO repositories were described, but also retrievalissues were introduced.

Chapter 3

Learning-Object Metadata

3.1 IEEE LOM

The Learning Object Metadata standard specification (LOM, 2002) is a product of the IEEELearning Technology Standard Committee (LTSC). This project identified IEEE 1494.12.1,was approved in November 2002. Other work around LOM concerns (1) 1484.12.2: Standardfor ISO/IEC 11404 binding for Learning Object Metadata data model, (2) 1484.12.3: Standardfor Learning Technology-Extensible Markup Language (XML) Schema Definition LanguageBinding for Learning Object Metadata, (3) 1484.12.4: Standard for Resource DescriptionFramework (RDF) binding for Learning Object Metadata data model. These projects are stillin evaluation step (1484.12.2 and 1484.12.3 do not show signs of activity since 2002).

LTSC defines the official purposes of the LOM standard and its related projects as:

Retrieval To enable learners or instructors to search, evaluate, acquire, and utilize LOs.

Interoperability To enable the sharing and exchange of LOs across any technology sup-ported learning systems.

Modularity To enable the development of LOs in units that can be combined and decomposedin meaningful ways.

Computability To enable computer agents to automatically and dynamically compose per-sonalized lessons for an individual learner.

Ability to be composed To compliment the direct work on standards that are focused onenabling multiple LOs to work together within a open distributed learning environment.

39

40 Chapter 3. Learning-Object Metadata

Sharing of educational methods To enable, where desired, the documentation and recog-nition of the completion of existing or new learning and performance objectives associatedwith LOs.

Reuse economics To enable a strong and growing economy for LOs that supports andsustains all forms of distribution; non- profit, not-for-profit and for profit.

Standardization To enable education, training and learning organizations, both government,public and private, to express educational content and performance standards in astandardized format that is independent of the content itself.

Standardization To provide researchers with standards that support the collection andsharing of comparable data concerning the applicability and effectiveness of LOs.

Extensible To define a standard that is simple yet extensible to multiple domains andjurisdictions so as to be most easily and broadly adopted and applied.

Secure To support necessary security and authentication for the distribution and use of LOs.

(From (LOM, 2002))

IEEE Learning Object Metadata specification is a set of about 60 attributes describingtechnical, educational and general aspects of educational resources (see Figure 1.1). All dataelements are grouped into nine categories:

General The General category groups the general information that describes the LO as awhole (e.g., its title, its keywords, its description).

Lifecycle The Lifecycle category groups the features related to the history of the LO (e.g., itsversion, its current state, the participants to its evolution).

Meta-Metadata The Meta-Metadata category groups information about the metadata in-stance itself rather than the resource.

Technical The Technical category groups the technical requirements and characteristics ofthe LO (e.g., its format, its size).

Educational The Educational category groups the educational and pedagogic characteristicsof the LO (e.g., its interactivity type, its difficulty, its intended end-users).

Rights The Rights category groups the intellectual property rights and conditions of use forthe LO.

LOM Characteristics 41

Relation The Relation category groups the relationships between the LO and other relatedLOs. Each relation should refer to a link type (e.g isPartOf, isBackgroundFor, intro-ducesTo).

Annotation The Annotation category provides comments on the educational use of the LO.Each comment is decorated with its author and its creation date.

Classification The Classification category describes this LO in relation to a particular classi-fication system (e.g., ACM Taxonomy).

Categories have no values: Only leaf elements have values. In this thesis, the LOMattributes are identified by a series of names separated by slashes, e.g., general/title,where “general” is the category and “title” the attribute name.

3.2 LOM Characteristics

In this part, the main characteristics of LOM are explored. First, a list of the data types usedin these metadata is presented. Then, a special focus is given to the educational category ofLOM. Finally, objectiveness and context dependency of LOM attributes are discussed.

3.2.1 Data Types of LOM Elements

The LOM specification defines data type for each metadata attribute. The distributionof data types in the LOM specification is summarized in Figure 3.1 Almost a third ofLOM attributes deals with primitive types such as size, format, id, URL, time stamp, ordate. Such types concern attributes like general/identifier, technical/duration, ortechnical/location.

Another third of LOM attributes takes for values some free text associated with a tagspecifying the language used in the text. This data type is called Langstring. Some LOMelements like educational/description, general/keyword, or general/title use thistype so that they can accept values for several languages. For instance, this data type enablesthat the title or the description of a certain learning resource can be defined in, e.g., English,French or Spanish and later be displayed in the preferred language of the user.

Another third of LOM attributes deals with vocabulary instances. For instance, thevocabulary for the attribute educational/interactivityType is active, expostive and


Primitive Types

Langstring

vCard Vocabulary Instances

Figure 3.1: Distribution of data types in the IEEE LOM specification

LOM Characteristics 43

mixed according to the LOM RDF Binding draft specification available at (LOMRDF, 2003).Technically, vocabulary instances are defined with couples source-value: The source refers toa local glossary of terms for the concerned LOM attribute (e.g., a LOM Profile defined by acertain institution) while the value corresponds to a specific instance of this vocabulary. Forinstance, the attribute educational/interactivityType of a certain LO may be definedas (source=(LOMRDF, 2003), value=active).

Finally, we note that one element is related with a complex data type: vCard. It is usedfor describing the contributors of the educational resource.

3.2.2 Educational Metadata

The LOM specification suggests 11 elements to describe the educational context of learningmaterial. This information falls into the category of “usage metadata” (Kim, 2005) i.e.,description of how and for what purposes the data is to be used by users and applications.In enterprise settings, this kind of data is called business data. When retrieving a LO, theteacher must appropriate it i.e she must “imagine how the designers intended it to be usedand adapt their practice accordingly, or else must conceive of some new use for the material”(Oliver, 2005). LOM may support this appropriation task since it describes the LO usagewith the educational metadata. The information described in the educational metadatais generally implicit in most LOs. For instance, the difficulty, the typical age range of theintended students, or the interactivity type of a LO are rarely explicit in its content.

An interesting issue about this kind of metadata is that a same educational resourcecould be used of various manners and that possibly different values can be applied to itseducational metadata. The LOM specification takes into account this situation and authorizedthe educational category to be instantiated more than once.

3.2.3 Objectiveness and Context Dependency of LOM elements

(Hodgins, 2001) distinguishes subjective metadata from objective metadata. Objective meta-data are factual data such as physical attributes, date, author, operational requirements, costs,identification numbers, or ownerships. Subjective attributes are those attributes concerningthe description, the keywords, or the educational usage. About 20% of the LOM specificationdeals with subjective attributes. Concerning the educational category of LOM, about 80% ofthese attributes are subjective. (Kabel et al., 2003) differentiates “tangible” and “abstract” val-ues for metadata. Tangible values are ones that exclude each other, such as learning resource


type and interactivity type. Abstract data elements (e.g., educational/semanticDensitywhich is a subjective measure of the LO’s usefulness as compared to its size or duration) havevalues that depend on the context. For instance, how to define that the semantic density of anelement is medium or high: For most authors, this measure is relative. Therefore, abstractvalues are more difficult to determine than tangible ones.

3.3 LOM Usage

3.3.1 LOM Creation

3.3.1.1 Who?

Creating a LOM instance consists of filling the almost 60 attributes of the IEEE LOM specifica-tion (see Figure 1.1 on page 19). (Greenberg, 2003) distinguishes four people roles, which maybe involved with metadata instantiation: professional metadata creator, technical metadatacreators, content creator, and community enthusiasts. Applied to the LO field, these roles canbe described as follows:

Professional metadata creators are third party metadata creators because they producemetadata content created by individuals. Typically, they are specialized catalogers, in-dexers or pedagogical engineers. They have sufficient pedagogical, domain and technicalskills “to make sophisticated interpretative metadata- relative decisions” (Greenberg,2003). They work with classification systems such as vocabulary and ontology, LOMProfile schemas and LOM standard.

Technical metadata creators include paraprofessionals (e.g., library assistants, or plan-ning assistants in educational institutions) who had metadata training but their hiringcosts are lower than those of metadata professionals. They are generally responsible forinstantiating simple metadata such as authors, title, format, or creation date.

Content creators are typically instructors developing the intellectual content of the educa-tional resource. Nevertheless, they generally have no special training on LOM Profile,vocabulary and ontology. Facilitating the creation of LOM by content creators may beless expensive than hiring metadata professionals.

Community enthusiasts are typically instructors with special knowledge and interest inthe shared educational resources. They typically have no formal LOM creation training,but they may participate in the completion of the metadata. Collaborative creation has

LOM Usage 45

of course many economical advantages but the coordination required is often difficult toachieve.

As stated in Section 3.2.3, LO metadata have the particularity to hold about one fifthof subjective attributes. In particular, educational attributes are mostly subjective. Thismakes the task of instantiating those metadata attributes difficult: It often requires a seriousevaluation effort from the user defining their values (Hodgins, 2001; Kabel et al., 2003).Correct instantiation of metadata for learning objects requires combined educational andtechnical skills (Kabel et al., 2003). This fact entails that professional metadata creators aregenerally too costly to be hired in all educational institutions.

(Currier et al., 2004) comments a small study carried out to investigate whether thecreators of the resources could also create metadata for their resources, and to assess how wellthey could do this in comparison with the information specialists (technical metadata creators)involved in the project. The key findings of the study are as follows:

• “In general terms, resource creators did not have a good understanding of the purpose ofmetadata or an appreciation of its value;”

• “Resource creators did understand and appreciate the context of their resources andfocused on these elements within the metadata;”

• “Information specialists had a better understanding of the purpose of metadata andincluded a wider range of metadata elements;”

• “Information specialists struggled with contextual aspects of the metadata;”

• “Neither the resource creators nor the information specialists handled pedagogic aspectsof the resources well.”

The study concludes that a collaborative approach to metadata creation is needed tooptimize the quality of the metadata in this context.

In practice, LO authors are generally responsible for generating metadata. Nevertheless,instantiating the large amount of metadata fields that characterize a LO (see Figure 1.1) isdefinitely a weighty load for them (Duval and Hodgins, 2004b; Currier et al., 2004).

3.3.1.2 When?

In most situations, LOM creation for a certain LO applies when this LO is shared or packagedin a certain interoperable standard (e.g., SCORM (SCORM, 2007)) in order to be shared.


In Learning Management Systems (e.g., (Moodle, 2007), (Blackboard, 2007)), the generalmetadata like the title, the author, the description, and the information related to a distantlearning situation (display style, publication dates, etc.) have to be instantiated duringlesson authoring. However, the other metadata attributes of LOM are instantiated only whenpackaging the learning material for being shared in a repository.

3.3.2 State of existing LOM Instances

(Friesen, 2004) analyzes sets of learning resources varying in size from 75 to 3000 and re-trieved in four LO repositories. This survey results in various conclusions. First, XMLstructure is too complex to implement. Interfaces on the technical infrastructure are definitelyrequired for facilitating the access to the data. Second, the vCard element is almost neverused because of its heavy structure. Third, there are notable differences in the way meta-data are instantiated. The implementation of elements describing the intellectual content(general/keywords, classification [with classification/purpose set to discipline])and the characteristics of the resource as media and Internet files (technical/format,educational/learningResourceType) are well utilized. Those which attempt to describethe resource as a software "object" or to associate with it an educational context or levelare much less frequently used (e.g., lifecycle/version, general/aggregationLevel,educational/semanticDensity, educational/context). Finally, the vocabulary-basedmetadata seem to suffer from deficient identification and definition of the suggested values.

(Kabel et al., 2003) denounces a real lack of consistency when users are instantiatingwhat the authors called abstract metadata (see 3.2.3). This criticism also focuses on most ofthe educational metadata.

(Najjar and Duval, 2006) also studies the actual use of LOM in the Ariadne repos-itory (Ariadne, 2007). In their analyze, apart from the basic metadata (title, keywords,author, creation date), they notice only one LOM element almost always instantiated:general/aggregationLevel (granularity in the LOM Profile of Ariadne). The ele-ments educational/context, educational/interactivityLevel, educational/-

semanticDensity, educational/difficulty are used about half the time and aremostly instantiated at a medium level looking like a default value. The rest of metadata arealmost never used in the indexing process.

(Currier et al., 2004) denounces the generally bad quality of metadata analyzed in threedifferent bases of educational material done in the UK: Again most of metadata are incompleteor default values are overused.

LOM Usage 47

3.3.3 LOM Usefulness

In (Heath et al., 2005), a five-year experience with the iLumina digital library is analyzed.This work concludes that the most used metadata elements are metadata directly coming fromthe DublinCore specification (e.g., the general category). This analysis could not found anyevidence of the usefulness of the educational metadata. In fact, it notes that the semanticambiguity of the subjective attributes of the educational category makes them difficult to beused and understood out of their original context. This work argues that communities shoulddefine precisely the semantical mean of the subjective attribute in order to make them useful.

Some usability studies about the LO retrieval (Najjar et al., 2004; Najjar et al., 2005)show that metadata attributes are poorly used in order to generate queries: Most people arelost in their complex structure and the specific vocabulary required. In fact, people mostly usethe default value of the metadata attributes. In particular, the subjective attributes are notunderstood by users wanting to search for a LO. Finally, (Najjar et al., 2005) advocates the useof simple keyword query, like in Google, instead of using the LOM attribute to retrieve LOs.(Currier et al., 2004) also

In an attempt to develop a system, called distance measure, based on the educationalattributes of LOM in order to retrieve LO, (Hiddink, 2001) concludes that “ The fact thatthe distance measure is unable to predict the usability of learning materials, means thatapparently some teachers base their judgement upon other characteristics than encodedin the metadata used in the experiment. A possible explanation could be that these othercharacteristics are related to pedagogical principles that are difficult to explicitly describe.From conversations with teachers about this subject, it appeared that sometimes teacherschoose a ULM [, i.e., a learning object,] just because it contains a picture that perfectly explainsthe relationships between concepts of the subject matter, or because a ULM contains a verygood exercise. Even if other characteristics of the material are less optimal (such as amount ofinteraction, or size) then the teacher will still choose the ULM.”

3.3.4 Conclusion

Studies about LOM usage confirm a citation from (Currier et al., 2004) “garnered fromconversations with e-learning colleagues around the world: for both technology and pedagogyexperts, metadata creation is seen as a tedious chore rather than as a complex intellectualskill which is essential for unlocking access to resources”.

When looking in detail the analysis of LOM usage, we note that a special deficit isattributed to the educational metadata, apparently due to their subjective aspect. About this


issue, (Duval and Hodgins, 2003) advocate for local appropriation and rationalization of thesubjective elements. This topic is related with the notion of LOM profile described in Section2.5. Local vocabulary for supporting subjective attribute instantiation is definitely a centralrequirement for using LOM.

The remaining of this chapter reviews the existing research on facilitating LOM usage:support for LOM generation, LOM validation, and LO retrieval with LOM are discussed inturn.

3.4 Supporting LOM Generation

(Greenberg, 2003) distinguishes three types of agents for supporting metadata creation:

• Human beings,

• Standards and documentation (e.g., the LOM specification, a LOM Profile includingrecommendations for the instantiation or some vocabulary for a certain LOM attribute).

• Tools (i.e., technical support for capturing and storing LOM in its conformant XMLformat). Tools for supporting LOM creation are templates, editors and generators.

Section 3.3.1 discussed the role of human agents. Sections 3.1 and 3.2 presented theLOM standard and its characteristics. This section introduces now the tools for supportingLOM generation.

3.4.1 Templates and Editors

Templates are sometimes used for creating LOM. They consist of a void instance of XML codeincluding all the LOM attributes. However, they are not very frequently used since LOM usersare rather familiar with higher-level tools such as editors.

LOM editors may be embedded in a web page like the LO repository web portals. Theymay also be stand-alone applications like e.g., the Reload editor (Reload, 2007). Part of theavailable LOM editors directly use the XML tree to interface the LOM standard. Nevertheless,most of them (including most LO repositories) use the form metaphor in order to interfacewith the XML standard (see Figure 3.2). Hiding the technical structure of the metadata, theform interface is generally preferred over the tree interface.

Supporting LOM Generation 49

Some of these editors can be used for creating LOM instances according to several LOMProfiles. For example, the Reload editor permit to choose between the IEEE profile (LOM,2002) or the IMS Metadata Profile (IMS, 2007) (see Section 2.4).

Figure 1. Interface of the Reload, multi-format editor of learning object metadata.

2.2. Information sources for generating metadata

A generator for automatic production of educational metadata needs information sources

in order to deduce metadata values. We distinguish two typical sources of information:

- Document contents, i.e. the educational resource itself and its technical

characteristics,

- Document context, i.e. the learning environment, the resource usage, the

environment in which it is used (e.g., is it part of a course? and if yes, which are

the characteristics of this course), and the related material.

Documents content may deal with structured documents, free-text documents,

images, sound, video, animations, simulations, or hybrid media. The difficulty of using

the content as a source for generating the metadata stands mainly in the fact that not all

media can be easily analyzed. Technical characteristics of a document are also part of this

source. They are the format, the size, the modification date and the access rights.

Document context presents other opportunities for generating metadata. Typically,

the usage of a certain learning resource reveals the learning time, the level of

interactivity, or the density of the learning material. The learning environment also gives

information on the profile of the learner who uses it, the used language, or the necessary

material. The lesson or learning unit in which a learning object is being (re)used and the

related material may also give some interesting information about the pedagogical

context of the learning material. Nevertheless, the document context is not always

Figure 3.2: Interface of Reload, a multi-format editor for LOM

In addition to interface a specific XML structure, some editors propose features for sup-porting the generation process: E.g., in the Reload editor, the attributes concerned with acertain vocabulary are instantiated with a combo-list proposing the vocabulary items autho-rized for the considered attribute. Moreover, an explanation of each metadata attribute isaccessible as tool-tips.

The remaining of this section deals with generators for automatic production of metadata.

3.4.2 Information Sources for automatic LOM Generation

A generator for automatic production of educational metadata needs information sources inorder to deduce metadata values. We distinguish two typical sources of information:

Document contents , i.e., the educational resource itself (e.g., its textual content) and itstechnical characteristics (e.g., its format, size).


Document context , i.e., the resource usage (e.g., time of use), the environment in whichit is used (e.g., Is it part of a course? and if it is, which are the characteristics of thiscourse?), and the related material (e.g., referenced LOs).

LO content may deal with structured text, free-text, images, sound, video, animations,simulations, or hybrid media. The difficulty of using the content as a source for generatingthe metadata stands mainly in the fact that not all media can be easily analyzed. Technicalcharacteristics of a document are also part of this source. They are the format, the size, themodification date and the access rights.

Document context presents other opportunities for generating metadata. Typically, theusage of a certain learning resource reveals the learning time, the level of interactivity, orthe density of the learning material. The learning environment also gives information onthe profile of the learner who uses it, the used language, or the necessary material. Thelesson or learning unit in which a learning object is being (re)used and the related materialmay also give some interesting information about the pedagogical context of the learningmaterial. Nevertheless, the document context is not always computable. Electronic learningenvironment such as most Learning Management Systems attempt to rationalize and tracethis information as much as possible. Standards like SCORM (SCORM, 2007) and IMS (IMS,2007) are initiatives that contribute by giving information about the context in which the LOis intended to be used. Moreover, they are especially intended to define uniform interfaces forelectronic learning environments.

3.4.3 Generation Techniques based on Content Analysis

In this subsection, various metadata generation techniques based on document content anal-ysis are reviewed. Methods using LOM as well as methods using other types of metadatadefinition schema are presented mentioning their advantages and drawbacks. The genericapproaches and the semantic based approaches are presented separately.

3.4.3.1 Generic Approaches

Collecting metadata from a document by retrieving previously created metadata within theLO is called metadata harvesting. Most popular file formats contained automatically createdmetadata (like document creation date, author name), and metadata which the user needsto fill out manually (like comments). Such formats are available for images (e.g., JPEG),videos (e.g., AVI), sounds (e.g., MP3) and text (e.g., PDF). Some formats allow manually added


metadata and multiple metadata formats within a single file (e.g., MS Word). However, theway metadata are stored inside a file can change depending on its format. Therefore, metadataextraction may be complicated when dealing with proprietary file formats and metadataschemas. Nevertheless, some applications are available which extract harvestable metadatafrom selected file types (e.g., (Greenberg, 2004), (Greenstone, 2007)).

Search engines like Google or Yahoo! are mainly based on document content analysis. Forthat purpose, they usually extract a so-called standard logical view from the documents. Themost used logical view for documents in search engines is the “bag of words” model, in whicheach document is seen only as an unordered set of words. In modern Web search engines,this view is extended with extra information concerning word frequencies and text formattingattributes, as well as meta-information about Web pages including embedded descriptionsand explicit keywords in the HTML markup. There are several text normalization operationsthat are executed for extracting keywords. The most used ones are: tokenization (dividingstream of text into words), stopword removal (removing functional words without semanticinformation) and stemming (extracting the morphological root of every word). Those threemethods are efficient working with languages similar to English. In contrast, tokenizationdoes not work with languages like Chinese in which word decomposition is not obvious, andstemming is very difficult with Arabic languages.

After text normalization, keyword ranking is done. The most used methods for textranking are: Vector Space Models (representing natural language documents in a formalmanner by the use of vectors in a multi-dimensional space), and Naïve Bayes Classifiers(based on probability models incorporating strong independence assumptions which oftenhave no bearing in reality). Some libraries implementing such methods are, e.g., Classifier4J(SourceForge, 2007), which enables keyword extraction and summarization from a text-baseddocument, or KEA (KEA, 2007) for automatic key phrase extraction. Generic content analysismay be useful in order to generate keywords or description. More specific characteristics liketitle, author, size, creation date, etc. are generally retrieved from already existing metadataembedded in the educational resource itself. This method is used by DC.dot (DC.dot, 2007), aweb application generating DublinCore (DublinCore, 2007) metadata for a web resource. Thissystem mainly works on retrieving text embedded by tags targeting DC-related information.

3.4.3.2 Approaches based on Domain-specific Semantics

Metadata contains semantic information about the documented resources. Consequently,using the domain semantic for generating metadata may offer interesting opportunities.

Concerning document classification, (Jenkins et al., 1999) describes a system generatingmetadata like DublinCore (DC for short) by matching correspondences with a classification


hierarchy (in this case the Dewey Decimal Classification System). (Stuckenschmidt and vanHarmelen, 2001) also proposes semantic-based classification by referring to an ontology. Someavailable commercial tools enable semantic classification of documents as well. The KlarityAPI (Klarity, 2007) and Metatagger (Interwoven, 2007), e.g., enable such a classificationprocess based on a user-defined ontology.

(Yilmazel et al., 2004) introduces natural language processing methods in order to extractpedagogical information from an educational resource. These methods succeed in efficientlygenerating DC metadata and GEM Metadata (GEM, 2007), which are quite similar to theeducational category of LOM. However, their system focuses on a specific type of educationalresource since it exclusively deals with lesson outlines.

3.4.4 Generation based on Context Analysis

Metadata generation techniques using document context analysis are discussed. As above, wepresent generic approaches first and then the semantic based approaches.

3.4.4.1 Generic Approaches

There are only few generic approaches for generating metadata based on the analysis ofthe context, being the work of (Marchiori, 1998) the most interesting one. He suggests apropagation system based on simple fuzzy logic. Basically, metadata values are propagatedfrom one document to another one, where a hyperlink between the two documents exists. Aweight variable is associated with each metadata value. At each diffusion step, this weight isdecreased. Since a document can be related to various elements of the graph, the propagationsystem may generate multiple values for the same metadata. In these cases, a composition ismade in which the value of the metadata with the maximum weight is kept. This approach doesnot consider the semantics of the elements, but it obtains good performance: The propagationmethod drastically increases the number of documents with instantiated metadata whereasthe produced metadata values remain reasonably correct. However, these results are valid insystems using ontologies based on one category, since fuzzy logic is not efficient with various,possibly ambiguously overlapping categories. If these ambiguities are carefully identified forthe case of metadata for educational resources, it may be very interesting to investigate theapplication of this method to the elements of LO repositories.


3.4.4.2 Approaches based on Domain-specific Semantics

Several research works focus on using domain semantics for context-analysis-based genera-tion of metadata. In order to enable metadata generation, these tools process sets of rulesconcerning the semantics of (1) the other metadata of the resource, (2) the associated learningenvironment, and (3) the related material.

(Lattner and Gehrke, 2004) proposes to use Inductive Logic Programming in order tocreate sets of rules relating the semantics of different elements of the same metadata. Then,they suggest to apply them in order to generate missing values of metadata using the valuesof the instantiated elements. As learned rules might not be completely correct or they mightmiss values for attributes of objects, they provide a rule relaxation algorithm while applyinglearned rules. This algorithm permits to systematically deduce metadata. However, thisweakens the overall quality of the generated metadata. Thus, the authors argue that thisfeature has to be used in the scope of post-verification of metadata. There are no availableresults of experiments using this methodology on metadata for educational resources. Thecommercial tool Metatagger (Interwoven, 2007) enables the processing of similar rules in orderto generate metadata values. Nevertheless, the used rules are not induced from system usagebut user-defined.

Learning Management Systems are also semantically rich source of information forgenerating metadata. (Ochoa et al., 2005) suggests deducing metadata values from theinformation available in the LMS. Metadata like author-related information and educationalcontext (course level, area, prerequisites, student level, etc.) may be generated. (Hatalaand Richards, 2003) propose a similar approach but they base their implementation on thestandard SCORM and IMS Packaging. Since this system uses standards instead of specificimplementations, it could be easily applied to most learning-management systems.

Another method for generating educational metadata is based on using metadata seman-tics of the related material. (Hatala and Richards, 2003) also develops this approach in theirsystem. In particular, a set of specific rules concerning the inheritance between educationalresource (from parent to children), the accumulation (from children to parent), and the contentsimilarity between educational resources are defined. (Brase, 2005) proposes a very similarprocess in order to infer metadata values for the elements of a set of related LOs. (Hatalaand Richards, 2003) notes that the inference rules may not be valid in all potential uses andthat this work should provide suggestions for metadata values during the metadata creationprocess instead of automatically instantiating metadata values. These systems present inter-esting results, specially for the educational metadata, which content-based system is difficultto infer form the material content. Nevertheless, this system depend on the quality of themetadata of the neighboring LOs: If the directly related LOs are poorly instantiated (it is a


common situation according to Section 3.3), the metadata lack directly influences the inferencerules since they may have no input.

3.4.5 Conclusion

Most metadata providing technical information (e.g., size, format, creation date, duration,etc.) about an educational resource can be generated automatically by the applicationsused to create them. Approaches for generating metadata using document contents areaimed at extracting information identifying the subject of the educational resource (title,keywords, description, classification). The efficiency of the method may be improved by usingdomain ontology. However, these techniques are based on text mining but not all educationalresources are text-based. Moreover, most educational-related metadata remain implicit in thelearning material. Document context offers other opportunities for generating such data. Inparticular, techniques using the semantics of this context seem to give interesting results alsofor generating educational metadata.

On the one hand, human-based metadata generation does not work in real life (Duvaland Hodgins, 2004a). Metadata creation is too constraining for content authors that do nosee their immediate benefit. Professional metadata creators are considered as too expensiveto be employed in most educational institutions. Automatic generation of metadata doesnot generate perfect metadata, but it may be good enough for sharing material (Duval andHodgins, 2004a).

On the other hand, studies about metadata generation methods (Greenberg, 2004; Green-berg et al., 2006) conclude that “best metadata generation option is to integrate both humanand automatic processes”. According to this trend, metadata for educational resources shouldremain instantiated by content authors or professional metadata creators, but this instanti-ation process should be supported by the suggestions and restrictions of various automaticgeneration methods. Tools supporting such a process are hybrid engines between editors andgenerators.

3.5 Supporting LOM Validation

Whereas the quality of the LO metadata is an important characteristic for enabling effectiveLO retrieval and interoperability between repositories (Barton et al., 2003), studies reveal thelow quality of most LOs available at popular repositories (Friesen, 2004; Najjar and Duval,2006; Heath et al., 2005; Currier et al., 2004). “While there has been a lot of studies about

Supporting LOM Validation 55

how metadata quality could be measured [e.g. see (Shreeves et al., 2005)], the fuzziness ofquality parameters, mainly designed to guide human reviewers, makes unfeasible to usethem consistently and scaleably in real repositories” (Ochoa and Duval, 2006a). In thecontext of lesson authoring, quality validation needs to be done at authoring time, i.e., athird party cannot validate the metadata and automatic processes should be considered. Thisthesis considers three main aspects revealing metadata quality: completeness, correctness,and accessibility. These topics are described in next subsections and some conclusions arepresented.

3.5.1 LOM Completeness

According to (Stuckenschmidt and van Harmelen, 2004), ”in order to provide full access toan educational resource, it has to be ensured that all the information is annotated with themetadata. Otherwise, important or useful parts of an information source may be missedor cannot be indexed correctly”. For that reason, LOM completeness is an important topic.This characteristic should be assessed in order to decide the validity of metadata. Basically,it means to check if each metadata element is effectively instantiated. (Ochoa and Duval,2006a) proposes two metrics to evaluate completeness of LOM. A first one is based on a binarymeasure of the completeness: a metadata attribute which is instantiated scores 1, a metadataattribute which is not instantiated scores 0. This metric deals with the mean of the scoresgiven to each attributes. The second metric considers the same binary scoring system for eachindividual attribute but the final result deals with a pondered mean in which a certain weightis given to each individual attribute.

LOM Profiles provide useful information about the relevance of metadata elements. Forinstance, in the application profile of CANCORE, the metadata general/coverage is notconsidered whereas it is part of the LOM specification. In this specific context, evaluation ofcompleteness should not take into account this particular metadata element. On the otherhand, if the CANCORE resources are retrieved from another context, completeness evaluationmay negatively consider this missing metadata.

3.5.2 LOM Correctness

Metadata about an educational resource is only useful if it correctly describes its contentsand pedagogical contexts. In fact, inconsistent metadata may be a more difficult problemthan missing metadata, âAIJbecause mechanisms relying on metadata will produce wrongresults without warningâAI (Stuckenschmidt and van Harmelen, 2004). Therefore, validity ofmetadata should depend on the evaluation of the correctness of this metadata.


general/structure=atomic=⇒general/aggregationLevel=1

educational/interactivityType=active=⇒high values of general/interactivityLevel

educational/learningResourceType=narrative text=⇒general/interactivityLevel=expositive

educational/semanticDensity = high or very high=⇒educational/difficulty is difficult or very difficult

educational/context=higher education=⇒educational/typicalAgeRange ≥ 17 years

Table 3.1: LOM consistency recommendations (adapted from (Ochoa and Duval, 2006a))

Correctness evaluation is more complex than completeness assessment because it dealswith the semantics of the metadata values. A first method consists of checking the validityof the data type. Such test can be done at low level using the XML Schema validationmechanism. Most tools for editing educational metadata (e.g., Reload editor (Reload, 2007))and most metadata creation forms available in current LO repositories provide this kind oftype checking. Moreover, these devices permit to check whether the given value belongs to aset of vocabulary terms. The Reload editor, e.g., enables to check the vocabulary correspondingto a different version of the IMS Metadata definition. Nevertheless, none of the availableeditors provides a deep checking of the meaning validity, because this topic has to do with boththe semantics of the metadata and the semantics of the educational resource.

Analyzing the semantics of an educational resource means to be able to mine thisdocument. This issue is related with the automatic generation of metadata, which was furtherdiscussed in the previous section. (Ochoa et al., 2005) suggests a framework for using variousautomatic metadata generation methods for cross-validating metadata. The idea is to comparethe value of a metadata element with the ones provided by several generators. They alsopropose developing a system based on Bayesian networks in order to fairly attribute weightsto the wide range of produced values.

Rule-Processing-based systems may also be relevant for checking the correctness of themetadata. For instance, the system developed by (Lattner and Gehrke, 2004) applies rulesby processing some semantic relationships between the various elements of the metadata ofa same resource. This approach may provide interesting information on the consistency ofthe metadata elements between them. IEEE LOM specification suggests certain combinationof values to maintain the internal consistency of the record. For instance, if the valueof general/structure is atomic, general/aggregationLevel should have the value 1.Table 3.1 lists other proposals of consistency between LOM attributes that are presented in(Ochoa and Duval, 2006a).

None of the available generation methods for metadata values provide sufficiently accu-rate results to safely validate the correctness of the metadata. “Is the perception of the authors,that while a simple accuracy metric for metadata (especially for metadata that describe text

Supporting LOM Validation 57

documents) could be implemented with existing algorithms, a more meaningful and generalaccuracy metric that could grade any metadata record is a complex task that could not beimplemented with the current state of Information Extraction technologies.” (Ochoa andDuval, 2006a). Nevertheless, combining various generation techniques may be an interestingsolution in order to build a validation system having a minimum fairness. No research has cur-rently seriously studied that issue and human-based verification (e.g., professional metadatacreators/evaluators) remains the most reliable means to ensure metadata correctness.

3.5.3 LOM Accessibility

Metadata for educational resources has to be accessible for people and applications wanting touse them in order to be useful. Accessibility measures the degree to which a LOM attribute isaccessible, both in terms of cognitive accessibility as well as physical/logical accessibility.

Cognitive accessibility deals with the easiness with which a user may understand theinformation contained in the metadata. It could consist of measuring spelling errors or thedifficulty of the text. While the first two techniques are easily computable, the difficultyassessment is the most difficult to automate. (Foltz et al., 1998) presents interesting results intrying to measure the coherence of a text using Latent Semantic Analysis techniques. Thismeasure may be used for evaluating the difficulty of a text as proposed by (Ochoa and Duval,2006a).

Physical accessible metadata has the following features: (1) it is defined within an interop-erable format, (2) it uses an accessible vocabulary, and (3) it is possible to localize. Evaluatingthe accessibility of metadata for educational resources means to assess the completeness ofthese three characteristics.

Most XML-based standards such as LOM or IMS Metadata attempt to offer a concretemeans to achieve interoperability for educational resource metadata. For being interoperable,metadata should be defined in a format transformable to an existing standard or format, whichan end-user system can understand. Consider some metadata M defined in a certain format F1.M is interoperable with another format F2 if it exists a direct transformation function F1→F2

between the two languages or a standard format F and two transformation functions F1→F

and F→F2 (and their inverse function, of course) in order to use F as a pivot. This simpledefinition may be used as a proof of the interoperability of a metadata. XML (XML, 2007) is apowerful tool for implementing interoperability of formats. In particular, it permits the use ofXSLT, a language for defining transformation sheets between XML documents (e.g., (Najjaret al., 2003)). Since LOM is based on XML, it is recommended for metadata to be defined in aformat based on XML or extensions (e.g., RDF (W3C, 2007)).


Accessibility of metadata also depends on the accessibility of the used vocabulary. Avocabulary is accessible when (1) it is a standard, or (2) a translation table from this vocabularyto a standard vocabulary is available, or (3) a translation table from this vocabulary to theend-user vocabulary is available. Current definitions of LOM Vocabulary are generally basedon RDF (W3C, 2007). RDF is an XML-based language enabling the definition of semanticrelations between elements. In particular, it permits the definition of ontology, classes orinstances of vocabulary. RDF may also be used in order to define a translation table betweenvocabularies.

Finally, metadata is said to be accessible if it is localizable. Basically, metadata foreducational resource is localizable when it is distributed by a medium (e.g., internet or cd-rom).Localizable metadata should also succeed in reaching the intended public (push protocol,e.g., mailing or RSS) or be easily available to be found by them (pull protocol, e.g., electroniclibraries, LO repositories). Accessibility may also be improved with advanced visualizationtechniques of specialized databases like the ones proposed in (Klerkx et al., 2004).

3.5.4 Conclusion

Validating metadata for educational resources is a difficult task. Completeness and accessi-bility are well-defined concepts and their validity is easily computable. However, correctnessdeals with the semantics of the educational resource. Thus, metadata correctness confrontswith the same issues and problems as metadata generation. In particular, both processes facethe difficulty of deriving educational information from the educational resource contents.

Metadata correctness depends on the expectation of metadata users. E.g., if metadata areused for locating LOs during a teaching session where responsiveness is essential, incorrectmetadata may be undesirable. However, if LOM can be used for LO retrieval in a lessconstraining situation (e.g., lesson authoring), approximation may be sufficient to offer greatresults (Duval and Hodgins, 2004a; Ochoa et al., 2005).

3.6 Supporting Learning-Object Retrieval with LOM

The challenge of finding appropriate LOs is a key issue for end-users in LO repositories.Usability studies (Najjar et al., 2005) show that the user search interfaces for the availablerepositories are not satisfying. In particular, those using the metadata fields in order togenerate queries are poorly used: most people are lost in the complex structure and the specificvocabulary required. As a consequence, research about educational retrieval has three main

Supporting Learning-Object Retrieval with LOM 59

directions. First, end-users need simple interfaces like the Google one in order to make querieson learning objects. Such a retrieval method may radically simplify the query process for end-users but it might also lack the advantage of metadata being a semantically rich informationsource. Thus, the second research direction deals with the generation of semantically richqueries. Finally, a third direction deals with the design of comfortable navigation methods inquery results.

3.6.1 Keyword Queries for Learning Object Retrieval

Indexing engines - like Google - index the WWW using text normalization operations like thosedescribed in the Section 3.4.3.1. Elaborated methods of indexes are used when finding queryresults in very short time. Query interfaces for such systems are based on simple keywords;users enter a sequence of expressions in order to retrieve documents containing them. Then,the resulting documents are ranked considering the order of the keywords and their occurrencein the retrieved contents. Other document characteristics like keywords frequency and relatedweb documents also play an important role in the ranking process.

Most people are accustomed to use the keyword queries of search engines. Consequently,interfaces implementing such queries have appeared in most LO repositories. In these systems,the sequence of keywords is processed on one specific element of metadata (e.g., title) or on alimited set of elements (e.g., title, keywords, description). Some research (Najjar et al., 2005)indicates the convenience of searching the occurrences of the expressions given by the users inall the elements of metadata.

In most LO repositories, keyword queries apply to the metadata content of the resourcesavailable in this repository. Therefore, they strongly depend on the validity of these metadata.Sometimes, the incompleteness of available LOM documents is a serious drawback for process-ing queries. Thus, in some situations, it is still faster to look for an educational resource withan indexing engine like Google than with repositories for educational resources. Metadataand queries on this metadata should not be restricted to data describing the contents, becauseeducational context is an important characteristic to consider when reusing learning material.However, in order to query this kind of data, the semantics of LOM must be considered.

3.6.2 Semantically Rich Queries

Metadata for educational resources usually contains more information than content descrip-tion. Such information (e.g., educational usage) is generally implicit in the document and


search engines are not able to grasp it. Educational metadata was especially designed forimproving the retrieval of learning material. However, users are not comfortable with query-ing such data. First, most people are used to query document content and not documentcontext. Second, it implies the understanding of the semantics characterizing the contextand the knowledge of the associated vocabulary. In order to support users in formulatingsemantically rich queries, most LO repositories provide forms. They are restricted version ofthose used for metadata edition except that they accept the input of restrictions instead ofvalues. Lists of vocabulary along with each metadata element are intended to guide usersin formulating their queries (e.g., Figure 2.1 on page 33). Nevertheless, (Najjar et al., 2005)claims that the complex structure of the forms and the wide range and non-familiarity ofvocabulary terms hinder the generation of semantically rich queries. This section reviews theavailable methods supporting users in generating semantically rich queries without fallingin the previous problems. First, keyword queries targeting all semantics of metadata foreducational resources are discussed. Then, recommender systems are introduced since theirpurpose is the generation of semantically rich queries. Finally, approaches supporting theuser in generating such queries are presented.

3.6.2.1 Keyword Queries targeting Metadata Semantics

According to (Najjar et al., 2005), keyword queries should target all the elements of LOmetadata. In particular, they may focus on semantically different attributes of the metadata(e.g., keywords, title, and also interactivity type and learning resource type). In keywordqueries, targeted attributes are not specified. For instance, a user looking for an educationalresource in order to make an interactive simulation in Java about the life of beavers couldformulate her request as active java simulation beaver life. Active stands for the interactivitytype, java simulation for the learning resource type and beaver life for keyword or title. Thisapproach has the advantage that users do not have to design long queries but they just needto summarize their demand. However, it is obvious that for generating such queries usersshould have sufficient knowledge about the semantics of learning object metadata, becausesome specific vocabulary (e.g., active) is not especially intuitive. Moreover, not specifying thesemantic of the query terms can lead to confusions due to possibly overlapping categories.Indeed, the example query active java simulation beaver life may match with a simulation,intended to life-long learning, on using ActiveXML (a Java library based on another Javalibrary called Beaver) for pushing information in XML documents. Considering such cases,it seems preferable to limit the use of keyword queries to basic requests. Natural languageprocessing is an interesting trade off between keyword queries and semantically rich queries(Woodley and Geva, 2004). However, this approach has not yet been explored in the field of LOmetadata retrieval.


3.6.2.2 Recommender Systems

Recommender systems are basically programs, which try to match a certain document with auser. Thus the main task they accomplish is trying to answer the question: “given a certaindocument, how interesting or useful may it be for a certain user?”. Recommender systemsmay be based on content analysis or collaborative filtering. In the first case, the content ofa document is usually automatically analyzed to extract its relevant characteristics by theapplication of different metrics. The characteristics are then compared with those desired bythe user. In the second case, these systems base their decisions on opinions previously givenby other users. Both types of systems apply user-modeling methodologies. Some techniques,which have been successfully used to accomplish this task are Markov chains (Wong and Butz,2000) and similarity metrics (Savia et al., 1998).

A variety of collaborative filters or recommender systems have been designed and de-ployed. The Tapestry system relied on each user to manually identify like-minded users(Goldberg et al., 1992) and it is one of the earliest implementation of collaborative filtering-based recommender systems. However, since this system depends on each person knowingthe others, it is not suitable for large user communities. Thereafter, several rating-basedautomated recommender systems were developed. The work of (Breese et al., 1998) identifies ageneral class of Collaborative Filtering algorithms called model-based algorithms. The authorsdescribe and evaluate two probabilistic models, which they term the Bayesian clustering andBayesian network models. In the first model, like-minded users are clustered together intoclasses. Given her class membership, a userâAZs ratings are assumed to be independent(i.e., the model structure is that of a Naive Bayes network). The second model also employsBayesian networks, but of a different form. Variables in the network are titles and theirvalues are the allowable ratings. Bayesian networks create a model based on a training setwith a decision tree at each node and edges representing user information. The model can bebuilt off-line on a matter of hours or days. The resulting model is very small, very fast, andessentially as accurate as nearest neighbor methods (Breese et al., 1998). Other technologythat has been used is Horting, that is a graph-based technique in which nodes are users, andedges between nodes indicate degree of similarity between the users (Aggarwal et al., 1999).Walking the graph nearby nodes and combining the opinions of the nearby users producespredictions.

As with any document, recommender systems can be used with learning objects. Theywould make a difference in locating appropriate learning material especially if they are basedon the opinion of other users, which have similar profiles and/or are “culturally” near, thusreducing the subjectivity of interpreting the metadata describing the educational contributionof the object. E.g., (Recker and Wiley, 2001) describes a collaborative recommender system forretrieving LOs. This system is based on a set of metadata designed for that purpose including


usability, authoritativeness, educational relevance, description and quality. (Baloian et al.,2004a) proposes a recommender system for learning material based on the characteristics ofthe metadata as well as the recommendations of other users with similar profiles. The systemuses the LOM metadata schema for characterizing the LOs.

(Brooks and McCalla, 2006; Brooks et al., 2006) proposes to extend the LOM model inorder to grasp information about the usage context of a LO. This extension is based on auser model ontology and a learning activity ontology. This information is used to support thelearning process (e.g., recognizing weak learning paths, distinguishing between successful andunsuccessful learning paths, uncovering difficulties).

(Ochoa and Duval, 2006b) suggests a set of metrics for recommending LOs. These met-rics deals with attention metadata, i.e., metadata characterizing the interactions betweenusers, documents, applications, and surrounding contexts. This work is based on the atten-tion metadata set defined in (Najjar et al., 2006). This set takes into account the actionsof creating, labeling, inserting, searching, recommending, browsing, selecting, publishing,sequencing, viewing, annotating, retaining. Analyzing these actions, (Ochoa and Duval, 2006b)proposes a method to measure the similarity between LOs, users, or authors. The concept of arecommender system based on these similarity metrics is also presented in this work.

3.6.2.3 Supporting the Generation of semantically rich Queries

Generation of semantically rich queries may also be supported at user level.

(Känsälä and Hyvönen, 2006) presents a semantic view based portal utilizing LOM. Inthis system, the user interface has several views (facets), offering different perspectives intothe content. A view (facet) may be a category hierarchy linked to ontology concept. Whenselecting a category, the user can see pro-actively the number of hits if the link is selectednext for constraining the search. Categories that produce no results are hidden on the userinterface, which eliminates dead-end situations during searching. “A benefit of the view-basedsearch paradigm is that user can select categories of different views simultaneously. Eachselected category works as search constrain so that only [documents] belonging to all selectedcategories are shown.” (Känsälä and Hyvönen, 2006).

(Pinkwart et al., 2004) defines a system complying with this proposition. Their work isimplemented over the CoolMode platform enabling collaborative learning. A user workingwithin this environment has the option of asking “is there something in the repository similarto the object I am working with?” without providing additional information. Then, themetadata values of the current document are used to formulate the query to the resource


repository. Users may also specify the required level of similarity by qualifying metadataelements “free”, “required”, or “not required”. To summarize, the current material defines aspecific working context, which is used by the system in order to automatically generate asemantically rich query tuned to users expectations. Nonetheless, this system is implementedfor supporting a specific context of collaborative learning.

(Farrell et al., 2004) also suggests a system intended to support learners in retrievingLOs. The learner enters topic keywords (e.g., “java xml”), optional desired course duration(e.g., “30mn”), and an optional Search Scope (e.g., “overview”), then the system dynamicallygenerates a course based on several fine grained learning objects according to these character-istics. This system uses concept graph and LOM semantics in order to make this assembly. Inparticular, the LOM semantics are used to identify the rhetorical relations linking the LOs.Based on these relations, the system may build a logical sequence of LOs that could directlybe used by the learners.

(Verbert et al., 2005) proposes a system allowing authors to directly retrieve LOs fromthe lesson authoring environment. The considered LOs apply to the ALOCoM model describedin Section 2.3.1. In particular, it proposes a MS Powerpoint add-in that enables authors tosearch the repository for LOs components they wish to re-purpose in the slide presentationthey are working on. The lesson author using this system can specify the type of componentshe is interested in (e.g., reference, definition, example, slide, or image), as well as keywordsthat best describe the component. All the components satisfying the specified search criteriaare added to an additional panel integrated in Powerpoint. The content of the elements of thepanel can be pasted into the authored lesson with a single mouse click.

3.6.3 Query Result Processing

Result processing basically consists of organizing and presenting the elements retrieved bya query in order to facilitate the analysis. As a result of this analysis, one or more learningresources are chosen and (re)used. The user can also reformulate the query in order to bettermatch her expectations.

3.6.3.1 Results Ranking

Results ranking is the most common task of information retrieval. In web search engines,web pages are indexed via web crawling. When a user issues a query, term indexes are usedto locate the documents where the query terms appear. The documents are then sorted in


decreasing order of relevance with different criteria: page authoring, frequency and importanceof query terms in the page, and so on. It should be noted that a document could be relevanteven if not all keywords appear in it (Baeza-Yates and Ribeiro-Neto, 1999).

In databases and LO repositories, the answer to a query basically consists of the setof documents exactly matching the query. (Hiddink, 2001) suggests a distance measure torank the results in a finer manner than “perfect matching”. This distance measure focuseson attributing weights on the search criteria, i.e., the LOM attributes used in the search.These criterion weights are defined for each search in order to make the system significant.Nevertheless, (Hiddink, 2001) notes that criterion weights are sometimes depreciating LOsthat do not suit perfectly the teaching situation but that are of interest for the teacher.

(Ochoa and Duval, 2006b) presents an alternative to the PageRank algorithm used in mostsearch-engines: Link Analysis-based Ranking. This method deals with attention metadatarecording the interactions of LOs with users, applications and other LOs. In particular, itcombines various perspectives for defining the popularity of a LO like, e.g., popularity basedon the LO reuse frequency or popularity based on user votes.

Result ranking may also be related to evaluation techniques of recommender systems. Insuch approaches evaluation of resources is based on profile analysis and/or pattern matching.

3.6.3.2 Results Visualization

Since metadata of educational resources have explicit semantics, the use of informationvisualization methods can help the user to analyze the result set. For instance, metadataelements referring to a hierarchy may be used to visually organize sets of documents. (Klerkxet al., 2004) suggests using advanced hierarchy visualization systems like a tree-map. Atree-map is a visualization technique for hierarchical structure attempting to use the completedisplay area. With this method, they made a portal for the Ariadne repository (Ariadne, 2007)in which the results of a query are visualized within the tree-map reflecting domain taxonomyof the whole repository. Hence, the results of a query are contextualized.

Another interesting approach may focus on facilitating result overview. In particular, thevarious semantics of metadata should be explored and compared. For example, users maybenefit from comparing the characteristics of various instances of metadata. In most websites for e-commerce, this feature consists in displaying the characteristics as lists. Usingmore refined techniques may certainly enhance results comparison. The Periscope 3D SearchSystem (Wiza et al., 2004), enables a flexible visualization of various characteristics of largesets of documents. However, this complex technique may require an important training period.

Summary 65

3.6.3.3 Query Reformulation

When the user of a LO search engine (e.g., a teacher) is not satisfied with her results, she couldmodify her query. As a result of frequent web search engine use, users tend to reformulate theirquery more often. Query reformulation can also be an automatic process, or semi-automaticthough user interaction (Baeza-Yates and Ribeiro-Neto, 1999). Most people are able toreformulate a keyword query when using serach engines like Google or Yahoo!. However, whenusing semantically rich queries, the reformulation process may be a tedious task, since the userhas to consider each element of the metadata. An alternative to this process is the relaxation.Relaxing a query means to weaken some of its constraints in order to widen the scope ofresults. This process is commonly used when querying the semantic web (Stuckenschmidt,2003). An interesting point is that the relaxation of semantically rich queries for LO metadatamay follow some predefined patterns. For instance, the elements of the query constraining thecategories general, lifeCycle, technical, and classification may be relaxed in orderto enlarge the search to educational resources matching a certain pedagogical context but notlimited to a specific discipline or format. The LOs resulting from this relaxation process mayoffer interesting hints for defining methods supporting the particular educational objective ofthe authored lesson. The development of a set of pedagogically sound relaxation strategiesmay be a powerful tool for using the semantic aspect of LOM.

3.6.4 Conclusion

Creating semantically rich queries may be a process even more fastidious than creatingmetadata for sharing learning material. Therefore, it is necessary to offer simple means toquery LO repositories like keyword query interfaces. However, such approaches may generallytend to disregard the semantic aspect of metadata for educational resources. Discarding thischaracteristic results in under-using the LOM possibilities.

Other approaches like recommender systems and support for semantic based queriescan offer new perspectives to the teacher/learners wanting to retrieve learning material.Restricting users to keyword queries may improve the usability of LO repositories, butsemantics support is still needed in order to enhance the retrieval process.

3.7 Summary

LO metadata is without doubt a keystone for promoting reuse of LOs. Paradoxically, they arealso the bottleneck of this process, since many authors are not comfortable with the work of


producing meaningful metadata for their learning objects (Section 3.3).

Automatic generation of metadata seems to be, if not a definitive solution, an significanthelp to simplify this work (Section 3.4). In fact, it is possible for some metadata to beautomatically generated by the authoring tool used to create the associated LOs. For example,digital format, weight, author’s name and even language can be derived from the environmentin which the object is being created. However, there is a set of metadata, which is difficultto generate automatically: it includes information related to the educational characteristicsof the LO such as difficulty, interactivity type and required learning time. Nevertheless,there is some possibilities of inferring the values for this metadata from the context in whichthe learning object has been, is, or is intended to be used (Section 3.4.4). This informationcan be supplied by people who have already used it in a so-called recommender system.Another interesting approach for deriving the values for the metadata of one LO is to gatherinformation from the metadata of other LOs that are used in conjunction with this one. Thisapproach is reasonable because LOs are seldom used in isolation: a learning unit typicallycontains several LOs. Furthermore, learning units are generally structured, thus their LOsare inter-related (Section 2.3).

LOM validation relies on three aspects: completeness, accessibility, and correctness(Section 3.5). The first two points can be supported by automatic processes. Nevertheless, thethird aspect, correctness, is difficult to manage: Automatic processes could be only a partialsupport for this task and human verification remains the most reliable method.

In order to retrieve learning material, data mining methods typically used in informa-tion retrieval are not sufficient. Indeed, these methods cannot extract implicit information,e.g., most pedagogical data, from the material contents. In contrast, metadata present theadvantage to hold such information. Moreover, LO metadata are semantically rich data. Thischaracteristic offers interesting perspectives for LO retrieval (Section 3.6.3). Nevertheless, cre-ating semantically rich queries may be a process even more fastidious than creating metadatafor sharing learning material. Therefore, it is necessary to offer simple means to query LOrepositories like keyword query interfaces. Restricting users to keyword queries may improvethe usability of LO repositories, but semantics support is still needed in order to enhance theretrieval process. Consequently, complementary approaches like recommender systems andautomatic systems for generating semantically rich queries may emerge as fundamental toolsin order to support users in retrieving learning material.

Chapter 4

The Learning-Object MetadataUsage Problem

This chapter relates the LOM usage during lesson authoring to the literature review presentedin the previous chapter. This analysis states the work hypothesis of this thesis as well as itsmain research questions. Finally, our approach is presented.

4.1 Problem Statement

The literature identifies three main issues concerning LOM usage based on human: First,LO metadata generation is a tedious process (Section 3.3.1). Second, produced metadata aregenerally incomplete or missing, they may also be incorrect (Section 3.3.2). Third, LO authorshave little direct retribution for the metadata they generate (Section 3.3.3).

Linking these issues together with causal relations, it appears that LO metadata usagepresently stands in the process flow depicted in Figure 4.1: The process of generating metadatais tedious and complex (Step 1). For that reason, produced metadata has poor quality in general(Step 2). Therefore, it cannot be correctly processed when trying to retrieve a suitable LOusing LOM (Step 3). Thus, metadata may not provide interesting benefits for the user (Step 4).Consequently, generating metadata is considered a tedious process (Step 1). As we can see,unfortunately, this sequence looks like a vicious circle. In the remaining of this thesis suchLOM usage based on human contribution will be called human LOM usage.

67

68 Chapter 4. The Learning-Object Metadata Usage Problem

1.tedious process of generating metadata // 2.poor quality of produced metadata

��4.few benefits from metadata

OO

3.incorrect/incomplete processing of metadataoo

Figure 4.1: Human-based usage of learning-object metadata

4.2 Existing Approaches

Chapter 3 showed two main research directions attempting to solve the LOM usage problem.The first direction consists in automating the whole metadata process: human beings shouldnot be responsible anymore for generating LOM and the use of LOM for querying LO repositoryshould be transparent. Following this direction, the process of generating metadata is entirelyautomatic, the quality of the produced metadata is controlled up to a certain acceptable level,and the processing of metadata reaches a constant quality. In this scenario, the benefits ofmetadata do not influence the generation process anymore (see Figure 4.2). This LOM usageexclusively based on automatic processes is called automatic LOM usage.

1.automatic process of generating metadata // 2.known quality of produced metadata

��4.constant-quality benefits from metadata 3.constant-quality processing of metadataoo

Figure 4.2: Automatic usage of learning-object metadata

The second research direction focuses on combining automatic processes and humanintervention when using LOM: It is called hybrid LOM usage. Such a usage aims atsupporting the following process flow: Metadata generation relies on a human interventionmostly facilitated by automatic systems. The metadata quality validation also depends on thisinteraction. In this situation, metadata processing reaches a reliable level. Thus, metadatacan provide noticeable benefits for both metadata creators and end-users. In turn, thesebenefits motivate the generation of accurate metadata (see Figure 4.3).

1.supported metadata generation process // 2.supported metadata validation process

��4.noticeable benefits from metadata

OO

3.reliable processing of metadataoo

Figure 4.3: Hybrid usage of learning-object metadata

Remaining Issues 69

4.3 Remaining Issues

In practice, both research approaches present limitations as this section describes.

On the one hand, several research efforts focus on automatic LOM usage. The au-tomatic approach provides significant support for generating most objective metadata at-tributes, but generally fails on accurately generating the attributes having subjective valuessuch as, e.g., description, coverage, interactivity type, semantic density,

difficulty, intended end-user role, typical learning time (Section 3.4).

Automatic LOM usage is a great improvement over human LOM usage as shown inFigure 4.2. Nevertheless, it implies to abandon the use of the metadata it fails in dealingwith. For now, it is difficult to evaluate the impact of this loss since neither the usefulness ofsubjective metadata like the educational ones is proved (Section 3.3.3), nor their uselessnessis stated and demonstrated. In fact, the usage of such metadata attributes remains to beinvestigated.

On the other hand, hybrid LOM usage is free of the computational limitations of au-tomatic processes since it considers human participation. Therefore, hybrid LOM usagecan support generation and validation of subjective LOM attributes as explored by variousresearches (Sections 3.4.4 and 3.5). In this sense, hybrid LOM usage is complementary toautomatic LOM usage. Nevertheless, it involves the contribution of often uncooperative usersthat need rewards to motivate their participation (Section 3.3).

This thesis chooses to focus on the hybrid LOM usage. It studies this approach not onlyfrom the perspective of hybrid LOM generation and validation, but also from the perspectiveof the rewards for lesson authors generating LOM.

Hybrid LOM usage implies human participation. Furthermore, LO authors are generallyresponsible for characterizing their production (Section 3.3.1.1). Therefore, we propose to studythe LOM usage problem directly from the lesson authoring setting. However, this strategyimplies to remove the technical barriers that LOM usage induces. The literature witnessesvarious efforts trying to remove these obstacles (Sections 3.4, 3.5, and 3.6). Nevertheless, LOMgeneration support, LOM validation support still stand out of the lesson authoring process inthe available authoring tools (Section 3.3.1.2).

Some hybrid systems present accurate support for generating LOM attributes, whichfully-automatic systems fail in producing (Section 3.4). The most promising systems forthe lesson authoring context are gathering information from the metadata of other LOsthat are used in conjunction with the LO for which the metadata generation is facilitated


(Section 3.4.4.2). This approach is reasonable because LOs are seldom used in isolation:A learning unit typically contains several LOs. It is also reasonable to consider that theneighboring LOs may miss metadata or be instantiated with incomplete or incorrect metadataduring lesson authoring (Section 3.3). However, the existing systems do not deal with thissituation.

LOM is intended to enhance the LO retrieval, but taking benefits of the LOM semanticsfor querying LO repositories is a difficult task (Section 3.6.2). For that reason, automaticgeneration of semantically-rich query appears as a fundamental direction to enhance LOretrieval (Section 3.6.2.3). However, no LO retrieval improvement during lesson authoringhave been measured with the existing approaches. In addition to that, no other beneficialuse of LOM is proposed for lesson authors than facilitating LO retrieval or query resultexploration.

4.4 Work Hypothesis

Transforming a vicious circle like the one of Figure 4.1 into the process flow of Figure 4.3means to translate the flow steps all at once if we want to avoid entering the vicious circleagain.

Steps 1 and 2 implies to support LOM generation and LOM validation. In the contextof lesson authoring, these requirements concern the lesson authoring tools. This topic leadsto the research question: How to seamlessly integrate hybrid LOM generation andhybrid LOM validation into a lesson authoring tool? (RQ1). Since hybrid LOM usagerequires human intervention, this research question is confronted with the current lack oflesson authoring tools removing the barrier between human LOM generation and validation,and lesson authoring process. For that reason, introduction of human LOM usage into lessonauthoring should be first studied. Next, hybrid LOM generation and validation could bediscussed.

Step 3 involves the reliability of processing LOM. In the lesson authoring context, thisreliability is mostly reduced by missing, incomplete, or incorrect metadata perturbing LOM-based processes. For that reason, a study of hybrid LOM usage during lesson authoringshould consider the following research question: How can processes dealing with LOMsemantics cope with incomplete, missing, or incorrect metadata? (RQ2). However, asstated above, no existing system for LOM processing bring acceptable answer to this question.

Step 4 deals with LOM benefits. In the lesson authoring context, benefits should concernlesson authors. Thus, the following research question is stated: How can lesson authors

Work Strategy 71

benefit from the LOM they generate? (RQ3). Nevertheless, the literature does not suggestother LOM benefits for lesson authors than the potential to enhance LO retrieval duringlesson authoring and query result characterization. Furthermore, no work about LO retrievalhas shown that LOM generated by lesson authors can improve the retrieval of new LOs forthe authored lesson.

Answering these three research questions shows that it is technically possible to translateeach step of the vicious circle of LOM usage during lesson authoring towards a positivesituation. The aim of this thesis is to study these three research questions.

4.5 Work Strategy

This thesis makes proposals for answering the three research questions presented above. Thefeasibility of the proposals is validated by a formal model when dealing with algorithmicprocesses and also software prototypes. The prototypes also serve to evaluate the proposalperformance in comparison with the existing approaches when they exist.

Since this work focuses on the lesson authoring context, we develop an open-source lessonauthoring tool that serves as a test-platform for experimenting with hybrid LOM usage duringlesson authoring. In practice, all the software prototypes developed in this thesis are builtupon this tool.

The proposals of this thesis are sufficiently generic to adapt various teaching styles.Nevertheless, this thesis applies them to the specific case study described below in order toexperiment with them.

4.6 Case Study

At the Engineering School of the Universidad de Chile, 600 first-year students take a two-semester introductory course on Java programming. Students are divided into six groups. Fivedifferent teachers are responsible for guiding them. Each teacher can organize the course asshe wishes to. Nevertheless, in order to ensure similar learning pace, common exams are takenevery two months. Each teacher uses her own material but remains interested in reusingmaterial from her colleagues. Such a reuse will save her much time, which could be used, e.g.,for defining elaborated pedagogic activities. Moreover, teachers are also interested in havingaccess to additional material when students react unexpectedly to the planned course.


Such a teacher community needs to share precise material about a same general topic:All community participants teach Java Programming and they would like to benefit from thework of their colleagues. On the one hand, they are not specially interested in using entirelessons prepared by other teachers since most teachers prefer that their lessons reveal theirown teaching style. On the other hand, they are really motivated in reusing small pieces ofmaterial, e.g., a meaningful example of a certain topic or an interesting diagram.

4.7 Conclusion

This chapter stated hybrid LOM usage as the main research direction of this thesis. In orderto know if it is possible to improve such a LOM usage during lesson authoring, three researchquestions should be answered: RQ1–How to seamlessly integrate hybrid LOM generation andhybrid LOM validation into a lesson authoring tool? RQ2–How can processes dealing withLOM semantics cope with incomplete, missing, or incorrect metadata? RQ3–How can lessonauthors benefit from the LOM they generate?

The remainder of this thesis focuses on four main topics in order to investigate theseresearch questions. First, the human LOM usage during lesson authoring is studied in orderto prepare the answer to RQ1. A favorable setting for introducing LOM usage into lessonauthoring is discussed. Next, methods for integrating human LOM usage in such a settingare proposed. Second, LOM processing during lesson authoring is explored. This studyresults in a formal model for processing semantics of possibly incomplete, missing, or incorrectLO metadata during lesson authoring. This model aims at answering RQ2. Concrete modelapplications are also proposed. Third, methods for supporting hybrid LOM usage during lessonauthoring are studied. In particular, the LOM processing model applications together withthe methods for integrating human LOM usage into lesson authoring are used for enablinghybrid LOM generation and validation during lesson authoring. These proposals serve toanswer RQ1. Fourth, applications rewarding lesson authors for generating good-quality LOMare presented. This last part deals with answering RQ3.

Chapter 5

Introducing Human LOM Usageinto Lesson Authoring

This chapter deals with the research question RQ1: How to seamlessly integrate hybrid LOMgeneration and hybrid LOM validation into a lesson authoring tool? Since hybrid LOM usagerequires human intervention, the integration of hybrid LOM usage implies to first integratehuman LOM usage into lesson authoring. This is the aim of this chapter.

Seamlessly integrating human LOM usage into lesson authoring implies that humanLOM instantiation should occur directly into lesson authoring tool and during the lessonauthoring task. For that purpose, this chapter discusses (1) the creation of a lesson authoringtool in which a lesson is a LO made of smaller-grained LOs and (2) a LOM instantiationinterface considering the specific situation of lesson authoring.

First, the choice of a certain setting for organizing lessons based on LOs is discussedin regard with the related work presented in Section 2.3. This entails the definition of thenotions of lesson graphs and LO graphs. Afterward, the authoring of LO graphs is describedfrom a practical point of view: A tool for authoring LO graph is introduced. Then, the humanLOM usage in a LO graph is discussed. Finally, LO sharing and retrieval in such a setting areconsidered.

73

74 Chapter 5. Introducing Human LOM Usage into Lesson Authoring

5.1 Organizing a Lesson with Learning Objects

In Section 2.3, educational resource organization was reviewed. It was shown that theorganization of learning objects as a lesson concerns three main topics : (1) the hierarchicalmodel, (2) the sequencing model, and (3) the visual structure. The literature about learningobject organization contains various diverging models. Finding which model best suits thelesson authoring context is out of the scope of this thesis. Instead, this thesis focuses on findinga generic organization model that may suit the most settings as possible. This model consistsof:

A hierarchical structure based on nested graphs. Section 2.3.1 showed that nestedgraphs consist of a simple, generic model for organizing learning objects. Since thisthesis does not impose a certain granularity on the content of learning objects (Section2.2), the tolerance of nested graph model on this topic seems appropriate.

A sequencing strategy based on the semantics of the relations. Section 2.3.2 de-scribed that semantically typed relations are used to serve various purposes. While wedefine a set of semantic, organizational, and rhetorical relations in order to suit theneeds of the teacher community concerned by our case study, all the methods that areproposed in this thesis and that use the relation semantics do not depend on this specialset: They can adapt to other specific needs expressed by other relation semantics.

A visual structure based on graphs. Section 2.3.3 showed graphs to be the most genericstructure. Indeed, graph can be used in order to display linear or hierarchical viewswhile linear or hierarchical views may fail in representing graphs.

5.2 Lesson Graph

Definition 5.1 (Lesson Graph) A lesson graph is a graph in which the nodes are educa-tional resources or activities standing as the content of a lesson. The edges of a lesson graphrepresent the relations linking the educational resource content.

Many authors have chosen the graph as the most suitable way of structuring the learningmaterial of computer-based learning systems whenever adaptability and flexibility of thelearning material is required (e.g., (McCalla, 1992; Fischer, 2001; Baloian et al., 2000)).Indeed, the graph structure enables a non-linear exploration of the lesson. This feature isparticularly interesting when a system needs to adapt the lesson to the profile of a certain

Learning-Object Graph 75

learner (Murray, 1999; Brusilovsky, 2004) or a certain teacher style (Baloian et al., 2000) sincevarious paths may be followed in a same graph.

While the graph is a flexible view on a lesson, it is also more complex than classical, linearor hierarchical perspectives. On the one hand, the edges of a lesson graph may be untyped ordecorated with free text like in most concept maps (Novak, 1998). On the other hand, the edgesmay be defined with machine-readable, predefined types. This is the case in most systemsprocessing the relations linking graph nodes in order to deduce meaningful information (e.g.,intelligent tutoring systems or adaptive hypermedia). This separation between untyped andtyped relations linking the node of the graph might be compared with the difference betweenthe actual structure of the Web (untyped graph of document and concepts) and the typedstructure of the semantic web (Berners-Lee, 1989).

"Traffic Light" Problem

"Traffic Light" Implementation

Object Instantiation

Constructors

Solved By

Introduces To

Abstracted ByL1

L3

L2

L4Background For

Figure 5.1: Start of a lesson graph about “object instantiation”.

Figure 5.1 illustrates a lesson graph consisting in educational resources and relationsamong them. Four educational resources, labeled from L1 to L4 describe a part of a program-ming course for an object oriented language. L1 describes the problem on how to coordinatetraffic lights at a crossroad and L2 presents the Java code of a program implementing thesimulation. This problem is used to teach object instantiation in a program. L3 and L4 referto documents defining object instantiation and the concept of constructors respectively. In thislesson graph, the relations linking the four educational resources together have explicit types.

5.3 Learning-Object Graph

Definition 5.2 (Learning- Object Graph) A learning object graph is a lesson graph inwhich the nodes are learning objects decorated with metadata (LOM). The edges are definedas part of the learning object metadata.


Figure 5.2: The relation category in the IEEE LOM specification

As described in Section 3.1, the IEEE LOM specification includes a category called relation

(see Figure 5.2). Instantiating the relation category for a certain LO permits to definea link between this LO and another LO: The relation/resource/identifier attributereferences the target LO. The relation/kind element specifies the type of the relation thatlinks the two LOs.

5.3.1 Existing Types for the relation LOM Category

The relation types are generally based on a predefined vocabulary imported from DublinCore.The DublinCore specification recommends the use of six pairs of relation types for instantiatingthe relation/kind attribute:

Relation Type Description

IsPartOf /HasPart one resource is a physical or logical part of another.

IsVersionOf /HasVersion one resource is an historical state or edition of another resource bythe same creator.

IsFormatOf /HasFormat one resource has been derived from another by a reproduction orreformatting technology which is not fundamentally an interpretationbut is intended to be a representation.

References/IsReferencedBy one resource cites, acknowledges, disputes or otherwise refers to an-other resource.

IsBasedOn/IsBasisFor one resource is a performance, production, derivation, translation,adaptation or interpretation of another resource.

Requires/IsRequiredBy one resource requires another resource for its functioning, delivery,or content and cannot be used without the related resource beingpresent.

However, various authors claim that the relations proposed by DublinCore were notoriginally designed for educational purposes and are not well suited to cope with the require-ments of lesson authoring (Engelhardt et al., 2006; Fischer, 2001). Therefore, other researchwork focuses on defining relation types for LO graphs. For instance, (Engelhardt et al., 2006)aggregates four relation pairs to those of DublinCore:



isNarrowerThan/isBroaderThan standard taxonomic relation.

isAlternativeTo alternative LOs are meant to be of equivalent content, peda-gogical and structural properties, but may deviate in formats.

illustrates/IsIllustratedBy expresses illustration in an open fashion.

isLessSpecific/isMoreSpecific a more specific object may cover sub-aspects or the identicalsubject in more detail or exhibit a thematic overlap while beingmore specific.

Another definition proposal can be found in (Fischer, 2001). This work defines the basisof MultiBook, an adaptive hypermedia system used to teach multimedia technology. Multibookuses metadata to create course sequences semi-automatically. In MultiBook a graph of conceptstands in parallel with the graph of learning objects. Concepts are related with semanticrelations while learning objects are related with rhetorical/didactic relations. This work doesnot use any relations of DublinCore for linking LOs. It defines its own relations as example,illustrates, restricts, amplifies, continues, deepens, opposite, alternative, analogy.

In order to support the definition of Didactic Networks (Section 2.3.2), (Baloian et al.,1999) makes a proposal of relation types based on the extended taxonomy of link types pro-posed by Trigg (Trigg, 1983). These relation types were used in order to automatically definevarious teaching paths inside a lesson graph. According to the teacher preferences, the gener-ated path may follow a deductive strategy (preferring paths sequencing theoretical materialbefore practical one), an inductive strategy (preferring paths sequencing practical materialbefore theoretical one), or a short version strategy (reducing the path to few nodes defining anoverview of the lesson). The relations used in this work are:


introducesTo recommended for the beginning of a lesson and introduction of new topics.

refinedBy the subject of the source is a part or a detail of the subject of the target (maybe used to split a topic in various sub topics).

explainedBy may be used to justify or support the idea of the source.

exemplifiedBy may be used to illustrate the idea of the source’s subject.

summarizedBy the target is a summary for all the nodes linked by this type of link.

5.3.2 Type Proposal for the relation LOM Category

The set of relation types suggested in (Baloian et al., 1999) was used in order to definea LO graph of almost 180 learning objects intended for use in an introductory computerprogramming course for freshmen in our University. During this experience, other relations


were suggested and added to the set. Finally, this process resulted in a new set of relationsemphasizing the semantical, rhetorical, and organizational aspects of course authoring. Thisrelation set is shown in Table 5.1.

Relation Type DescriptionintroducesTointroducesBy

This pair is recommended for the beginning of a lesson and theintroduction of new topics.

assessedByasseses

They are used to link with a LO specifically intended to assesslearners’ knowledge.

supportedBysupports

They are recommended for linking with a LO that may serve ofcomplementary support for achieving the pedagogical goal of thesource LO.

abstractedByexemplifiedBy

They are used to link with a LO that generalizes or formalizes theconcept induced in the source LO.

comparableWith They are recommended when it is pedagogically relevant to comparetwo LOs.

backgroundForhasBackgroundIn

They are used to define that the knowledge provided by the sourceLO is necessary for using the target LO.

summarizedBysummarizes

They are used when linking a LO that summarizes the content ofone or various LOs.

solvedBysolves

They are recommended for LOs that are tightly coupled: thedefinition of a problem given by a certain LO is solved by the contentof a second LO.

hasPartisPartOf

These relations are not defined at user level. They are used in thesense given by the DublinCore recommendation.

IsVersionOfHasVersion

These relations are not defined at user level. They are used in thesense given by the DublinCore recommendation.

Table 5.1: Proposal of relation types. These types are used in the remaining of this thesis.

In this set, each relation has a inverse relation. (e.g., isPartof is the inverse relationof hasPart). Note that the comparableWith relation has for inverse relation itself. In theremaining of this thesis, a relation from a LO a to another LO b implies a inverse relationbetween b and a.

Definition 5.3 (Learning-Object Graph) Learning-object graphs are graphs with bidi-rected edges.

The proposed relation set specifically leaves out the relations indicating the continuation


(e.g., isFollowedBy). Since, most lesson authors generally think about a lesson in its linear form,the possibility of expressing the continuation is generally misused: Continuation is consideredas a generic relation while it gives almost no information on the context surrounding a certainLO in the graph.




Constructors

Constructors Overloading


Solved By

Introduces To

Abstracted By ?

L6

L1

L3

L2

L4L5

Examplified ByBackground For

Figure 5.3: Start of a LO graph about “object instantiation”

Figure 5.3 illustrates a lesson graph showing the start of a programming course for anobject oriented language. In this lesson graph, the L1 to L4 educational resources have similarcontent than those of Figure 5.1. However, in this case, L1 to L4 are LOs decorated withassociated metadata (LOM). Each relation is now defined with the relation category of LOMand follows the relation types defined in Table 5.1. L5 is a node for which the actually learningmaterial has not been assigned yet. L6 is a LO of coarser granularity and acts as a containerfor L1 to L5. This characteristic implies there are implicit symmetric relations hasPart andisPartOf between L6 and all the other LOs of the lesson graph. Similarly, when a relationbetween two LOs is defined using a certain relation type of our relation set, its inverse relationis implicitly created.

In general, a LO graph as a whole is a lesson unit. This lesson unit is also a LO.Part/whole relations link this coarse-grained LO to the fine-grained-LO graph: All the LOsof a LO graph are directly or transitively connected to the lesson LO with isPartOf/hasPartrelations. Consequently, since all edges of a LO graph are bidirected, most LO graphs containcycle. For instance, in Figure 5.3, L1 is connected to L3 with an introducesTo relation, L3 isconnected to L6 with a isPartOf relation, L6 is connected to L1 with a hasPart relation.

Since the proposed set of relations is the result of an inductive process based on thedevelopment of a computer science course at our institution, it is probable that this set doesnot suit all pedagogical contexts. For this reason, the proposals of the next chapters work withthe relation semantics but do not depend of a certain set of relation types. Nevertheless, therelation set that this section suggests, is used for describing these proposals and experimentingwith them.


5.4 Authoring a LO Graph

This section deepens the concept of LO graph from a practical perspective. It presentsLessonMapper2: a software prototype developed for this thesis. LessonMapper2 is a Java-based graphical application for building LO graphs. First, this section explores the notion ofLO in LessonMapper2. Next, the notion of nested graphs, which is the hierarchical structurechosen in Section 5.1 is discussed.

5.4.1 LO as Node

In a LO graph, each node refers to a LO and an associated LOM. In LessonMapper2, aLO may be an external resources of any type (e.g., an image, a slide, a web site or anactivity designed with the learning activity builder LAMS) or a nested LO graph as it willbe described in the next subsection. As shown in Figure 5.4, the title and a thumbnail of theLO content decorates each node. Relations are chosen in the following subset of the Table5.1: introducesTo, assessedBy, supportedBy, abstractedBy, exemplifiedBy, comparableWith,backgroundFor, summarizedBy, solvedBy. This set does not include inverse relations (apartfrom abstractedBy/exemplifiedBy) nor isPartOf/hasPart/isVersionOf/hasVersion relations.The excluded relations are automatically generated by the system but are not displayed onthe LO graph in LessonMapper2.

Figure 5.4 shows a LO graph in which the sequence is visually defined by the user. Inthis case, LOs are sequenced clockwise. Nevertheless, the relations linking the object let ussuppose there is not only one compulsory navigation path in this LO graph. In fact, only twoparts of the graph restrict the sequence of LOs: (1) The introducesTo relation between the LOsHistoria and Versiones implies that the former LO should be presented before the latter one.(2) The backgroundFor relation between the Servlet and JSP LOs also implies that the formerLO should be presented before the latter one.

In LessonMapper2, each LO is associated with a concrete document (e.g., via draggingand dropping). This association may consist of a reference to the document or a copy of thisdocument included in the LO graph package. Once associated with a graph node, documentsmay be directly opened from the graph (e.g., double-click on the node). They can also be editedwith the application the user operating system has assigned for the type of this document (e.g.,slides will be open and edited with MS Powerpoint, OpenOffice or Apple Keynote). Furtherwork is planned in order to embed the visualization and edition features of the open-sourceproject OpenOffice (OpenOffice API, 2007).

Authoring a LO Graph 81

Figure 5.4: Simple LO graph with LessonMapper2


5.4.2 Nested LO Graphs

This thesis suggests nested graphs as a hierarchical model for organizing LOs. Like other LOs,nested graphs have LOM associated with them. In LessonMapper2, any new lesson starts at aroot graph. This root graph has a LOM instance which characterizes the lesson.

There are various techniques to visualize nested graphs (see (Herman et al., 2000) fora sample). Nevertheless, according to the study of (Hornbaek et al., 2002), it seems thatmultilevel graphs like LO graphs benefit from being displayed using a mix of overview+detailand zoomable interface. (Good, 2003) studies the impact of a zoom-based visualization onslide show authoring and presentation. It reports that authors using zoom-based visualizationspend less time on authoring a slide show than with classical presentation tools while thefinal result is generally better organized than with classical presentation tools.

For these reasons, zoom-based visualization of LO graphs is also used in LessonMapper2.Figure 5.5 shows this feature: (1) The active LO graph (i.e., the LO graph on which newLOs can be added and linked together) is automatically scaled to suit the display. (2) Whennavigating nested graphs in LessonMapper2, the active LO graph is situated in an overviewof its parent graph. (3) Nested graphs look like other LOs and opening one nested LO graphmakes it the new active graph. (4) Clicking on the overview enables the parent graph ofthe LO graph being active to become the new active graph. These navigation features areimplemented in Java with the Picolo zoomable interface toolkit (Bederson et al., 2004).

5.5 LOM and LO Graphs

This section discusses the instantiation of LOM in a lesson authoring setting based on LOgraphs. Then it presents the visual characterization of the graph LOs using LOM.

5.5.1 Instantiating LOM during Lesson Authoring

Section 3.3 showed that instantiating LOM is a tedious task generally consisting of filling aform listing the almost 60 attributes of LOM. This practice should be definitely reviewed in alesson authoring system based on LOM.

It often happens that the LOs of a same lesson have characteristics in common. Forinstance, the language used in the various LOs of a lesson is generally the same. In such

LOM and LO Graphs 83

(1) The active LO graph is automatically scaled to suit the display size.

(3)Opening a nested graph

makes it active. (4)Click on the overview of the

parent graph makes it active.

(2)Overview of the parent graph: the colored box shows the place of the

active graph in its parent.

Figure 5.5: Nested LO graph in LessonMapper2


1

1

2

2

3

3

4

4

s

Figure 5.6: Simultaneous edition of the general/language attribute for four LOs withLessonMapper2. Each LOM element have the color of the selection border of its correspondingLOs. For easing gray-scale visualization, a numbered flag tags each color.

situation, we believe that human-based LOM generation may benefit from instantiatingsimultaneously the general/language attribute of various nodes of the graph instead ofindependently repeating this task for each piece of material.

When editing LOM with LessonMapper2, the user selects a LO (i.e., a node) and thenone or more attributes she wants to edit. This process raises a classical form for LOM editionwhere the metadata elements are displayed in a list. In addition to that, the user can selectseveral LOs at once. This feature enables the LOM of all the selected LOs to be simultaneouslyedited. In this case, the values of all the selected LOs are grouped together by LOM attribute.In Figure 5.6, the general/language attribute is edited simultaneously for various LOs. Inthis example, all the LOs are written in Spanish. With the Simultaneous Edition featured byLessonMapper2, the language identifier can easily be replicated from one LO to the others bydragging and dropping it.

LOM and LO Graphs 85

1

1

2

2s

Figure 5.7: Edition “in comparison” of the educational/difficulty attribute for two LOswith LessonMapper2.


Editing simultaneously the same attribute for various resources also eases the com-parison among their values. For example, Figure 5.7 presents the instantiation of theeducational/difficulty attribute for two LOs. Since the notion of difficulty is directlyrelated to the usage context (e.g., the background of the students or the topic being taught),instantiating this attribute in an isolated manner does not really make sense. The LOspresented in Figure 5.7 are dealing with a programming language course. The Problema debilletes and Edad en dias LOs are two problems putting in practice the notions of declarationand primitive types. Both problems are quite easy. Nevertheless, one of them is easier thanthe other. Thus, the instantiation of the educational/difficulty for the two LOs shouldconsider this fact. The Edition in Comparison featured by LessonMapper2 enables it. Thesame principle can be applied to most subjective attributes of LOM, i.e., those attributes thatare difficult to instantiate in an objective and rational manner. This category concerns almostall educational attributes (Section 3.2.3).

The proposed interface takes benefits of the context of lesson authoring based on LOgraphs: It enables simultaneous and “in comparison” instantiation of LOM attributes. More-over, it does not impose the LOM attributes to be instantiated all at a time. For instance, thetitle or the pedagogical characteristics of a LO may be instantiated soon after the introductionof this LO into the lesson. This task may encourage the lesson authors to characterize theelements of their lesson from a pedagogical perspective during lesson authoring. In contrast,some objective elements like, e.g., the technical format, or the aggregation level of lesson LOsmay be instantiated afterward since they may have few impact on the lesson design. In fact,this kind of attributes can be automatically generated using existing systems (Section 3.4).

5.5.2 Characterizing LOs with LOM

In LessonMapper2, the LOs are visually characterized with the values of their metadata.Figure 5.8 shows this characterization. In this example, only four metadata attributesare visually characterized: educational/difficulty, educational/semanticDensity,educational/intendedUserRole, and educational/interactivityType. These at-tributes have been arbitrarily chosen in order to give an insight on the educational aspect ofthe LOs. Nevertheless, this attribute list can be modified, extended, or reduced.

We limited the number of displayed attributes in order to avoid a cognitive overload dueto a large number of visual information. However, more investigation should explore the idealnumber of LOM characteristics to display.

The displayed characteristics should consider the user necessity. E.g., it may be interestedto let users configure which attributes to display.

LO Sharing and Retrieval in a LO Graph 87

Difficulty is Medium

Intended useris the Learner

Interactivityis Active

Semantic density is Low

s

Figure 5.8: Visual Characterization of some metadata attributes of the Edad en dias LO.

5.6 LO Sharing and Retrieval in a LO Graph

This section deals with the seamless integration of sharing and retrieval of LO during theauthoring of a LO graph. Methods for sharing and retrieving LOs with LessonMapper2 aredescribed. The case of reused LOs is considered in these processes.

5.6.1 Seamless Sharing of LOs when Authoring a Lesson

When authoring a course as a LO graph, the teacher may want to share her lesson on therepository of the teaching community she is affiliated with. Since a LO graph is based onstandard metadata, learning material of the lesson is ready to be packaged for sharing (Section2.4).

The lesson defined by the LO graph can be shared not only at coarse levels of granularity(e.g., a chapter or the entire lesson), but also at fine-grained levels (e.g., a learning activity,an assessment, or a problem statement). Therefore, the teacher can share its creation as anentire lesson, as intermediary chapters, and also as atomic learning resources on the samerepository. In LessonMapper2, LO sharing is seamlessly done inside the authoring tool: onebutton permits to share all the levels of granularity of the current LO graph. With such asystem, if the teacher updates the content and metadata of her lesson graph later, this updatewill be replicated in the repository the next time she will decide to share her lesson.


Such a sharing feature should ensure that LOs with incomplete or incorrect metadatamight not be stored on the repository. For instance, a certain teaching community may definea LOM Profile in which some LOM attributes are compulsory, others are recommended, andthe remaining ones are optional. In such case, the system would not allow to store the LOshaving at least one compulsory LOM attribute, which is incomplete or incorrect. It would notpermit to store the LOs that have a certain number of recommended LOM attributes,whichare incomplete or incorrect. The topic of checking LOM validity during LO graph authoring isdiscussed in Section 7.2.

5.6.2 Seamless Sharing of LOs when Modifying Colleague’s Lesson

Let us assume a teacher is reusing a lesson stored by another teacher on the repository. Shewill certainly modify the content of the lesson in order to adopt the material and suit herspecific teaching style. LessonMapper2 considers this situation by enabling the teacher todecide whether: (1) she will change the educational context of the LO, (2) she will modify theLO, or (3) she will use the unchanged material in a similar context.

In the first case, since the educational context of the LO is modified, the LOM associatedto the original LO is changed to a new instance. This new instance takes into consideration thevalues of the former one but the teacher should modify them to adapt to the new educationalcontext. Only the metametadata/identifier attribute is automatically modified so thatthere are two LOMs describing various manners to use a same LO in the repository.

In the second case, since the LO is modified, a new instance of LOM is created and themodified LO has a new identifier. This means there will be a new pair of LOM-LO in therepository. IsVersionOf/hasVersion relations are automatically created between the originalLO and the new one.

In the third case, nor the LOM, nor the LO are changed for new instances. New relationsmay be added to the LOM but the other attributes cannot be modified. If they are modified,the LOM is processed like the first case.

5.6.3 Seamless Retrieval of LOs during Lesson Authoring

In the context of a teaching community sharing material, when a teacher needs a certainlearning/teaching resource, she may create it or she may search for existing LOs that would,at least partially, suit her needs. After retrieving existing material, she can reuse it, i.e.,contextualize it in order to make it suitable for the new learning/teaching situation.

LO Sharing and Retrieval in a LO Graph 89

Figure 5.9: Query node with title constructor and associated results


In the graph of Figure 5.3 (page 75), the L5 LO does not refer to any concrete document,i.e., L5 is a node without an associated LO. Nevertheless, this node is positioned inside the LOgraph: This node has been defined during the lesson authoring process in order to specify arequired additional item for the lesson. This is a top-down approach to lesson authoring: First,the lesson graph is designed, i.e., semantic, rhetorical and organizational relations are definedbetween the lesson graph nodes. Then, the node content is built.

In a LO graph, a node without associated material is regarded as a query node. Then,the title of the node is used to seamlessly query the community repository. For instance,in LessonMapper2, a query is generated that looks for the query node title terms in all themetadata fields of the repository LOs. Then, the results (i.e., couples of LOM-LO) are displayedin list (see Figure 5.9). Each result can be previewed and adopted. To adopt a result meansto change the LOM and the LO of the query node for the LOM and LO of the result. Later,when sharing the LO graph, this reused LO will have to comply with one of the strategies theprevious subsection describes.

Section 8.2 discusses the possibility to take into account the context of the query node, i.e.,the other nodes of the LO graph and their relations to the query node, in order to enhance theprecision of the LO retrieval process. This section also describes an experiment concerning 11teachers who had to complete 4 predefined LO graphs with existing LO of a certain repository.For that purpose, they had to build 8 queries (2 per graph). For each query, the teachers had toformulate a set of keywords as they would do if querying a popular search engine like Google,and also to position the corresponding query node in the LO graphs they have to complete. Arecurrent behavior was observed during this experiment: At the beginning of the experiment,all the teachers preferred first to define the keyword query before positioning the query nodein the graph. At the end of the experiment, the majority of the teachers began by definingthe query node before the keyword query. This observation is encouraging since it shows thatdefining a query node may be a more natural process than defining a keyword query whenauthoring a lesson.

5.7 System Deployment

As described in the last section, LessonMapper2 can be used in symbiosis with a LO repository.Teachers access to the repository (sharing and retrieval) directly from this authoring tool.Nevertheless, LessonMapper2 has to use a certain LOM profile for describing authored LOs.

If a teacher community uses our system in order to share LO graphs, then it has to definea certain LOM profile (Section 2.5). This LOM profile will hold LOM attribute vocabulary

Conclusion 91

adapted to the particular teaching context in which stands this community. It may also specifythe relation types with which LO graphs are defined.

Since the definition of a LOM profile requires the cohesion of community members, thesystem needs a person responsible for ensuring this cohesion and configuring the systemas depicted by Figure 5.10. In our prototype implementation, the repository holding thisconfiguration is built upon eXist (exist, 2007), an open-source java-based XML database.

An automatic system replacing this person must be able to manage the collaborationamong community members and ensure the needed cohesion.

LORLOMs + LOs

AdminLM2

Teachers

LM2 LM2

ConfigurationLOM Profile

Figure 5.10: LessonMapper2 deployment model

LOM profile modifications concerning language translation or attribute definition innatural language can occur while the system is deployed. If the system repository alreadycontains some LOs using a certain LOM Profile, then deep LOM profile modifications, e.g.,new attribute vocabulary or new metadata attribute, will entail compatibility issues. Thesecompatibility problems can be solved using the interoperability methods described in Section3.5.3.

5.8 Conclusion

This chapter presented LO graphs as favorable settings for introducing human LOM usageinto the lesson authoring process (Section 5.1). It also described a tool for supporting LO graphcreation (Section 5.4).


In a LO graph, user-interfaces for LOM instantiation can be unlike classical form-based interfaces. E.g. this chapter introduced an interface for supporting Instantiation inComparison of LOM attribute. This feature may be particularly helpful for instantiatingsubjective attributes which are difficult to instantiate apart (Section 5.5.1). LOM was alsoused to visually characterize the elements of a LO graph so that the characteristics of thelesson LOs can be seen at a glance (Section 5.5.2).

Integration of LO sharing process during lesson authoring was also discussed. In particu-lar, concrete proposal were done for sharing new or reused LOs on a community repository(Sections 5.6.1 and 5.6.2). This chapter also introduced lesson graph construction as a favor-able metaphor for querying LO repositories directly from the lesson authoring tool (Section5.6.3). Finally, deployment of the proposed system into a teacher community was proposed.

This chapter has defined a setting for facilitating human LOM usage during lessonauthoring. This definition was necessary in order to study the research question RQ1. Nev-ertheless, answering this research question also requires to consider methods for processingLOM in a lesson authoring setting such as the one defined in this chapter. This work togetherwith the study of RQ2 are the topics of the next chapter.

Chapter 6

ProcessingLearning-Object Metadataduring Lesson Authoring

This chapter aims at answering RQ2:How can processes dealing with LOM semantics copewith incomplete, missing, or incorrect metadata?

Sections 3.4, 3.5, and 3.6 showed systems automatically processing metadata in order toexploit the context of the LOs. This chapter proposes a three-layer extensible framework fortaking advantage of lesson graph semantics. This framework provides an original support fordiffusing the metadata-based processes along the edges of the lesson graph. This techniqueaims at coping with the metadata processing issues raising when some graph metadata aremissing, incorrect, or incomplete. As part of the framework, two original types of metadataprocesses are introduced. The first one takes advantage of the metadata attribute similaritiesbetween related LOs. The second one focuses on the lesson graph consistency.

6.1 Classical Approach

(Hatala and Richards, 2003) and (Brase, 2005) present some systems taking advantage of thesemantics of a set of related LOs in order to infer possible values for some missing elements.These systems are based on a set of rules defining the way to combine the semantics of LOM

93

94 Chapter 6. Processing LOM during Lesson Authoring

and the nature of the links between the related LOs. For instance, (Hatala and Richards,2003) proposes the rule: When there is an ascendancy (i.e., isPartOf) relation between two LOs,the value of the educational/intentedUserRole attribute of the parent may be suggestedto the child.

Metadata Values of a Learning-Object Graph

Influence Rules

Support For LOM Generation

(Rules expressing Influence of Graph

Semantics on Metadata Value)

Contextualized Metadata Values for this LO Graph

Figure 6.1: Conceptual model for systems taking advantage of lesson graph semantics

Figure 6.1 shows a conceptual model of such systems. This model is based on themetadata values and the relation semantics of a LO graph. The central component of themodel are the influence rules. Influence rules express the influence of the graph semanticson the metadata values.

Definition 6.1 (Influence Rule) An influence rule is a triplet from A ×T × I where A isthe LOM attribute set, T is the relation type set, and I is the set of influence type.

E.g., in the influence rule (a, t, i) of (A,T, I), i specifies the influence of the a attributevalues of a set of LOs on the a attribute value of another LO, the latter LO being related tothe first LOs with relations of type t.

Influence rules are used to infer information (e.g., a probable value) for the metadata ofa certain LO. The rules proposed by the previous work of (Hatala and Richards, 2003) and(Brase, 2005) can be considered as influence rules. For instance, the previously presented ruleof (Hatala and Richards, 2003) can be defined as the influence rule:

(educational/intendedUserRole, isPartOf,’likely equals to’

)This means that the educational/intendedUserRole values of the parents of a certainLO influence the educational/intendedUserRole value of this LO in such a way that theformer values likely equals to the latter value.

The result of influence rules are called contextualized metadata values (CMVs) since

Diffusion-based approach 95

they are inferred from the context of the LOs, i.e., the metadata values of the other LOsrelated to it and the semantics of the relations between them.

Definition 6.2 (Contextualized Metadata Value (CMV)) Applying influence rules onthe metadata of a certain LO result in special data called contextualized metadata values.

E.g, the generated CMVs in (Hatala and Richards, 2003) are suggestions of metadatavalues with a relevance level: Strong or Medium. In (Brase, 2005), the generated CMVs areinstantiated as new metadata values. In these two systems, metadata processing result indifferent types of CMV. In fact, both systems have their own metadata processing strategy. Wecall CMV model a metadata processing strategy. Defining a CMV model consists in specifyingsome influence rules for processing metadata and the nature of the CMVs resulting fromapplying those rules.

6.2 Diffusion-based approach

Surveys about metadata usage (Friesen, 2004; Currier et al., 2004; Heath et al., 2005; Najjarand Duval, 2006) state that human metadata instantiation (in contrast to automatic metadatainstantiation) is a difficult task which is generally ignored or made in a hurry. Frequently,authors leave “by default” values, which may not correspond to real values. Therefore, itis very probable that the LOs of a lesson graph suffer from missing, incomplete, and evenincorrect metadata values.

In the existing approaches, the scope of the rules is generally limited to the metadatavalues of the neighboring LOs. Therefore, computation of the rules may suffer from metadatalack. In (Hatala and Richards, 2003), this problem is considered in the inheritance andaccumulation rules since they are applied on the whole hierarchy of LOs instead of beinglimited to the direct parent or children. Nevertheless, this principle is not applied to all therules.

6.2.1 Context Diffusion

In order to cope with metadata lack, (Marchiori, 1998) suggests to diffuse the metadata valuesover an untyped graph using fuzzy logic. We also propose a diffusion mechanism in orderto cope with metadata lack, but unlike this work, we use the semantics of the graph. Thus,


instead of a fixed propagation mechanism based on fuzzy logic, our propagation mechanism isbased on the results of influence rules. In this process, the influence rules are applied not onlyto the metadata values of the neighboring LOs, but also recursively to the results of applyingthe rules to these neighboring nodes, i.e., the CMVs. We call this recursive process contextdiffusion. During context diffusion, influence rules are first applied to the original values ofthe graph. Thus, a first set of CMVs is generated. Afterwards, the influence rules are appliedto both the generated CMVs and the original metadata values of the graph. Thus, a new setof CMVs is generated. This process is repeated iteratively until the generated CMVs finallyconverge – the update process should ensure this convergence.

The context diffusion process increases the scope of the influence rules to the metadata ofthe whole graph. Therefore, the impact of missing, incomplete, or incorrect metadata values isdecreased compared with methods only taking into a reduced neighborhood.

This process is non-monotonic because the CMVs for a same LO may change at eachiteration of the context diffusion process. Defining such non-monotonic process over anexisting inference rule system for the semantic web such as Jena (JENA, 2007) is a non-trivialtask. Therefore, we preferred to use a simple push protocol based on update propagation: Amodification of the CMVs of a graph LO induces the CMV update of the neighbor LOs. If theupdated CMVs are different from those of the previous iteration, the process is repeated to theneighboring nodes. Otherwise, the propagation stops for that node. Two reasons out of thediffusion process can entail the propagation of the CMVs of a LO: (1) a change in the originalLOM values of the LO and (2) a change in the relation semantics linking this LO to other LOsof the graph.

6.2.2 Conceptual Model

Figure 6.2 depicts a conceptual model for taking advantage of lesson graph semantics thattakes into account context diffusion. Similar to the model of Figure 6.1, the main componentsare the metadata values and relation semantics of the LO graph, the influence rules andthe resulting contextualized metadata values (CMVs). Nevertheless, this model aggregatesa new core component: the context diffusion. As described above, the context diffusion isresponsible for iteratively applying the influence rules on both original metadata values andthe already inferred CMVs until stabilization of the CMVs is reached.

The generated CMVs are not limited to support LOM generation: The next sections showthat CMVs can also be used to facilitate metadata validation and LO graph consistency check-ing during authoring, and to enhance the retrieval of LOs. Nevertheless, these features maydepend on the particular teaching style and strategy of the corresponding teacher community.

Diffusion-based approach 97

Influence Rules

Context Diffusion

Contextualized Metadata Value

Update

Customization

Support For Lesson Graph Authoring and Usage

(Rules expressing Influence of Graph Semantics on Metadata

Value)

(Metadata Generation and Validation / Graph Consistency Checking / Retrieval of Additional Material)

(Propagation until Stabilization)

Authoring and Use

Metadata Values of a Learning-Object Graph

Contextualized Metadata Values for this LO Graph

Teacher Community

( Teacher Community Refinement and Correction/ Analysis of existing Lesson Graphs)

Figure 6.2: Conceptual model using context diffusion for taking advantage of lesson graphsemantics

This is due to the fact that some LOM attributes are subjective (LOM, 2002). Therefore, it isdifficult to define influence rules suiting every teaching situation.

For that reason, teacher communities are also part of the model. First, they benefitfrom the support provided by using the CMVs when authoring and using the lesson graphs.Second, they customize the influence rules in order to adapt the system to their teaching styleand preferences.

6.2.3 Framework

CMV Models

CMV Propagation Protocol

Influence Rules Influence Rule ScopeCMV Update Process

Scope

Figure 6.3: Context diffusion-based framework for taking advantage of lesson graph semantics

In order to support this conceptual model, we introduce a metadata processing frameworkimplementing our context diffusion approach. Figure 6.3 shows the three-layer structureof this framework. The bottom layer is the base of the framework. It consists of a genericpropagation protocol ensuring performance and convergence of our model. It also defines thescope of the update processes that can use this propagation protocol.

The middle layer concerns the implementation of different CMV models, i.e, differentmetadata processing strategies. Implementing a CMV model means to define the nature of


the model CMVs and the update process of these CMVs. This update process must enter thescope defined in the bottom layer. In this layer, the scope of the influence rules for the definedCMV model is specified. In practice, this layer targets researchers or engineers wishing to usethe diffusion propagation system defined in this chapter with new type of influence rules andCMVs. The remaining of this chapter presents two CMV models.

The top layer deals with the definition of customized influence rules according to thescope specified in the previous layer. This customization process is intended to teachercommunities needing to adapt the metadata processing system to suit their specific context.This customization process may be manual. E.g., new influence rules could be defined using adomain-specific language (DSL) or existing rules could be refined. Customization may alsobe automatic. E.g., the system could analyze existing lesson graphs in order to deduce thenecessary information for customizing the influence rules.

The next section presents the generic propagation protocol, i.e, the bottom layer of theframework. Next, two examples CMV models are introduced. For each of them implementationof the middle and top layers are discussed.

6.3 Generic Propagation Protocol

Ensuring the convergence of a propagation process is a difficult task. For this reason, ourframework aims at guaranteeing this property directly in the propagation protocol. The baselayer of the framework consists in defining such a converging propagation protocol. Neverthe-less, convergence also depends on the characteristics of the CMV update processes that aredefined in the middle layer. In order to permit users to easily define new implementation ofthe middle layer, the scope of this characteristics should be define in the base layer. Therefore,this section first draws the basic characteristics of CMV update processes that are necessaryin order to define a converging propagation protocol. On the base of these characteristics, theremainder of this section presents a propagation protocol ensuring minimal performance andconvergence.

6.3.1 CMV Update Process Characteristics

Definition 6.3 (Update Process) An update process is a set of update functions C×C→ C,where C is the set of CMVs.

Generic Propagation Protocol 99

Updating the LO i with the LO j consists in the assignation ci ← φ(ci, cj) where ci and cj arethe CMVs of i and j, respectively, and φ : C ×C → C is an update function for the relationconnecting i to j.

Definition 6.4 (Active Update) The update of one LO j with another LO i is active if itinduces a modification of the CMVs of j. Such an active update is noted i→ j. We note ji→j

the state of the LO j after being updated by i.

Therefore, the update of the LO j with the LO i is active (i → j) if cj 6= φ(ci, cj). TheCMVs of ji→j are φ(ci, cj).

Definition 6.5 (Passive Update) The update of one LO j with another LO i is passive if itdoes not induce modification of the CMVs of j. Such a passive update is noted i /→j.

Therefore, the update of the LO j with the LO i is passive (i /→j) if cj = φ(ci, cj).

We note i j the transitive active update of the LO i on the LO j via a certain numberof intermediary LOs. I.e. if i j, then ∃ k0, ..., kn ∈ G such that i → k0, k

i→k00 → k1, ..., and

kkn−1→knn → j.

We note i / j the transitive passive update of the LO i on the LO j. If there is not i j,then i / j.

Definition 6.6 (Stable Graph) A LO graph is stable in regards with a certain regularupdate process if for all LOs i and j, i /→j.

Corollary 6.7 In a stable graph, no active update can occur before one CMV or the structure(nodes, edges) of the graph is arbitrarily modified.

PROOF. If an active update occurs without modification of the graph structure or one of theCMV, then the graph is not stable. �

Definition 6.8 (Regular Update Process) An update process is regular if it complies withthe following properties:

Stability For any LOs i and j, i /→ji→j .


Acyclism For any LO i, i / i.

Asymmetry For any LOs i and j, if i j, then j / i.

Cumulation For any LOs i, j, and k, if i /→j and k → j, then i /→jk→j

Initialization There is an initialization procedure for this update process suchthat any initialized graph is stable.

Max There is a function Max : C ×C → C for this update process such that forany LOs i and j, if Max(i, j) = i then j / i.

Property 6.9 (MaxFunction) In a regular update process, a Max function can always bedefined.

PROOF. A Max function cannot be defined if it exists i and j such that Max(i, j) have no result.This case means that i j and j i. However, by asymmetry if i j then j / i and reversely.Thus, the previous case cannot occur. �

6.3.1.1 Update Propagation Protocol

Algorithm 1: Regular-Update Propagation ProtocolData: a stable LO graph and some modified LOsResult: update propagation of the modifications into the LO graphbegin

h← a Fibonacci heap based on the Maxfunction;insert all modified LOs in h;while h is not empty do

L← extract Max of h;foreach Lr related to L do

if L actively updates Lr theninsert or update Lr in h;

end

Retrieving the Max of a heap has order O(1), extracting it has order O(log n), where n

is the number of elements in the heap. Inserting a new element in a heap also has orderO(log n). In Algorithm 1, an element already in the heap can be updated again. In sucha case, the heap should be reorganized in order to consider the new value. This operationrequires to remove the element and insert considering the new value. With a classical heapimplementation, this operation has order O(n). Algorithm 1 uses another data structure forwhich the update operation has order O(1): a Fibonacci Heap. A Fibonacci Heap is a datastructure that supports all the basic heap operation in O(1) amortized time, with the exceptionof remove and remove Max operations, which take O(log n) (Cormen et al., 2001).

Generic Propagation Protocol 101

Theorem 6.10 Regular-update propagation of k modification in a stable graph processes hasorder O(m + n log n)O(Max) with m the number of relations in the graph and n the number ofLOs.

PROOF. Each LO propagates its update only once. If there is a LO i propagating its up-date twice, then i is picked up twice in the heap. Thus, i has been twice the Max from theheap. Let H1 and H2 be the state of the heap when i is popped for the first and secondtime, respectively. Let i1 and i2 be the states of i in H1 and H2, respectively. By definitionof Max, for all j1 of H1, j1 / i1 (1). However, since i2 ∈ H2, it exists a LO l such that l → i1

otherwise i1 should not reintegrate the heap. According to the algorithm, all changes inthe input graph are due to the propagation of updates registered in the heap. Thus, l

was also in the heap. Transitively, it exists j1 ∈ H1 such that j1 l. Since l → i1, thenj1 i1 (2). If j1 6= i1, then (2) is contradictory with (1). If j1 = i1, then by (2), i1 i1

which is impossible by acyclism property of regular update processes. Therefore, a sameLO cannot propagates its update twice.

Each edge is visited only once. Since each updated LO is propagated only once, the rela-tions between each LO and its respective neighbors are visited only once.

Cost of adding an element in the Fibonacci Heap. In a Fibonacci Heap, inserting an ele-ment costs O(1). If this element is already in the data structure, the key associated to thiselement is increased since it was updated by the propagation protocol. This operation alsocosts O(1). However, a Fibonacci Heap for the Max function depends on the complexityof Max. Thus, the cost of adding or updating an element in a Fibonacci heap has orderO(Max).

Cost of removing the Max of the Fibonacci Heap. In a Fibonacci Heap, removing theMax element costs O(log p), where p is the number of element in the heap. In the worstcase, p = n where n is the number of nodes of the graph. This operation also depends onthe complexity of Max. Thus, the cost of removing the Max of the Fibonacci Heap hasorder O(Max)O(log n).

Total Cost. Since the edges are visited only once, the worst case consists in inserting m timesin the Fibonacci Heap with m being the number of edges. Since each node diffuses its up-date only once, the Max of the Fibonacci Heap is removed n times at worst. Consequently,the total cost of the algorithm has order O(m + n log n)O(Max). �

Theorem 6.11 Regular-update propagation results in a stable graph.

PROOF. If the graph resulting from a regular update process is not stable, then it exists i

and j such that i→ j. Since the graph was stable before the update propagation, there wasibefore /→jbefore. But since iafter → jafter, then iafter 6= ibefore (1) or jafter 6= jbefore (2). If (1)occurs, then according to the propagation protocol, the iafter has propagated its update toall its neighbors. Thus, iafter /→jiafter→j by stability property of regular update processes. Ifjiafter→j is modified by other updates, then there is still iafter /→jafter by cumulation property


of regular update processes. If (1) does not occur but (2) does, then by cumulation propertysince ibefore /→jbefore, ibefore /→jafter. Therefore, for all i and j of the graph, i /→j. Consequently,the graph is stable. �

6.4 CMV Model Examples

This section described two implementations of the second and third layer of our framework.In the first one, similarities among the attribute values of graph LOs are used to generatesuggestions for the metadata value of the LOs. In the second one, the graph consistency isanalyzed in order to generate restrictions for some of the metadata values. For both proposals,this section defines

1. the scope of the influence rules and eventually a language to specify them,

2. the nature of the CMVs,

3. a regular update process for these CMVs,

4. an influence rule customization strategy.

6.4.1 Attribute Similarities and Value Suggestion



Introduces To

L1

L3

general/keyword = {instantiation, object, method}

general/keyword = {instantiation, object, new}

Figure 6.4: Extract of the LO graph shown Figure 5.3.

As stated in (Hatala and Richards, 2003), graph semantic analysis may be used to identifysimilarities between LOM attribute values of related LOs. For instance, in the graph of Figure6.4, since L1 introduces L3 we may expect that the general/keyword attribute of L1 andthe general/keyword attribute of L3 share some values. In this particular case, the valuesof this attribute are {instantiation,object,method} for L1 and {instantiation,object,new} for L3,sharing two common values out of three. In general, attribute similarity may concern onlysome of the values.

CMV Model Examples 103

6.4.1.1 Influence Rule Scope

In order to generalize the predefined rules of (Hatala and Richards, 2003), we propose toweight similarities for each “attribute / relation type” couple according to the use context. Inour case study, we analyze a repository of lesson graphs (containing about 170 LOs) developedat our institution. For instance, if a LO L is related to a LO L’ with a link of type introducesTo,then we found out that there is a probability 0.54 a certain value of the general/keywordattribute of L belongs to the general/keyword attribute of L’. In order to get such a value,all the pairs of repository LOs having an introducesTo relation between them are taken intoaccount. Then we calculate the mean of the probabilities

P(k ∈ keyword(L’)

/k ∈ keyword(L)

)where L introduces to L’ and k is a possible value for the general/keyword attribute. Thiscalculus is done on the stemmed version of the attribute values, i.e., their morphological root,in order to avoid common spelling problem (e.g., car and cars have the same stemmed value).

L6

L1L3

L2

L4

?L5hasPart0.57

isPartOf0.45

abstrBy0.72

introTo0.54

resolvBy0.85

backgrdFor0.70

explaiBy0.75

Figure 6.5: Suggestion weights for general/keyword attribute.

Figure 6.5 is a reduced view of the lesson graph of Figure 5.3 showing the probabilities ofsame values between neighboring nodes calculated on our corpus for the general/keywordattribute. The complete set of generated probabilities can be found in Appendix A.

In this application of our conceptual model, influence rules are called suggestion rules.

Definition 6.12 (Suggestion Rules) Suggestion rules are functions p : A×T→ [0, 1] whereA is the set of LOM attributes and T is the set of relation types.

Example: According to the weights depicted in Figure 6.5, there are the following valuesfor p:

p(general/keyword, introducesTo) = 0.54


p(general/keyword, explainedBy) = 0.75

6.4.1.2 CMV Nature

It is possible to generate some CMVs called suggestions for the LOs of the lesson graph usingthe rules introduced above.

Definition 6.13 (Suggestions) A suggestion is a set {(v, w(v)) : v ∈ Va} associating aweight w(v) to all the possible values Va for a certain LOM attribute a.

The weight is 0 when the value is not at all appropriate for the LO, while it is 1 when itdescribes it perfectly. At the beginning of the context diffusion process, we set w(v) = 0 for allpossible values v for the a attribute except for the original values of the attribute which havea weight 1.

6.4.1.3 Update Process

{(v, w(v))}v {(v, w!(v))}v

t

node are compared with the suggestions of the proposed results. The results having sug-gestions very similar to the ones of the empty node receive a better ranking. Rankinginformation is then combined with a classical keyword-based query processed by the in-formation retrieval system Lucene [7]. Small-scale experiments shown that this approacheffectively enhances the retrieval of LOs compared with Lucene alone.

6. Conclusion

This article presents a conceptual model for a system taking advantage of graph seman-tics during lesson authoring and usage. This model is based on a novel context diffu-sion process that considers the information of the whole graph each time an influencerule is computed. Since the diffusion process is separated from the rule definition, rulescan be manually or automatically tailored by teacher communities. Moreover, metadatatypes and graph relation types are parameters of the diffusion process: Adapting themto the specific needs of a community only impacts the influence rules not the diffusionprocess. This article also described two instantiations of our model dealing with graphconsistency and attribute value similarity analysis. Their implementations in a softwarefor lesson authoring based on LOs was introduced and various usages for supporting thelesson authoring process with the suggested model were presented.

For now, diffusion process is implemented as simple update propagation protocolhaving a rough complexity of O(n2). Better optimization should be consider if the sys-tem is used over large number of nodes (e.g. in a repository). Therefore, we plan toimplement the model on top of an inference engine performing optimized computationof non-monotonic processes.

L

References

[1] DublinCore. Metadata initiative. http://www.dublicore.org last visit on 05/2006, 2006.[2] R. Farrell, S. D. Liburd, and J. C. Thomas. Dynamic assembly of learning objects. In World-Wide Web

International Conference WWW 2004, New York, 2004.[3] N. Friesen. International lom survey report. Technical report, ISO/IEC JTC1/SC36 sub-committee,

2004.[4] M. Hatala and G. Richards. Value-added metatagging: Ontology and rule based methods for smarter

metadata. In LNCS, editor, Conference on Rules and Rule Markup Languages for the Semantic WebRuleML, volume 2876 of Lecture Notes in Computer Science, pages 65–80. Springer, 2003.

[5] JENA. A semantic web framework for java. http://jena.sourceforge.net/ last visit on12/2006, 2006.

[6] LOM. Ieee ltsc - learning object metadata specification. http://ltsc.ieee.org/wg12/ lastvisit on 05/2006, 2006.

[7] Lucene. Full-featured text search engine library in java. http://lucene.apache.org last visiton 10/2006, 2006.

[8] G. McCalla. Foundations and Frontiers of Adaptive Learning Environments, chapter The search foradaptability, flexibility, and individualization: Approaches to curriculum in intelligent tutoring systems.,pages 91–122. Springer, 1992.

[9] O. Motelet and N. A. Baloian. Hybrid system for generating learning object metadata. In ICALT, pages563–567. IEEE Computer Society, 2006.


6. Conclusion



L’

References










[10] N. Pinkwart, M. Jansen, M. Oelinger, L. Korchounova, and U. Hoppe. Partial generation of contextual-ized metadata in a collaborative modeling environment. In 2nd International Workshop on Applicationsof Semantic Web Technologies for E-Learning AH 2004, Eindhoven, Netherlands, 2004.

Figure 6.6: Suggestion update process.

Figure 6.6 depicts the suggestion update process defined below.

Definition 6.14 (Suggestion Update) Consider an attribute a and Va the set of possiblevalues for a. Consider also two LOs L and L’, and a relationship of type t connecting L

with L’. Let {(v, w(v)) : v ∈ Va} and {(v, w′(v)) : v ∈ Va} be the suggestions of L and L’,respectively, for the a attribute. The suggestion update of L’ in regards with the suggestions ofL consists of replacing the suggestions of L’ with

{(v,max(w′(v), p(a, t)× w(v)) : v ∈ Va}

Theorem 6.15 The suggestion update process is regular.


PROOF. Consider a ∈ A and v ∈ Va. Let pij be the influence rule p(a, tij) where tij is therelation type between the LOs i and j. Let also wi be the weight wi(v) ∈ [0, 1] of the value v

of a for the LO i. If i → j, according to the definition of the suggestion update process thenpij ∗ wi > wj . The suggestion diffusion process complies with the following properties:

Stability i /→ji→j since pij ∗ wi ≤ pij ∗ wi.

Acyclism i / i since∏

from i to i(p) ∗ wi ≤ wi for all p ∈ [0, 1].

Asymmetry i j ⇒ j / i since if∏

from i to j ∗wi > wj , then∏

from j to i ∗wj ≤ wi because∏from j to i ∗wj ≤ wj <

∏from i to j ∗wi ≤ wi for all p ∈ [0, 1].

Cumulation i /→j & k → j ⇒ i /→jk→j since if pij ∗wi ≤ wj and pkj ∗wk > wj , then pij ∗wi <

pkj ∗ wk because wj < pkj ∗ wk

Initialization By initializing all weights with 0, the graph is stable since for any i and j,pij ∗ wi = 0 ≤ wj = 0.

Max For any i and j, Max(i, j) = max(wi, wj). Therefore if Max(i, j) = i, then j / i since∏from j to i ∗wj ≤ wj < wi.

Therefore, the suggestion update process is regular. �

We note that the Max functions for suggestion update has order O(1).

Corollary 6.16 Suggestion update propagation has order O(m+n log n) where n is the numberof LOs in the graph, and m the number of relations between these LOs.

6.4.1.4 Influence Rule Customization

We propose that suggestion rules customization is mostly automatic. The strategy consists inanalyzing existing lesson graph and extracting the probabilities that attribute are repeatedbetween related LOs.

In order to suit the teaching style of a particular community, this analysis should locallyperform. E.g., each lesson graphs added or updated in a local repository used by a particularcommunity implies the analysis of the suggestion probabilities. Next, the results of thisanalysis can be downloaded by community users when connecting to the repository.

Nevertheless, such strategy implies that the analyzed repository contains enough lessongraphs to extract statistically significant probabilities. Therefore, some default probabilitiesshould be taken into account instead of the analysis results when there is too few data. Theprobabilities calculated in our repository of 170 interrelated LOs could serve as default values.


6.4.2 Graph Consistency and Value Restriction




Introduces To

L6

L1

L3

educational/difficulty = easy

educational/difficulty = medium

general/coverage = {Java, OOP}

general/coverage = {Java}

general/coverage = {OOP}

Figure 6.7: Extract of the LO graph shown Figure 5.3.

Let us consider the lesson graph of Figure 6.7. It is plausible to think that for a certainteacher community, the fact that L1 introduces L3 may imply that the content of the LO L1

is simpler than the one of L3 (otherwise the lesson graph would not be consistent). In termsof LOM semantics, it means the value of the LOM attribute educational/difficulty

of L1 should be lower or equal to the educational/difficulty of L3. If L1 introducesmore than one LO, its level of educational/difficulty should be compatible (lower) witheach element it introduces. Since the value of educational/difficulty is associated to apredefined vocabulary, we can define an order among the terms of this vocabulary to test theconsistency.

Similarly, it seems reasonable to imagine that for a certain teacher community, since L6is a container for L1 and L3, the content coverage of L6, i.e., the extent or scope of the contentof this LO, should include the content coverage of its children. In terms of LOM semantics, itmeans the value of the general/coverage attribute of L6 should be a superset for all thegeneral/coverage of its children (including L1 and L3).

In some cases, the semantics of a LO graph gives sufficient information to directly inferrestrictions on the metadata values of a LO without having to take into account the metadatavalues of the neighborhood. For instance, it seems reasonable to state that most LOs, whichassess other LOs, are intended to be directly used by the learners. In terms of LOM semantics,it means that if there is an assesses relation between a certain LO and others, then theeducational/intendedUserRole of this LO should be learner.


Relation Types CoverageAggregation

Level

Intended User

Role

Semantic

Density

Typical Learning

TimeDifficulty

isPartOf

hasPart

summarizes

summarizedby

introducesTo

introducedBy

assessedBy

assesses

General

LOM Attributes

Educational

Table 6.1: Examples of restriction rules.

6.4.2.1 Influence Rule Scope

The previous assumptions about the consistency of LOM attribute values are special influencerules called restriction rules. When defining restriction rules, we distinguish between twokinds of LOM attributes:

1. the attributes dealing with single value (e.g., educational/difficulty or educational-/interactivityLevel), which are ordered in the LOM specification or can be easilyordered in a LOM Profile (e.g., the easy value of the educational/difficulty attributecan be set as inferior to the difficult value),

2. the attributes which have a set of elements as value (e.g., general/coverage orgeneral/keyword).

The LOM specification makes such distinction.

When dealing with metadata attributes having a single element as value, the re-striction rules are functions taking a metadata attribute name and a relation name asparameter, and returning an element of the set O = {≤,≥} × {max, min,V} where V areall the possible values for the LOM attributes. For example, the restriction rule for theeducational/difficulty attribute and the introducesTo relation is defined as ≤ maxvi

where vi are the educational/difficulty values of the LOs related with the introducesTorelation.

When dealing with a metadata attribute based on a set of elements, restriction rulesare functions taking a metadata attribute name and a relation name as parameter, and


returning an element of the set Os = {(⊆,∩), (⊆, 2V), (⊇,∪), (⊇, 2V)} where 2V is the powerset of all the possible values for the LOM attributes. For example, the restriction rule for thegeneral/coverage attribute and the relation hasPart is defined as ⊇ ∪vi where vi are thegeneral/coverage values of the LOs related with the hasPart connection. The restrictionrule for the educational/intendedUserRole attribute and the relation assesses is definedas ⊆ {learner}.

Table 6.1 shows other examples of restriction rules. Note that the rules can be modifiedin order to adapt them to other educational contexts. An extended set of restriction rulesdefined in our institution can be found in Appendix B.

Definition 6.17 (Restriction rules) Restrictions rules are function γ : A × T → O ∪Os

where O = {≤,≥} × {max, min,V}, Os = {(⊆,∩), (⊆, 2V), (⊇,∪), (⊇, 2V)}, A is the set of LOMattributes, T is the set of relation types, and V is the set of all the possible values for the LOMattributes.

Example: The restriction rules for the propositions done before have the following values:

γ(educational/difficulty, introducesTo) = (≤, max)

γ(general/coverage, hasPart) = (⊇,∪)

γ(educational/intendedUserRole, assesses) = (⊆, {learner})

6.4.2.2 CMV Nature

Applying restriction rules to the LO graph results in a set of special CMVs called restrictionintervals.

Definition 6.18 (Restriction Interval) A restriction interval for a certain attribute a re-duces the range of possible value for a.

We note ra(L) the restriction interval for a LO L and the a attribute. rlowa (L) and rup

a (L),the lower and upper bounds of ra(L), respectively, are elements of Va for single-element valueattributes and 2Va for element set value attributes.


We detect an anomaly when the value that the user set for the a attribute does not belongto this interval. Another source of anomaly is when the interval becomes incoherent, i.e., itdoes not comply anymore with rlow

a (L) ≤ rupa (L) for single-element values or rlow

a (L) ⊆ rupa (L)

for element set values.

At the beginning of the propagation process, the restriction interval for the a attributeand the L LO is initialized to the whole interval of the possible values for a, i.e., [−∞,+∞]for single-element values or [∅,Va] for element set values where Va is the power set of allthe values for a. If the user has set a value for the a attribute, the interval reduces to[a(L), a(L)] where a(L) is the actual value of a for L. If an interval becomes incoherent, thenthe propagation process excludes it.

6.4.2.3 Update Process

ra(L)

ra(Li)0!i!n

t tt

...

t!1


6. Conclusion



L1

References











?

L6

L1

L3

L2

L4

L5

hasPart0.57

isPartOf0.45

abstractedBy0.72

introducesTo0.54

resolvedBy0.85

backgrdFor0.70

explainedBy0.75

Relation Types CoverageAggregation

Level

Semantic

Density

Typical Learning

TimeDifficulty

isPartOf

hasPart

summarizes

summarizedby

introducesTo

introducedBy

assessedBy

assesses

General Educational

LOM Attributes

(a) (c)

{(v, w(v))}v {(v, w!(v))}v

L L!tra(L)

ra(Li)0!i!nLnL1L0

L

tt

t...

tC

(b) (d)

Figure 3. (a) Weights of suggestions for general/keyword attribute. (b) Update process of suggestions.(c) Examples of restriction rules. (d) Update process of restriction intervals.

Rule Definition. The previous assumption about the consistency of LO attribute valuesof the graph is a special type of influence rule called a restriction rule. A restrictionrule is defined as the combination of four elements: (1) a metadata attribute name, (2) arelation name, (3) a constraint operator of the set {!, "} (or {#, $} when dealing witha metadata attribute based on a set of elements), and (4) a combination operator of theset {max,min} (or {%, &} respectively). For example, the restriction rule for the attributeeducational/difficulty and the relation introducesTo is defined as ! max vi wherevi are the attribute values of the LOs related with the introducesTo connection. Otherexamples of rules can be found in Figure 3c. Note that the rules can be modified in orderto adapt them to other educational contexts.

CMV Definition. Applying restriction rules to the LO graph results in a special CMVcalled restriction interval for metadata attribute of each LO. We define the restrictionfor the LO L and the attribute a as the interval ra(L) = [rlower

a (L), ruppera (L)] where

the boundaries of the interval are possible metadata values for a. Defining a restrictioninterval for a LO L and an attribute a means that a(L) (i.e. the original value of theattribute a for the LO L) should be in the interval ra(L) otherwise there is an inconsistencyin the LO graph. At the beginning of the diffusion process, the restriction interval for aLO L and an attribute a is initialized with the original value for the attribute: rlower

a (L) =rupper

a (L) = a(L). If no value is available, the restriction interval is initialized to acceptall possible values: ra(L) = [amin, amax] where amin and amax are respectively theminimum and maximum value for an attribute a.

Update Process. Figure 3d depicts a step of the diffusion process: the changes in therestriction interval for a certain attribute a of a LO L/0 are propagated to the restrictioninterval of its neighbor L. Restriction rules are applied on the relation opposite to thepropagation direction: in this case, the relation of type t connecting the LO L (receivingthe update notification) to L/0 (propagating changes). The update process of a restrictionrule considers the n + 1 LOs (n ' N) to which L is connected with relations of type t .


6. Conclusion



L

References











6. Conclusion



Ln

References











Figure 6.8: Restriction interval update process.

Figure 6.8 depicts a step of the propagation process: the changes in the restrictioninterval for a certain attribute a of the L/0 LO are propagated to the restriction interval of itsneighbor L. Restriction rules are applied to the inverse of the relation used for propagatingthe changes: in this case, the relation of type t connecting the L LO (receiving the updatenotification) to L/0 (propagating changes) is used. The update process of a restriction ruleconsiders the n+1 LOs to which L is connected with relations of type t. Those LOs are denotedLi with /0 ≤ i ≤ n, where n is a natural number.

If the t relation type imposes a restriction rule ≤ max for the values of the a attribute (i.e.,γ(a, t) = (≤, max)) , the update function consists of replacing the ra(L) restriction interval ofthe L LO by: ra(L) ∩

]−∞, maxi

(rupa (Li)

)]. This means that the restriction interval of L is

intersected with an interval consisting of an infinite lower boundary (no effect on ra(L)) andan upper boundary equal to the maximum of the upper boundaries of the restriction intervalsof the LOs Li (may lower the upper boundary of ra(L)). The other restriction update functionsfollow the same principles as defined below.


Definition 6.19 (Restriction Update Process) Consider an attribute a and Va the set ofpossible values for a. Consider also a LO L and a set of LO Li, and a relationship of type t

connecting L with the Lis. If the attribute a has single-element values, then the restrictionupdate process consists of the functions defined in Table 6.2. If the attribute a has element setvalues, then the restriction update process consists of the functions defined in Table 6.3.

γ(a, t) Restriction Update Functions

(≤, max) ra(L) = ra(L) ∩]−∞, maxi

(rupa (Li)

)](≤, min) ra(L) = ra(L) ∩

]−∞, mini

(rupa (Li)

)](≤, v) ra(L) = ra(L) ∩

]−∞, v

]v ∈ Va

(≥, max) ra(L) = ra(L) ∩[maxi

(rlowa (Li)

),+∞

[(≥, min) ra(L) = ra(L) ∩

[mini

(rlowa (Li)

),+∞

[(≥, v) ra(L) = ra(L) ∩

[v,+∞

[v ∈ Va

Table 6.2: Restriction update process for single-element attributes.

γ(a, t) Restriction Update Functions

(⊆,∩) ra(L) = ra(L) ∩[∅,

⋂i

(rupa (Li)

)](⊆, v) ra(L) = ra(L) ∩

[∅, v

]v ⊆ 2Va

(⊇,∪) ra(L) = ra(L) ∩[⋃

i

(rlowa (Li)

),Va

](⊇, v) ra(L) = ra(L) ∩

[v,Va

]v ⊆ 2Va

Table 6.3: Restriction update process for element set attributes.

The intersection between two intervals is defined as

[xlow, xup] ∩ [ylow, yup] = [max(xlow, ylow), min(xup, yup)]


We define the min and max functions for the element set values a and b as

min(a, b) =

a if a ⊆ b

b if a ⊇ b

else a composition error is thrown

and

max(a, b) =

a if a ⊇ b

b if a ⊆ b

else a composition error is thrown

Definition 6.20 A composition error thrown when updating a certain LO L means that1. there are some incorrect relations between this LO L and the other LOs of the graph, and/or2. there is a contradiction between two restrictions rules that should be resolved.

Definition 6.21 If a restriction interval becomes empty when updating a certain LO L –rlowa (L) ≤ rup

a (L) or rlowa (L) ⊆ rup

a (L) – means that1. there is an incorrect value for the metadata of L, and/or2. there are some incorrect relations between L and the other LOs of the graph, and/or3. there is a contradiction between two restrictions rules that should be resolved.

Theorem 6.22 The restriction update process is regular.

PROOF. The restriction update process has two parts working on different type of attributesand thus not interacting together. In each of these parts, half of the restriction rules considersto low down the upper limit of the restriction interval. The second half focuses on increasingthe lower limit of the restriction interval. These processes can be considered as independentsince the interval coherence is considered out of the update process as described in Definition6.21.

Lemma 6.23 The update process for (≤, max), (≤, min) and (≤, v) is regular.

PROOF (PROOF OF LEMNA). The restriction rules (≤, max), (≤, min), and (≤, v) consist in low-ering the upper limit of the restriction interval with an external value. Consider a ∈ A. Wenote ri the upper bound of the restriction interval ra(i) for the LO i. If i → j, then there isri < rj with ri being the max or min of the upper bound of the restriction intervals of the relatedLOs, or v, for the rules (≤, max), (≤, min), or (≤, v), respectively. The restriction propagationprocess for these rules complies with the properties:

Stability i /→ji→j since ri ≥ rji→j = ri.


Acyclism i / i since for all k such that i k, then rki k = ri and ki k /→i because ri ≥ ri.

Asymmetry i j ⇒ j / i since if ri < rj , then rj ≥ ri.

Cumulation i /→j & k → j ⇒ i /→jk→j since if ri ≥ rj and rk < rj then ri ≥ rjk→j becauserjk→j = rk < rj ≤ ri.

Initialization By initializing all weights with the highest value for a, the graph is stablesince for any i and j, ri = ahigh ≥ rj = ahigh.

Max For any i and j, Max(i, j) = min(ri, rj). Therefore if Max(i, j) = i, then j / i since rj ≥ ri.In this case, Max has order O(1).

Therefore, Lemna is proved. �

Lemma 6.24 The update process for (≥, max), (≥, min) and (≥, v) is regular.

PROOF (PROOF OF LEMNA). The restriction rules (≥, max), (≥, min) and (≥, v) consist in in-creasing the lower limit of the restriction interval with an external value. Consider a ∈ A. Wenote ri the lower bound of the restriction interval ra(i) for the LO i. If i → j, then there isri > rj with ri being the max or min of the lower bound of the restriction intervals of the relatedLOs, or v, for the rules (≥, max), (≥, min), or (≥, v), respectively. The restriction propagationprocess for these rules complies with the properties:

Stability i /→ji→j since ri ≤ rji→j = ri.

Acyclism i / i since for all k such that i k, then rki k = ri and ki k /→i because ri ≤ ri.

Asymmetry i j ⇒ j / i since if ri > rj , then rj ≤ ri.

Cumulation i /→j & k → j ⇒ i /→jk→j since if ri ≤ rj and rk > rj then ri ≥ rjk→j becauserjk→j = rk > rj ≥ ri.

Initialization By initializing all weights with the lowest value for a, the graph is stable sincefor any i and j, ri = alow ≤ rj = alow.

Max For any i and j, Max(i, j) = max(ri, rj). Therefore if Max(i, j) = i, then j / i since rj ≤ ri.In this case, Max has order O(1).


Lemma 6.25 The update process for (⊆,∩) and (⊆, v) is regular.

PROOF (PROOF OF LEMNA). The restriction rules (⊆,∩) and (⊆, v). They consist in loweringthe upper limit of the restriction interval with an external value. Consider a ∈ A. We noteri the upper bound of the restriction interval ra(i) for the LO i. If i→ j, then there is ri ⊂ rj

with ri ∩ (∩rother neighbors) = ri for the update to be due to i. The restriction propagationprocess for these rules complies with the properties:


Stability i /→ji→j since ri ⊇ rji→j = ri.

Acyclism i / i since for all k such that i k, then rki k = ri and ki k /→i because ri ⊇ ri.

Asymmetry i j ⇒ j / i since if ri ⊂ rj , then rj ⊃ ri.

Cumulation i /→j & k → j ⇒ i /→jk→j since if ri ⊇ rj and rk ⊂ rj then ri ⊇ rjk→j becauserjk→j = rk ⊂ rj ⊆ ri.

Initialization By initializing all weights with the whole set of possible value for the attributea, the graph is stable since for any i and j, ri = Va ⊇ rj = Va.

Max For any i and j, Max(i, j) = min(ri, rj). Therefore if Max(i, j) = i, then j / i since rj ⊇ ri.In this case, Max has order O(1).


Lemma 6.26 The update process for (⊇,∪) and (⊇, v) is regular.

PROOF (PROOF OF LEMNA). The restriction rules (⊇,∪) and (⊇, v). They consist in loweringthe upper limit of the restriction interval with an external value. Consider a ∈ A. We noteri the upper bound of the restriction interval ra(i) for the LO i. If i→ j, then there is ri ⊃ rj

with ri ∪ (∪rother neighbors) = ri for the update to be due to i. The restriction propagationprocess for these rules complies with the properties:

Stability i /→ji→j since ri ⊆ rji→j = ri.

Acyclism i / i since for all k such that i k, then rki k = ri and ki k /→i because ri ⊆ ri.

Asymmetry i j ⇒ j / i since if ri ⊃ rj , then rj ⊂ ri.

Cumulation i /→j & k → j ⇒ i /→jk→j since if ri ⊆ rj and rk ⊃ rj then ri ⊆ rjk→j becauserjk→j = rk ⊃ rj ⊇ ri.

Initialization By initializing all weights with an empty set, the graph is stable since for anyi and j, ri = ∅ ⊆ rj = ∅.

Max For any i and j, Max(i, j) = max(ri, rj). Therefore if Max(i, j) = i, then j / i since rj ⊆ ri.In this case, Max has order O(1).


According to the four last Lemna, the update process of all the restriction functions is regular.�

We note that the Max functions for all the restriction update processes have order O(1).

Corollary 6.27 Restriction update propagation has order O(m+n log n) where n is the numberof LOs in the graph, and m the number of relations between these LOs.


R → Csimple Osimple | CONTAINS Ocontains | CONTAINED Ocontained

Csimple → INFEQ | SUPEQOsimple → MAX | MIN | Vsimple

Ocontains → UNION | Vset

Ocontained → INTERSECTION | Vset

Vset → Vsimple (, Vset)∗

Vsimple → V

Figure 6.9: Restriction rule context-free grammar

6.4.2.4 Influence Rule Customization

Table 6.1 (on page 103) shows a sample of restriction rules defined in our institution. In orderto suit other teaching communities, new restriction rules can be defined or existing restrictionrules can be refined. Rules are defined in a domain-specific language following the grammarshown in Figure 6.9.

Restriction rules may be the result of a negotiation process inside a teaching community.The restriction rules not suiting the community style are reported by users and new rules areproposed. New rules or rule modifications are thus negotiated between community members.Finally, changes may be centrally defined in order to preserve restriction rule consistencyinside the community.

6.5 Context Diffusion Impact

The impact of the context on the generated information depends on the number of influencerules which are defined.

When few rules are defined for a same attribute and/or these rules only concern infrequentrelations, the impact of the diffusion process is limited. This may be the case with therestriction process. For instance, in our institution, only two restriction rules were definedfor the interactivity/type atribute:

(interactivity/type, examplifiedBy, (≤, min)

)and(

interactivity/type, abstractedBy, (≥, max)). Since there is no other restriction rules for this

attribute, restriction diffusion is limited to the sets of LOs linked together with examplifiedByor abstractedBy relation types.

In contrast, there are suggestion rules for almost all the “LOM attribute / relation type”couples since this rules are tangible probabilities automatically calculated. Therefore, the

Framework Implementation 115

metadata of each LO of the graph can receive the influence of the metadata of all the LOs,which are directly or indirectly connected with it. Without context diffusion, this scope hadbeen limited to the direct neighborhood. With the accumulation and inheritance functionsdefined in (Hatala and Richards, 2003; Brase, 2005), the scope had been limited to the directand indirect neighbors related with isPartOf and hasPart relations, respectively. In contrastthese approaches, the impact of the suggestion diffusion process is spread on the entire graph.

6.6 Framework Implementation

6.6.1 Avoiding Propagation Side-Effect

In practice, our propagation system may present an undesirable side-effect when decreasingmetadata values according to the Max function. Indeed, consider the active propagation i→ j.Let inew be a decreasing value for i such that ji→j → inew. The memory of the old i value inthe CMV of j (ji→j) may be inconvenient: E.g., if the old i value was incorrect and inew rectifiesit, it is not acceptable to have the CMV of i modified by a CMV deriving from the old value of i.

In order to cope with this behavior, the update process implementation must consider tworules. First, the update process of a node should not take into account the neighboring CMVsderiving from this node. Applying this rule to our example, we have ji→j /→inew because ji→j

derives from i. This additional rule is consistent with the acyclism property of regular updateprocesses. Second, update processes should always take into account all the neighboring CMVsand the original value of the updated node. Applying this rule to our example, when inew isassigned for i, this new value can update j whereas there should be inew /→ji→j by asymmetryof regular update processes applied to ji→j → inew. Therefore, the value of j can reflect thechanges of i. In fact this property is natural since j derived from i. Moreover, the asymmetryproperty remains respected since the first rule forbids ji→j /→inew.

The implementation of these rules implies that each CMV is decorated with the set ofnodes from which it derives. When making an active update, the resulting CMV is associatedto the copy of the node set associated to the CMV originating the update. Then, a reference ofthe updated node is aggregated to the copy.

This requirement can be assumed using Hashsets. The cost of Hashsets is linear inaccess, e.g., with the “contains” function, and insertion. Nevertheless, copying a Hashset has acost of O(p) where p is the number of element of the copied set. In the worst case, the node setdecorating a CMV have n− 1 elements where n is the number of nodes in the graph. Indeed,the updated element cannot be part of the set by acyclism property of regular update processes.


Consequently, implementing these two rules appends a factor of O(n) to the worst-case cost ofupdate propagation.

6.6.2 Java Library

The framework presented in this chapter is implemented as an open source Java library avail-able at (diffuse, 2007). This library is independent of the processed-metadata type. To achievethis independence, a set of interfaces reify the metadata types: relation semantics betweenmetadata sets, metadata attributes, and metadata values are defined. Other interfaces reifythe CMV models: the CMV nature and the CMV update processes are defined. The library alsoimplements the update propagation algorithm based on Fibonacci heap and the CMV modelspreviously described. It also includes the implementation of the two CMV model applicationsdefined in this this chapter.

6.7 Conclusion

This chapter presented a framework for building systems taking advantage of the graphsemantics (Section 6.2.2). It also described two CMV models dealing with graph consistencyand attribute value similarity analysis. These applications permit to generate suggestions andrestrictions for the LOM values of the elements of a LO graph (Sections 6.4.1 and 6.4.2).

In this framework, graph semantic analysis is based on a context diffusion processthat attempts to cope with the problem of lesson graphs where some metadata are missing,incorrect, or incomplete. This process consists in applying influence rules, i.e., rules formalizingthe influence of the graph semantics on the LOM semantics, not only to the metadata values ofthe neighborhood but also on the information already generated by such rules (Section 6.2.1).Depending on the lesson graph size, context diffusion can considerably decrease the effect ofmissing and incorrect metadata values compared with the existing methods (Section 6.5). Forthat reason, our LOM processing model is an answer to the research question RQ2.

Each influence rule relates a certain LOM attribute and a certain relation type with acertain behavior. The scope of this behavior needs to be clearly defined in order to ensurecontext diffusion convergence. This scope was defined as a set of properties to comply with.Into this scope, influence rules can be tailored without having to modify the diffusion process.When the system is used in a teacher community, influence rule tailoring can be manual: E.g.,teachers may define the most appropriated rules for their teaching/learning context. Tailoring

Conclusion 117

can also be automatic: E.g. some existing LO graphs may be analyzed in order to define therules.

Furthermore, metadata types and graph relation types are parameters of the diffusionprocess: Adapting them to other specific needs only has effect on the influence rules but itdoes not affect the diffusion process. For this reason, this model can be applied to variouslearning/teaching settings.

The next chapters use this LOM processing framework in order to answer the researchquestions RQ1 and RQ3. First, hybrid LOM generation and validation based on LOM process-ing model are discussed. Next, the rewarding issue associated to the hybrid LOM usage isconsidered: Systems based on the LOM processing model are presented in order to rewardlesson authors for generating metadata for their lesson LOs.

Chapter 7

Hybrid LOMGeneration and Validationduring Lesson Authoring

This chapter attempts to answer RQ1: How to seamlessly integrate hybrid LOM generationand hybrid LOM validation into a lesson authoring tool? For that purpose, it presents twohybrid systems based on the proposals of Chapter 5 for introducing human LOM usage intolesson authoring and also the LOM processing model introduced in Chapter 6.

First, a system for integrating hybrid LOM generation during lesson authoring is dis-cussed. Next, a system for integrating hybrid LOM validation during lesson authoring isexplored.

7.1 Hybrid LOM Generation

The information produced by the automatic systems reviewed by Section 3.4 and presented inChapter 6 may be classified in three groups:

The very probable values. Such values are the results of accurate automatic generationsystems. They concern objective attributes like the technical format or the size of thedocument (Section 3.4.3).

119

120 Chapter 7. Hybrid LOM Generation and Validation

The probable values. These values generated by automatic systems are generally not reli-able enough to serve automatic instantiation purposes, but they may support the taskof defining metadata values. These values are generally extracted from the context likethe learning management systems or the related LOs. They concern both objective andsubjective attributes (Section 3.4.4 and Chapter 6).

The forbidden values. Chapter 6 proposed a method to infer some restrictions about theLOM values of a LO by analyzing the context around this LO. These restrictions mayserve to reduce the scale of possible values.

From the point of view of a hybrid instantiation of LOM, all these groups of informationare relevant. First, the very probable values may be automatically instantiated withouthuman intervention. Second, suggestions may be displayed to help the user in the processof metadata value instantiation. Third, the restrictions may also be used to speed up theinstantiation and to limit the cognitive overload denounced by (Friesen, 2004) and due to thelarge range of possible values for a same attribute.

We have implemented part of these concepts in our LOM-based lesson graph builder,LessonMapper2. In particular, LessonMapper2 integrates the generator of suggestions andrestrictions presented in Chapter 6. This system uses the semantics of lesson graphs based onLOs for generating potential values (suggestions) and value boundaries (restrictions) for LOMattributes.

Figure 7.1 shows the semantic density of three LOs: Base de Java, Primer Programa andSegundo Programa. The first LO contains the last two ones. The semantic density of PrimerPrograma is not defined. In this example, suggestions and restrictions were asked for thisattribute. Since Primer Programa introduces to Segundo Programa, the system can deducea restriction for the educational/semanticDensity attribute: The semantic density ofPrimer Programa should be less or equal to the semantic density of Segundo Programa.This is due to the fact that the rule set initializing the system contained the restriction rule(educational/semanticDensity, introducesTo, (min,≤)

). As described in Chapter 6, this

restriction rule may not suit the pedagogical considerations of all the teacher communities.Thus, teacher communities using such a system should customize the restriction rules to suittheir own needs.

Since Primer Programa is part of Base de Java, the system makes suggestions forinstantiating the semantic density of Primer Programa:

medium density , i.e., the value of the semantic density of Base de Java

low density , i.e., the semantic density of Segundo Programa

Hybrid LOM Generation 121

1

1

2

2

3

3

Suggestions Restriction

Figure 7.1: Suggestions and restriction for the educational/semanticDensity attributevalue of the Primer Programa LO. The suggestion size depends on their probability of beingrelevant. Note that the font color differentiates the suggested values that do not comply withthe inferred restriction, from the suggested value, (LowDensity), that does.


high and very high density , i.e., the semantic density of other LOs not displayed in Figure7.1 but being also part of the lesson and indirectly related to Primer Programa.

Suggestions are sized according to their relevance: Suggestions having a high probability ofbeing chosen are larger than the other ones. For instance, according to the probabilities listedin Appendix A, the probability that Primer Programa has the same semantic density than itsparent, Base de Java, is about 50% whereas the probability of such similarity with the LOit introduces, Segundo Programa, is about 75%. Therefore, the semantic density of SegundoPrograma is displayed as the largest one. As described in Section 6.4.1, these probabilitiesare based on the analysis of a base of 170 LOs we developed at our institution (see (Motelet,2007) for a snapshot of the repository used). Such an analysis should locally be repeated whenapplied to the context of another community of teacher in order to suit its specific needs. Thesuggested values that are forbidden, i.e., that do not comply with the inferred restrictions arepainted in a different color from the color assigned to the values that do. In LessonMapper2,simple drag and drop enables to adopt one of the suggested values if the user considers it asthe most appropriate. This support system benefits from human work since it is based on themetadata values of the lesson elements that the user has already instantiated in the lesson.Therefore, the more the user assigns values to the lesson metadata, the better the systemsupport should be.

This section proposed an interface that integrates suggestions (i.e., probable values)and restrictions into the lesson authoring process in order to support the LOM instantiationprocess. While LessonMapper2 does not implement existing methods for generating veryprobable values, these values can be integrated in the LessonMapper2 LOM editor as defaultvalues. Such default values are colored differently in order to differentiate them from thevalue instantiated by the user. Combination of various techniques for generating very probablevalues can follow the framework of (Ochoa et al., 2005).

Such automatically generated values are very useful for those objective values, like, e.g.,technical attributes or author-related information, that are generally not of interest for lessonauthors. In contrast, instantiating educational metadata during lesson authoring may helplesson authors to articulate their pedagogical approach. This topic is further discussed inSection 8.1.

7.2 Hybrid LOM Validation

The validity of the metadata of a certain educational resource has to be ensured in order tofacilitate the access to this resource (Section 3.5. Valid metadata should satisfy a minimum

Hybrid LOM Validation 123

level of completeness and correctness. The analysis of completeness of LOM simply consistsof checking the number of instantiated attributes. Complete LOM have all their attributeinstantiated (Section 3.5.1). Correctness evaluation is complex because it deals with thesemantics of the metadata values. Some methods attempt to cope with this problem bycomparing with results of automatic instantiation systems or by checking the consistency ofthe LOM attributes inside the same LO. These systems can determine that a metadata valueis incorrect. Nevertheless, none of them is able to ensure the correctness of a metadata value:If a system cannot detect that a certain metadata value is incorrect, it does not mean that thismetadata data value is correct (Section 3.5.2). Therefore, these systems test metadata valueincorrectness.

In the context of lesson authoring based on LO graph, we propose a new method based ona different source of information : the LO graph semantics. The method uses the restrictionsinferred from the analysis of the graph semantics for checking the correctness of the values.

1

1

2

2

global validation stateon tricolor bar

validation statefor each attribute

Figure 7.2: Displaying the validation state of the metadata of a LO with LessonMapper2.

LOM validation is integrated into the lesson authoring process as a visual tag decoratingthe graph LOs. As shown in Figure 7.2, in LessonMapper2, all the learning material items ofthe graph are decorated with tricolor bars representing the proportion of invalid, undefined,and not invalid elements for the LOM attributes of each learning material item. Invalidelements hold values not satisfying the restrictions deduced by the system. Undefined elements


have not yet any value assigned. Finally, not invalid elements hold values successfully passingboth tests of completeness and incorrectness.

The validity bar permits to spotlight the elements forgotten by the instantiation processand prevent from the incoherencies as far as they can be detected by the rule-based systemgenerating the restrictions. The bar is automatically refreshed when a change occurs in themetadata values of graph elements. Figure 5 shows the validity results of the Primer Programalearning resource in their detailed form. This view exhibits the list of reasons for eventualissues. In this example, we attached a high semantic density to the Primer Programa element.Nevertheless, as described before, there is a restriction deduced for this LO imposing that itssemantic density should be inferior or equal to the low semantic density of Segundo Programa.Since the current value does not comply with this restriction, the attribute semanticDensity istagged as invalid. As discussed in Section 6.4.2.3, this error can have three different causes:

• The semantic density of this LO is not well defined. In this case, the metadata valueshould be changed.

• The relation introducesTo between these two LOs is not appropriate. In this case, thegraph organization should be re-evaluated.

• The teacher does not agree with the rule originating the conflict. In this case, the ruleshould be customized.

Metrics like those of (Ochoa and Duval, 2006a) may also be used to test the incorrectnessof some metadata values. While not implemented in the present LessonMapper2 version,incorrect metadata values detected with this approach will also be tagged as “invalid”.

As described in Section 5.5.1, it may be expected that lesson authors instantiating LOMduring lesson authoring do not instantiate all LOM attributes at once. In this context, it isimportant to ensure that lesson authors can be aware of the completness rate of their LOs.The system proposed in this section enables such awareness.


The presented methods for easing LOM generation and LOM validation, require a list ofsuggestion probabilities and a list of restriction rules. Consequently, we extend the deploymentmodel of LessonMapper2 presented in Section 5.7 with these requirements (Figure 7.3).

The configuration of suggestion probabilities and restriction rules depends on the chosenLOM profile. This configuration is expected to be regularly updated while the community

Conclusion 125

LORLOMs + LOs

AdminLM2

Teachers

LM2 LM2

ConfigurationLOM Profile +

Suggestion Proba + Restriction Rules

Figure 7.3: LessonMapper2 deployment model including rule customization

uses the system. Thus, the suggestion probabilities is automatically updated when therepository content changes. This feature is implemented as an extension package for theeXist DB available at (LessonMapper2, 2007). Besides, observed incoherencies produced bythe restriction rules as well as recurring patterns identified into the LO graphs producedby community members are reasons to modify the system restriction rules. This featurerequires the intervention of a administrator collecting this information and modifying therules accordingly.

7.4 Conclusion

This chapter has presented concrete methods to support the hybrid LOM generation andvalidation during lesson authoring and answer the research question RQ1.

These methods are seamlessly integrated into lesson authoring. This integration is basedon the LOM instantiation interface proposed in Chapter 5. Furthermore, the LOM processingmodel of Chapter 6 is used in order to infer probable and forbidden values for all metadata ofthe lesson graph LOs.

Probable values can be directly used for instantiating LOM: Lesson authors can simplydrag-and-drop probable values from a ranked list in order to instantiate LOM values. For-bidden values restrict this list of probable values. Forbidden values are also used to checkthe incorrectness of instantiated LOM. This test together with a completeness test enablesto report a metadata validation state for each LOs of the lesson graph. This validation state


constantly decorates the lesson LOs. Nevertheless, experiments remains needed to measurethe impact of this hybrid approach on LOM generation and validation.

In order to complete our study of the hybrid LOM usage during lesson authoring, nextchapter focuses on the last research question RQ3 dealing with LOM benefits for lessonauthors. In particular, the use LOM for facilitating lesson design and LO retrieval duringlesson authoring is studied.

Chapter 8

Rewarding LOM Generationduring Lesson Authoring

The previous chapters of this thesis have focused on introducing hybrid LOM usage intolesson authoring: Metadata processing of lesson LOs has been improved and hybrid LOMgeneration and validation has been seamlessly integrated into lesson authoring. Nevertheless,as described in Chapter 4, LOM benefits for lesson authors should also be improved in orderto create a positive feedback loop where LOM benefits motivate good-quality LOM generationand good-quality LOM generation increases benefits. This is the aim of RQ3: How can lessonauthors benefit form the LOM they generate?

Rewarding LOM generation during lesson authoring is the focus of this chapter. Thisstudy attempts to answer the research question RQ3. First, the use of LOM for supportinglesson design is discussed. Next, the use of our LOM processing model for enhancing LOretrieval during lesson authoring is studied. In particular, a system is proposed that compareswith classical information retrieval methods.

8.1 Using LOM for supporting Lesson Design

Section 5.5.2 presented a method for visually characterizing LOs with LOM. This methodconsists in decorating the LOs with a visual representation of some of their educationalmetadata values: difficulty, intendedUserRole, interactivityType, and semantic-

Density (see Figure 5.8 on page 83).

127

128 Chapter 8. Rewarding LOM Generation

This visual information may ease the task of checking the coherence of the learningdesign of a lesson. E.g., if a teacher wants her lesson having a linear progress in terms ofdifficulty, the visual characteristics associated with the educational/difficulty attributepermits to check this property for all the lesson items at a glance. As a consequence, thisfeature may be used to help the teacher ensure the difficulty progression she is planning.The same method can be applied to the semantic density of the lesson LOs. Besides, visuallycharacterizing LOs with their intended user role and their interactivity type may also helplearning design. E.g, the teacher can ensure at a glance that learner participation respects acertain pace and regularity in her lesson.

As described in Section 7.2, our hybrid validation system can detect some “invalid”metadata values. Consider that a metadata value is tagged “invalid” whereas this value suitswell it associated LO. This situation certainly witnesses that the LOs is not appropriatelylocated inside the lesson graph. Let, e.g., the error be due to the restriction rule ≤ max for theassesses relation and the educational/difficulty attribute. Then, it means that the LOconcerned by the error is an assessment being more difficult than the most difficult material itassesses. A teacher community defining such a rule considers this situation as a lesson designmistake.

Other lesson design mistakes can be detected with the restriction rules defined in ourcase study. E.g., the restriction rules defined for the general/aggregationLevel attributepermit to detect if, a summary is of coarser granularity than the material it summarizes or if anintroduction is of coarser granularity than the material it introduces. Similarly, our restrictionrules for the educational/difficulty attribute permit to detect if: an introduction is moredifficult than the material it introduces or if an summary is more difficult than the material itsummarizes.

In this sense, our LOM processing model can be used to support the detection of lessondesign mistakes. Lesson design mistake detection depends on the restriction rules givento the system. More numerous are the restriction rules and more potential lesson designmistakes can be detected. Restriction rules reflect the characteristics of a teaching community.Communities sharing strict teaching/learning patterns will define more restriction rules thancommunities having few constraints on the teaching/learning style. The community of ourcase study deals with this latter case. In this community, the lesson design mistakes that canbe detected by the system, enter the scope of the restriction rules defined in Annex B.

If the lesson author tries to generate correct metadata, most of the “invalid” flags, i.e., thered colors painting part of the validity bar of some LOs, will witness lesson design mistakesfor the flagged LOs. This feature as well as the visual characterization of LOs can be usedfor facilitating the lesson design. Better is the quality of the lesson LO metadata, better isthe lesson design support. For that reason, the proposed lesson design support is a concrete

Using LOM for Enhancing LO Retrieval 129

reward for generating LOM.

8.2 Using LOM for Enhancing LO Retrieval

While authoring, the teacher is looking for LOs (e.g., an exercise, a simulation, a definition ora motivating activity) in order to support her course with proper learning/teaching material.Her primary objective is not the reuse of material per se but its integration into her ownteaching, also depending on the profile of her students. As described in Section 5.6.3, we viewthe authoring process as follows. The teacher starts her course by building a graph where eachnode is a LO. These LOs are connected by semantic and/or rhetoric relationships. Moreover, wesuppose that the teacher will fill in some metadata values. This graph and this metadata willbe useful when searching for new material: A query corresponds to a new node in this graph,a node whose metadata values and associated material are empty but which is connected tosome other nodes of the graph, thus providing a query context. In this section, we propose away to use the context influence for enhancing the LO retrieval.

The idea of using the query context in a graph setting is also present in work related toconcept maps (Carvalho et al., 2001; Leake et al., 2004). These approaches propose that theuser can seamlessly query a common search engine by setting out a concept node as a searchtopic during the knowledge construction process. Query results are then ranked according tothe structure of the concept map. As suggested in (Novak and Gowin, 1984), a concept mapshould be defined as a tree. The distance between a concept node and the root concept of thetree influences its importance in the map. Based on this structural assumption and a fewothers, their system ranks the retrieved web pages. However, this work does not take intoaccount either the semantics of the links between concept nodes nor characteristics other thannode titles. We may notice the experiments of this section show that additional characteristicsare also useful for retrieving LOs.

In this section, we present a novel method making use of the graph structure to enhanceLO retrieval. First, we show that in a repository of interconnected LOs, queries can beexpressed in two orthogonal dimensions: (1) topologically, i.e., according to the location of thenode within the graph and (2) by a set of keywords. We also describe how to score each LO ofthe repository using these two dimensions. Finally, experiments show the significance of thisapproach.





Constructors

Constructors Overloading


Solved By

Introduces To

Abstracted By ?

L6

L1

L3

L2

L4L5

Examplified ByBackground For

Figure 8.1: Start of a LO graph about “object instantiation” (from Figure 5.3)

8.2.1 Querying a LO Repository from within the Lesson Graph

According to the proposal stated in Section 5.6.3, querying a LO repository can be done from apurely graphical point of view. For instance, consider the node L5 of Figure 8.1. This node isnot associated to an existing document butit is a query node: A query reflecting the need fora LO with certain characteristics is thus expressed by the position of the query node in thelesson graph. In Figure 8.1, the LOs satisfying the query node L5 should be examples of theconcepts introduced by L4. Note that suggestions and restrictions can be deduced from thequery node position in the graph. This information is used when confronting the query to eachLO of the repository, as explained in the next section.

From a more traditional perspective, querying can be done by searching directly themetadata, as with a set of keywords. To search a LO to substitute L5 in the same manner thanwith a common search engine like Google, we could for example choose the keyword query:“constructors overloading”. The metadata of the searched LO should contain this expression orat least a part of it. In our system, the Lucene search engine (Lucene, 2007) is used in order toprocess such keyword queries. Lucene is a system indexing the textual context of documentsin order to search them. When querying Lucene, results are ranked according to classicalsearch algorithm (e.g., Vector Space Model) . We use Lucene for indexing the metadata of theLOs contained in the repository. During the preprocessing, metadata values are stemmed, i.e.,only the morphological root of the words are kept, and tags are removed. Only metadata areindexed since most of the repository educational resources are multimedia documents withproprietary format that are difficult to process. Similarly, keyword queries are stemmed beforebeing processed by Lucene.


8.2.2 Evaluating Potential Results using the authored-Lesson Semantics

Searching documents within a repository with a set of keywords is performed by Lucene. Wenow describe how to evaluate the similarities between a query node and the repository LOs.We use two different methods for that purpose: one based on restrictions, the other one usingsuggestions.

8.2.2.1 Evaluating with inferred Restrictions

As described in Section 6.4.2, graph analysis can result in a set of restrictions for each attributeof the query node. We postulate that the searched LOs should comply with these restrictions.Repository elements satisfying the restrictions will have best ranks. For each attribute, thescore of a LO is

#CompliedRestrictions

#Restrictions

If there is no generated restriction for a certain LOM attribute, all the evaluated LOs havethe same score: 1. Since a list of restrictions can be calculated for each metadata attribute,this evaluation process can be repeated for each of them.

8.2.2.2 Evaluating with inferred Suggestions

Value suggestions can be generated by the process described in Section 6.4.1. In our example,we can diffuse a set of suggestions for each metadata attribute of L5. These suggestions arebased on the context surrounding L5. Similarly, in a repository containing graphs of LOs ratherthan isolated LOs, it is possible to calculate a set of suggestions for each metadata attributeof each LO of the repository. We postulate that the searched LOs should have a contextsimilar to the query node context. Therefore, the LOs of the repository where the query takesplace are evaluated with respect to their context similarity and ranked accordingly. Contextsimilarity is evaluated with the cosine between the two sets of suggestions for each metadataattribute. This comparison method is commonly used in the vector model of informationretrieval (Baeza-Yates and Ribeiro-Neto, 1999). Therefore, the similarity between the contextof the query node q and the context of the repository element e based on attribute att is definedas the cosine between the attribute values for q and e:

simatt(q, e) =

∑vi

wq(vi)× we(vi)√∑vi

wq(vi)2 ×√∑

viwe(vi)2


where the vi are value instances for the attribute att and wq(vi) and we(vi) are the weightsassociated with vi in the suggestions of q and e respectively.

The diffusion mechanism that generates the suggestions is applied to the q query nodeand the e repository LO to evaluate. Then, we compare the generated suggestions. Twoproblems arise with this approach:

Homogeneity Compared objects are not homogeneous: A query node has generally no definedmetadata value (apart from the title) whereas the repository element has. Consequently,the generated suggestions for the query node are limited to the information coming fromthe context (the neighborhood) while the suggestions for the repository LO are composedof both the context-based information and the original metadata values, which are alwayspreferred over the values coming from the context. Because we are attempting to matchthe lesson graph with the repository graph, it is natural to simulate what would bethe state of e if it were the currently introduced node. Setting e as a new node in itsenvironment with the same kind of information as the teacher provides for q has theeffect of making the two associated LOs representations more homogeneous and improvesretrieval as shown by our numerical experiments.

Orthogonality This approach is not orthogonal enough to Lucene: Lucene indexes the LOsin the repository according to the original attributes and thus there is an informationredundancy if we re-introduce these values in the classifiers. Instead, using only thevalue suggestions originating from the node neighbors, the context is better taken intoaccount.

In consequence, we propose that the suggestion calculus for repository LOs should nottake into account their original metadata values. Nevertheless, removing the suggestions dueto the original values of a LO is not sufficient : The context may also contain values similar tothe original values while their contribution in the similarities was overwritten by the originalvalues having always best weight (remember that suggestion diffusion solves weight conflictsusing the max operator). Therefore, a special diffusion process should be considered for eachrepository LO in which the original values of this LO are not considered.

With such a method, the suggestions calculated for the query node and the repositoryLOs may better deal with the two previous problems than the classical approach. In the restof this article this diffusion method will be called context-only diffusion to differentiate itwith the first method which will be called full diffusion. Section 8.2.5 shows the benefit ofcontext-only diffusion over full diffusion.


8.2.2.3 Result Classifiers

As explained before, restriction compliance and suggestion similarity can be used to evaluatethe relevance of repository LOs for each metadata attribute. While some research arguesthat some LOM attributes are dependent from each other (e.g., see Table 3.1 on page 52), weassume that they are independent for simplicity in this section. Therefore, we can considerhaving at most two independent classifiers per LOM attribute: one based on restrictioncompliance and the other on suggestion similarity. In practice, we based our experiment over17 attributes (see Appendix A for a listing). Thus, we have a total of 34 classifiers.

8.2.3 Experiments

8.2.3.1 Description

Within our institution, a LO repository was implemented and populated with 170 LOs about asingle topic: An introductory Java course (see (Motelet, 2007) for database snapshot). Thisrepository contains LO graphs, i.e., LOs of various granularity related together. Most of thecomponents of this repository are fine-grained: each one corresponding to about 5 minutes ofteaching material. In contrast with other repositories, relation semantics in this repositorywere based on the proposal made in Section 5.3.2.

Eleven teachers of Java Programming – working in two Chilean universities and notdirectly involved in our project – were asked to complete a lesson about object instantiationand method call. In order to help them, they were informed that the lesson should tacklethe following topics: constructor, new, method call, ’.’ , constructors with arguments, methodoverloading, delegation, object design, separation of concern (see Figure D.1 on Appendix D).They were also informed that topic ordering was flexible. Topics were purposely defined forvarious levels of granularity and without apparent coherency so that each teacher felt freeto give an original contribution without being restricted by their usual way of teaching thosetopics.

The lesson was presented as a graph built with the LessonMapper2 tool and having thecharacteristics discussed in Chapter 5. The tool was previously introduced to the teachers,along with examples of lesson graphs and available relation types. Each teacher was confrontedto 4 different situations: (1) A graph with one LO, (2) a graph with five LOs (including the firstgraph), (3) a graph with nine LOs (including the second one), and (4) a graph with thirteenresources (including the third one) (see Figures D.2, D.3, D.4, and D.5 on Appendix D). Thecontent of the presented graphs was not previously known by the teachers: The presented LOs


were original and not similar to the teachers’ own courses. The repository did not contain anyLOs used in the presented graphs.

The proposed graphs were purposely incomplete in order to move the teacher to completethem. For each situation, teachers were asked to complete the corresponding lesson graph withtwo new LOs of their choice. The teachers had to thoroughly describe the required LOs so thatthe interviewer could identify which repository LO matches the teacher’s intent. The matchingLOs were not communicated to the teachers. Instead, they were asked to search for them inthe repository using common keyword queries and locating the expected material insidethe graphs, i.e., by defining query nodes (see Figure D.6 on Appendix D). Teachers knewthat keyword queries were used to search the metadata of the available LOs of the repositoryand not their content. Eventually, query terms referring to some special vocabulary valueswere defined in natural language and then replaced by the interviewer with the appropriatevocabulary value.

The four situations gave rise to respectively 23, 21, 22 and 22 test cases, includingkeyword query and position in lesson graph along with the relevant results (see (Motelet,2007) for test case records). In the first situation, one teacher formulated 3 queries instead oftwo. In the second situation, one query had no answer in the repository and was subsequentlyignored.

8.2.3.2 Preliminary Results

The performance of each evaluation systems is analyzed from the perspective of Precision-Recall graphs (Salton and Mcgill, 1986). Recall is the fraction of the relevant documentswhich have been retrieved while Precision is the fraction of the retrieved documents which arerelevant. Since the recall levels for each test case may be different than the standard recalllevels (from 0% to 100% with steps of 5%), the precision-recall data of the experiment areinterpolated to fit into these standard levels. In the remaining of this chapter, the presentedprecision-recall curves are the average of the interpolated precision-recall results of all thetest cases.

Lucene and our restriction-based and suggestion-based classifiers are somehow orthogo-nal classifiers: one focuses on the content when the others focus on the context. We proposeto promote the LOs that are relevant for both Lucene and these classifiers. In order to takeinto account this proposal, this section discusses both the original classifiers and also newclassifiers whose score is the score of one context-based classifier multiplied by the Lucenescore. We call them mixed classifiers in contrast with the previous ones which are calledoriginal classifiers.


Restriction-based Classifiers Figure 8.2 shows the performance of the original restriction-based classifiers in comparison with Lucene. The top of the figure illustrates that all theoriginal restriction-based classifiers have very low performance compared with the Luceneclassifier. Nevertheless, some of the mixed restriction-based classifiers have better performancethan Lucene (in particular for the attributes educational/learningResourceType andgeneral/aggreagationLevel). The other mixed classifiers are less efficient than Lucene,e.g., the educational/intendedUserRole-based mixed classifier. Note also that someclassifiers have a null precision; these classifiers concern LOM attribute for which no restrictionrules were defined (e.g., the educational/typicalAgeRange-based classifier).

Suggestion-based Classifiers Figure 8.3 shows the performance of the suggestion-basedclassifiers in comparison with Lucene. Like for the original restriction-based classifiers,the original suggestion-based classifiers have low performances compared with Lucene (seethe top of the figure). Nevertheless, some of the mixed suggestion-based classifiers, i.e.,classifiers combined with Lucene, perform better than Lucene. This is the case of themixed suggestion-based classifiers for the attributes educational/interactivityType,educational/intendedEndUserRole and general/description. The other mixed clas-sifiers have lower performance than Lucene. This is the case of the mixed suggestion-based clas-sifiers for the attributes general/coverage, general/title, educational/learning-ResourceType, and educational/aggregationLevel.

The classifiers shown in Figure 8.3 are based on context-only diffusion, i.e., that the valueof the evaluated node are not taken into account in the computation of the suggestions (Section8.2.3.2). Figure 8.4 plots the average performance of the mixed classifiers for both context-onlydiffusion and full-diffusion. While the graph shows a short advantage for context-only diffusionover the full-diffusion in our experiments, this benefit is too small to be generalized. Note thatnone of the two methods performs better than Lucene in average.

8.2.3.3 Conclusion

Both restriction-based and suggestion-based methods multiplied by Lucene show results whichare better, equivalent, and lower than Lucene. In order to enhance the LO retrieval, bestcombination of classifiers should be found. Therefore, the conditions under which a certainclassifier works the best have to be defined.

However, it seems reasonable to think that the classifiers performing well in theseexperiments may not produce the same result in other educational settings. Therefore, manualtuning is useless without lager experiments generalizing classifier behavior. Nevertheless,


Recall

Pre

cisi

on

0% 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

0%10

%20

%30

%40

%50

%60

%

general/titlegeneral/keywordgeneral/descriptiongeneral/languagegeneral/coveragegeneral/structuregeneral/aggregationleveleducational/interactivitytypeeducational/learningresourcetypeeducational/interactivityleveleducational/semanticdensityeducational/intendedenduserroleeducational/contexteducational/typicalagerangeeducational/difficultyeducational/descriptioneducational/language

Lucene

Recall

Pre

cisi

on

0% 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

0%10

%20

%30

%40

%50

%60

%


Lucene

Figure 8.2: Top – Recall-precision for the original classifiers based on restriction compliance.Bottom – Recall-precision for the mixed classifiers based on restriction compliance.


Recall

Pre

cisi

on

0% 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

0%10

%20

%30

%40

%50

%60

%


Lucene

Recall

Pre

cisi

on

0% 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

0%10

%20

%30

%40

%50

%60

%


Lucene

Figure 8.3: Top – Recall-precision for the original classifiers based on suggestion similarity.Bottom – Recall-precision for the mixed classifiers based on suggestion similarity.


Recall

Pre

cisi

on

0% 5% 15% 25% 35% 45% 55% 65% 75% 85% 95%

0%10

%20

%30

%40

%50

%60

%

LuceneMean for context−only diffusionMean for full context diffusion

Figure 8.4: Average precision-recall results for Lucene alone, the classifiers based on context-only diffusion multiplied by Lucene, and the classifiers based on full diffusion multiplied byLucene.


such extended experiments are out of the scope of this thesis. An alternative to manual tuningconsists of using an automatic process in order to find the best combination of classifiers. Sucha method has the advantage to be repeatable without human intervention. Therefore, thesystem can adapt to other teaching/learning situations. The next section introduces such asystem.

8.2.4 Machine Learning System for combining Classifiers

We use the RankBoost algorithm to combine the classifiers (Freund et al., 2003). RankBoost isa machine learning algorithm that tries to learn the optimal combination of several weak oruncertain classifiers. RankBoost is based on the Boosting principle introduced in (Schapire,1990). Boosting occurs in stages: At every stage, a weak learner (i.e., one that has an accuracyonly slightly greater than chance) is trained with the data. The output of the weak learner isthen added to the learned function, with some strength (proportional to how accurate the weaklearner is). Then, the data is reweighted: examples that the current learned function getswrong are "boosted" in importance, so that future weak learners will attempt to fix the errors.Compared with other techniques, RankBoost offers good result for optimizing the combinationof classifiers (Freund et al., 2003).

Thanks to the fact that relevance can be considered binary (a repository element isrelevant or irrelevant), we can use the simplest version of the RankBoost algorithm: At eachiteration i of the algorithm, a certain classifier is considered, along with a certain thresholdrank and a certain weight α, proportional to how accurate the classifier is. This triplet ischosen for performing the best on the training set. After each iteration, the weight of theelements for which the chosen triplet gets wrong are voluntarily increased while the weight ofthe elements for which the chosen triplet get right are voluntarily decreased. Therefore, thenext iteration will focus on the elements that are not well ranked by the previous triplet.

The result of learning is a set of triplets (classifier, threshold rank, weight), which werechosen by the algorithm. This set is accessed with some convenience step functions fi, onefor each classifier. Each step function associates a rank to a certain weight. This weightcorresponds to the sum of the weights given for all the threshold ranks of the pairs inferior orequal to the considered rank in the triplets dealing with the concerned classifier. The finalscore of a LO L is given by

∑i fi(ri(q,L)) where ri(q,L) is the rank of L according to the ith

classifier and fi is the function learnt by RankBoost for this classifier.

A number of iterations should be chosen when running the algorithm . The moreiterations are done, the better the generated combination of classifier works on the trainingset. Nevertheless, when the algorithm is used with too many iterations, it tends to fall into


over-fitting: The generated combination is becoming too specific and is loosing efficiency inthe test cases out of the training set. In order to avoid this phenomenon, (Freund et al., 2003)proposes that each function fi should monotonically increase with respect to the rank ri (givingthat the best ranks are superior to the other ones): This is the “positive cumulative weight”version of RankBoost. The remaining of the experiments are based on this version.

8.2.5 Performance Analysis

We use RankBoost twice:

Context-only diffusion version. The RankBoost algorithm is used for combining the re-sults of (1) mixed restriction-based classifiers and (2) mixed suggestion-based classifiersusing context-only diffusion.

Full-diffusion version. The RankBoost algorithm is used for combining the results of (1)mixed restriction-based classifiers and (2) mixed suggestion-based classifiers using fulldiffusion.

We train and test the two versions using a 4-fold cross-validation. Each fold correspondsto one of the four situations of the experiment: The data of three situations is used for trainingand the fourth one is used for testing.

8.2.5.1 RankBoost-based Combination of Classifiers Vs. Lucene alone

Figure 8.5 shows the standard precision-recall curves for the data of the experiment describedabove. This graph shows that the RankBoost algorithm performs better than Lucene alone forboth the full and context-only diffusions. The difference is even larger with the context-onlydiffusion. Table 8.1 summarized tests confirming the statistical significance of this difference.

Recalls Mean 95% Confidence Interval p-value0%-25% 0.05226301 0.01948787 0.08503815 0.0021090%-50% 0.05020075 0.02076609 0.07963540 0.0010530%-75% 0.04613934 0.01865014 0.07362853 0.0012510%-100% 0.04106811 0.01429855 0.06783768 0.003039

Table 8.1: T-tests on precision differences between RankBoost (context-only diffusion version)and Lucene.

In order to have more insight on the results, Figure 8.6 depicts a sequence of box-plots corresponding to the mean proportional differences between the evaluation based on


Recall

Pre

cisi

on

0% 5% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

0%10

%20

%30

%40

%50

%60

%

RankBoost for context−only diffusionRankBoost for full−context diffusion Lucene

Figure 8.5: Precision-recall results of Lucene and the RankBoost combinations.


●

●

●

● ●●●

●●●

●●●

●

●●

●●

●

●●

●

●

●

●●●

●

●●

●

●●

●

●●

●● ●● ●

●

●

●

●

●

●

●

●

●

Recall

Per

cent

age

of P

reci

sion

Gai

n

0% 5% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

−10

%0%

10%

20%

30%

40%

50%

60%

Figure 8.6: Box-plots of the improvement of RankBoost (context-only diffusion version) overLucene. A boxplot consists of the smallest observation, lower quartile (the box base), me-dian (the middle line in the box), upper quartile (the box top), and largest observation; inaddition, the boxplot indicates which observations, if any, are considered unusual, or out-liers (the isolated round points).


RankBoost for the context-only diffusion version and Lucene alone. These results are basedon 100 samples taken randomly with repetition into the 88 test cases (Bootstrap Sampling).Concerning these statistics, RankBoost for the context-only diffusion version presents aconstantly positive gain over Lucene alone. On average, this gain is more than 10%. We alsonote that almost none of the generated samples shows a decrease of precision compared withLucene until 70% of recall. After 70% of recall, the presented systems (including Lucene)have a precision rate too low to be interesting for the user. Therefore, it can be stated thatin these test cases the use of our method is always beneficial or similar to Lucene alone. Thegain increase between 35% and 65% of recall corresponds to the performance decrease of bothsystems over 35% of recall.

8.2.5.2 RankBoost Hypothesis Analysis

As described in Section 8.2.4, RankBoost generates a set of step functions for each classifier.Figure 8.7 shows the step functions of the selected classifiers when applying RankBoost tothe context-only diffusion version of our system. The step functions are presented over a fixedrange of ranks correponding to the size of the repository (-170 is the worst rank whereas -1 isthe best rank). The associated weights are positive and monotonically increasing since we usethe “positive cumulative weight” version of RankBoost for avoiding over-fitting.

The presented results contain the cumulation of the generated hypothesis during thefour training sets. That is to say using the LO graphs with 5, 9 and 13 elements, then usingthe graphs with 1, 9, and 13 elements, next using the graphs with 1, 5, and 13 elements, andfinally using the graphs with 1, 5, and 9 elements (4-fold cross-validation). Appendix E detailsthe results of RankBoost in these four situations.

In the cumulative results of Figure 8.7, we observe three types of classifiers:

Weak classifiers. They are the classifiers which do not much influence the final ranking scorebut are selected by RankBoost. In our test-cases, they are the mixed restriction-basedclassifiers for educational/intendedUserRole, general/aggregationLevel, andgeneral/structure, and the mixed suggestion-based classifiers for educational-

/language, general/coverage, and general/keyword.

Depreciating classifiers. They are mixed classifiers giving a same positive weight for mostranks but depreciating the few elements that have very bad ranks (worst than the 150thrank). These classifiers depreciate the repository LOs having a bad score in both Luceneand the original suggestion-based or restriction-based classifiers. In our test cases,depreciating classifiers are the mixed suggestion-based classifiers for the attributes:


−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Res−educational/intendedenduserrole*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Res−educational/learningresourcetype*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Res−general/aggregationlevel*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Res−general/structure*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/description*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/difficulty*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/intendedenduserrole*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/interactivitylevel*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/interactivitytype*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/language*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/learningresourcetype*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/semanticdensity*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−educational/typicalagerange*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/aggregationlevel*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/coverage*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/description*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/keyword*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/language*Lucene

−150 −100 −50 0

0.0

1.0

2.0

Rank

Alp

ha

Sug−general/title*Lucene

Figure 8.7: Cumulated hypothesis generated by the context-only diffusion version of RankBoostwhen training with every combinations of 3 situations taken from the four situations of thetest cases.


educational/difficulty, educational/intendedEndUserRole, educational-

/learningResourceType, educational/typicalAgeRange, educational/inter-activityLevel, educational/interactivityType, educational/semanticDen-sity, and general/language.

Overrating classifiers. They are mixed classifiers benefiting elements having a goodrank (better than 30). These classifiers bring out the repository LOs having a goodscore in both Lucene and the original suggestion-based or restriction-based classi-fiers. In our test cases, overrating classifiers are mixed restriction-based classifier foreducational/learningResourceType and the mixed suggestion-based classifiers foreducational/description, general/description, and general/title. We notethat the mixed suggestion-based classifiers for educational/interactivityLeveland educational/interactivityType, are both overrating and deprecating classi-fiers.

Note that some classifiers have bad results when individually taken, but they are includedin the classifier combination generated by RankBoost. This occurs with the mixed suggestion-based classifiers for educational/learningResourceType or general/aggregation-

Level. This point illustrates that the RankBoost process tends to benefit weak classifiers.

In our test cases, the system prefers the mixed suggestion-based classifiers over themixed restriction-based classifiers. This is certainly due to the fact that the experiments arebased on a few restriction rules (see Appendix B). Indeed, the system generally generates onlyone restriction or even no restriction per LOM attributes in this experiment. Thus, the resultof the mixed restriction-based classifiers are often binary: A LO complies with the restriction(if it exists) or not. In such a case, only the LOs sharing the first ranks are interesting. Forthat reason, the restriction-based classifiers are overrating classifiers.

We also note that the overrating mixed suggestion-based classifiers mostly concernfree-text attributes (e.g., general/title, general/keyword, general/description,or educational/description) while the depreciating mixed suggestion-based classi-fiers deal with vocabulary-based attributes (e.g., educational/intendedEndUserRole,educational/learningResourceType, or educational/difficulty). In the formercase, this means that Lucene and the original suggestion-based classifiers for free-text at-tributes tend to give good ranks to the same relevant results. In the latter case, this meansthat Lucene and the original suggestion-based classifiers for attributes based on vocabularytend to give bad ranks to the same irrelevant results.

On the one hand, the scope of possible values for the attributes based on free text is notrestricted. Therefore, these values can be diverse enough to identify LOs. For that reason,classifiers for free-text attributes can pertinently rank repository LOs. Thus, the mixed


suggestion-based classifiers for free-text attributes included in the RankBoost combinationare overrating classifiers.

On the other hand, the scope of possible values for the attributes based on vocabularyis restricted to a predefined vocabulary list. Thus, several repository LOs can share a samevocabulary term without being closely related. For that reason, classifiers for attributes basedon vocabulary may not be sufficiently selective to ensure the pertinence of well-ranked LOs.However, this information can be used to exclude the repository LOs not entering this selection.That is certainly why mixed suggestion-based classifiers for attributes based on vocabularyare depreciating classifiers.

Since most educational attributes are based on vocabulary, classifiers for educationalattributes are depreciating classifiers. While not sufficiently accurate to be used indepen-dently, these classifiers have a significant impact on the final RankBoost result. E.g., themixed suggestion-based classifiers for the attributes educational/interactivityLevel,educational/interactivityType, and educational/semanticDensity are part of thefive most important classifiers – alpha over 1.5 – in our experiments. Consequently, theseexperiments show that the educational attributes can be useful for enhancing LO retrievalwhen automatically processed.

8.2.5.3 Full Diffusion vs. Context-only Diffusion

Figure 8.8 plots the results of Bootstrap Sampling comparing the performance of RankBoostcombination for the full-diffusion version (in gray) with the context-only diffusion (in white)over Lucene alone. The boxplots include notches around the median line. Notches representsthe normalized standard deviation of the records, i.e., ±1.58 × interquartileRange/

√(n)

where n is the number of records displayed by the boxplot. Therefore, if the notches oftwo plots do not overlap this is a “strong evidence” that the two medians differ (Chamberset al., 1983). This is the case when comparing the context-only diffusion version with thefull-diffusion version for a recall under 70%. Therefore, the benefits of the context-only versionover the full-diffusion version are significant.

Figure E.5 of Appendix E shows the hypothesis generated by RankBoost when combiningthe full-diffusion version of the classifiers. We remark that the suggestion-based classifier forgeneral/title is overused compared with the other classifiers.

As stated in Section 8.2.2.2, compared data of the full-diffusion version are less homoge-neous than in the context-only version since the attributes of the query nodes are generallynot instantiated. However, the general/title attribute is used for defining the keyword


● ●

●

●

●

●●

●

Recall

Per

cent

age

of P

reci

sion

Gai

n

● ●●

●

●

● ●

●●

●●

●●

●

●● ●●

●

● ●

● ●●

● ● ●

●

●

●

0% 5% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

−10

%0%

10%

20%

30%

40%

50%

60%

Figure 8.8: Box-plots of the improvement of context-only diffusion version of RankBoostover Lucene alone (gray boxplots) and the full-diffusion version of RankBoost over Lucenealone (white, slim boxplots).


query. Thus, in the full-diffusion version, the classifier dealing with this attribute works onmore homogeneous data than the other classifiers. Consequently, RankBoost tends to bring itout.

8.2.5.4 Semantics vs. Topology

Our diffusion scheme for the suggestion calculus may be compared to the work of (Marchiori,1998) that proposes a diffusion algorithm based on fuzzy logic (see Section 3.4.4.1). In thisapproach, the reduction rate at each diffusion step is fixed. On the opposite, the reduction rateused in our approach depends on the attribute and the relation type (these rates are listed inAppendix A). We make a new version of the diffusion algorithm to simulate the Marchiori’sproposal: This version takes a fixed reduction rate of 70%, i.e., p(a, t) = 0.7 for any attribute a

and any relation type t. This rate was chosen for giving best precision-recall results on our testcases. The semantics of the graph do not influenced the suggestion diffusion process resultswhen using a fixed reduction rate. Indeed, only the shape of the graph is taken into accountin the suggestion computation. For that reason, this suggestion diffusion version is calledtopology diffusion.

●

●

●

●

●

●

●

●

●

● ●●

● ●

● ●●

● ● ● ●

Recall

Per

cent

age

of P

reci

sion

Gai

n

●●● ●●

● ●● ●

●●

●

●

●●

●

●

●

●● ●

●

●● ●

●●● ●●●

● ●●●●

●●●● ●●●

●●●●●

0% 5% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

−10

%0%

10%

20%

30%

40%

50%

60%

Figure 8.9: Box-plots of the improvement of the context-only diffusion version of RankBoostover Lucene alone (gray boxplots) and the topology diffusion version of RankBoost over Lucenealone (white, slim boxplots).


We apply again the RankBoost algorithm for combining the results of mixed restriction-based classifiers and the mixed suggestion-based classifiers using topology diffusion. Figure 8.9compares the results of the context-only diffusion version (in gray) with the topology diffusionversion (in white). Since the notches of the two plots do not overlap for a recall under 70%, thebenefits of the context-only version over the topology version are significant.

Figure E.6 of Appendix E shows the hypothesis generated by RankBoost when com-bining the topology versions of the classifiers. We remark the hypothesis for the general-/aggregationLevel suggestion-based classifier using topology diffusion has a more regularincrease than using context-only diffusion. In this case, a constant reduction rate performsbetter than the reduction rates inferred from the semantic-based analysis of our reposi-tory. We believe that this result witnesses an irregularity of our repository. Indeed, thegeneral/aggregationLevel attributes of the two repository LOs linked together withassessedBy/asseses or summarizedBy/summarizes relations never share the same value.While it is reasonable to think that these attributes may share the same values, repositoryanalysis inferred a maximum reduction rate (0.0) for these relation types and the general-/aggregationLevel attribute. This result is basically due to an irregularity of our repository.

Repository irregularities are due to a lack of heterogeneous data. Therefore, when therepository data are too homogeneous a fixed reduction rate should be used instead of therepository analysis results.

8.2.6 Conclusion

We have described how graph semantics can be used as an additional source of evidence whensearching in LO repositories. We proposed a method in which a query is defined not only as aclassical keyword set, but also according to the location of the expected LO within the currentlesson graph. The former is processed by a standard IR system (in our case Lucene) (Section8.2.1). The latter information is processed differently, by comparing the context of the querynode with the context of each repository LO (Section 8.2.2).

Experiments show that when comparing the repository LO context with the query context,not only the graph structure is useful but also its semantics. Moreover, we should not considerthe metadata value of the LO itself for the computation of its context, thereby making it muchmore comparable to the query context. Our first experiments have shown that the approachwas sound. Nevertheless, larger scale experiments, with bigger and more heterogeneousrepositories are needed in order to prove that our approach is scalable.

This study has shown that metadata can be useful for retrieving LOs even if the user


does not specify them in the query. In our case study, educational metadata play an importantrole in the LO retrieval whereas they have not been specified in the keyword query content.However, our method uses the metadata of the authored-lesson LOs for enhancing LO re-trieval. Consequently, it requires the user to generate metadata for the authored-lesson LOs.Nevertheless, the precision enhancement of query results rewards this production.


Search ModuleRankBoost +

Lucene

LORLOMs + LOs

AdminLM2

Teachers

LM2 LM2

ConfigurationLOM Profile +

Suggestion Proba + Restriction Rules

Figure 8.10: LessonMapper2 deployment model including search module

Our proposals for supporting lesson design only requires access to the configuration ofthe restriction rules. This requirement can be reach with the deployment diagram describedin 7.3. Nevertheless, this deployment diagram needs to be extended in order to integrate theLO retrieval system proposed in this chapter.

This extension consists of a search module managing both the Lucene indexation ofrepository LOM instances and our RankBoost-based approach. The search module usesinformation coming from the rule configuration and from the LO repository. Figure 8.10depicts the deployment of the system including the search extension.

Lucene acts as an alternative indexation process for the LO repository. As detailed inthis section, queries done in LessonMapper2 consists of a query node and a set of keywords.When querying the search module, LessonMapper2 sends the current lesson graph in termsof metadata, the identifier of the query node, and the keyword set. Since only metadata areused in our search method, LOs are not communicated to the search module. A ranked result

Conclusion 151

list is then returned to the authoring tool. This list is explored by the user. If she choosesone LO for being reused in her own lesson, then LessonMapper2 sends the reference of herchoice to the search module. Next, queries documented with their preferred results are usedin order to train our RankBoost-based approach. Rankboost training can be schedule, e.g.,after a certain number of queries being processed by the system, or during low-access periods.Furthermore, the Lucene index is updated each time a new LO is inserted or updated. Thisfeature is implemented as an extension of the eXist DB available at (LessonMapper2, 2007).

8.4 Conclusion

This chapter has shown that it is possible to reward lesson authors for generating LOM andthus answer positively to RQ3.

First, generating valid LOM during lesson authoring can facilitate lesson design: Lessondesign can be analyzed using a visual representation of the educational characteristics ofthe lesson LOs. Furthermore, processing the metadata of the lesson LOs can facilitate theautomatic detection of lesson design mistakes. Second, generating metadata for the authored-lesson LOs can enhance the retrieval of additional LOs during lesson authoring compared withclassical information retrieval systems. It has been shown that both general and educationalmetadata participate in this improvement. Rewards like those presented in this chapter arenecessary in order to motivate lesson authors to generate LOM.

This thesis has now reached its aim: to bring answer to the three research questionsstated in Chapter 4. Next chapter discusses the contributions and perspectives brought by theresearch presented in this thesis.

Chapter 9

Contributions and Perspectives

This thesis made some practical proposals in order to answer the following three researchquestions: RQ1–How to seamlessly integrate hybrid LOM generation and hybrid LOM valida-tion into a lesson authoring tool? RQ2–How can processes dealing with LOM semantics copewith incomplete, missing, or incorrect metadata? RQ3–How can lesson authors benefit fromthe LOM they generate? These proposals showed that it is possible to translate all the stepsof the vicious circle in which the LOM usage currently stands, towards a positive situation(Chapter 4). For that reason, this thesis can state that it is technically possible to improveLOM usage during lesson authoring.

In order to support this study, LessonMapper2, a Java-based open-source lesson authoringtool has been developed. In LessonMapper2, lessons are authored as hierarchical graphs ofLOs characterized with LOM (Section 5.1). A zoomable user interface facilitates the navigationinto nested LO graphs (Section 5.4). LessonMapper2 is based on a generic framework forprocessing metadata organized in graph so that it can be used with other metadata than IEEELOM (Section 6.6). Besides, a basic repository implementation was introduced for integratingseamless LO sharing and LO retrieval into LessonMapper2. A repository extension wasalso implemented in order to deploy the proposals of this thesis to a teacher community.LessonMapper2, the repository configuration, and the repository extension are available at(LessonMapper2, 2007). These tools were used as a platform in order to validate the feasibilityof the proposals of this thesis.

The remaining of this chapter discusses the contributions of this thesis, their limitations,and future work for each of the research questions stated above.

153

154 Chapter 9. Contributions and Perspectives

9.1 LOM Generation and Validation Integration

9.1.1 Contributions

About LOM generation and validation integration, this thesis made the following contributions:

LOM instantiation interface. This thesis proposed to include a new interface for support-ing manual LOM instantiation while authoring lessons based on LO graphs. Thisinterface permits the simultaneous edition of a same LOM attribute for various LOs. Thisfeature enables the instantiation of subjective LOM attributes in comparison with theattribute values of other LOs of the lesson (Section 5.5.1).

Hybrid LOM generation. It also proposed to assist LOM generation with a new methodproviding weighted probable values and identifying inconsistent values. This methodhas the advantage of being applicable to subjective attributes – like, e.g., educationalattributes – for which automatic generation is generally impossible. Our LOM generationinterface manages three kinds of information for supporting LOM instantiation: thevery probable values, the probable values, and the forbidden values. Therefore, it canintegrate not only our generation method, but also the existing automatic generationmethods (Section 7.1).

Hybrid LOM validation. This system serves also the LOM validation: Values identified asinconsistent receive an “invalid” flag and missing values receive an “undefined” flag. Thisinformation is used by a system tracing LOM validity states at the lesson level so thatlesson authors can constantly have an insight on which metadata is missing or invalid(Section 7.2).

This work proposed methods to seamlessly integrate hybrid LOM generation and valida-tion during lesson authoring. It has been published in (Motelet et al., 2007b). Nevertheless,there are still various limitations to tackle as described in the next section.

9.1.2 Limitations and Future Work

This work should be continued on the following topics:

Evaluating the impact of the “generation in comparison” feature of the LOM edi-tor interface. Future work is needed to analyze the benefits and drawbacks of instanti-ating LOM attributes in comparison. Usability studies of this approach as well as quality

LOM Processing 155

measure of the metadata produced with this method should be performed. Metrics likethose proposed in (Ochoa and Duval, 2006a) may support this work.

Evaluating the impact of the suggestions on the LOM instantiation process. Fur-ther studies are required in order to measure the impact of using suggestions for in-stantiating LOM. E.g., when using the suggestion system, the number of LOM valuesinstantiated with the suggestions should be compared to the LOM values built by hand.Moreover, the quality of LOM instantiated with the suggestion should be confronted tothe quality of manually generated LOM.

Evaluating the cost of authoring a lesson with rhetoric and semantic relations.The approach proposed by this thesis assumes that lessons are based on LO graphswith semantic and rhetoric relation types. This requirement involves additional workfor lesson authors: They need to identify the relations between the LOs of the authoredlesson. Nevertheless, defining relations between the LOs of a lesson may help the teacherto clarify her aims (Section 5.6.3). This benefit was also experimented when designinglesson with paper-based concept maps (Martin, 1994). A study is needed on the trade-offbetween the cost of defining relations among LOs and the benefits they have on lessondesign.

9.2 LOM Processing

9.2.1 Contributions

This thesis presented a conceptual model and its implementation for processing LOM (Chap-ter 6).

Coping with missing, incomplete, or incorrect metadata. This model formally definesthe diffusion of LOM-based processes along the edges of a lesson graph. This diffusionmethod prevents missing and incomplete metadata from blocking these processes. More-over, it reduces the impact of incorrect metadata values on the final result of processingLOM (Section 6.5). Two implementations of this model are proposed: one for generatingweighted probable LOM values and one for generating restrictions about LOM valuescope. In the first case, processing LOM is configured by a set of probabilities that canbe deduced from the analysis of an existing repository of LO graphs. In our case study,this probabilistic method provide more valuable information than a fuzzy-logic-basedapproach when the analyzed repository contained sufficient data for calculating theprobabilities (Section 8.2.5.4). In the second case, rules are defined in a domain specificlanguage introduced by this thesis.


Adapting LOM processing rules. LOM processing deals with lesson semantics. Therefore,it is strongly related to the teaching style of the teacher community using it. For thatreason, LOM processing needs to be adaptable in order to suit this teaching style. Sinceour model separates the diffusion process from the rule definition, rules can be tailoredwithout having to modify the diffusion process (Section 6.2.2).

This model provide a practical solution to cope with missing, incomplete and incorrectmetadata values. It has been published in (Motelet et al., 2007c). Next section describes thefuture work necessary to complete this research.


This work about LOM processing requires further efforts in the following topics:

Supporting restriction rule customization. Restriction rule customization implies thatusers agree on what are the characteristics of an inconstant lesson graph. In our casestudy, this process was done iteratively by a single person synthesizing the requirementsand generating the rules. Nevertheless, this strategy requires to decide on a personresponsible for administrating the rule list. Further work should try to support thisprocess in order to avoid such centralized task. E.g., the analysis of a corpus maygive interesting probabilistic information for automatically generating restriction rules.Instructional patterns about best and worst instructional practices may also support thistask. Finally, feedback from the users could be automatically taken into account. E.g., thelesson author may not agree with the system detection of a certain incoherency betweenan inferred value restriction and some instantiated metadata values. This disagreementmeans that the concerned restriction rules do not suit the author’s teaching style. Thus,this conflict should be reported back to the configuration server. If the same conflictoccurs with various teachers, then the rule could be canceled.

Automatically generating LO graphs from existing LO sets. Since our model is basedon the LO graph characteristics, further work is needed in order to transform existingLO sets into lesson graphs. E.g., it may be investigated which relation types and LOMextensions are necessary in order to transform an IMS LD-based or SCORM-basedorganization into a lesson graph. Nevertheless, the lesson graph generated from an IMSLD and SCORM model will have different semantics than the lesson graphs presented inthis thesis. In particular, this lesson graph will contain few rhetorical relations since thesepopular models contain few rhetorical information. Automatizing the characterization ofrhetorical relations between LOs based on content analysis and LOM characteristics asdone in (Farrell et al., 2004; Engelhardt et al., 2006) should also be studied.

LOM Benefits 157

Applying the model to other metadata types. The proposed metadata processing modelcan be used with any type of metadata for which a graph can be defined. Future workshould focus on new model applications considering metadata different from LOM.

9.3 LOM Benefits

9.3.1 Contributions

This thesis made some practical proposals for (1) using LOM in order to support the lessondesign process, and (2) using LOM in order to enhance the LO retrieval process.

Using LOM for facilitating lesson design. In LessonMapper2, the educational char-acteristics of the LOs of a lesson are visually rendered. This feature gives the lessonauthor an insight on the educational organization of her lesson. Moreover, our restrictionmechanism permits to automatically detect some potential lesson design mistakes duringlesson authoring (Section 8.1).

Enhancing LO retrieval with LOM. We propose to use the LOM characteristics of thelesson graph structure as an additional source of evidence when searching in LO reposito-ries during lesson authoring. In this approach a query is defined not only as a classicalkeyword set, but also according to the location of the expected LO within the currentlesson graph. The former is processed by a standard IR system (in our case Lucene)(Section 8.2.1). The latter information is processed differently, by comparing the contextof the query node with the context of each repository LO (Section 8.2.2). Information fromboth sources is then combined with a machine learning system (in our case RankBoost)in order to enhance the LO retrieval precision and recall. This system trains with thequeries performed by the community members using LessonMapper2. Preliminary exper-iments show that the approach is sound (Section 8.2). It also reveals that educationalatributes were necessary for enhancing the LO retrieval in the context of our case study.This method and its results have been published in (Motelet et al., 2007d).

The above facts confirmed that lesson authors can benefit from the LOM they generate.Nevertheless, these proposals also require further work as described in the next section.



The research directions proposed below should be considered in order to enhance LOM benefitsfor lesson authors.

Measuring the benefits of instantiating educational LOM attributes during lessonauthoring. Instantiating the educational characteristics of lesson LOs may help lessonauthors to state their pedagogical strategy, but further work is needed to measure this fact.E.g., feedback of users designing lesson with and without instantiating the educationalcharacteristics of their LOs may be analyzed. In particular, it should be determined whichLOM attributes are the most helpful for lesson authors.

Measuring the benefits of visually characterizing LOs. Further studies are needed inorder to evaluate the visual characterization feature of LessonMapper2. This analysisshould also try to give insight on the average number of LOM characteristics to displayper LO. E.g., various configurations may be tried in order to establish when this featurehelps the users and when it entails cognitive overload. User control over the visualizationfeature may also be considered since some LO characteristics may be more relevant insome settings than in other ones.

Evaluating the training delay of the LO retrieval system. Before becoming beneficial,our system needs to train with a certain critical mass of LOs. This critical mass needs tobe estimated. For now, our case study can be considered as an upper limit for this criticalmass. It consists of 170 LOs of a specific domain linked with about 800 typed relations.This corpus deals with a 10-hour course of introduction to Java programming based onmaterial from 3 teachers.

Testing the scalability of the LO retrieval system. This thesis does not consider awell-populated LO repository in order to experiment our proposals because there are fewexplicit relations linking the LOs contained in the available repositories. This semanticlack disables the benefits of our system. We suggest two possibilities in order to scale theresults. First, the system may be deployed and used in a teacher community until a largerand more heterogeneous repository than the case study repository can be built. Second,it may be considered to transform popular structuring frameworks into LO graphs, asproposed in Section 9.2.2.

9.4 Conclusion

This thesis has shown that improving LOM usage during lesson authoring is technically possi-ble. Nevertheless, other, non-technical, factors are also necessary to study in order to improve

Conclusion 159

LOM usage during lesson authoring. These additional factors include organizational issues(e.g., lesson author time management, or lesson author skills about information technology(Pajo and Wallace, 2001)) and sociological issues (e.g., negative attitude and belief of lessonauthors, innovation acceptance (Rogers, 2003)).

This thesis has also shown that LOM can be useful not only for understanding the LOsof other teachers, but also to serve LOM generation, LOM validation, lesson design and LOretrieval during lesson authoring. These results include the educational metadata. Whileeducational metadata are generally considered useless because of their context dependency,they can be useful when used in the lesson context, i.e., in comparison with the metadata ofother LOs. This finding confirms the current research trend stating that usage context is avaluable information for retrieving and using LOs (Brooks and McCalla, 2006; Jovanovic et al.,2006; Brooks et al., 2006; Najjar et al., 2006; Ochoa and Duval, 2006b).

Appendix A

Suggestion Probabilities

161

162 Appendix A. Suggestion Probabilities

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/title

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/keyword

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/description

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/language

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/coverage

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/structure

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/aggregationlevel

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refuteseducational/in

teractivitytype

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/learningresourcetype

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

teractivitylevel

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/semanticdensity

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

tendedenduserrole

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/context

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/ty

picalagerange

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/diffic

ulty

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/description

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/language

Figure

A.1:P

robabilitiesfor

acertain

LO

Mattribute

tohave

thesam

evalue

intw

orelated

LO

s.The

natureofthe

relationlinking

theL

Os

isdefined

inhorizontalaxis.

163

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/title

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/keyword

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/description

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/language

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/coverage

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/structure

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

teractivitytype

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

teractivitylevel

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

tendedenduserrole

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/context

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/ty

picalagerange

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/diffic

ulty

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


164 Appendix A. Suggestion Probabilities

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/title

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/keyword

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/description

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/language

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/coverage

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

general/structure

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

teractivitytype

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

teractivitylevel

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/in

tendedenduserrole

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/context

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/ty

picalagerange

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes

educational/diffic

ulty

Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Similarity

0.0 0.2 0.4 0.6 0.8 1.0

notdefined

introducesto

introducedby

assessedby

assesses

supportedby

supports

abstractedby

examplifiedby

comparablewith

isbackgroundfor

hasbackgroundin

summarizedby

summarizes

ispartof

haspart

isversionof

explainedby

explains

resolvedby

resolves

refutedby

refutes


Appendix B

Restriction Rules

165

166 Appendix B. Restriction Rules

lan

gu

ag

eco

verag

e

ag

greg

ati

on

Level

inte

racti

vit

y

Typ

e

Sem

an

tic

Den

sit

y

typ

ical

Learn

ing

Tim

ed

iffi

cu

lty

inte

nd

ed

User R

ole

learn

ing

Resso

urce T

yp

eco

st

co

pyrig

ht

an

d

oth

er

restr

icti

on

s

isP

artO

f

hasP

art

su

mm

ariz

es

su

mm

ariz

ed

by

intr

od

ucesTo

intr

od

uced

By

assessed

By

assesses

exam

pli

fied

By

ab

str

acte

dB

y

exp

lain

s

exp

lain

ed

By

isB

ackg

ro

un

dFo

r

hasB

ackg

ro

un

dIn

su

pp

orts

gen

eral

ed

ucati

on

al

rig

hts

Fig

ure

B.1

:Res

tric

tion

rule

sfo

rL

OM

attr

ibut

eva

lues

base

don

the

valu

esof

rela

ted

LO

s.T

hese

rule

sde

pend

onth

ere

lati

onty

pere

lati

ngth

eL

Os.

Appendix C

Restriction Rule Definition

167

168 Appendix C. Restriction Rule Definition

R→ Csimple Osimple | CONTAINS Ocontains | CONTAINED Ocontained

Csimple → INFEQ | SUPEQ

Osimple → MAX | MIN | Vsimple

Ocontains → UNION | Vset

Ocontained → INTERSECTION | Vset

Vset → Vsimple (, Vset)∗

Vsimple → V

Figure C.1: Restriction rule context-free grammar

<rules>

<restriction attribute="general/coverage" relation="summarizedBy">CONTAINS UNION

</restriction>

<restriction attribute="general/coverage" relation="ispartof">CONTAINED INTERSECTION

</restriction>

<restriction attribute="general/aggregationlevel" relation="introducesTo">INFEQ MIN

</restriction>

<restriction attribute="general/aggregationlevel" relation="assessedby">SUPEQ MAX

</restriction>

<restriction attribute="educational/learningresourcetype" relation="abstractedby">CONTAINED ’exercise,simulation,diagram,figure,graph,

slide,table,narrativeText,experiment,problemStatement,selfAssesment’

</restriction>

</rules>

Figure C.2: XML configuration of some restriction rule examples

Appendix D

Experiment Data

169

170 Appendix D. Experiment Data

Instanciacion de Objetos. Llamado a métodos.

- constructor

- new

- llamado a método

- .

- constructor con argumentos

- sobrecarga de métodos

- delegación

- diseño de objetos

- separación de problema

Relaciones disponibles

R (Actividad a) R (Actividad b)

introduce a la actividad a introduce b

evaluado por la actividad b permite de evaluar el conocimiento adquirido en a

facilitado por la actividad b sirve de soporte para a

abstraído por la actividad b abstrae el contenido de a

ejemplificado por la actividad b ejemplifica el contenido abstracto descrito en a

comparable con es interesante al nivel didáctico comparar el contenido de a con el

contenido de b

base para la actividad a tiene un contenido requerido para hacer b

resumido por la actividad b resume el contenido de a

explicado por la actividad b explica el contenido de a (fuertemente acoplada - por ej: a es

un ejemplo y b es un texto explicando conceptos pero referendo se

específicamente al ejemplo)

resuelto por la actividad b resuelve la actividad a (fuertemente acoplada – típicamente a

es la definición de un problema y b su solución)

refutado por la actividad b refuta la actividad a (fuertemente acoplada – típicamente b es

una critica o un contre ejemplo de a)

Figure D.1: Help sheet for interviewed people. It contains (1) the topics to be taught and (2)the description of the relation types for building LO graphs

171

Figure D.2: First LO graph that the interviewed teachers had to complete.


Figure D.3: Second LO graph (containing the first one) that the interviewed teachers had tocomplete.

173

Figure D.4: Third LO graph (containing the second one) that the interviewed teachers had tocomplete.


Figure D.5: Fourth LO graph (containing the third one) that the interviewed teachers had tocomplete.

175

Lección: 1 OA

1- Describe el contenido y el propósito educativo de un material educativo que

quiere agregar a esta lección:

Escribe la consulta de tipo Google que haría para encontrar tal material:

Posiciona el material que quiere agregar en el grafo:

2- Describe el contenido y el propósito educativo de otro material educativo que

quiere agregar a esta lección:

Escribe la consulta de tipo Google que haría para encontrar tal material:

Posiciona el material que quiere agregar en el grafo:

Figure D.6: Interview sheet for each LO graph. The answers were built directly with Lesson-Mapper2 or told to the interviewer.

Appendix E

RankBoost Results

177

178 Appendix E. RankBoost Results

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

haSug−general/aggregationlevel*Lucene

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


Figure E.1: Hypothesis generated by the context-only diffusion version of RankBoost whentraining on the test cases based on the LO graphs of 5, 9, and 13 elements.

179

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha

Sug−educational/language*Lucene

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha




−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

haSug−educational/typicalagerange*Lucene

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha



181

−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha


−150 −100 −50 0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Rank

Alp

ha




−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha

Res−educational/context*Lucene

−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha

Sug−general/structure*Lucene

−150 −100 −50 0

0.0

1.0

2.0

3.0

Rank

Alp

ha


Figure E.5: Cumulated hypothesis generated by the full-diffusion version of RankBoost whentraining with all combinations of 3 situations taken from the four situations of the test cases.

183

−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


−150 −100 −50 0

0.0

0.5

1.0

1.5

2.0

Rank

Alp

ha


Figure E.6: Cumulated hypothesis generated by the topology version of RankBoost whentraining with all combinations of 3 situations taken from the four situations of the test cases.

Publications of this Thesis

Book Chapter

Motelet, O., Baloian, N., and Pino, J. A. (2007a). Learning-object metadata and automaticprocesses: Issues and perspectives. In Harman, K. and Koohang, A., editors, Learning Object:Standards, Metadata, Repositories and LCMS, Learning Object Book Series, pages 185–220.Informing Science Press

International Journal

Motelet, O., Baloian, N. A., and Pino, J. A. (2007b). Hybrid system for generating learningobject metadata. Journal of Computers, 2(3)

International Conferences

Motelet, O., Piwowarski, B., Dupret, G., Pino, J. A., and Baloian, N. (2007d). Enhancingeducational-material retrieval using authored-lesson metadata. In Symposium on StringProcessing and Information Retrieval (SPIRE07), Lecture Notes in Computer Science. Springer

Motelet, O., Baloian, N. A., Piwowarski, B., and Pino, J. A. (2007c). Taking advantage ofthe semantics of a lesson graph based on learning objects. In The 13th International Confer-ence on Artificial Intelligence in Education (AIED 2007). IOS Press

Motelet, O. and Baloian, N. A. (2006). Hybrid system for generating learning object metadata.In Kinshuk and Koper, R., editors, Proceedings of the 6th IEEE International Conference onAdvanced Learning Technologies, (ICALT 2006), pages 563–567. IEEE Computer Society

Motelet, O. and Baloian, N. A. (2005). Taking advantage of lom semantics for supporting lessonauthoring. In Meersman, R. and al., editors, On the Move to Meaningful Internet Systems2005: OTM 2005 Workshops, Workshop on Ontologies Semantics and E-Learning (WOSE 2005),volume 3762 of Lecture Notes in Computer Science, pages 1159–1168. Springer

Doctoral Consortium

Motelet, O. (2005). Relation-based heuristic diffusion framework for lom generation. InArtificial Inteligence in Education (AIED 2005) - Young Researcher Track

185

186 Publications of this Thesis

References

Aggarwal, C. C., Wolf, J. L., Wu, K.-L., and Yu, P. S. (1999). Horting hatches an egg: a newgraph-theoretic approach to collaborative filtering. In Fifth ACM SIGKDD internationalconference on Knowledge discovery and data mining (KDD ’99), pages 201–212, New York,NY, USA. ACM Press.

Alexander, C., Ishikawa, S., and Silverstein, M. (1977). A Pattern Language: Towns, Buildings,Construction (Center for Environmental Structure Series). Oxford University Press, NewYork.

Allert, H. (2004). Coherent social systems for learning: An approach for contextualized andcommunity-centred metadata. Journal of Interactive Media in Education - Special Issueon the Educational Semantic Web, 2.

Ariadne (2007). Ariadne fundation for the european knowledge pool. http://www.ariadne-eu.org.

Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison Wesley.Baloian, N., Pino, J. A., and Hoppe, U. (1999). Intelligent navigation support for lecturing

in an electronic classroom. In Lajoie, S. and Vivet, M., editors, Artificial Intelligence inEducation, pages 606–610. IOS Press.

Baloian, N. A., Galdames, P., Collazos, C., and Guerrero, L. (2004a). A model for a collaborativerecommender system for multimedia learning material. In de Vreede, G.-J., Guerrero,L. A., and Raventós, G. M., editors, CRIWG, volume 3198 of Lecture Notes in ComputerScience, pages 281–288. Springer.

Baloian, N. A., Hoppe, H. U., and Pino, J. A. (2000). A teaching/learning approach to cscl. In33rd Hawaii International Conference on System Sciences (HICSS).

Baloian, N. A., Luther, W., Hoppe, U., and Motelet, O. (2004b). Implementing teachingstrategies in the classroom. In World Conference on Educational Multimedia, Hypermedia& Telecommunications (EdMedia 2004). Association for the Advancement of Computingin Education.

Barrit, C., Lewis, D., and Wieseler, W. (1999). Cisco systems reusable information objectstrategy version 3.0. Technical report, http://www.cisco.com.

Barton, J., Currier, S., and Hey, J. M. N. (2003). Building quality assurance into metadatacreation: an analysis based on the learning objects and e-prints communities of practice.In DublinCore International Conference.

Bederson, B. B., Grosjean, J., and Meyer, J. (2004). Toolkit design for interactive structuredgraphics. IEEE Transactions on Software Engineering, 30(8):535–546.

Berners-Lee, T. (1989). Information management: A proposal. Technical report, CERN.Blackboard (2007). Blackboard learning management system. http://www.blackboard.

com.

187

http://www.ariadne-eu.org

http://www.ariadne-eu.org

http://www.cisco.com

http://www.blackboard.com

http://www.blackboard.com

188 References

Bra, P. D., Aroyo, L., and Chepegin, V. I. (2004). The next big thing: Adaptive web-basedsystems. Journal of Digital Information, 5(1).

Brase, J. (2005). Usage of Metadata. PhD thesis, University of Hannover.Breese, J., Heckerman, D., and Kadie, C. (1998). Empirical analysis of predictive algorithms for

collaborative filtering. In 14th Annual Conference on Uncertainty in Artificial Intelligence(UAI-98), pages 43–52, San Francisco, CA. Morgan Kaufmann.

Brooks, C., Bateman, S., Liu, W., McCalla, G., Greer, J., GaÅaevic, D., Eap, T., Richards, G.,Hammouda, K., Shehata, S., Kamel, M., Karray, F., and Jovanovic, J. (2006). Issues anddirections with educational metadata. In 3rd Annual Scientific Conference of the LORNETResearch Network (I2LOR 2006).

Brooks, C. and McCalla, G. (2006). Towards flexible learning object metadata. InternationalJournal of Continuing Engineering and Lifelong Learning, 16(1):50–63.

Brusilovsky, P. (2004). Adaptive educational hypermedia. In 4th Hellenic Conference onInformation and Communication Technologies in Education, pages 19–33, Athens, Greece.

Bruvilosky, P. (2003). Developing adaptive educational hypermedia systems: From designmodels to authoring tools. In Murray, T., Blessing, S., and Ainsworth, S., editors, AuthoringTools for Advanced Technology Learning Environment, pages 377–409. Dordrecht: KluwerAcademic Publishers.

Cancore (2007). http://www.cancore.ca.Careo (2007). Campus alberta repository of educational objects. http://careo.ucalgary.

ca.Carvalho, M., Hewett, R., and Canas, A. (2001). Enhancing web searches from concept

map-based knowledge models. In Fifth Multi-Conference on Systems, Cybernetics andInformatics (SCI’01). AAAI Press.

Cedma (2007). Computer education management association. http://www.cedma.org.Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. (1983). Graphical Methods for

Data Analysis. Wadsworth & Brooks/Cole.Conole, G. and Fill, K. (2005). A learning design toolkit to create pedagogically effective

learning activities. Journal of Interactive Media in Education, 8.Cormen, T. H., Stein, C., Rivest, R. L., and Leiserson, C. E. (2001). Introduction to Algorithms.

McGraw-Hill Higher Education.Currier, S., Barton, J., O’Beirne, R., and Ryan, B. (2004). Quality assurance for digital learning

object repositories: issues for the metadata creation process. ALT-J, 12(1):5–20.DC.dot (2007). Metadata generation tool. http://www.ukoln.ac.uk/metadata/dcdot.Dempsey, L. and Heery, R. (1997). Specification for resource description methods. Technical

report, DESIRE Project - Development of a European Service for Information on Researchand Education.

diffuse (2007). Metadata diffusion framework. http://metadatadiffuse.googlecode.com.

Dodds, P. (2001). Advanced distributed learning sharable content object reference modelversion 1.2. the scorm content aggregation model. Technical report, http://www.adlnet.org.

Downes, S. (2004). Ressource profiles. Journal of Interactive Media in Education, SpecialIssue on the Educational Semantic Web., 5(ISSN: 1365-893X).

DublinCore (2007). Metadata initiative. http://www.dublincore.org last visit on 05/2006.

http://www.cancore.ca

http://careo.ucalgary.ca

http://careo.ucalgary.ca

http://www.cedma.org

http://www.ukoln.ac.uk/metadata/dcdot

http://metadatadiffuse.googlecode.com

http://metadatadiffuse.googlecode.com

http://www.adlnet.org


http://www.dublincore.org

References 189

DublinCoreEdu (2007). Dublin core metadata initiative education working group. http://dublincore.org/groups/education.

Duval, E. and Hodgins (20-24 May 2003). A lom research agenda. In WWW Conference.Duval, E. and Hodgins (2004a). Metadata matters. In International Conference on metadata

and Dublin Core specifications (DC 2004).Duval, E. and Hodgins, W. (2004b). Making metadata go away: Hiding everything but the

benefits. In DublinCore International Conference (DC-2004).EdNa (2007). Education network australia. http://www.edna.edu.au/.EML (2007). Educational modelling language. http://eml.ou.nl.Engelhardt, M., Hildebrand, A., Lange, D., and Schmidt, T. C. (2006). Reasoning about

elearning multimedia objects. In Proc. of WWW 2006, Intern. Workshop on Semantic WebAnnotations for Multimedia (SWAMM).

exist (2007). eXist DB. http://exist.sourceforge.org.Farrell, R., Liburd, S. D., and Thomas, J. C. (2004). Dynamic assembly of learning objects. In

World-Wide Web International Conference (WWW 2004).Fill, K., Leung, S., DiBiase, D., and Nelson, A. (2006). Repurposing a learning activity on

academic integrity: the experience of three universities. Journal of Interactive Media inEducation, 1.

Fischer, S. (2001). Course and exercise sequencing using metadata in adaptive hypermedialearning systems. J. Educ. Resour. Comput., 1(1es):5.

Foltz, P. W., Kintsch, W., and Landauer, T. K. (1998). The measurement of textual coherencewith latent semantic analysis. Discourse Processes, 25:285–307.

Freund, Y., Iyer, R. D., Schapire, R. E., and Singer, Y. (2003). An efficient boosting algorithmfor combining preferences. Journal of Machine Learning Research, 4:933–969.

Friesen, N. (2003). Three objections to learning objects. Online education using LearningObjects.

Friesen, N. (2004). International lom survey report. Technical report, ISO/IEC JTC1/SC36.GEM (2007). Gateway to educational material. http://www.thegateway.org/.Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. (1992). Using collaborative filtering to

weave an information tapestry. Communications of the ACM, 35(12):61–70.Good, L. E. (2003). Zoomable Interfaces for the authoring and delivery of slide presentations.

PhD thesis, University of Mariland.Google (2007). Web search engine. http://www.google.com.Greenberg, J. (2003). Metadata generation: Processes, people and tools. Bulletin of American

Society for Information Science and Technology, 29(2).Greenberg, J. (2004). Metadata extraction and harvesting: A comparison of two automatic

metadata generation applications. Journal of Internet Cataloging: The InternationalQuarterly of Digital Organization, Classification, and Access, 6(4):58–82.

Greenberg, J., Spurgin, K., and Crystal, A. (2006). Functionalities for automatic metadatageneration applications: a survey of metadata experts’ opinions. International Journal ofMetadata, Semantics and Ontologies, 1(1):3–20.

Greenstone (2007). Greenstone digital library software. http://www.greenstone.org.Hatala, M. and Richards, G. (2003). Value-added metatagging: Ontology and rule based

methods for smarter metadata. In Schroeder, M. and Wagner, G., editors, RuleML, volume2876 of Lecture Notes in Computer Science, pages 65–80. Springer.

http://dublincore.org/groups/education

http://dublincore.org/groups/education

http://www.edna.edu.au/

http://eml.ou.nl

http://exist.sourceforge.org

http://www.thegateway.org/

http://www.google.com

http://www.greenstone.org

190 References

Heath, B. P., McArthur, D. J., McClelland, M. K., and Vetter, R. J. (2005). Metadata lessonsfrom the ilumina digital library. Communications of the ACM, 48(7):68–74.

Herman, I., Melançon, G., and Marshall, M. S. (2000). Graph visualization and navigation ininformation visualization: A survey. IEEE Transactions on Visualization and ComputerGraphics, 6(1):24–43.

Hernández-Leo, D., Aseniso-Pérez, J., Dimitriadis, Y., Bote-Lorenzo, M., Jorrín-Abellán, I.,, and Villasclaras-Fernández, E. (2005). Reusing ims-ld formalized best practices incollaborative learning structuring. Advanced Technology for Learning, 208.

Hiddink, G. (2001). Educational MultimediaDatabases. PhD thesis, University of Twente.Hodgins, W. (2001). The future of learning objects. In Wiley, D., editor, The Instructional Use

of Learning Objects. Association for Instructional Technology.Hodgins, W. and Conner, M. (2000). Everything you ever wanted to know about learning

standards and were always afraid to ask. LiNE Zine’s, Fall Issue.Horn, R. E. (1995). Structured writing as a paradigm. In Romiszowski, A. and Dills, C., editors,

Instructional development: State of the art. Englewood Cliffs, NJ: Educational TechnologyPublications.

Hornbaek, K., Bederson, B. B., and Plaisant, C. (2002). Navigation patterns and usability ofzoomable user interfaces with and without an overview. ACM Transactions on Computer-Human Interaction., 9(4):362–389.

IMS (2007). IMS global learning consortium. http://www.imsglobal.org last visit on05/2006.

Inaba, A. and Mizoguchi, R. (2004). Learning design palette: An ontology-aware authoringsystem for learning design. In International Conference on Computers in Education(ICCE2004).

Interwoven (2007). Metatagger. http://www.interwoven.com.JENA (2007). A semantic web framework for java. http://jena.sourceforge.net/.Jenkins, C., Jackson, M., Burden, P., and Wallis, J. (1999). Automatic rdf metadata generation

for resource discovery. Comput. Networks, 31(11-16):1305–1320.Jonassen, D. and Churchill, D. (2004). Is There a Learning Orientation in Learning Objects?

International Journal on E-Learning, 3(2):32–41.Jonassen, D. H. (1991). Objectivism versus constructivism: Do we need a new philosophical

paradigm? Educational Technology Research and Development, 39(3):5–14.Jovanovic, J., Knight, C., Gasevic, D., and Richards, G. (2006). Learning object context on the

semantic web. In Sixth IEEE International Conference on Advanced Learning Technologies(ICALT ’06), pages 669–673, Washington, DC, USA. IEEE Computer Society.

Kabel, S., de Hoog, R., and Wielinga, B. (2003). Consistency in indexing learning objects: anempirical investigation. In ED-MEDIA 2003, Learning Objects 2003 Symposium: LessonsLearned Questions Asked, pages 26–31. Association for the Advancement of Computing inEducation.

Känsälä, T. and Hyvönen, E. (2006). A semantic view-based portal utilizing learning objectmetadata. In 1st Asian Semantic Web Conference (ASWC2006) - Semantic Web Applica-tions and Tools Workshop.

KEA (2007). Automatic keyphrase extraction. http://www.nzdl.org/Kea.Kim, W. (2005). On metadata management technology: Status and issues. Journal of Object

Technology, 4(2):41–47.

http://www.imsglobal.org

http://www.interwoven.com

http://jena.sourceforge.net/

http://www.nzdl.org/Kea

References 191

Klarity (2007). API. http://www.intology.com.au.Klerkx, J., Duval, E., and Meire, M. (2004). Using information visualization for accessing

learning object repositories. In 8th International Conference on Information Visualisation,IV 2004, 14-16 July 2004, London, UK. IEEE Computer Society.

Koper, R. (2004). Use of the semantic web to solve some basic problems in education: In-crease flexible, distributed lifelong learning, decrease teachersâAZ workload. Journal ofInteractive Media in Education - Special Issue on the Educational Semantic Web, 2.

L’Allier, J. (1997). A frame of reference: Netg’s map to its products, their structures and corebeliefs. http://www.im.com.tr/framerefer.htm.

LAMS (2007). Learning activity managment system. http://www.lamsfoundation.org/.Lattner, A. D. and Gehrke, J. D. (2004). Applying inductive logic programming and rule

relaxation for the generation of metadata. In FGWM’04 Workshop, Annual Meeting of theSpecial Interest Group on Knowledge Management of the German Society for ComputerScience (GI), pages 267–273.

Lave, J. and Wenger, E. (1991). Situated Learning : Legitimate Peripheral Participation (Learn-ing in Doing: Social, Cognitive & Computational Perspectives). Cambridge UniversityPress.

Leake, D., Maguitman, A., Reichherzer, T., Canas, A. J., Carvalho, M., Arguedas, M., andEskridge, T. (2004). ”googling” from a concept map: Towards automatic concept-map-basedquery formation. In Canas, A. J., Novak, J. D., and Gonzalez, F. M., editors, Concept Maps:Theory, Methodology, Technology - First Int. Conference on Concept Mapping.

LessonMapper2 (2007). Lessonmapper2 web site. http://lm2.eduforge.org.LOM (2002). IEEE LTSC p1484.12.1 learning object metadata specification final draft. http:

//ieeeltsc.org/wg12LOM/.LOMRDF (2003). The rdf binding of lom. http://kmr.nada.kth.se/el/ims/metadata.

html.Lornet (2007). http://www.lornet.org.Lucene (2007). Full-featured text search engine library in java. http://lucene.apache.

org.Marchiori, M. (1998). The limits of web metadata, and beyond. In Seventh International

Conference on World Wide Web (WWW98), pages 1–9, Amsterdam, The Netherlands, TheNetherlands. Elsevier Science Publishers B. V.

Martin, D. (1994). Concept mapping as an aid to lesson planning: A longitudinal study. Journalof Elementary Science Education, 6(2):11–30.

McCalla, G. (1992). The search for adaptability, flexibility, and individualization: Approachesto curriculum in intelligent tutoring systems. In Jones, M. and Winne, P., editors, Foun-dations and Frontiers of Adaptive Learning Environments, pages 91–122. Springer.

McCalla, G. (2004). The ecological approach to the design of e-learning environments: Purpose-based capture and use of information about learners. Journal of Interactive Media inEducation - Special Issue on the Educational Semantic Web, 2.

McDaniel, E., Roth, B., and Miller, M. (2005). Concept mapping as a tool for curriculum design.In Informing Science IT Education Conference (InSITE2005).

Merlot (2007). Multimedia educational resource for learning and on-line teaching. http://www.melot.org.

Moodle (2007). http://moodle.org.

http://www.intology.com.au

http://www.im.com.tr/framerefer.htm

http://www.lamsfoundation.org/

http://lm2.eduforge.org

http://ieeeltsc.org/wg12LOM/

http://ieeeltsc.org/wg12LOM/

http://kmr.nada.kth.se/el/ims/metadata.html

http://kmr.nada.kth.se/el/ims/metadata.html

http://www.lornet.org

http://lucene.apache.org

http://lucene.apache.org

http://www.melot.org

http://www.melot.org

http://moodle.org

192 References

Motelet, O. (2005). Relation-based heuristic diffusion framework for lom generation. InArtificial Inteligence in Education (AIED 2005) - Young Researcher Track.

Motelet, O. (2007). Experiment data (repository snapshot and test cases). http://reflex.dcc.uchile.cl/lm/lessonMapper2/IRTests.zip.

Motelet, O., Baloian, N., and Pino, J. A. (2007a). Learning-object metadata and automaticprocesses: Issues and perspectives. In Harman, K. and Koohang, A., editors, LearningObject: Standards, Metadata, Repositories and LCMS, Learning Object Book Series, pages185–220. Informing Science Press.

Motelet, O. and Baloian, N. A. (2005). Taking advantage of lom semantics for supporting lessonauthoring. In Meersman, R. and al., editors, On the Move to Meaningful Internet Systems2005: OTM 2005 Workshops, Workshop on Ontologies Semantics and E-Learning (WOSE2005), volume 3762 of Lecture Notes in Computer Science, pages 1159–1168. Springer.

Motelet, O. and Baloian, N. A. (2006). Hybrid system for generating learning object metadata.In Kinshuk and Koper, R., editors, Proceedings of the 6th IEEE International Conference onAdvanced Learning Technologies, (ICALT 2006), pages 563–567. IEEE Computer Society.

Motelet, O., Baloian, N. A., and Pino, J. A. (2007b). Hybrid system for generating learningobject metadata. Journal of Computers, 2(3).

Motelet, O., Baloian, N. A., Piwowarski, B., and Pino, J. A. (2007c). Taking advantage ofthe semantics of a lesson graph based on learning objects. In The 13th InternationalConference on Artificial Intelligence in Education (AIED 2007). IOS Press.

Motelet, O., Piwowarski, B., Dupret, G., Pino, J. A., and Baloian, N. (2007d). Enhancingeducational-material retrieval using authored-lesson metadata. In Symposium on StringProcessing and Information Retrieval (SPIRE07), Lecture Notes in Computer Science.Springer.

Murray, T. (1999). Authoring intelligent tutoring systems: an analysis of state of the art.International Journal of Artificial Intelligence in Education, 10:98–129.

Najjar, J. and Duval, E. (2006). Actual use of learning objects and metadata: An empiricalanalysis. IEEE Technical Committee on Digital Libraries Bulletin (TCDL), 2(2).

Najjar, J., Duval, E., Ternier, S., and Neven, F. (2003). Towards interoperable learning objectrepositories: the ariadne experience. In Isaias, P. and Karmakar, N., editors, IADISInternational Conference WWW/Internet, volume 1, pages 219–226.

Najjar, J., Klerkx, J., Vuoikari, R., and Duval, E. (2005). Finding appropriate learningobjects: An empirical evaluation. In World Conference on Educational Multimedia,Hypermedia and Telecommunications ED-MEDIA 2005, pages 1407–1414. Association forthe Advancement of Computing in Education.

Najjar, J., Ternier, S., and Duval, E. (2004). User behavior in learning object repositories: Anempirical analysis. In World Conference on Educational Multimedia, Hypermedia andTelecommunications (ED-MEDIA 2004), pages 4373–4379. Association for the Advance-ment of Computing in Education.

Najjar, J., Wolpers, M., and Duval, E. (2006). Attention metadata: Collection and manage-ment. In WWW2006 workshop on Logging Traces of Web Activity: The Mechanics of DataCollection.

Neven, F. and Duval, E. (2002). Reusable learning objects: a survey of lom-based repositories.In Tenth ACM international conference on Multimedia (MULTIMEDIA’02), pages 291–294,New York, NY, USA. ACM Press.

Nime (2007). http://nime-glad.nime.ac.jp/en/.

http://reflex.dcc.uchile.cl/lm/lessonMapper2/IRTests.zip

http://reflex.dcc.uchile.cl/lm/lessonMapper2/IRTests.zip

http://nime-glad.nime.ac.jp/en/

References 193

Norman, D. (2004). Learning objects as molecular compounds.

Novak, J. D. (1998). Learning, Creating, and Using Knowledge: Concept Maps as FacilitativeTools in Schools and Corporations. Lawrence Erlbaum, Mahwah, NJ.

Novak, J. D. and Gowin, D. B. (1984). Learning How to Learn. New York: Cambridge UniversityPress.

OAI (2007). Open archive initiative. http://www.openarchives.org.

Ochoa, S. F., Pino, J. A., Baloian, N., and Fuller, D. A. (2003). Icesee: A tool for developingengineering courseware. Computer Applications in Engineering Education, 11(2):53–66.

Ochoa, X., Cardinaels, K., Meire, M., and Duval, E. (2005). Frameworks for the automaticindexation of learning management systems content into learning object repositories.In World Conference on Educational Multimedia, Hypermedia and Telecommunications(ED-MEDIA 2005), pages 1407–1414. Association for the Advancement of Computing inEducation.

Ochoa, X. and Duval, E. (2006a). Quality metrics for learning object metadata. In WorldConference on Educational Multimedia, Hypermedia & Telecommunications (EdMedia2006). Association for the Advancement of Computing in Education.

Ochoa, X. and Duval, E. (2006b). Use of contextualized attention metadata for ranking andrecommending learning objects. In 1st international workshop on Contextualized attentionmetadata: collecting, managing and exploiting of rich usage information (CAMA ’06),pages 9–16, New York, NY, USA. ACM Press.

Oliver, M. (2005). Metadata vs. educational culture: roles, power and standardisation. InLand, R. and Bayne, S., editors, Education in Cyberspace, pages 112–138. London: Rout-ledgeFalmer.

OpenOffice API (2007). Openoffice.org api. http://api.openoffice.org.

Pajo, K. and Wallace, C. (2001). Barriers to the uptake of web-based technology by universityteachers. Journal of Distance Education, 16(1).

Patterns, P. (2007). Project. http://www.pedagogicalpatterns.org/.

Pinkwart, N., Jansen, M., Oelinger, M., Korchounova, L., and Hoppe, U. (2004). Partialgeneration of contextualized metadata in a collaborative modeling environment. In 2ndInternational Workshop on Applications of Semantic Web Technologies for E-Learning AH2004, Eindhoven, Netherlands.

Polsani, P. (2003). Use and abuse of reusable learning objects. Journal of Digital Information,3(4).

Priestley, M. (2001). Dita xml: a reuse by reference architecture for technical documentation.In 19th annual international conference on Computer documentation (SIGDOC’01), pages152–156, New York, NY, USA. ACM Press.

Recker, M. (2006). Perspectives on teachers as digital library users. D-Lib Magazine, 12(9).

Recker, M. and Wiley, D. (2001). A non-authoritative educational metadata ontology forfiltering and recommending learning objects. Interactive Learning Environments: Specialissue on metadata, pages 1–17.

Reload (2007). Educational metadata editor. http://www.reload.ac.uk/editor.html.

Rogers, E. M. (2003). Diffusion of Innovations. Free Press, fifth edition.

Salton, G. and Mcgill, M. J. (1986). Introduction to Modern Information Retrieval. McGraw-Hill,Inc., New York, NY, USA.

http://www.openarchives.org

http://api.openoffice.org

http://www.pedagogicalpatterns.org/

http://www.reload.ac.uk/editor.html

194 References

Santos, S. G., Rioja, R. M. G., Pardo, A., and Kloos, C. D. (2004). Beyond simple sequencing:Sequencing of learning activities using hierarchical graphs. In International Conferenceon Web-Based Education (IASTED’04).

Savia, E., Kurki, T., and Jokela, S. (1998). Metadata based matching of documents anduser profiles, in human and artificial information processing. In 8th Finnish ArtificialIntelligence Conference, pages 61–70.

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5:197–227.

SCORM (2007). Advanced distributed learning scorm (sharable content object referencemodel.). http://www.adlnet.org.

Shreeves, S., Knutson, E., Stvilia, B., Palmer, C., Twidale, M., and Cole, T. (2005). Is ”quality”metadata ”shareable” metadata? The implications of local metadata practices for federatedcollections. In Twelfth National Conference of the Association of College and ResearchLibraries. Association of College and Research Libraries.

Simon, B., Dolog, P., Miklós, Z., Olmedilla, D., and Sintek, M. (2004). Conceptualising smartspaces for learning. Journal of Interactive Media in Education- Special Issue on theEducational Semantic Web., 5(ISSN: 1365-893X).

SourceForge (2007). Classifier4j. http://classifier4j.sourceforge.net.

Stuckenschmidt, H. (2003). Query processing on the semantic web. Kunstliche Intelligenz:Special Issue on the Semantic Web, 3/03:22–26.

Stuckenschmidt, H. and van Harmelen, F. (2001). Ontology-based metadata generation fromsemi-structured information. In 1st international conference on Knowledge capture (K-CAP’01), pages 163–170, New York, NY, USA. ACM Press.

Stuckenschmidt, H. and van Harmelen, F. (2004). Generating and managing metadata forweb-based information systems. Knowledge-Based Systems, 17(5-6):201–206.

Ternier, S. and Duval, E. (2003). Web services for the ARIADNE knowledge pool sys-tem. In 3rd Annual ARIADNE Conference, pages 1–9. ARIADNE Foundation. URL:http://www.cs.kuleuven.ac.be/˜hmdb/publications/publicationDetails.php?id=41233.

Trigg, R. (1983). A Network-Based Approach to Text Handling for the Online ScientificCommunity. PhD thesis, University of Maryland.

Verbert, K. and Duval, E. (2004). Towards a global architecture for learning objects: a com-parative analysis of learning object content models. In World Conference on EducationalMultimedia, Hypermedia and Telecommunications (EDMEDIA2004), pages 202–208.Association for the Advancement of Computing in Education.

Verbert, K., Jovanovic, J., Gasevic, D., and Duval, E. (2005). Repurposing learning objectcomponents. In Meersman, R. and al., editors, OTM Workshops, volume 3762 of LectureNotes in Computer Science, pages 1169–1178. Springer.

Vygotsky, L. S. (1981). The genesis of higher mental functions. In Wertsch, J. V., editor, TheConcept of Activity in Soviet Psychology. Armonk, NY: Sharpe.

W3C (2007). Ressource description framework specification. http://www.w3.org/RDF.

Wenger, E. (1998). Communities of Practice: Learning, Meaning, and Identity. CambridgeUniversity Press, Cambridge.

Wertsch, J. V. (1993). Voices of the Mind: A Sociocultural Approach to Mediated Action.Harvard University Press.

Wiley, D. (2000). Learning Object Design and Sequencing Theory. PhD thesis, Brigham YoungUniversity.


http://classifier4j.sourceforge.net

http://www.w3.org/RDF

References 195

Wiley, D. (2001a). Connecting learning objects to instructional design theory: a definition, ametaphor, and a taxonomy. In Wiley, D., editor, The Instructional Use of Learning Objects.Association for Instructional Technology.

Wiley, D. (2001b). The reusability paradox. Technical report, Reusability, Collaboration, andLearning Troupe.

Wiley, D. (2007). Learning objects literature review. http://opencontent.org/docs/lo-lit-review-draft.doc.

Wiley, D., Waters, S., Dawson, D., Lambert, B., Barclay, M., Wade, D., and Nelson, L. (2004).Overcoming the Limitations of Learning Objects. Journal of Educational Multimedia andHypermedia, 13(4):507–521.

Wiley, D. A. (1999). The post-lego learning object.Wiley, D. A. and Edwards, E. K. (2002). Online Self-Organizing Social Systems: The Decen-

tralized Future of Online Learning. Quarterly Review of Distance Education, 3(1):33–46.Winn, W. (1997). Advantages of a theory-based curriculum in instructional technology. Educa-

tional Technology, 37(1):34–41.Wiza, W., Walczak, K., and Cellary, W. (2004). Periscope: a system for adaptive 3d visualization

of search results. In Ninth international conference on 3D Web technology (Web3D ’04),pages 29–40, New York, NY, USA. ACM Press.

Wong, S. and Butz, C. (2000). A bayesian approach to user profiling in information retrieval.Technology Letters, 4/01:50–56.

Woodley, A. and Geva, S. (2004). Nlpx - an xml-ir system with a natural language interface. InBruza, P., Moffat, A., and Turpin, A., editors, ADCS, pages 71–74. University of Melbourne,Department of Computer Science.

Xiang, X., Guo, L., and Shi, Y. (2003). Search and delivery of standardized learning resourcesbased on soap messaging and native xml databases. In 3rd Dublin Core Conference (DC2003).

XML (2007). Extensible markup language. http://www.w3.org/xml.Yahoo! (2007). Web search engine. http://www.yahoo.com.Yilmazel, O., Finneran, C. M., and Liddy, E. D. (2004). Metaextract: an nlp system to

automatically assign metadata. In 4th ACM/IEEE-CS joint conference on Digital libraries(JCDL ’04), pages 241–242. ACM Press.

http://www.w3.org/xml

http://www.yahoo.com