wysiwym – integrated visualization, exploration and ... · introduction the semantic web and...

17
Semantic Web 0 (2014) 1–17 1 IOS Press WYSIWYM – Integrated Visualization, Exploration and Authoring of Semantically Enriched Un-structured Content Editor(s): Roberto Garcia, Universitat de Lleida, Spain; Heiko Paulheim, University of Mannheim, Germany; Paola Di Maio, Universal Interfaces Research Lab, ISTCS, Edinburgh, UK Solicited review(s): Heiko Hornung, University of Campinas, Brazil; Haofen Wang, Shanghai Jiao Tong University, China; one anonymous reviewer Ali Khalili a , Sören Auer b a AKSW, Universität Leipzig, Augustusplatz 10, 04109 Leipzig, Germany [email protected] b CS/EIS, Universität Bonn, Römerstraße 164, 53117 Bonn, Germany [email protected] Abstract. The Semantic Web and Linked Data gained traction in the last years. However, the majority of information still is contained in unstructured documents. This can also not be expected to change, since text, images and videos are the natural way how humans interact with information. Semantic structuring on the other hand enables the (semi-)automatic integration, repurposing, rearrangement of information. NLP technologies and formalisms for the integrated representation of unstructured and semantic content (such as RDFa and Microdata) aim at bridging this semantic gap. However, in order for humans to truly benefit from this integration, we need ways to author, visualize and explore unstructured and semantically enriched content in an integrated manner. In this paper, we present the WYSIWYM (What You See is What You Mean) concept, which addresses this issue and formalizes the binding between semantic representation models and UI elements for authoring, visualizing and exploration. With RDFaCE, Pharmer and conTEXT we present and evaluate three complementary showcases implementing the WYSIWYM concept for different application domains. Keywords: Visualization, Authoring, Exploration, Semantic Web, WYSIWYM, WYSIWYG, Visual Mapping 1. Introduction The Semantic Web and Linked Data movements with the aim of creating, publishing and interconnect- ing machine readable information have gained traction in the last years. However, the majority of informa- tion still is contained in and exchanged using unstruc- tured documents, such as Web pages, text documents, images and videos. This can also not be expected to change, since text, images and videos are the natural way how humans interact with information. Semantic structuring on the other hand provides a wide range of advantages compared to unstructured information. It facilitates a number of important aspects of informa- tion management: For search and retrieval enriching documents with semantic representations helps to create more efficient and effective search interfaces, such as faceted search [33] or question answer- ing [20]. In information presentation semantically enriched documents can be used to create more sophis- ticated ways of flexibly visualizing information, such as by means of semantic overlays as de- scribed in [2]. 1570-0844/14/$27.50 c 2014 – IOS Press and the authors. All rights reserved

Upload: others

Post on 22-May-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Semantic Web 0 (2014) 1–17 1IOS Press

WYSIWYM – Integrated Visualization,Exploration and Authoring of SemanticallyEnriched Un-structured ContentEditor(s): Roberto Garcia, Universitat de Lleida, Spain; Heiko Paulheim, University of Mannheim, Germany; Paola Di Maio, UniversalInterfaces Research Lab, ISTCS, Edinburgh, UKSolicited review(s): Heiko Hornung, University of Campinas, Brazil; Haofen Wang, Shanghai Jiao Tong University, China; one anonymousreviewer

Ali Khalili a, Sören Auer ba AKSW, Universität Leipzig, Augustusplatz 10, 04109 Leipzig, [email protected] CS/EIS, Universität Bonn, Römerstraße 164, 53117 Bonn, [email protected]

Abstract. The Semantic Web and Linked Data gained traction in the last years. However, the majority of information still iscontained in unstructured documents. This can also not be expected to change, since text, images and videos are the naturalway how humans interact with information. Semantic structuring on the other hand enables the (semi-)automatic integration,repurposing, rearrangement of information. NLP technologies and formalisms for the integrated representation of unstructuredand semantic content (such as RDFa and Microdata) aim at bridging this semantic gap. However, in order for humans to trulybenefit from this integration, we need ways to author, visualize and explore unstructured and semantically enriched content inan integrated manner. In this paper, we present the WYSIWYM (What You See is What You Mean) concept, which addressesthis issue and formalizes the binding between semantic representation models and UI elements for authoring, visualizing andexploration. With RDFaCE, Pharmer and conTEXT we present and evaluate three complementary showcases implementing theWYSIWYM concept for different application domains.

Keywords: Visualization, Authoring, Exploration, Semantic Web, WYSIWYM, WYSIWYG, Visual Mapping

1. Introduction

The Semantic Web and Linked Data movementswith the aim of creating, publishing and interconnect-ing machine readable information have gained tractionin the last years. However, the majority of informa-tion still is contained in and exchanged using unstruc-tured documents, such as Web pages, text documents,images and videos. This can also not be expected tochange, since text, images and videos are the naturalway how humans interact with information. Semanticstructuring on the other hand provides a wide rangeof advantages compared to unstructured information.

It facilitates a number of important aspects of informa-tion management:

– For search and retrieval enriching documentswith semantic representations helps to createmore efficient and effective search interfaces,such as faceted search [33] or question answer-ing [20].

– In information presentation semantically enricheddocuments can be used to create more sophis-ticated ways of flexibly visualizing information,such as by means of semantic overlays as de-scribed in [2].

1570-0844/14/$27.50 c© 2014 – IOS Press and the authors. All rights reserved

Page 2: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

2 Khalili et al. / WYSIWYM

– For information integration semantically enricheddocuments can be used to provide unified viewson heterogeneous data stored in different applica-tions by creating composite applications such assemantic mashups [1].

– To realize personalization, semantically enricheddocuments provide customized and context-specificinformation which better fits user needs and willresult in delivering customized applications suchas personalized semantic portals [29].

– For reusability and interoperability enrichingdocuments with semantic representations facili-tates exchanging content between disparate sys-tems and enables building applications such asexecutable papers [24].

Natural Language Processing (NLP) technologies(e.g. named entity recognition and relationship extrac-tion) as well as formalisms for the integrated represen-tation of unstructured and semantic content (such asRDFa and Microdata) aim at bridging the semantic gapbetween unstructured and semantic representation for-malisms. However, in order for humans to truly ben-efit from this integration, we need ways to author, vi-sualize and explore unstructured and semantically en-riched content in an integrated manner.

In this paper, we present an approach inspired by theWYSIWYM metaphor (What You See Is What YouMean), which addresses the issue of an integrated vi-sualization, exploration and authoring of semanticallyenriched un-structured content. Our WYSIWYM con-cept formalizes the binding between semantic repre-sentation models and UI elements for authoring, visu-alizing and exploration. We analyse popular tree, graphand hyper-graph based semantic representation mod-els and elicit a list of semantic representation elements,such as entities, various relationships and attributes.We provide a comprehensive survey of common UIelements for authoring, visualizing and exploration,which can be configured and bound to individual se-mantic representation elements. Our WYSIWYM con-cept also comprises cross-cutting helper components,which can be employed within a concrete WYSIWYMinterface for the purpose of automation, annotation,recommendation, personalization etc.

With RDFaCE, Pharmer and conTEXT we presentand evaluate three complementary showcases imple-menting the WYSIWYM concept for different do-mains. RDFaCE is a domain agnostic editor for textcontent with embedded semantic in the form of RDFa

or Microdata. Pharmer is a WYSIWYM interface forthe authoring of semantic prescriptions and thus tar-geting the medical domain. conTEXT is a Linked-Databased lightweight text anlaytics platform supportingdifferent views for semantic analytics. Our evaluationof these tools with end-users (in case of RDFaCE andconTEXT) and domain experts (in case of Pharmer)shows, that WYSIWYM interfaces provide good us-ability, while retaining benefits of a truly semantic rep-resentation.

The contributions of this work are in particular:

1. A formalization of the WYSIWYM conceptbased on definitions for the WYSIWYM model,binding and concrete interfaces.

2. A survey of semantic representation elements oftree, graph and hyper-graph knowledge represen-tation formalisms as well as UI elements for au-thoring, visualization and exploration of such el-ements.

3. Three complementary use cases, which evaluatedifferent, concrete WYSIWYM interfaces in ageneric as well as domain specific context.

The WYSIWYM formalization can be used as a ba-sis for implementations; allows to evaluate and classifyexisting user interfaces in a defined way; provides aterminology for software engineers, user interface anddomain experts to communicate efficiently and effec-tively. We aim to contribute with this work to mak-ing Semantic Web applications more user friendly andultimately to create an ecosystem of flexible UI com-ponents, which can be reused, repurposed and chore-ographed to accommodate the UI needs of dynami-cally evolving information structures.

The remainder of this article is structured as follows:In Section 2, we describe the background of our workand discuss the related work. Section 3 describes thefundamental WYSIWYM concept proposed in the pa-per. Subsections of Section 3 present the different com-ponents of the WYSIWYM model. In Section 4, we in-troduce three implemented WYSIWYM interfaces to-gether with their evaluation results. Finally, Section 5concludes with an outlook on future work.

2. Related Work

WYSIWYG. The term WYSIWYG as an acronym forWhat-You-See-Is-What-You-Get is used in computingto describe a system in which content (text and graph-

Page 3: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 3

ics) displayed on-screen during editing appears in aform closely corresponding to its appearance whenprinted or displayed as a finished product. The firstusage of the term goes back to 1974 in the print in-dustry to express the idea that what the user sees onthe screen is what the user gets on the printer. Xe-rox PARC’s Bravo was the first WYSIWYG editor-formatter [23]. It was designed by Butler Lampson andCharles Simonyi who had started working on theseconcepts around 1970 while at Berkeley. Later on bythe emergence of Web and HTML technology, theWYSIWYG concept was also utilized in Web-basedtext editors. The aim was to reduce the effort requiredby users to express the formatting directly as validHTML markup. In a WYSIWYG editor users can editcontent in a view which matches the final appear-ance of published content with respect to fonts, head-ings, layout, lists, tables, images and structure. Be-cause using a WYSIWYG editor may not require anyHTML knowledge, they are often easier for an aver-age computer user to get started with. The first pro-grams for building Web pages with a WYSIWYG in-terface were Netscape Gold, Claris HomePage, andAdobe PageMill.

WYSIWYG text authoring is meanwhile ubiquitouson the Web and part of most content creation andmanagement workflows. It is part of content manage-ment cystems (CMS), weblogs, wikis, fora, productdata management systems and online shops, just tomention a few. However, the WYSIWYG model hasbeen criticized, primarily for the verbosity, poor sup-port of semantics and low quality of the generatedcode and there have been voices advocating a changetowards a WYSIWYM (What-You-See-Is-What-You-Mean) model [32,30].

WYSIWYM. The first use of the WYSIWYM termoccurred in 1995 aiming to capture the separation ofpresentation and content when writing a document.The LyX editor1 was the first WYSIWYM word pro-cessor for structure-based content authoring. Insteadof focusing on the format or presentation of the doc-ument, a WYSIWYM editor preserves the intendedmeaning of each element. For example, page head-ers, sections, paragraphs, etc. are labeled as such inthe editing program, and displayed appropriately in thebrowser. Another usage of the WYSIWYM term wasby Power et al. [28] in 1998 as a solution for Sym-bolic Authoring. In symbolic authoring the author gen-

1http://www.lyx.org/

erates language-neutral “symbolic" representations ofthe content of a document, from which documents ineach target language are generated automatically, us-ing Natural Language Generation technology. In thisWhat-You-See-Is-What-You-Meant approach, the lan-guage generator was used to drive the user interface(UI) with support of localization and multilinguality.Using the WYSIWYM natural language generationapproach, the system generates a feed-back text for theuser that is based on a semantic representation. Thisrepresentation can be edited directly by the user by ma-nipulating the feedback text.

The WYSIWYM term as defined and used in thispaper targets the novel aspect of integrated visualiza-tion, exploration and authoring of unstructured and se-mantic content. The rationale of our WYSIWYM con-cept is to enrich the existing WYSIWYG presenta-tional view of the content with UI components reveal-ing the semantics embedded in the content and enablethe exploration and authoring of semantic content. In-stead of separating presentation, content and meaning,our WYSIWYM approach aims to integrate these as-pects to facilitate the process of Semantic Content Au-thoring. Two “You”s in our WYSIWYM concept re-fer to the end user (with no or limited knowledge ofSemantic Web ) who is viewing an unstructured con-tent which is semantically enriched by himself. The“Mean” refers to the metadata or semantics which isencoded in the unstructured content viewed by user.There are already some approaches (i.e. visual map-ping techniques), which go into the direction of inte-grated visualization and authoring of structured con-tent.

Visual Mapping Techniques. Visual mapping tech-niques are knowledge representation techniques thatgraphically represent knowledge structures. Most ofthem have been developed as paper-based techniquesfor brainstorming, learning facilitation, outlining or toelicit knowledge structures. According to their basictopology, most of them can be related to the followingfundamentally different primary approaches [5,31]:

– Mind-Maps. Mind-maps are created by drawingone central topic in the middle together with la-beled branches and sub-branches emerging fromit. Instead of distinct nodes and links, mind-mapsonly have labeled branches. A mind-map is a con-nected directed acyclic graph with hierarchy asits only type of relation. Outlines are a similartechnique to show hierarchical relationships us-ing tree structure. Mind-maps and outlines are not

Page 4: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

4 Khalili et al. / WYSIWYM

suitable for relational structures because they areconstrained to the hierarchical model.

– Concept Maps. Concept maps consist of labelednodes and labeled edges linking all nodes to aconnected directed graph. The basic node and linkstructure of a connected directed labeled graphalso forms the basis of many other modeling ap-proaches like Entity-Relationship (ER) diagramsand Semantic Networks. These forms have thesame basic structure as concept maps but withmore formal types of nodes and links.

– Spatial Hypertext. A spatial hypertext is a setof text nodes that are not explicitly connectedbut implicitly related through their spatial layout,e.g., through closeness and adjacency — similarto a pin-board. Spatial hypertext can show fuzzilyrelated items. To fuzzily relate two items in a spa-tial hypertext schema, they are simply placed nearto each other, but possibly not quite as near as to athird object. This allows for so-called “construc-tive ambiguity” and is an intuitive way to dealwith vague relations and orders. Spatial Hyper-text abandons the concept of explicitly interrelat-ing objects. Instead, it uses spatial positioning asthe basic structure.

Binding data to UI elements. There are already manyapproaches and tools which address the binding be-tween data and UI elements for visualizing and explor-ing structured content. Dadzie and Rowe [3] presentthe most exhaustive and comprehensive survey to dateof these approaches. For example, Fresnel [27] is a dis-play vocabulary for core RDF concepts. Fresnel’s twofoundational concepts are lenses and formats. Lensesdefine which properties of an RDF resource, or groupof related resources, are displayed and how those prop-erties are ordered. Formats determine how resourcesand properties are rendered and provide hooks to ex-isting styling languages such as CSS.Parallax, Tabulator, Explorator, Rhizomer, Sgvizler,Fenfire, RDF-Gravity, IsaViz and i-Disc for TopicMaps are examples of tools available for visualizingand exploring structured data. In these tools the bind-ing between semantics and UI elements is mostly per-formed implicitly, which limits their versatility. How-ever, an explicit binding as advocated by our WYSI-WYM model can be potentially added to some of thesetools.

In contrast to the structured content, there are manyapproaches and tools which allow binding semanticdata to UI elements within semantically enriched un-

Semantic Representation Data Models

Visualization

Techniques

Exploration

TechniquesAuthoring

Techniques

Helper components

Bindings

Configs Configs Configs

WYSIWYM Interface

Fig. 1. Schematic view of the WYSIWYM model.

structured content (cf. our comprehensive literaturestudy [11]). As an example, Dido [10] is a data-interactive document which lets end users author se-mantic content mixed with unstructured content in aweb-page. Dido inherits data exploration capabilitiesfrom the underlying Exhibit2 framework. Loomp asa prove-of-concept for the One Click Annotation [6]strategy is another example in this context. Loomp isa WYSIWYG web editor for enriching content withRDFa annotations. It employs a partial mapping be-tween UI elements and data to hide the complexity ofcreating semantic data.

3. WYSIWYM Concept

In this section we introduce the fundamental WYSI-WYM concept and formalize key elements of the con-cept. Formalizing the WYSIWYM concept has a num-ber of advantages: First, the formalization can be usedas a basis for design and implementation of novel ap-plications for authoring, visualization, and explorationof semantic content (cf. Section 4). The formalizationserves the purpose of providing a terminology for soft-ware engineers and UI designers to communicate ef-ficiently and effectively. It provides insights into andan understanding of the requirements as well as cor-responding UI solutions for proper design and imple-mentation of semantic content management applica-tions. Secondly, it allows to evaluate and classify exist-ing user interfaces according to the conceptual modelin a defined way. This will highlight the gaps in ex-isting applications dealing with semantically enricheddocuments and will help to optimize them based on thedefined requirements.

2http://simile-widgets.org/exhibit/

Page 5: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 5

Figure 1 provides a schematic overview of theWYSIWYM concept. The rationale is that elements ofa knowledge representation formalism (or data model)are connected to suitable UI elements for visualiza-tion, exploration and authoring. Formalizing this con-ceptual model results in three core definitions (1) forthe abstract WYSIWYM model, (2) bindings betweenUI and representation elements as well as (3) a con-crete instantiation of the abstract WYSIWYM model,which we call a WYSIWYM interface.

Definition 1 (WYSIWYM model). The WYSIWYMmodel can be formally defined as a quintuple (D,V,X, T,H)where:

– D is a set of semantic representation data models,where each Di ∈ D has an associated set of datamodel elements EDi

;– V is a set of tuples (v, Cv), where v is a visual-

ization technique and Cv a set of possible config-urations for the visualization technique v;

– X is a set of tuples (x,Cx), where x is an explo-ration technique and Cx a set of possible config-urations for the exploration technique x;

– T is a set of tuples (t, Ct), where t is an authoringtechnique and Ct a set of possible configurationsfor the authoring technique t;

– H is a set of helper components.

Semantic representation data models are techniquesto define the meaning of data within the context ofits interrelationships with other data (cf. Section 3.1).Tree, Graph and Hypergraph are examples of com-monly used data models. Visualization techniques in-clude UI techniques for highlighting, associating anddetail viewing of semantic entities (cf. Section 3.2).Exploration techniques include UI techniques for ef-ficient browsing and navigating semantic data (cf.Section 3.3). Authoring techniques include UI tech-niques for adding and editing semantic entities andtheir relations (cf. Section 3.4). Helper componentsare cross-cutting aspects to enhance and customize theuser/application requirements of a WYSIWYM inter-face (cf. Section 3.6).

The WYSIWYM model represents an abstract con-cept from which concrete interfaces can be derivedby means of bindings between semantic representationmodel elements and configurations of particular UI el-ements.

Definition 2 (Binding). A binding b is a function whichmaps each element of a semantic representation modele (e ∈ EDi

) to a set of tuples (ui, c), where ui is a

user interface technique ui (ui ∈ V ∪X ∪ T ) and c isa configuration c ∈ Cui.

Figure 4 gives an overview on all data model(columns) and UI elements (rows) and how they canbe bound together using a certain configuration (cells).The shades of gray in a certain cell indicate the suit-ability of a certain binding between a particular UI anddata model element.

For example, having tree-based semantic represen-tation model, framing and segmentation UI techniquescan be used as external augmentation to visualize theitems in the text. It is also possible to use text format-ting techniques as inline augmentation for highlighitngthe items but since they might interfere with the cur-rent text format, we assume a partial binding for them.A possible configuration for this example binding is toset different border and text colors to distinguish dif-ferent item types.

Once a selection of data models and UI elementswas made and both are bound to each other encoding acertain configuration in a binding, we attain a concreteinstantiation of our WYSIWYM model called WYSI-WYM interface.

Definition 3 (WYSIWYM interface). An instantiationof the WYSIWYM model I called WYSIWYM interfacenow is a hextuple (DI , VI , XI , TI , HI , bI), where:

– DI is a selection of semantic representation datamodels (DI ⊂ D);

– VI is a selection of visualization techniques(VI ⊂ V );

– XI is a selection of exploration techniques(XI ⊂ X);

– TI is a selection of authoring techniques(TI ⊂ T );

– HI is a selection of helper components(HI ⊂ H);

– bI is a binding which binds a particular occur-rence of a data model element to a visualization,exploration and/or authoring technique.

Note, that we limit the definition to one binding,which means that only one semantic representationmodel is supported in a particular WYSIWYM inter-face at a time. It could be also possible to support sev-eral semantic representation models (e.g. RDFa andMicrodata) at the same time. However, this can be con-fusing to the user, which is why we deliberately ex-cluded this case in our definition. In the remainderof this sections we discuss the different parts of theWYSIWYM concept in more detail.

Page 6: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

6 Khalili et al. / WYSIWYM

Semantic expressiveness

Co

mp

lex

ity o

f vis

ua

l m

ap

pin

g

low high

hig

h

low

Mind-maps

Concept maps

Spatial hypertext

Tree-based

Graph-based

Hypergraph-based

Microdata Microformat

Topic Maps XTM LTM CTM AsTMa

RDF RDF/XML Turtle/N3/N-Triples RDFa JSON-LD

Fig. 2. Comparison of existing visual mapping techniques in termsof semantic expressiveness and complexity of visual mapping.

3.1. Semantic Representation Models

Semantic representation models are conceptual datamodels to express the meaning of information therebyenabling representation and interchange of knowledge.Based on their expressiveness, we can roughly dividepopular semantic representation models into the threecategories tree-based, graph-based and hypergraph-based (cf. Figure 2). Each semantic representationmodel comprises a number of representation elements,such as various types of entities and relationships.For visualization, exploration and authoring it is ofparamount importance to bind the most suitable UI el-ements to respective representation elements. In the se-quel we briefly discuss the three different types of rep-resentation models.

Tree-based. This is the simplest semantic represen-tation model, where semantics is encoded in a tree-like structure. It is suited for representing taxonomicknowledge, such as thesauri, classification schemes,subject heading lists, concept hierarchies or mind-maps. It is used extensively in biology and life sci-ences, for example, in the APG III system (AngiospermPhylogeny Group III system) of flowering plant classi-fication, as part of the Dimensions of the XBRL (eXten-sible Business Reporting Language) or generically inthe SKOS (Simple Knowledge Organization System).Elements of tree-based semantic representation usu-ally include:

– E1: Item – e.g. Magnoliidae, the item repre-senting all flowering plants.

– E2: Item type – e.g. biological term forMagnoliidae.

– E3: Item-subitem relationships – e.g. Magnoliidaereferring to subitem magnolias.

– E4: Item property value – e.g. the synonymflowering plant for the item Magnoliidae.

– E5: Related items – e.g. the sibling item Eudicotsto Magnoliidae.

Tree-based data can be serialized as Microdata orMicroformats.

Graph-based. This semantic representation modeladds more expressiveness compared to simple tree-based formalisms. The most prominent representativeis the RDF data model, which can be seen as a set oftriples consisting of subject, predicate, object, whereeach component can be a URI, the object can be aliteral and subject as well as object can be a blanknode. The most distinguishing features of RDF froma simple tree-based model are: the distinction of enti-ties in classes and instances as well as the possibilityto express arbitrary relationships between entities. Thegraph-based model is suited for representing combi-natorial schemes such as concept maps. Graph-basedmodels are used in a very broad range of domains,for example, in the FOAF (Friend of a Friend) for de-scribing people, their interests and interconnections ina social network, in MusicBrainz to publish informa-tion about music albums, in the medical domain (e.g.DrugBank, Diseasome, ChEMBL, SIDER) to describethe relations between diseases, drugs and genes, orgenerically in the SIOC (Semantically-Interlinked On-line Communities) vocabulary. Elements of RDF as atypical graph-based data model are:

– E1: Instances – e.g. Warfarin as a drug.– E2: Classes – e.g. anticoagulants drug

for Warfarin.– E3: Relationships between entities (instances

or classes) – e.g. the interaction betweenAspirin as an antiplatelet drug and Warfarinwhich will increase the risk of bleeding.

– E4: Literal property values – e.g. the halflife forthe Amoxicillin.

∗ E41 : Value – e.g. 61.3 minutes.∗ E42 : Language tag – e.g. en.∗ E43 : Datatype – e.g. xsd:float.

RDF-based data can be serialized in various for-mats, such as RDFa, RDF/XML, JSON-LD or Turtle/N3/N-Triples.

Page 7: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 7

Hypergraph-based. A hypergraph is a generalizationof a graph in which an edge can connect any num-ber of vertices. Since hypergraph-based models al-low n-ary relationships between arbitrary number ofnodes, they provide a higher level of expressivenesscompared to tree-based and graph-based models. Themost prominent representative is the Topic Maps datamodel developed as an ISO/IEC standard which con-sists of topics, associations and occurrences. The se-mantic expressivity of Topic Maps is, in many ways,equivalent to that of RDF, but the major differences arethat Topic Maps (i) provide a higher level of seman-tic abstraction (providing a template of topics, associa-tions and occurrences, while RDF only provides a tem-plate of two arguments linked by one relationship) and(hence) (ii) allow n-ary relationships (hypergraphs) be-tween any number of nodes, while RDF is limited totriplets. The hypergraph-based model is suited for rep-resenting complex schemes such as spatial hypertext.Hypergraph-based models are used for a variety of ap-plications. Amongst them are musicDNA3 as an indexof musicians, composers, performers, bands, artists,producers, their music, and the events that link themtogether, TM4L (Topic Maps for e-Learning), clini-cal decision support systems and enterprise informa-tion integration. Elements of Topic Maps as a typicalhypergraph-based data model are:

– E1: Topic name – e.g. University of Leipzig.– E2: Topic type – e.g. organization for Uni-

versity of Leipzig.– E3: Topic associations – e.g. member of aproject which has other organization partners.

– E4: Topic role in association e.g. coordinator.– E5: Topic occurrences – e.g. address.

∗ E51 : value – e.g. Augustusplatz 10,04109 Leipzig.

∗ E52 : datatype – e.g. text.

Topic Maps-based data can be serialized as anXML-based syntax called XTM (XML Topic Map),LTM (Linear Topic Map Notation), CTM (CompactTopic Maps Notation) and AsTMa (Asymptotic TopicMap Notation).

3.2. Visualization

The primary objectives of visualization are to present,transform, and convert semantic data into a visual rep-

3http://www.musicdna.info/

resentation, so that, humans can read, query and editthem efficiently. We divide existing techniques for vi-sualization of knowledge encoded in text, images andvideos into the three categories Highlighting, Associ-ating and Detail view. Highlighting includes UI tech-niques which are used to distinguish or highlight a partof an object (i.e. text, image or video) from the wholeobject. Associating deals with techniques that visual-ize the relation between some parts of an object. Detailview includes techniques which reveal detailed infor-mation about a part of an object. For each of the abovecategories, the related UI techniques are as follows:

- Highlighting.

– V1: Framing and Segmentation (borders, over-lays and backgrounds). This technique can be ap-plied to text, images and videos, we enclose a se-mantic entity in a coloured border, background oroverlay. Different border styles (colours, width,types), background styles (colours, patterns) oroverlay styles (when applied to images and videos)can be used to distinguish different types of se-mantic entities (cf. Figure 3 no. 1, 2). The tech-nique is already employed in social networkingwebsites such as Google Plus and Facebook totag people within images.

– V2: Text formatting (color, font, size, margin,etc.). In this technique different text styles suchas font family, style, weight, size, colour, shad-ows, margin and other text decoration techniquesare used to distinguish semantic entities within atext (cf. Figure 3 no. 6). The problem with thistechnique is that in an HTML document, the ap-plied semantic styles might overlap with existingstyles in the document and thereby add ambiguityto recognizing semantic entities.

– V3: Image color effects. This technique is simi-lar to text formatting but applied to images andvideos. Different image color effects such asbrightness/contrast, shadows, glows, bevel/embossare used to highlight semantic entities within animage (cf. Figure 3 no. 7). This technique suf-fers from the problem that the applied effectsmight overlap with the existing effects in the im-age thereby making it hard to distinguish the se-mantic entities.

– V4: Marking (icons appended to text or image). Inthis technique, which can be applied to text, im-ages and videos, we append an icon as a markerto the part of object which includes the semanticentity (cf. Figure 3 no. 9). The most popular use

Page 8: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

8 Khalili et al. / WYSIWYM

Fig. 3. Screenshots of user interface techniques for visualization and exploration: 1-framing using borders, 2-framing using backgrounds, 3-videosubtitle, 4-line connectors and arrow connectors, 5-bar layouts, 6-text formatting, 7-image color effects, framing and line connectors, 8-expand-able callout, 9-marking with icons, 10-tooltip callout, 11-faceting

of this technique is currently within maps to indi-cate specific points of interest. Different types oficons can be used to distinguish different types ofsemantic or correlated entities.

– V5: Bleeping. A bleep is a single short high-pitched signal in videos. Bleeping can be used tohighlight semantic entities within a video. Differ-ent type of bleep signals can be defined to distin-guish different types of semantic entities.

– V6: Speech (in videos). In this technique a videois augmented by some speech indicating the se-mantic entities and their types within the video.

- Associating.

– V7: Line connectors. Using line connectors is thesimplest way to visualize the relation between se-mantic entities in text, images and videos (cf. Fig-ure 3 no. 4). If the value of a property is avail-able in the text, line connectors can also reflectthe item property values. Problematic is that nor-mal line connectors can not express the directionof a relation.

– V8: Arrow connectors. Arrow connectors are ex-tended line connectors with arrows to express thedirection of a relation in a directed graph.

Besides the line and arrow connectors techniqueswhich explicitly visualize the association between en-tities, implicit techniques defined as Gestalt princi-ples [9] can be used for modeling association. Thesetechniques are psychological assumptions that imposestructure for human visual perception. Principles suchas proximity, similarity, continuity, closure, symmetry,figure/ground and common fate can be used to affectour perception of whether and how the objects are or-ganized into groups. Discussing these principles areout of the scope of this paper.

- Detail view.

– V9: Callouts. A callout is a string of text con-nected by a line, arrow, or similar graphic to a partof text, image or video giving information aboutthat part. It is used in conjunction with a cursor,usually a pointer. The user hovers the pointer overan item, without clicking it, and a callout appears(cf. Figure 3 no. 10). Callouts come in different

Page 9: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 9

styles and templates such as infotips, tooltips, hintand popups. Different sort of metadata can be em-bedded in a callout to indicate the type of seman-tic entities, property values and relationships. An-other variant of callouts is the status bar whichdisplays metadata in a bar appended to the text,image or video container. A problem with dy-namic callouts is that they do not appear on mo-bile devices (by hover), since there is no cursor.

– V10: Video subtitles. Subtitles are textual versionsof the dialog or commentary in videos. They areusually displayed at the bottom of the screen andare employed for written translation of a dialogin a foreign language. Video subtitles can be usedto reflect detailed semantics embedded in a videoscene when watching the video. A problem withsubtitles is efficiently scaling the text size and re-lating text to semantic entities when several se-mantic entities exist in a scene.

3.3. Exploration

To increase the effectiveness of visualizations, usersneed to be capable to dynamically navigate and ex-plore the visual representation of the semantic data.The dynamic exploration of semantic data will resultin faster and easier comprehension of the targeted con-tent. Techniques for exploration of semantics encodedin text, images and videos include:

– X1: Zooming. In a zoomable UI, users can changethe scale of the viewed area in order to see moredetail or less. The zooming elements and tech-niques vary on different applications. Zooming ina semantic entity can reveal further details suchas property value or entity type. Zooming out canbe employed to reveal the relations between se-mantic entities in a text, image or video. Support-ing rich dynamics by configuring different visualrepresentations for semantic objects at differentsizes is a requirement for a zoomable UI. TheiMapping approach[5] which is implemented inthe semantic desktop is an example of the zoom-ing technique.

– X2: Faceting. Faceted browsing is a techniquefor accessing information organized according toa faceted classification system, allowing users toexplore a collection of information by applyingmultiple filters (cf. Figure 3 no. 11). Definingfacets for each component of the predefined se-mantic models enable users to browse the under-

lying knowledge space by iteratively narrowingthe scope of their quest in a predetermined order.One of the main problems with faceted browsersis the increased number of choices presented tothe user at each step of the exploration [4].

– X3: On-demand highlighting. Unlike the high-lighting approach discussed in the visualizationmethods, on-demand highlighting is used to nav-igate the semantic entities encoded in text in adynamic manner. One technique to realize on-demand highlighting is Bar layout. In the bar lay-out, each semantic entity within the text is indi-cated by a vertical bar in the left or right mar-gin (cf. Figure 3 no. 5). The colour of the bar re-flects the type of the entity. The bars are orderedby length and order in the text. Nested bars can beused to show the hierarchies of entities. Seman-tic entities in the text are highlighted by a mouse-over the corresponding bar. This approach is em-ployed in Loomp [21].

– X4: Expanding & Drilling down. Expandablecallouts are interactive and dynamic calloutswhich enable users to explore the semantic dataassociated to a predefined semantic entity (cf.Figure 3 no. 8). Drilling down in a callout en-ables users to move from summary informationto detailed data by focusing in on entities. Thistechnique is employed in OntosFeeder [16].

3.4. Authoring

Semantic authoring aims to add more meaning todigitally published documents. If users do not onlypublish the content, but at the same time describe whatit is they are publishing, then they have to adopt astructured approach to authoring. A semantic author-ing UI is a human accessible interface with capabili-ties for writing and modifying semantically enricheddocuments. The following techniques can be used forauthoring of semantics encoded in text, images andvideos:

– T1: Form editing. In form editing, a user employsexisting form elements such as input/check/radioboxes, drop-down menu, slider, spinner, buttons,date/color picker etc. for content authoring.

– T2: Inline edit. Inline editing is the process ofediting items directly in the view by performingsimple clicks, rather than selecting items and thennavigating to an edit form and submitting changesfrom there.

Page 10: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

10 Khalili et al. / WYSIWYM

– T3: Drawing. Drawing as part of informal userinterfaces [19], provides a natural human inputto annotate an object by augmenting the ob-ject with human-understandable sketches. For in-stance, users can draw a frame around semanticentities, draw a line between related entities etc.Special shapes can be drawn to indicate differententity types or entity roles in a relation.

– T4: Drag and drop. Drag and drop is a pointingdevice gesture in which the user selects a virtualobject by grabbing it and dragging it to a differentlocation or onto another virtual object. In general,it can be used to invoke many kinds of actions, orcreate various types of associations between twoabstract objects.

– T5: Context menu. A context menu (also calledcontextual, shortcut, or pop-up menu) is a menuthat appears upon user interaction, such as a rightbutton mouse click. A context menu offers a lim-ited set of choices that are available in the currentstate, or context.

– T6: (Floating) Ribbon editing. A ribbon is a com-mand bar that organizes functions into a series oftabs or toolbars at the top of the editable content.Ribbon tabs/toolbars are composed of groups,which are a labeled set of closely related com-mands. A floating ribbon is a ribbon that appearswhen user rolls the mouse over a target area.A floating ribbon increases usability by bringingedit functions as close as possible to the user’spoint of focus. The Aloha WYSIWYG editor4 is anexample of floating ribbon based content author-ing.

– T7: Voice commands. Voice commands permit theuser’s hands and eyes to be busy with anothertask, which is particularly valuable when usersare in motion or outside. Users tend to preferspeech for functions like describing objects, setsand subsets of objects [26]. By adding special sig-nals to input voice, users can author semantic con-tent from the scratch.

– T8: (Multi-touch) gestures. A gesture is a formof non-verbal communication in which visiblebodily actions communicate particular messages.Technically, different methods can be used fordetecting and identifying gestures. Movement-sensor-based and camera-based approaches aretwo commonly used methods for the recognition

4http://aloha-editor.org

of in-air gestures [22]. Multi-touch gestures areanother type of gestures which are defined to in-teract with multi-touch devices such as modernsmartphones and tablets. Users can use gesturesto determine semantic entities, their types and re-lationship among them. The main problem withgestures is their high level of abstraction whichmakes it hard to assert concrete property values.Special gestures can be defined to author seman-tic entities in text, images and videos.

3.5. Bindings

Figure 4 surveys possible bindings between the userinterface and semantic representation elements.

The bindings were derived based on the followingmethodology:

1. We first analyzed existing semantic representa-tion models and extracted the corresponding ele-ments for each semantic model.

2. We performed an extensive literature study re-garding existing approaches for visual mappingas well as approaches addressing the binding be-tween data and UI elements. If the approach wasexplicitly mentioning the binding composed ofUI elements and semantic model elements, weadded the binding to our mapping table.

3. We analyzed existing tools and applicationswhich were implicitly addressing the binding be-tween data and UI elements.

4. Finally, we followed a predictive approach. Weinvestigated additional UI elements which arelisted in existing HCI glossaries and carefully an-alyzed their potential to be connected to a seman-tic model element.

Although we deem the bindings to be fairly complete,new UI elements might be developed or additional datamodels (or variations of the ones considered) might ap-pear, in this case the bindings can be easily extended.

Partial binding indicates the situation when a UItechnique does not completely cover a semantic modelelement but still can be used in particular cases. Forexample, different text colors can be used to highlightpredefined item types in text but since the colors mightinterfere with the current colors in the text (in caseof HTML document), we assign this binding as par-tial binding. Another example are the line connectorsused to represent the relation between items in a tree orgraph-based model. In this case, on the contrary to ar-row connectors, since we cannot determine the source

Page 11: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 11

and destination of the line, we are unable to model di-rectional relations completely, thereby, a partial bind-ing is assigned.

The asterisks in Figure 4, indicate the cases whenthe metadata value is explicitly available in the textand the user just needs to provide the connection (e.g.imagine that we have Berlin and Germany men-tioned in the text and we want to assign the relationisCapitalOf).

The following binding configurations (extractedfrom the literature and current tools) are available andreferred to from the cells of Figure 4:

– Defining a special border or background style(C1), text style (C2), image color effect (C4),beep sound (C5), bar style (C6), sketch (C7),draggable or droppable shape (C8), voice com-mand (C9), gesture (C10) or a related icon (C3)for each type.

– Progressive shading (C11) by defining continuousshades within a specific color scheme to distin-guish items in different levels of the hierarchy.

– Hierarchical bars (C12) by defining special stylesfor nested bars.

– Grouping by similar border or background style(C13), text style (C14), icons (C15) or image coloreffects (C16).

For example, a user can define a set of preferred bor-der colors to distinguish different item types (e.g. Per-sons, Organizations or Locations) or to group relateditems (e.g. all the cities in Germany).

3.6. Helper Components

In order to facilitate, enhance and customize theWYSIWYM model, we utilize a set of helper compo-nents, which implement cross-cutting aspects.

A helper component acts as an extension on topof the core functionality of the WYSIWYM model.The following components can be used to improve thequality of a WYSIWYM UI depending on the require-ments defined for a specific application domain:

– H1: Automation means the provision of facil-ities for automatic annotation of text, imagesand videos to reduce the need for human workand thereby facilitating the efficient annotationof large item collections. For example, users canemploy existing NLP services (e.g. named en-tity recognition, relationship extraction) for auto-matic text annotation.

– H2: Real-time tagging is an extension of automa-tion, which allows to create annotations proac-tively while the user is authoring a text, image orvideo. This will significantly increase the annota-tion speed and users are not distracted since theydo not have to interrupt their current authoringtask.

– H3: Recommendation means providing userswith pre-filled form fields, suggestions (e.g. forURIs, namespaces, properties), default values etc.These facilities simplify the authoring process,as they reduce the number of required user inter-actions. Moreover, they help preventing incom-plete or empty metadata. In order to leverageother user’s annotations as recommendations, ap-proaches like Paragraph Fingerprinting [8] canbe implemented.

– H4: Personalization and context-awareness de-scribes the ability of the UI to be configured ac-cording to users’ contexts, background knowl-edge and preferences. Instead of being static, apersonalized UI dynamically tailors its visual-ization, exploration and authoring functionalitiesbased on the user profile and context.

– H5: Collaboration and crowdsourcing enablescollaborative semantic authoring, where the au-thoring process can be shared among differ-ent authors at different locations. There are avast amounts of amateur and expert users whichare collaborating and contributing on the SocialWeb. Crowdsourcing harnesses the power of suchcrowds to significantly enhance and widen theresults of semantic content authoring and anno-tation. Generic approaches for exploiting single-user Web applications for shared editing [7] canbe employed in this context.

– H6: Accessibility means providing people withdisabilities and special needs with appropriateUIs. The underlying semantic model in a WYSI-WYM UI can allow alternatives or conditionalcontent in different modalities to be selectedbased on the type of the user disability and infor-mation need.

– H7: Multilinguality means supporting multiplelanguages in a WYSIWYM UI when visualizing,exploring or authoring the content.

Page 12: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

12 Khalili et al. / WYSIWYM

* If value is available in the text/subtitle.

No binding Partial binding Full binding

Tree-based

(e.g. Taxonomies)

Graph-based

(e.g. RDF)

Hypergraph-based

(e.g. Topic Maps)

Item

Item

type

Item

-subitem

Item

pro

pert

y v

alu

e

Rela

ted I

tem

s

Insta

nce

Cla

ss

Rela

tionship

s b

etw

een

entities

Literal property

values

Topic

Topic

type

Topic

associa

tions

Topic

role

in a

ssocia

tion Topic

Occurre

nces

Valu

e

Language t

ag

Data

type

Valu

e

Data

type

Structure

encoded in: UI categories UI techniques

Vis

uali

zati

on

text

Highlighting

Framing and

segmentation (borders,

overlays, backgrounds)

C1 C11 C13 C1 C1

Text formatting (color,

font, size etc.)C2 C11 C14 C2 C2

Marking (appended icons) C3 C15 C3 C3

AssociatingLine connectors * * *

Arrow connectors * * *

Detail viewCallouts

(infotips, tooltips, popups)

images

Highlighting

Framing and

segmentation (borders,

overlays, backgrounds)

C1 C11 C13 C1 C1

Image color effects C4 C11 C16 C4 C4

Marking (appended icons) C3 C15 C3 C3

AssociatingLine connectors

Arrow connectors

Detail viewCallouts

(infotips, tooltips, popups)

videos

Highlighting

Framing and

segmentation (borders,

overlays, backgrounds)

C1 C11 C13 C1 C1

Image color effects C4 C11 C16 C4 C4

Marking (appended icons) C3C15 C3

C3

Bleeping C5 C5 C5

Speech

AssociatingLine connectors * * *

Arrow connectors * * *

Detail view

Callouts

(infotips, tooltips, popups)

Subtitle

Exp

lora

tio

n text

Zooming

Faceting

On-demand highlighting C5 C12 C5 C5

Expanding & Drilling down

imagesZooming

Faceting

videos Faceting (excerpts)

Au

tho

rin

g

text, images,

videos

Form editing

Inline edit

Drawing C7 C7 C7 C7

Drag and drop C8 C8 C8 C8

Context menu

(Floating) Ribbon editing

Voice commands C9 C9 C9 C9

(Multi-Touch) Gestures C10 C10 C10 C10

Fig. 4. Possible bindings between user interface and semantic representation model elements.

Page 13: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 13

4. Implementation and Evaluation

In order to evaluate the WYSIWYM model, we im-plemented the three applications RDFaCE, Pharmerand conTEXT, which we present in the sequel.

RDFaCE. RDFaCE (RDFa Content Editor) [13] isa WYSIWYM interface for semantic content author-ing. It is implemented on top of the TinyMCE richtext editor. RDFaCE extends the existing WYSIWYGuser interfaces to facilitate semantic authoring withinpopular CMSs, such as blogs, wikis and discussionforums. The RDFaCE implementation (cf. Figure 5,left) is open-source and available for download to-gether with an explanatory video and online demoat http://aksw.org/Projects/RDFaCE. RD-FaCE as a WYSIWYM instantiation can be describedusing the following hextuple:

– D: RDFa, Microdata5.– V: Framing using borders (C: special border color

defined for each type), Callouts using dynamictooltips.

– E: Faceting based on the type of entities.– T: Form editing, Context Menu, Ribbon editing.– H: Automation, Recommendation.– b: bindings defined in Figure 4.

RDFaCE comes with a special edition [12] cus-tomized for Schema.org vocabulary. In this ver-sion, different color schemes are assigned to differentschemas defined in Schema.org. Users are able to cre-ate a subset of Schema.org schemas for their intendeddomain and customize the colors for this subset. In thisversion, nested forms are dynamically generated fromthe selected schemas for authoring and editing of theannotations.

In order to evaluate RDFaCE usability, we con-ducted an experiment with 16 participants of the ISS-LOD 2011 summer school6. The user evaluation com-prised the following steps: First, some basic informa-tion about semantic content authoring along with ademo showcasing different RDFaCE features was pre-sented to the participants as a 3 minutes video. Then,participants were asked to use RDFaCE to annotatethree text snippets – a wiki article, a blog post and anews article. For each text snippet, a timeslot of five

5Microdata support is implemented in RDFaCE-Lite available athttp://rdface.aksw.org/lite

6Summer school on Linked Data: http://lod2.eu/Article/ISSLOD2011

Usability Factor/Grade Poor Fair Neutral Good ExcellentFit for use 0% 12.50% 31.25% 43.75% 12.50%Ease of learning 0% 12.50% 50% 31.25% 6.25%Task efficiency 0% 0% 56.25% 37.50% 6.25%Ease of remembering 0% 0% 37.50% 50% 12.50%Subjective satisfaction 0% 18.75% 50% 25% 6.25%Understandability 6.25% 18.75% 31.25% 37.50% 6.25%

Table 1Usability evaluation results for RDFaCE.

Fig. 6. Usability evaluation results for Pharmer (0: Strongly dis-agree, 1: Disagree, 2: Neutral, 3: Agree, 4: Strongly agree).

minutes was available to use different features of RD-FaCE for annotating occurrences of persons, locationsand organizations with suitable entity references. Sub-sequently, a survey was presented to the participantswhere they were asked questions about their experi-ence while working with RDFaCE. Questions were tar-geting six factors of usability [17,25] namely Fit foruse, Ease of learning, Task efficiency, Ease of remem-bering, Subjective satisfaction and Understandability.Results of the survey are shown in Table 1. They in-dicate on average good to excellent usability for RD-FaCE. A majority of the users deem RDFaCE being fitfor use and its functionality easy to remember. Also,easy of learning and subjective satisfaction was wellrated by the participants. There was a slightly lower(but still above average) assessment of task efficiencyand understandability, which we attribute to the shorttime participants had for familiarizing themselves withRDFaCE and the quite comprehensive functionality,which includes automatic annotations, recommenda-tions and various WYSIWYM UI elements.

Pharmer. Pharmer [15] is a WYSIWYM interfacefor the authoring of semantically enriched electronicprescriptions. It enables physicians to embed drug-related metadata into e-prescriptions thereby reduc-ing the medical errors occurring in the prescriptionsand increasing the awareness of the patients about

Page 14: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

14 Khalili et al. / WYSIWYM

Fig. 5. Screenshots of our implemented WYSIWYM interfaces. Left: RDFaCE for semantic text authoring (T6 indicates the RDFaCE menu bar,V1 – the framing of named entities in the text, V9 – a callout showing additional type information, T5 – a context menu for revising annotations).Right: Pharmer for authoring of semantic prescriptions (V1 – highlighting of drugs through framing, V9 – additional information about a drugin a callout, T1/T2 combined form and inline editing of electronic prescriptions).

the prescribed drugs and drug consumption in gen-eral. In contrast to database-oriented e-prescriptions,semantic prescriptions are easily exchangeable amongother e-health systems without need to changing theirrelated infrastructure. The Pharmer implementation(cf. Figure 5, right) is open-source and available fordownload together with an explanatory video and on-line demo7 at http://code.google.com/p/pharmer/. It is based on HTML5 contenteditable el-ement. Pharmer as a WYSIWYM instantiation is de-fined using the following hextuple:

– D: RDFa.– V: Framing using borders and background (C:

special background color defined for each type),Callouts using dynamic popups.

– E: Faceting based on the type of entities.– T: Form editing, Inline edit.– H: Automation, Real-time tagging, Recommen-

dation.– b: bindings defined in Figure 4.

In order to evaluate the usability of Pharmer, we per-formed a user study with 13 subjects. Subjects were 3physicians, 4 pharmacist, 3 pharmaceutical researchersand 3 students. We first showed them a 3-minute tuto-rial video of using different features of Pharmer thenasked each one to create a semantic prescription withPharmer. After finishing the task, we asked the partic-ipants to fill out a questionnaire. We used the SystemUsability Scale (SUS) [18] as a standardized, simple,ten item Likert scale-based questionnaire to grade theusability of Pharmer. SUS yields a single number in the

7http://bitili.com/pharmer

range of 0 to 100 which represents a composite mea-sure of the overall usability of the system. The resultsof our survey (cf. Figure 6) showed a mean usabilityscore of 75 for Pharmer WYSIWYM interface whichindicates a good level of usability. Participants particu-larly liked the integration of functionality and the easeof learning and use. The confidence in using the sys-tem was slightly lower, which we again attribute to theshort learning phase and diverse functionality.

conTEXT. conTEXT [14] is a WYSIWYM interfacewhich allows users to semantically analyze text cor-pora (such as blogs, RSS/Atom feeds, Facebook, G+,Twitter) and provides novel ways for browsing andvisualizing the results. It helps non-programmer Webusers to use sophisticated NLP techniques for text ana-lytics and to give feedback to the NLP services for im-proving their quality for named entity recognition. TheconTEXT implementation (cf. Figure 7) together withan explanatory video and online demo is available athttp://context.aksw.org.

conTEXT as a WYSIWYM instantiation is definedusing the following hextuple:

– D: RDFa.– V: Framing using borders and background (C:

special background color defined for each type),Callouts using dynamic popups, Text margin for-mat for hierarchies, Line collectors for entity re-lations.

– E: Faceting based on the type of entities.– T: Form editing, Inline edit.– H: Recommendation.– b: bindings defined in Figure 4.

Page 15: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 15

Fig. 7. Screenshots of the conTEXT WYSIWYM interfaces (T2 indicates the inline editing UI, V1 – the framing of named entities in the text, V2– text margin formatting for visualizing hierarchy, V7 – line connectors to show the relation between entities, V9 – a callout showing additionaltype information, X2 – faceted browsing, H3 – recommendation for NLP feedback).

Fig. 8. Usability evaluation results for conTEXT (0: Strongly dis-agree, 1: Disagree, 2: Neutral, 3: Agree, 4: Strongly agree).

Faceted browsing view in conTEXT clearly showsthe concept of integrated unstructured and structuredview in a WYSIWYM interface. Users see unstruc-tured text enriched with highlighted entities and detaildescription of the entities. In addition to that, differentfacets (e.g. entity type tree) allow users to filter out thetext by their preferences. Users can also use inline edit-ing within unstructured text to refine the annotationsand to send feedback to NLP services.

In order to evaluate the usability of conTEXT, weperformed a user study with 25 subjects (20 PhDstudents having different backgrounds from computersoftware to life sciences, 2 MSc students and 3 BScstudents with good command of English) on a set of

10 questions pertaining to knowledge discovery in cor-pora of unstructured data. Similar to Pharmer evalua-tion, we used the SUS questionnaire to grade the us-ability of conTEXT. The results of our survey (cf. Fig-ure 8) showed a mean usability score of 82 for con-TEXT WYSIWYM interface which indicates a goodlevel of usability. The responses to question 1 suggeststhat our system is adequate for frequent use by users.While a small fraction of the functionality is deemedunnecessary by some users, the users deem the sys-tem easy to use. Only one user suggested that he/shewould need a technical person to use the system, whileall other users were fine without one. The modules ofthe system in itself were deemed to be well integrated.Overall, the output of the system seems to be easy tounderstand while users even without training assumethemselves capable of using the system.

5. Conclusions

Bridging the gap between unstructured and seman-tic content is a crucial aspect for the ultimate suc-cess of semantic technologies. With the WYSIWYMconcept we presented in this article an approach forintegrated visualization, exploration and authoring ofunstructured and semantic content. The WYSIWYMmodel binds elements of a knowledge representationformalism (or data model) to a set of suitable UI el-ements for visualization, exploration and authoring.Based on such a declarative binding mechanism, we

Page 16: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

16 Khalili et al. / WYSIWYM

aim to increase the flexibility, reusability and develop-ment efficiency of semantics-rich user interfaces

We deem this work as a first step in a larger researchagenda aiming at improving the usability of seman-tic user interfaces, while retaining semantic richnessand expressivity. In future work we envision to adopt amodel-driven approach to enable automatic implemen-tation of WYSIWYM interfaces by user-defined pref-erences. This will help to reuse, re-purpose and chore-ograph WYSIWYM UI elements to accommodate theneeds of dynamically evolving information structuresand ubiquitous interfaces. We also aim to bootstrap anecosystem of WYSIWYM instances and UI elementsto support structure encoded in different modalities,such as images and videos. Creating live and context-sensitive WYSIWYM interfaces which can be gener-ated on-the-fly based on the ranking of available UIelements is another promising research venue.

Acknowledgments

We would like to thank our colleagues from theAKSW research group for their helpful comments dur-ing the development of the WYSIWYM model. Thiswork was partially supported by a grant from the Euro-pean Union’s 7th Framework Programme provided forthe project LOD2 (GA no. 257943).

References

[1] A. Ankolekar, M. Krötzsch, T. Tran, and D. Vrandecic. Thetwo cultures: Mashing up Web 2.0 and the Semantic Web. WebSemantics: Science, Services and Agents on the World WideWeb, 6(1):70 – 75, 2008. Semantic Web and Web 2.0.

[2] G. Burel, A. E. Cano1, and V. Lanfranchi. Ozone Browser:Augmenting the Web with Semantic Overlays. In S. Auer,C. Bizer, and G. A. Grimnes, editors, Proc. of 5th Work-shop on Scripting and Development for the Semantic Web atESWC 2009, volume 449 of CEUR Workshop Proceedings,ISSN 1613-0073, June 2009.

[3] A.-S. Dadzie and M. Rowe. Approaches to visualising LinkedData: A survey. Semantic Web, 2(2):89 – 124, 2011.

[4] L. Deligiannidis, K. J. Kochut, and A. P. Sheth. RDF dataexploration and visualization. In P. Mitra, C. L. Giles, andL. Carr, editors, Proceedings of the ACM first workshop on Cy-berInfrastructure: information management in eScience (CIMS2007), pages 39 – 46. ACM, 2007.

[5] H. Haller and A. Abecker. iMapping: a zooming user interfaceapproach for personal and semantic knowledge management.ACM SIGWEB Newsletter, pages 4:1 – 4:10, Sept. 2010.

[6] R. Heese, M. Luczak-Rösch, A. Paschke, R. Oldakowski, andO. Streibel. One Click Annotation. In G. T. W. G. A. Grimnes,editor, CEUR Workshop Proceedings, volume 699. CEUR-WS.org, February 2010.

[7] M. Heinrich, F. Lehmann, T. Springer, and M. Gaedke. Ex-ploiting single-user web applications for shared editing: ageneric transformation approach. In A. Mille, F. L. Gandon,J. Misselis, M. Rabinovich, and S. Staab, editors, Proceedingsof the 21st World Wide Web Conference (WWW 2012), pages1057 – 1066. ACM, 2012.

[8] L. Hong and E. H. Chi. Annotate once, appear anywhere: col-lective foraging for snippets of interest using paragraph finger-printing. In D. R. O. Jr., R. B. Arthur, K. Hinckley, M. R. Mor-ris, S. E. Hudson, and S. Greenberg, editors, Proceedings of theSIGCHI Conference on Human Factors in Computing Systems,CHI ’09, pages 1791 – 1794. ACM, 2009.

[9] J. Johnson. Designing with the Mind in Mind, Second Edition:Simple Guide to Understanding User Interface Design Guide-lines. Morgan Kaufmann Publishers Inc., 2014.

[10] D. R. Karger, S. Ostler, and R. Lee. The Web Page As a WYSI-WYG End-user Customizable Database-backed InformationManagement Application. In Proceedings of the 22Nd AnnualACM Symposium on User Interface Software and Technology,UIST ’09, pages 257 – 260, New York, NY, USA, 2009. ACM.

[11] A. Khalili and S. Auer. User interfaces for semantic author-ing of textual content: A systematic literature review. Web Se-mantics: Science, Services and Agents on the World Wide Web,22(0):1 – 18, 2013.

[12] A. Khalili and S. Auer. WYSIWYM Authoring of StructuredContent Based on Schema.org. In X. Lin, Y. Manolopoulos,D. Srivastava, and G. Huang, editors, Web Information SystemsEngineering – WISE 2013, volume 8181 of Lecture Notes inComputer Science, pages 425 – 438. Springer Berlin Heidel-berg, 2013.

[13] A. Khalili, S. Auer, and D. Hladky. The RDFa Content Ed-itor - From WYSIWYG to WYSIWYM. In X. Bai, F. Belli,E. Bertino, C. K. Chang, A. Elçi, C. C. Seceleanu, H. Xie, andM. Zulkernine, editors, IEEE Computer Software and Applica-tions Conference (COMPSAC 2012), pages 531 – 540. IEEEComputer Society, 2012.

[14] A. Khalili, S. Auer, and A.-C. Ngonga Ngomo. conTEXT –Lightweight Text Analytics Using Linked Data. In V. Presutti,C. d’Amato, F. Gandon, M. d’Aquin, S. Staab, and A. Tor-dai, editors, The Semantic Web: Trends and Challenges, vol-ume 8465 of Lecture Notes in Computer Science, pages 628 –643. Springer International Publishing, 2014.

[15] A. Khalili and B. Sedaghati. Semantic Medical Prescriptions- Towards Intelligent and Interoperable Medical Prescriptions.In IEEE Seventh International Conference on Semantic Com-puting (ICSC 2013), pages 347 – 354. IEEE, 2013.

[16] A. Klebeck, S. Hellmann, C. Ehrlich, and S. Auer. Ontos-Feeder – A Versatile Semantic Context Provider for Web Con-tent Authoring. In G. Antoniou, M. Grobelnik, E. Simperl,B. Parsia, D. Plexousakis, P. De Leenheer, and J. Pan, editors,The Semanic Web: Research and Applications, volume 6644 ofLecture Notes in Computer Science, pages 456 – 460. SpringerBerlin Heidelberg, 2011.

[17] S. Lauesen. User Interface Design: A Software EngineeringPerspective. Pearson/Addison-Wesley, Harlow, England, NewYork, 2005.

[18] J. Lewis and J. Sauro. The Factor Structure of the System Us-ability Scale. In M. Kurosu, editor, Human Centered Design,volume 5619 of Lecture Notes in Computer Science, pages 94– 103. Springer Berlin Heidelberg, 2009.

Page 17: WYSIWYM – Integrated Visualization, Exploration and ... · Introduction The Semantic Web and Linked Data movements ... grated visualization and authoring of structured con-tent

Khalili et al. / WYSIWYM 17

[19] J. Lin, M. Thomsen, and J. A. Landay. A visual languagefor sketching large and complex interactive designs. In D. R.Wixon, editor, Proceedings of the CHI 2002 Conference onHuman Factors in Computing Systems: Changing our World,Changing ourselves, CHI ’02, pages 307 – 314. ACM, 2012.

[20] V. Lopez, V. Uren, M. Sabou, and E. Motta. Is Question An-swering Fit for the Semantic Web?: A Survey. Semantic WebJournal, 2(2):125 – 155, Apr. 2011.

[21] R. H. M. Luczak-Roesch. Linked Data Authoring for Non-Experts. In C. Bizer, T. Heath, T. Berners-Lee, and K. Idehen,editors, Proceedings of the WWW2009 Workshop on LinkedData on the Web (LDOW 2009), volume 538 of CEUR Work-shop Proceedings. CEUR-WS.org, 2009.

[22] A. Löcken, T. Hesselmann, M. Pielot, N. Henze, and S. Boll.User-centred process for the definition of free-hand gesturesapplied to controlling music playback. Multimedia Systems,18(1):15 – 31, 2012.

[23] B. A. Myers. A brief history of human-computer interactiontechnology. interactions, 5(2):44 – 54, 1998.

[24] W. Müller, I. Rojas, A. Eberhart, P. Haase, and M. Schmidt.A-R-E: The Author-Review-Execute Environment. ProcediaComputer Science, 4(0):627 – 636, 2011. Proceedings of theInternational Conference on Computational Science, {ICCS}2011.

[25] J. Nielsen. Introduction to Usability. http://www.useit.com/alertbox/20030825.html, 2012.

[26] S. Oviatt, P. Cohen, L. Wu, J. Vergo, L. Duncan, B. Suhm,J. Bers, T. Holzman, T. Winograd, J. Landay, J. Larson, andD. Ferro. Designing the user interface for multimodal speechand pen-based gesture applications: state-of-the-art systems

and future research directions. Human-Computer Interaction,15(4):263 – 322, Dec. 2000.

[27] E. Pietriga, C. Bizer, D. Karger, and R. Lee. Fresnel: ABrowser-Independent Presentation Vocabulary for RDF. InI. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe,P. Mika, M. Uschold, and L. Aroyo, editors, The Semantic Web- ISWC 2006, volume 4273 of Lecture Notes in Computer Sci-ence, pages 158 – 171. Springer Berlin Heidelberg, 2006.

[28] R. Power, D. Scott, and R. Evans. What You See Is WhatYou Meant: direct knowledge editing with natural languagefeedback. In European Conference on Artificial Intelligence(ECAI), pages 677 – 681, 1998.

[29] M. Sah, W. Hall, N. M. Gibbins, and D. C. D. Roure. SEM-Port - A Personalized Semantic Portal. In 18th ACM Conf. onHypertext and Hypermedia, pages 31 – 32, 2007.

[30] C. Sauer. What you see is Wiki – Questioning WYSIWYG inthe Internet Age. In Proceedings of Wikimania 2006, 2006.

[31] B. Shneiderman. Creating creativity: user interfaces for sup-porting innovation. ACM Transactions on Computer-HumanInteraction (TOCHI) - Special issue on human-computer inter-action in the new millennium, 7(1):114 – 138, Mar. 2000.

[32] J. Spiesser and L. Kitchen. Optimization of HTML Automat-ically Generated by WYSIWYG Programs. In S. I. Feldman,M. Uretsky, M. Najork, and C. E. Wills, editors, Proceedings ofthe 13th International Conference on World Wide Web, WWW’04, pages 355 – 364, New York, NY, USA, 2004. ACM.

[33] D. Tunkelang. Faceted Search (Synthesis Lectures on Informa-tion Concepts, Retrieval, and Services). Morgan and ClaypoolPublishers, June 2009.