logical structure of a hypermedia newspaper

30
Logical structure of a hypermedia newspaper Janne Saarela, Marko Turpeinen, Tuomas Puskala, Mari Korkea-aho, and Reijo Sulonen Department of Computer Science Helsinki University of Technology 02150 Espoo, Finland Email: Janne.Saarela@hut.fi Corresponding author: Janne Saarela Tel.int: +358-0-451 3246 Fax.int: +358-0-451 3293 Submitted: 11.3.1996

Upload: independent

Post on 01-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Logical structure of a hypermedia newspaper

Janne Saarela, Marko Turpeinen, Tuomas Puskala,

Mari Korkea-aho, and Reijo Sulonen

Department of Computer ScienceHelsinki University of Technology

02150 Espoo, FinlandEmail: [email protected]

Corresponding author:

Janne SaarelaTel.int: +358-0-451 3246Fax.int: +358-0-451 3293

Submitted: 11.3.1996

Requests for reprints should be addressed to Janne Saarela

Logical structure of a hypermedia newspaper 3

Abstract — The OtaOnline project at the Helsinki University ofTechnology has been deploying the distribution of ordinary newspa-pers on the Internet since 1994. The editors produce the electroniccounterpart of the ordinary papers by a conversion process fromQuarkXpress documents to HyperText Markup Language. Theproject is about to step into a new era by introducing an informationmodel which describes the logical structure of a future newspaperand thus enables the use of distributed newspaper servers and intelli-gent producer agents.

This paper describes an object-oriented approach which imple-ments a logical model of a hypermedia newspaper. This model encap-sulates the structure of the hypermedia documents as well as theircapability of transforming into different presentation formats. It alsoprovides a semantical rating mechanism to be used with intelligentagents. A distribution scheme which enables efficient use of thismodel is also presented.

Introduction

Logical structure of a hypermedia newspaper 4

1. Introduction

1.1. OtaOnline

OtaOnline is a testbed, implemented at the campus of the Helsinki University of Tech-

nology, for experimenting with the possibilities of the net media and electronic pub-

lishing. The testbed has been built around an existing high speed network, ubiquitous

in the Otaniemi campus area.

OtaOnline consists of a product development project and a research project. These

two sub-projects share many common R&D efforts, but the ultimate goals are differ-

ent. The OtaOnline product development is primarily interested in experimenting

with the net media concept. The usage of the net media is carefully studied. New

product concepts and enhancements are introduced on a regular basis. The OtaOnline

research project concerns the problems of large-scale implementation of net media.

Four main research areas are production systems, scalable distribution systems, struc-

ture modeling of multimedia products, and personalisation of multimedia services.

1.2. Problem domain

This paper discusses the implementation of a logical structure of a hypermedia news-

server which fullfills several requirements set for servers of this type. These require-

ments have been identified as follows:

1. Control over the structure — an electronic newspaper should be designed on two

distinct levels: in the document level which combines different media types into

one unit and in the publication level which creates relations between the docu-

ments.

2. Control over presentation — the presentation of the documents should be defined

separately from the structure in order to enable several concrete representations

Introduction

Logical structure of a hypermedia newspaper 5

from one high-level representation.

3. Versioning — an edition of an electronic newspaper can be conceived to consist of

the objects which have most recently been introduced to the system. This enables

the paper to have real-time properties and update its contents at any time.

4. Distribution — if the descriptive part of the components to be presented is sepa-

rated from the actual data, the descriptive elements can be easily distributed and

replicated among several hypermedia servers. In this way the servers can deter-

mine the material they store and serve according to their own attributes.

5. Metadata — the reusability of the hypermedia documents is much better if the doc-

uments have separate information about themselves. This information enables the

personalisation aspect of a newspaper as it can be used as a basis for personal

views inside a newspaper.

Moreover, the functionality of this logical model will focus on the following

aspects:

1. Generation of the presentable hypermedia documents;

2. Distribution of the objects of this framework;

3. Added value features facilitated by the model.

1.3. Concrete representations

Different hypermedia representations are available for our purposes. The World-Wide

Web family of formats and protocols enables easy linking and presentation of hyper-

text documents in the HTML format. MHEG-5 ISO standard is still a draft document

but is assumed to become popular on other platforms apart from the Internet such as

Set Top Boxes (STB) with normal TV sets.

The term concrete representation will be used to describe the final presentation for-

Introduction

Logical structure of a hypermedia newspaper 6

mat of the client host. HTML, Java and MHEG-5 concrete representations are

described in more detail below.

1.3.1. HTML, Java

The HyperText Markup Language, HTML, was developed as a part of the World-

Wide Web project in the early 90’s. At first it was only an SGML-look-alike application

but was formally defined and accepted as an Internet standard a few years later.

HTML is a specific Document Type Definition (DTD) defined by the means of the

SGML standard. It describes the logical structure of a document despite the fact

HTML also contains presentation-oriented markup elements.

HTML version 2.0 supports markup for 6 levels of headings, three types of lists,

text emphases, link anchors and inlining images within the documents. A special

FORMS collection of markup elements enables interaction and passing information

from the client to the server.

HTML is currently aiming for a standardised version 3.0. It extends version 2.0 by

introducing a table model, better image control and mathematics. Private vendors are

also introducing their own features to the language thus pursuing for more custom-

ers. Netscape Communications Corp. has recently introduced version 2.0 of their pop-

ular Netscape browser which supports even more markup elements such as frames

and scripting languages.

Java is a language developed by Sun Microsystems, Inc. This language is based on

C++ and introduces features from Objective C. It eliminates the most common prob-

lems in object-oriented programming by removing pointers, multiple inheritance and

provides garbage collection for the run-time environment. This language is inter-

preted on the World-Wide Web by a browser, first of which was HotJava also devel-

oped by Sun. Netscape browser version 2.0 brings this interpreter to a wide variety of

Introduction

Logical structure of a hypermedia newspaper 7

platforms and offers the possibility of bringing the HTML documents alive.

Java can be used to write applets, application or protocol handlers. Applets are pro-

grams which are included within HTML documents. They are identified using the

APPLET element. Applications are stand alone Java programs which need not be

within HTML documents. Protocol handlers can be used to write handlers for media

types which the browser does not support. The output of these handlers should be

one of the known and supported media types.

The following markup hints the browser to fetch from the server and run the corre-

sponding applet which has been already compiled into bytecode by a Java compiler.

The bytecode consists of machine-independent instructions which run on any Java

runtime system.

<APPLET CODE=”StockDisplay.class” WIDTH=100 HEIGHT=50>

</APPLET>

Java has a supported API which is a class library consisting of classes for graphical

user interface, IO, network and utilities such as containers (collection classes). Java

also supports the notion of threads which enable concurrent execution of applications.

Concurrency can be controlled using the monitor and condition variable paradigm.

1.3.2. MHEG, PREMO

Two international standards are emerging as the most prominent candidates for mul-

timedia interchange and presentation. MHEG, the Multimedia/Hypermedia Experts

Group, has suggested an object-oriented framework (ISO, MHEG) where the primi-

tives of the standard are classes. The instances of these classes can be set to perform

synchronised actions and user interaction using methods which encapsulate the func-

tionality within the classes. The focus of the standard is in the interchange of multime-

dia presentations and it uses ASN.1 (ISO, ASN.1) for transmitting the description of

the structure of the presentation and the media data itself. MHEG engines for running

Introduction

Logical structure of a hypermedia newspaper 8

MHEG presentations are not yet widely available.

Part 1 of the proposed MHEG standard is a large collection of classes from which

part 5 specifies only a restricted subset. This subset is assumed to be widely used in

Set Top Boxes (STBs) with future television sets. Part 3 specifies a scripting support

whose implementation is still left open. Plans to use virtual machines similar to what

Java uses, have been proposed.

Part 5 defines a class library which can be used to create scenes, basic units of pres-

entation. The instances of the class library are described along with basic units of

interaction; links and actions. Every object can have several different actions associ-

ated with it and thus provides a way to describe the functionality as well.

PREMO, Presentation Environments for Multimedia Objects (ISO, PREMO), shifts

focus to the interactivity of the final presentations and enables methods for creating

and modifying the application models of the final presentation. PREMO uses object

technology but differs from MHEG in the sense that it controls the presentation

details and could even be used to construct an MHEG engine.

An example scene with MPEG video is presented below.

(:scene Video

(:items

(:video

(:size 320 100)

(:position 0 0)

(:hook #MPEG1)

(:data (:name “queen.mpg”)

(:storage-mode #stream)

)

)

)

1.3.3. Organization of this paper

This paper is organised as follows: section 2 describes an ideal situation we would

An ideal scenario

Logical structure of a hypermedia newspaper 9

like the users of our servers to experience. Section 3 presents the logical model which

enables the ideal use and goes through its major concepts. Section 4 shows how this

model will be actually used. Section 5 says a few words about the implications of this

model section 6 presenting a framework in which this model will work. Section 7

gives ideas on the future work and section 8 concludes the paper.

2. An ideal scenario

John, our example user, enters OtaOnline by connecting to an OtaOnline server. Since

John is at his home workstation, he uses a WWW browser to read OtaOnline. At work

he might also have chosen to use an MHEG-5 engine. The server uses public-key

cryptography to authenticate John as a legitimate user for session logging purposes

and billing of services.

The welcoming screen has many different views based on the service selections that

John has made during earlier sessions. The first view brings the news of general inter-

est as an edited issue of latest news items (similar to current OtaOnline contents). The

second view that John has selected is a special edition for classical music enthusiasts,

including concert reviews, interviews, high quality audio and video samples of

recordings. The third view is dynamically constructed according John’s personal pref-

erences described in his user model.

The status of each view is visible on screen. Static (i.e. general) views should be

accessible after a short transmission delay. The dynamic (i.e. personal) view is bound

at this stage and John is informed when the personalised view is readable. John can

also order the personalised view to be constructed at a specific time of day.

John enters the general news view and selects the first document of interest.The

news document contains text, images, two synchronised video streams, and an appli-

An ideal scenario

Logical structure of a hypermedia newspaper 10

cation for entering and viewing user annotations related to the story. Navigation sys-

tem includes a visual map of the current view and the position of the current story in

the view. John can now move from one article to another article inside the view.

The view consists of latest documents as well as links to older documents related to

the current document (news threads). The view may include these older documents,

or there may be predefined queries to fetch relevant articles from the article archive, if

John wants to browse the related stories of interest.

John is able to rate each article, using a predefined scale (0-10). He thinks the cur-

rent article was worth reading and gives the article rating 7. These annotations are

used in specifying how interesting a given article is. They are used in changing John’s

user preferences.

John enters the module where he can view and change his registered user profile.

The profile is based on predefined semantic groups that have an importance value.

John wants to see more motor sports news in his personal profile so he changes the

“Sports/Motor” parameter from 6 to 8.

Information filtering methods are used to suggest articles to read. This ranked list

of interesting material is part of John’s personal view. The selection is based on the

user profile and the annotations other users have made related to current material.

The billing of service is based on the data collected in the server log about the user

session.

Going for the goal

Logical structure of a hypermedia newspaper 11

3. Going for the goal

3.1. An object model describing the logical structure

The object model which enables the functionality and flexibility we want to achieve is

presented in Fig. 1. We argue for each class and class relation with other classes in the

following subsections.

This model will be used to provide one high-level representation from which sev-

eral concrete representations can be derived. This is also known as multi-purpose pub-

lishing. By this we mean the capability of supporting different presentation and

Figure 1. The logical structure presented by the means of OMT

Temporal

Text

+toHTML3+toMHEG5

Image

+toHTML3+toMHEG5

Audio

+duration+toHTML3+toMHEG5

Video

+duration+toHTML3+toMHEG5

Dynamic

+duration+toHTML3+toMHEG5

NonTemporal

consists of

LogBook

Annotation

made annotations

Semantics

+set+query+$listCategories+$listSubCategories

Publication

+toHTML3+toMHEG5

View

+toHMTL3+toMHEG5

GroupComponent

+toHTML3+toMHEG5

Component

consists of

has

User

+name+age+profession

rating

user profile

IdMapper

+$newId+$map

Relation

+target+name+$listNames

MediaObject

+id+description+toHTML3+toMHEG5+centralized

in relation with

points to

has

traversed items

consists of

Descriptor

+id+description+author+date+toHTML3+toMHEG5

SingleComponent

+toHTML3+toMHEG5

has background

old version

Going for the goal

Logical structure of a hypermedia newspaper 12

distribution environments.

This model can be used to structure the contents of an electronic newspaper in two

separate levels much in the same way as (Garzotto et al., 1995) do:

1. Authoring in the Small (AIS) — how a single hypermedia document (here a Single-

Component) is structured. This is accomplished using the Relation class underneath

the SingleComponent class in Fig. 1. The Relation class provides a dynamic way to

name relations to different MediaObjects. This level of structuring is also similar to

the within-component layer of the Dexter Hypertext Reference Model (Halasz et

al., 1994).

2. Authoring in the Large (AIL) — how the documents (here SingleComponents) are

structured to form an electronic newspaper. This is accomplished using the classes

Publication, View and GroupComponent.

3.1.1. Media types

Different media types supported by the model are Audio, Video, Dynamic, Text and

Image. Each of these types is implemented as a class which has the functionality of

converting the actual contents into different formats. The variety of formats is due to

the fact that HTML3 browsers may natively support different formats than MHEG-5

or other concrete representation engines. The media classes share the same base class,

MediaObject, which remains an abstract class but provides a method for identifying

each object uniquely within the framework.

The Dynamic class is a special class holding a small program or a routine to be used

in the final presentation. A logical name is assigned to each instance of the Dynamic

class is order to identify them. These instances access the ProgramLibrary class to see

whether the actual executing code can be found for a given representation. For exam-

ple, a stock market display might be available for Java but not for MHEG.

Going for the goal

Logical structure of a hypermedia newspaper 13

3.1.2. Media representations

We decided to have different media types to be encoded in the following formats in

the first stage of development. Next stage will be to reuse these formats to produce

other formats if necessary.

• JPEG for Image — a format widely used in the news industry;

• MPEG level 1 for Video — a common format for video content producers;

• AIFF for Audio — a format defined primarily for audio interchange;

• SGML for Text (Universal Text Format DTD) — an international standard which

enables high-level abstraction of documents independent of their presentation. It

thus provides an excellent master representation for text documents from which

several other representations can be derived.

3.1.3. Synchronising media

Introducing temporal media within the model poses the problem of synchronisation.

Possible solutions to this problem can be categorised into two mutually exclusive

choices:

1. logical synchronisation — something happens before, after or at the same time with

another event;

2. timely synchronisation — something takes place at time t1 having duration d1,

something else taking place at t2 having duration d2.

We adopt the concept of timed streams proposed by (Gibbs, 1994) which follows the

Figure 2. The ProbramLibrary structure

ProgramLibrary

+$addJava+$addMHEG+$list+$instance

Routine

+name+java+MHEG

Going for the goal

Logical structure of a hypermedia newspaper 14

second paradigm. They consist of a sequence of tuples of the form <ei,si,di>, i=1,...,n. ei

are the media elements (instances of MediaObject), si the start time and di the duration

(in seconds). These tuples can be generated by the editor as a part of the editorial

process. A graphical tool that supports visual cues for synchronisation will be neces-

sary. For example, a Gantt chart gives an intuitive idea on concurrent and serial acti-

vation of temporal objects.

In the first stage of development we do not enable the editors to set the time points

for synchronisation. This design activity can, however, be later easily added as the

Relation class provides a place for storing the time points.

Figure 3 introduces an example construction of one SingleComponent holding sev-

eral MediaObjects.

3.1.4. Conversion between representations

The philosophy we want to stress with this model is the separation of structure and

presentation similar to SGML (ISO, SGML) which is a standard to markup the struc-

ture of a document and DSSSL (ISO, DSSSL) which is a standard for restructuring and

formatting the SGML documents.

Figure 3. A SingleComponent has named relations to three MediaObjects

Image

Agassi holding trophy

Video

Agassi’s last serve

SingleComponent

Agassi wins

Semantics

Sports/ball=5

Sports=5

Agassi wins 2nd game at

the Australian Open

Text

main video

Relation Relation Relation

abstract main image

Going for the goal

Logical structure of a hypermedia newspaper 15

In our model the formatting of the Components is implemented separately from the

structure using the Formatter class. This class can have several subclasses each of

which does a specific type of formatting. For example, figure 4 shows two different

formatters, one for HTML3 and one for MHEG-5. The base class, Formatter, holds a list

of all available formatters. The editor can apply one of these formatters to any Descrip-

tor subclass and thus produce the wanted concrete representation.

It should be noted that the media types themselves also participate in this format-

ting process by producing the suitable media formats to be used in the concrete repre-

sentations.

In the first stage of development, the MediaObjects of type Video, Audio and Image

shall be encoded in one of the formats already supported by the WWW browsers. The

Text objects will be converted from Universal Text Format DTD to HTML3 DTD using

an event-driven translator built on top of the validating SGML parser, nsgmls1, by

James Clark. The translator reads the Element Structure Information Set (ESIS) format

output of nsgmls and calls a specific function for each start and end element in the

original document and produces a document in the wanted output DTD. The tool we

use is an extended version of the sgmlspl2 conversion package originally written by

1. http://www.jclark.com/2. http://www.uottawa.ca/~dmeggins/Index.html

Figure 4. The Formatter class structure

MHEG5_Formatter

+apply

HTML3_Formatter

+apply

Formatter

+apply+$add+$list+$instance

Going for the goal

Logical structure of a hypermedia newspaper 16

David Megginson at the University of Ottawa.

In the next stage the conversion of the other MediaObject types can later be encapsu-

lated within the classes. The instances can then simply be asked to convert themselves

to a wanted encoding format.

As the MHEG-5 formatting cannot be verified due to the lack of available engines,

its proper testing will be done later. At this point the formatter for MHEG-5 incorpo-

rates a class library generated by the snacc compiler1 from the ASN.1 descriptions for

MHEG-5. This class library can be used to construct the scenes for the MHEG-5 pres-

entation and to produce the universal encoding according to the Basic Encoding Rules

(BER, ISO).

3.1.5. Semantics

With a separate Semantics class we address the problem of classifying the contents of

an article apart from its actual representation in a systematic way. The approach we

use is a straight-forward implementation of rating categories of two levels.

An instance of the Semantics class is attached to each instance of SingleComponent.

This helps the editor when he/she plans, for example, a View with contents of a given

type. One instance of Semantics class is also linked to every user as the users may have

personal profiles which determine the semantical contents they are interested in. The

user profile can be used to automatically filter information from all available articles

to a personal subset of that material.

The semantic metadata must, however, be provided by someone early in the edito-

rial process. We suggest the following scheme for semantic metadata: a fixed set of

semantic categories each having a value from 0 to 5. The bigger the value, the more

this information is relevant to the given semantic category. 0 indicates a total absence

1. http://remarque.berkeley.edu/~muir/free-compilers/TOOL/ASN1-1.html

Going for the goal

Logical structure of a hypermedia newspaper 17

of relevance to a given category. These categories can have one level of subcategories.

This is how we extend the categories but still remain in a level abstract enough not to

get into single keywords which describe the semantics.

Table 1 gives an idea what a semantic metadata entry might look like for an article

describing the wedding ceremony of a celebrated domestic music artist.

The categories must be defined before they are used as they need to be available on

equal basis for all news material generated with this model. The person who assigns

these categories and their respective values to SingleComponents, should ideally stay

the same as otherwise the subjective classification of these entries might vary too

much.

3.2. Using the semantics to filter information

In addition to these categories, the users can type in a list of inclusive and exclusive

keywords. The inclusive keywords are used to select the articles no matter what the

semantic rating and exclusive keywords to discard articles.

Now, it is necessary to stress the two-fold nature of Semantics.

1. An editor assigns values to each category from his newspaper’s i.e. the product’s

point of view.

2. The user can set a profile based on his/her personal interests. For example, value 5

for sports category will make sure the user gets articles with rating 5 for sports. An

Table 1: Semantic categories

Categories Subcategories Value

social event 5

social event family-related 5

geography domestic 5

art music 2

Going for the goal

Logical structure of a hypermedia newspaper 18

example interface for setting the values using Java is shown in Fig. 5.

As an example there are five articles having different ratings 1, 2, 3, 4, 5 for sports.

The user says he wants sports with value 4. He will now get all articles having the rep-

resentative value greater or equal to 4 e.g. 4 and 5.

Once the values have been assigned to different categories and their subcategories,

the following 5 different cases can be considered when retrieving the articles accord-

ing to the ratings. These five cases, A-E, are also depicted in Fig. 6. This figure shows

how the user has set the rating of Sports to 5 and Economics to 3. We will now analyse

each of the five conceptually different articles and see why and when they are

retrieved.

• A — Article is selected because its Sports rating, 5, is equal to the setting in the

user’s profile. It will be retrieved to the Sports section if the user wanted to create

sections according to semantic categories.

• B — The article’s rating is equal to both Sports and Economics and will thus be

Figure 5. Configuring the user profile with a Java applet

Using the model

Logical structure of a hypermedia newspaper 19

retrieved to either of those two categories.

• C — The article’s rating is bigger than the one in the user’s profile and will be

retrieved to the Economics section.

• D and E — Neither of these two articles will be retrieved due to the user’s profile.

The only way they would be retrieved is by a random procedure which will be acti-

vated when there are no articles with high enough ratings. The random selection

procedure should, however, take into account the relative values the articles have

e.g. article E would be more likely to appear in the Sports section than D.

4. Using the model

4.1. Generating views

Instantiating the class objects and making the associations between these instances

will be left to the editor. His task will consist of creating documents and rating them

with the semantic metadata. In addition he makes logical associations between the

document instances and generates Views. An instance of the View class is an ordered

Figure 6. Example of ratings of two categories

����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

��������������������������������������������

��������������������������������������������������������������������������������������������

��������������������������������������������������������������������������������������������

D

Sports

Economics50

0

5

3

A B

C

E

Using the model

Logical structure of a hypermedia newspaper 20

collection of SingleComponent or GroupComponent instances. The main activity of the

editor will be the creation of these Views which form a single Publication. An example

of the Views is given in Fig. 7. This task is somewhat similar to the one using Hyper-

media Design Model (Garzotto et al., 1995) implemented by (Kessler, 1995). It is worth

noticing that the structure will be a graph instead of a tree due to the fact that different

Views may have common children.

The concrete representation of these Views will be generated when needed. This we

call the late binding of views. Each instance will know whether it has already been con-

verted into a given concrete representation or whether it is even its duty to do the con-

version. This concept helps us with the distribution scheme described later in this

paper more detail.

It should be noted that each object in this model will have a unique identifier con-

sisting, for instance, of a date and a positive integer. These identifiers will have an infi-

nite lifetime which guarantees that a single instance of SingleComponent or an instance

of Audio can be retrieved from the system at any time. Having the identifier—concrete

representation mapping table available at all times helps to reduce the storage

requirements at the servers. If the concrete representation is not here, we know where

Figure 7. Two views sharing same components

International news

SingleComponent

Domestic news Agassi wins

Politics Domestic

EU directive for music

Sweden declines EMU France bombing Atolls Unemployment -10% Parlament on leave

GroupComponent SingleComponent GroupComponent SingleComponent

ViewView

SingleComponent SingleComponent SingleComponent

Using the model

Logical structure of a hypermedia newspaper 21

to find it.

The concrete representation sets the granularity of the identifiers. An HTML docu-

ment is identified using a Uniform Resource Locator (URL) which corresponds to the

level of hypermedia documents (SingleComponents). If the user saves the URL he/she

can find the same SingleComponent at any later time.

4.2. Logging activities

Once a client requests an instance of the Component class, his request will be resolved

through a logical resolver. Its implementation is very simple; it finds a corresponding

instance given a logical name, sees whether this object has the physical representation

at this server in which case it simply returns it. In case the concrete representation is

missing, it will be either generated on-the-fly or fetched from the server that has the

MediaObjects associated with the Component.

Once a document is retrieved from the server, a new association between a user

specific LogBook instance and the retrieved Component will be created. On regular

basis, these associations can be collected to one central server which can perform a

Figure 8. Logging activities and annotations using LogBook and Annotation

EU directive for music

AnnotationLogBook

User

SingleComponentSingleComponent SingleComponent

Janne Saarela

Agassi wins EU Unemployment -10%

Implications of the model

Logical structure of a hypermedia newspaper 22

global analysis of the user behaviour. Figure 8 shows a user who has fetched three

Components and annotated one of them.

These log entries will work as a basis for the intelligent producer agents which

work at the server end analysing the clients. The agents are described in more detail

in (Turpeinen, 1995).

4.3. Annotating components

The user will be given a chance to evaluate the Components by giving, for example, a

value ranging from 0 to 5; 0 indicating no interest at all, 3 some interest and 5 I-want-

more-of-these. These instances of the Annotation class will be linked from the User

instance to several Components.

The annotations can later be processed and used in the analysis of the user profile

in trying to find out what type of articles the user finds interesting. Letting users see

other people’s ratings facilitates social filtering first introduced in (Malone et al.,

1987).

5. Implications of the model

5.1. Lost concept of an edition

The edition of a hypermedia newspaper complying with the model is a collection of

the latest versions of the introduced objects. An editor can generate Views once a day

to give the reader a daily newspaper, but the MediaObjects can be introduced at any

time. A new version of an object replaces the old one. The new version then has a his-

tory link to the previous versions. Figure 9 presents this situation.

It is also possible for the editor to create new articles during the day. Once he or she

introduces a new SingleComponent into the system, he or she has to either update a

Implications of the model

Logical structure of a hypermedia newspaper 23

View or a GroupComponent in order to reflect the new association to this article. It is

once again worth noting that if the user does not want to go through the predesigned

Views, he can take advantage of the filtering process and thus be able to include the

new article in his personal newspaper.

5.2. Object lifetime

As the amount of objects becomes increasingly large, an object-collection process can

be run on the material. This process can store old (old as in age or as an old version)

objects into a long-term storage. The object identifiers remain available at each server

thus providing a way to find old material being placed at a specific storage system.

To avoid a large table of identifiers being stored at each server, the identifiers could

be structured, for example, so that a known prefix redirects a request to a specific

server. A request such as /obj/text/archive487 would be interpreted and the

archive part would be enough to redirect this request to a long-term text storage

Figure 9. Latest edition defined in terms of latest objects

View

View

version link

View

Object repository

Latest edition

assigned link

An architectural framework

Logical structure of a hypermedia newspaper 24

server.

6. An architectural framework

All the benefits of this model can be achieved by using a persistent store for storing

the objects. The objects and their associations will remain the same from one invoca-

tion of the server to another which provides us with a solid basis for the framework.

Object database management systems (Cattell, 1994) can provide many different

features we find useful. In addition to persistency a concept of versioning is necessary.

An electronic newspaper can be taken as a collection of the latest versions of the

objects in the logical model. Every time a new version of an existing object is intro-

duced into the model, the old ones still remain available through the versioning

mechanism.

The object model that the object database uses may not be final at the early stage of

the design process. Changes to the model will cause problems if the ODBMS is not

capable of handling them. Some of the available databases can be adjusted to modifi-

cations in the attributes, methods, types and inheritance chains. We find this adaptive-

ness crucially important.

Of other features a query language seems appropriate. A user who has read a given

article can be traced with a query expression. The language should, however, be com-

patible with an object-oriented language such as C++. Several object databases do this

and they also provide an ad-hoc query facility when the ODGM-93 compliant inter-

face is missing.

6.1. A distribution scheme

Distributing the instances of the logical model to all of the servers serving the news-

paper is necessary as we wish to have the same material available at each site. Trans-

An architectural framework

Logical structure of a hypermedia newspaper 25

ferring the instances over a network to other servers appears more effective than

having all the concrete representations of media being transferred as well. This plan is

shown in Fig. 10.

Due to the huge amount of information contained in the electronic newspapers we

plan to distribute the origin of different types of media into different servers. This

leaves room for optimising the server load and storage for a given situation. A single

server who is missing the concrete representation of an object can choose whether it

wants to keep a local copy (cache) of these representations or delete them as soon as

they have been served. An example situation where a server is missing the Text for a

Component and fetches it from a text archive is presented in Fig. 11.

Figure 10. Replicating instances instead of concrete representations

Figure 11. The connected server does the late binding of views

1

3. server replicates the instances of the logical model

2

3a

3b

3c

1. editor introduces a new component into system

2. receiving server stores the MediaObjects locally

text archive

live video feed

voting system

5. return the concrete representation

1

5 3

2

4

1. request a Component

2. request the Text part of the Component

3. return the Text

4. create the concrete representation and cache it

Results

Logical structure of a hypermedia newspaper 26

It is also possible to classify different servers having several levels of functionality.

A basic server might only hold the concrete representations of the objects and have no

network connection but once a day when it retrieves the daily newspaper. Other

superservers could communicate and update the news material at any time and thus

reach for the real-time electronic newspapers.

The centralised method of the MediaObject class tells whether a given instance can be

distributed to several servers or if it should remain at a single server from where cli-

ents request it every time they need it. Should there be an application, say a voting

system, whose result influences other objects, say a pie chart image, the application

(instance of the Dynamic class) should remain centralised at one server in order to pro-

vide all the clients with the same output.

Application-level control of the distribution can be implemented in many ways.

These aspects are discussed in (Korkea-aho, 1995).

7. Results

The object model has been implemented on a single Objectivity/DB database which

provides a clean interface for objectmanagement. Features such as versioning, cluster-

ing and indexing are used in the implementation. Some tests have been conducted

where articles have been composed of separate media objects and higher level struc-

tures have been designed to provide the GroupComponent and View levels of this

model. Simple formatter subclasses have also been designed which can be used to set

the layout for the presentable units such as SingleComponents. An example of this is

presented in Fig. 12.

The algorithms for generating the contents and the structure of the newspaper have

also been verified and they have proven to work with the simple rating system pre-

Conclusions

Logical structure of a hypermedia newspaper 27

sented in this paper.

The ease with which a whole Publication can be created is clearly an advantage

compared to the old-fashioned editing of HTML documents. It, however, remains to

be seen how well the editors at the newspaper companies adapt to the edition of a

new type of product.

What still remains to be verified is the efficiency of using late binding of views

instead of transferring the concrete representations over the network in a distributed

environment. The current implementation does the late binding of views but only in a

centralised manner.

8. Conclusions

A logical model of an electronic newspaper has been described. It aims for a next gen-

eration newspaper which not only is the counterpart of the printed version but brings

Figure 12. A SingleComponent formatted by a Formatter class instance in HTML

Conclusions

Logical structure of a hypermedia newspaper 28

added value by introducing several features:

• The documents are described in a structural manner independent of any specific

presentation format such as HTML or MHEG-5. Separate Formatter class instances

are applied to the structures to produce the wanted layout;

• The material is distributed at the level of this model without binding the structure

to any specific presentation format before required. The servers are also capable of

determining what level of functionality they provide to their clients as different

servers have different storage and processing resources available;

• The contents are rated with a simple two level category classification system which

enables efficient filtering of information thus enabling personalised views inside

the electronic newspaper;

• The users are allowed to annotate the articles. This enables social filtering i.e. the

possibility to follow popular articles read and annotated by other users;

We believe this is the most suitable framework for an electronic newspaper. It pro-

vides a solid basis for the whole editorial process starting off from traditional page

layout systems such as QuarkXPress. The process consists of composing single pre-

sentable units from media objects. After this a higher structure consisting of groups of

presentable units, views collecting together groups and articles and finally publica-

tions each with their own brand can be designed.

Once the structure has been designed, the editor attaches specific formatting

instructions to the structures on how to layout them on different presentation envi-

ronments. The product is then distributed using an a scheme which binds the struc-

tures with the formatting instructions at the server where they are requested from.

Personalisation is enabled through the use of a semantic rating system which

allows for personal generation of the contents and the structure of the product. Other

Conclusions

Logical structure of a hypermedia newspaper 29

added value features such as social filtering is also possible with the help of users’

annotations.

Acknowledgements

This research was supported by the Finnish Technology Development Center

(TEKES), Nokia and the Aamulehti corporation.

Thanks to Mikael Honkala for analysing the OMT model and reviewing the paper.

Conclusions

Logical structure of a hypermedia newspaper 30

References

Bruford, J.F.K. (1994). Multimedia Systems, Addison-Wesley.

Cattell, R.G.G. (1994). Object Data Management, Addison-Wesley.

Garzotto F.,Mainetti L., Paolini P. (1995). Hypermedia Design, Analysis and EvaluationIssues, Communications of the ACM, 38(8).

Gibbs, S.J., Tsichritzis, D.C. (1994). Multimedia Programming, Addison-Wesley.

Halasz, F., Schwartz, M. (1994). The Dexter HyperText Reference Model, Communicationsof the ACM, 37(2).

International Organization for Standardization. Abstract Syntax Notation One (ASN.1):Specification of basic notation. ISO/IEC 8824-1.

International Organization for Standardization. ASN.1 encoding rules: Specification ofBasic Encoding Rules (BER), Canonical Encoding Rules (CER) and DistinguishedEncoding Rules (DER). ISO/IEC 8825-1.

International Organization for Standardization. Document Style Semantics and Specifica-tion Language, DSSSL, ISO/IEC DIS 10179.

International Organization for Standardization. MHEG Object Representation, Basenotation (ASN.1), ISO/IEC DIS 13522-1.

International Organization for Standardization. Presentation Environments for Multime-dia Objects (PREMO), ISO/IEC CD 14478-1.

International Organization for Standardization. Standard Generalized Markup Language(SGML), ISO 8879.

Isakowitz, T., Stohr, E.A., Balasubramanian, P. (1995). RMM. A Methodology for Struc-tured Hypermedia Design, Communications of the ACM, 38(8).

Kessler, M. (1995). A Schema-Based Approach to HTML Authoring. World Wide WebJournal, Fourth International World Wide Web Conference Proceedings, O’Reilly &Associates.

Korkea-aho, M. (1995). Scalability in Distributed Multimedia Systems, Technical reportTKO-B128, Helsinki University of Technology.

Malone, T.W., Grant K.R., Turbak F.A., Brobst S.A., Cohen M.D. (1987). IntelligentInformation-sharing Systems, Communications of the ACM, 30(5), pp. 390-402.

Turpeinen, M. (1995). Agent-mediated personalised multimedia services, Technical reportTKO-B125, Helsinki University of Technology.