logical structure of a hypermedia newspaper
TRANSCRIPT
Logical structure of a hypermedia newspaper
Janne Saarela, Marko Turpeinen, Tuomas Puskala,
Mari Korkea-aho, and Reijo Sulonen
Department of Computer ScienceHelsinki University of Technology
02150 Espoo, FinlandEmail: [email protected]
Corresponding author:
Janne SaarelaTel.int: +358-0-451 3246Fax.int: +358-0-451 3293
Submitted: 11.3.1996
Logical structure of a hypermedia newspaper 3
Abstract — The OtaOnline project at the Helsinki University ofTechnology has been deploying the distribution of ordinary newspa-pers on the Internet since 1994. The editors produce the electroniccounterpart of the ordinary papers by a conversion process fromQuarkXpress documents to HyperText Markup Language. Theproject is about to step into a new era by introducing an informationmodel which describes the logical structure of a future newspaperand thus enables the use of distributed newspaper servers and intelli-gent producer agents.
This paper describes an object-oriented approach which imple-ments a logical model of a hypermedia newspaper. This model encap-sulates the structure of the hypermedia documents as well as theircapability of transforming into different presentation formats. It alsoprovides a semantical rating mechanism to be used with intelligentagents. A distribution scheme which enables efficient use of thismodel is also presented.
Introduction
Logical structure of a hypermedia newspaper 4
1. Introduction
1.1. OtaOnline
OtaOnline is a testbed, implemented at the campus of the Helsinki University of Tech-
nology, for experimenting with the possibilities of the net media and electronic pub-
lishing. The testbed has been built around an existing high speed network, ubiquitous
in the Otaniemi campus area.
OtaOnline consists of a product development project and a research project. These
two sub-projects share many common R&D efforts, but the ultimate goals are differ-
ent. The OtaOnline product development is primarily interested in experimenting
with the net media concept. The usage of the net media is carefully studied. New
product concepts and enhancements are introduced on a regular basis. The OtaOnline
research project concerns the problems of large-scale implementation of net media.
Four main research areas are production systems, scalable distribution systems, struc-
ture modeling of multimedia products, and personalisation of multimedia services.
1.2. Problem domain
This paper discusses the implementation of a logical structure of a hypermedia news-
server which fullfills several requirements set for servers of this type. These require-
ments have been identified as follows:
1. Control over the structure — an electronic newspaper should be designed on two
distinct levels: in the document level which combines different media types into
one unit and in the publication level which creates relations between the docu-
ments.
2. Control over presentation — the presentation of the documents should be defined
separately from the structure in order to enable several concrete representations
Introduction
Logical structure of a hypermedia newspaper 5
from one high-level representation.
3. Versioning — an edition of an electronic newspaper can be conceived to consist of
the objects which have most recently been introduced to the system. This enables
the paper to have real-time properties and update its contents at any time.
4. Distribution — if the descriptive part of the components to be presented is sepa-
rated from the actual data, the descriptive elements can be easily distributed and
replicated among several hypermedia servers. In this way the servers can deter-
mine the material they store and serve according to their own attributes.
5. Metadata — the reusability of the hypermedia documents is much better if the doc-
uments have separate information about themselves. This information enables the
personalisation aspect of a newspaper as it can be used as a basis for personal
views inside a newspaper.
Moreover, the functionality of this logical model will focus on the following
aspects:
1. Generation of the presentable hypermedia documents;
2. Distribution of the objects of this framework;
3. Added value features facilitated by the model.
1.3. Concrete representations
Different hypermedia representations are available for our purposes. The World-Wide
Web family of formats and protocols enables easy linking and presentation of hyper-
text documents in the HTML format. MHEG-5 ISO standard is still a draft document
but is assumed to become popular on other platforms apart from the Internet such as
Set Top Boxes (STB) with normal TV sets.
The term concrete representation will be used to describe the final presentation for-
Introduction
Logical structure of a hypermedia newspaper 6
mat of the client host. HTML, Java and MHEG-5 concrete representations are
described in more detail below.
1.3.1. HTML, Java
The HyperText Markup Language, HTML, was developed as a part of the World-
Wide Web project in the early 90’s. At first it was only an SGML-look-alike application
but was formally defined and accepted as an Internet standard a few years later.
HTML is a specific Document Type Definition (DTD) defined by the means of the
SGML standard. It describes the logical structure of a document despite the fact
HTML also contains presentation-oriented markup elements.
HTML version 2.0 supports markup for 6 levels of headings, three types of lists,
text emphases, link anchors and inlining images within the documents. A special
FORMS collection of markup elements enables interaction and passing information
from the client to the server.
HTML is currently aiming for a standardised version 3.0. It extends version 2.0 by
introducing a table model, better image control and mathematics. Private vendors are
also introducing their own features to the language thus pursuing for more custom-
ers. Netscape Communications Corp. has recently introduced version 2.0 of their pop-
ular Netscape browser which supports even more markup elements such as frames
and scripting languages.
Java is a language developed by Sun Microsystems, Inc. This language is based on
C++ and introduces features from Objective C. It eliminates the most common prob-
lems in object-oriented programming by removing pointers, multiple inheritance and
provides garbage collection for the run-time environment. This language is inter-
preted on the World-Wide Web by a browser, first of which was HotJava also devel-
oped by Sun. Netscape browser version 2.0 brings this interpreter to a wide variety of
Introduction
Logical structure of a hypermedia newspaper 7
platforms and offers the possibility of bringing the HTML documents alive.
Java can be used to write applets, application or protocol handlers. Applets are pro-
grams which are included within HTML documents. They are identified using the
APPLET element. Applications are stand alone Java programs which need not be
within HTML documents. Protocol handlers can be used to write handlers for media
types which the browser does not support. The output of these handlers should be
one of the known and supported media types.
The following markup hints the browser to fetch from the server and run the corre-
sponding applet which has been already compiled into bytecode by a Java compiler.
The bytecode consists of machine-independent instructions which run on any Java
runtime system.
<APPLET CODE=”StockDisplay.class” WIDTH=100 HEIGHT=50>
</APPLET>
Java has a supported API which is a class library consisting of classes for graphical
user interface, IO, network and utilities such as containers (collection classes). Java
also supports the notion of threads which enable concurrent execution of applications.
Concurrency can be controlled using the monitor and condition variable paradigm.
1.3.2. MHEG, PREMO
Two international standards are emerging as the most prominent candidates for mul-
timedia interchange and presentation. MHEG, the Multimedia/Hypermedia Experts
Group, has suggested an object-oriented framework (ISO, MHEG) where the primi-
tives of the standard are classes. The instances of these classes can be set to perform
synchronised actions and user interaction using methods which encapsulate the func-
tionality within the classes. The focus of the standard is in the interchange of multime-
dia presentations and it uses ASN.1 (ISO, ASN.1) for transmitting the description of
the structure of the presentation and the media data itself. MHEG engines for running
Introduction
Logical structure of a hypermedia newspaper 8
MHEG presentations are not yet widely available.
Part 1 of the proposed MHEG standard is a large collection of classes from which
part 5 specifies only a restricted subset. This subset is assumed to be widely used in
Set Top Boxes (STBs) with future television sets. Part 3 specifies a scripting support
whose implementation is still left open. Plans to use virtual machines similar to what
Java uses, have been proposed.
Part 5 defines a class library which can be used to create scenes, basic units of pres-
entation. The instances of the class library are described along with basic units of
interaction; links and actions. Every object can have several different actions associ-
ated with it and thus provides a way to describe the functionality as well.
PREMO, Presentation Environments for Multimedia Objects (ISO, PREMO), shifts
focus to the interactivity of the final presentations and enables methods for creating
and modifying the application models of the final presentation. PREMO uses object
technology but differs from MHEG in the sense that it controls the presentation
details and could even be used to construct an MHEG engine.
An example scene with MPEG video is presented below.
(:scene Video
(:items
(:video
(:size 320 100)
(:position 0 0)
(:hook #MPEG1)
(:data (:name “queen.mpg”)
(:storage-mode #stream)
)
)
)
1.3.3. Organization of this paper
This paper is organised as follows: section 2 describes an ideal situation we would
An ideal scenario
Logical structure of a hypermedia newspaper 9
like the users of our servers to experience. Section 3 presents the logical model which
enables the ideal use and goes through its major concepts. Section 4 shows how this
model will be actually used. Section 5 says a few words about the implications of this
model section 6 presenting a framework in which this model will work. Section 7
gives ideas on the future work and section 8 concludes the paper.
2. An ideal scenario
John, our example user, enters OtaOnline by connecting to an OtaOnline server. Since
John is at his home workstation, he uses a WWW browser to read OtaOnline. At work
he might also have chosen to use an MHEG-5 engine. The server uses public-key
cryptography to authenticate John as a legitimate user for session logging purposes
and billing of services.
The welcoming screen has many different views based on the service selections that
John has made during earlier sessions. The first view brings the news of general inter-
est as an edited issue of latest news items (similar to current OtaOnline contents). The
second view that John has selected is a special edition for classical music enthusiasts,
including concert reviews, interviews, high quality audio and video samples of
recordings. The third view is dynamically constructed according John’s personal pref-
erences described in his user model.
The status of each view is visible on screen. Static (i.e. general) views should be
accessible after a short transmission delay. The dynamic (i.e. personal) view is bound
at this stage and John is informed when the personalised view is readable. John can
also order the personalised view to be constructed at a specific time of day.
John enters the general news view and selects the first document of interest.The
news document contains text, images, two synchronised video streams, and an appli-
An ideal scenario
Logical structure of a hypermedia newspaper 10
cation for entering and viewing user annotations related to the story. Navigation sys-
tem includes a visual map of the current view and the position of the current story in
the view. John can now move from one article to another article inside the view.
The view consists of latest documents as well as links to older documents related to
the current document (news threads). The view may include these older documents,
or there may be predefined queries to fetch relevant articles from the article archive, if
John wants to browse the related stories of interest.
John is able to rate each article, using a predefined scale (0-10). He thinks the cur-
rent article was worth reading and gives the article rating 7. These annotations are
used in specifying how interesting a given article is. They are used in changing John’s
user preferences.
John enters the module where he can view and change his registered user profile.
The profile is based on predefined semantic groups that have an importance value.
John wants to see more motor sports news in his personal profile so he changes the
“Sports/Motor” parameter from 6 to 8.
Information filtering methods are used to suggest articles to read. This ranked list
of interesting material is part of John’s personal view. The selection is based on the
user profile and the annotations other users have made related to current material.
The billing of service is based on the data collected in the server log about the user
session.
Going for the goal
Logical structure of a hypermedia newspaper 11
3. Going for the goal
3.1. An object model describing the logical structure
The object model which enables the functionality and flexibility we want to achieve is
presented in Fig. 1. We argue for each class and class relation with other classes in the
following subsections.
This model will be used to provide one high-level representation from which sev-
eral concrete representations can be derived. This is also known as multi-purpose pub-
lishing. By this we mean the capability of supporting different presentation and
Figure 1. The logical structure presented by the means of OMT
Temporal
Text
+toHTML3+toMHEG5
Image
+toHTML3+toMHEG5
Audio
+duration+toHTML3+toMHEG5
Video
+duration+toHTML3+toMHEG5
Dynamic
+duration+toHTML3+toMHEG5
NonTemporal
consists of
LogBook
Annotation
made annotations
Semantics
+set+query+$listCategories+$listSubCategories
Publication
+toHTML3+toMHEG5
View
+toHMTL3+toMHEG5
GroupComponent
+toHTML3+toMHEG5
Component
consists of
has
User
+name+age+profession
rating
user profile
IdMapper
+$newId+$map
Relation
+target+name+$listNames
MediaObject
+id+description+toHTML3+toMHEG5+centralized
in relation with
points to
has
traversed items
consists of
Descriptor
+id+description+author+date+toHTML3+toMHEG5
SingleComponent
+toHTML3+toMHEG5
has background
old version
Going for the goal
Logical structure of a hypermedia newspaper 12
distribution environments.
This model can be used to structure the contents of an electronic newspaper in two
separate levels much in the same way as (Garzotto et al., 1995) do:
1. Authoring in the Small (AIS) — how a single hypermedia document (here a Single-
Component) is structured. This is accomplished using the Relation class underneath
the SingleComponent class in Fig. 1. The Relation class provides a dynamic way to
name relations to different MediaObjects. This level of structuring is also similar to
the within-component layer of the Dexter Hypertext Reference Model (Halasz et
al., 1994).
2. Authoring in the Large (AIL) — how the documents (here SingleComponents) are
structured to form an electronic newspaper. This is accomplished using the classes
Publication, View and GroupComponent.
3.1.1. Media types
Different media types supported by the model are Audio, Video, Dynamic, Text and
Image. Each of these types is implemented as a class which has the functionality of
converting the actual contents into different formats. The variety of formats is due to
the fact that HTML3 browsers may natively support different formats than MHEG-5
or other concrete representation engines. The media classes share the same base class,
MediaObject, which remains an abstract class but provides a method for identifying
each object uniquely within the framework.
The Dynamic class is a special class holding a small program or a routine to be used
in the final presentation. A logical name is assigned to each instance of the Dynamic
class is order to identify them. These instances access the ProgramLibrary class to see
whether the actual executing code can be found for a given representation. For exam-
ple, a stock market display might be available for Java but not for MHEG.
Going for the goal
Logical structure of a hypermedia newspaper 13
3.1.2. Media representations
We decided to have different media types to be encoded in the following formats in
the first stage of development. Next stage will be to reuse these formats to produce
other formats if necessary.
• JPEG for Image — a format widely used in the news industry;
• MPEG level 1 for Video — a common format for video content producers;
• AIFF for Audio — a format defined primarily for audio interchange;
• SGML for Text (Universal Text Format DTD) — an international standard which
enables high-level abstraction of documents independent of their presentation. It
thus provides an excellent master representation for text documents from which
several other representations can be derived.
3.1.3. Synchronising media
Introducing temporal media within the model poses the problem of synchronisation.
Possible solutions to this problem can be categorised into two mutually exclusive
choices:
1. logical synchronisation — something happens before, after or at the same time with
another event;
2. timely synchronisation — something takes place at time t1 having duration d1,
something else taking place at t2 having duration d2.
We adopt the concept of timed streams proposed by (Gibbs, 1994) which follows the
Figure 2. The ProbramLibrary structure
ProgramLibrary
+$addJava+$addMHEG+$list+$instance
Routine
+name+java+MHEG
Going for the goal
Logical structure of a hypermedia newspaper 14
second paradigm. They consist of a sequence of tuples of the form <ei,si,di>, i=1,...,n. ei
are the media elements (instances of MediaObject), si the start time and di the duration
(in seconds). These tuples can be generated by the editor as a part of the editorial
process. A graphical tool that supports visual cues for synchronisation will be neces-
sary. For example, a Gantt chart gives an intuitive idea on concurrent and serial acti-
vation of temporal objects.
In the first stage of development we do not enable the editors to set the time points
for synchronisation. This design activity can, however, be later easily added as the
Relation class provides a place for storing the time points.
Figure 3 introduces an example construction of one SingleComponent holding sev-
eral MediaObjects.
3.1.4. Conversion between representations
The philosophy we want to stress with this model is the separation of structure and
presentation similar to SGML (ISO, SGML) which is a standard to markup the struc-
ture of a document and DSSSL (ISO, DSSSL) which is a standard for restructuring and
formatting the SGML documents.
Figure 3. A SingleComponent has named relations to three MediaObjects
Image
Agassi holding trophy
Video
Agassi’s last serve
SingleComponent
Agassi wins
Semantics
Sports/ball=5
Sports=5
Agassi wins 2nd game at
the Australian Open
Text
main video
Relation Relation Relation
abstract main image
Going for the goal
Logical structure of a hypermedia newspaper 15
In our model the formatting of the Components is implemented separately from the
structure using the Formatter class. This class can have several subclasses each of
which does a specific type of formatting. For example, figure 4 shows two different
formatters, one for HTML3 and one for MHEG-5. The base class, Formatter, holds a list
of all available formatters. The editor can apply one of these formatters to any Descrip-
tor subclass and thus produce the wanted concrete representation.
It should be noted that the media types themselves also participate in this format-
ting process by producing the suitable media formats to be used in the concrete repre-
sentations.
In the first stage of development, the MediaObjects of type Video, Audio and Image
shall be encoded in one of the formats already supported by the WWW browsers. The
Text objects will be converted from Universal Text Format DTD to HTML3 DTD using
an event-driven translator built on top of the validating SGML parser, nsgmls1, by
James Clark. The translator reads the Element Structure Information Set (ESIS) format
output of nsgmls and calls a specific function for each start and end element in the
original document and produces a document in the wanted output DTD. The tool we
use is an extended version of the sgmlspl2 conversion package originally written by
1. http://www.jclark.com/2. http://www.uottawa.ca/~dmeggins/Index.html
Figure 4. The Formatter class structure
MHEG5_Formatter
+apply
HTML3_Formatter
+apply
Formatter
+apply+$add+$list+$instance
Going for the goal
Logical structure of a hypermedia newspaper 16
David Megginson at the University of Ottawa.
In the next stage the conversion of the other MediaObject types can later be encapsu-
lated within the classes. The instances can then simply be asked to convert themselves
to a wanted encoding format.
As the MHEG-5 formatting cannot be verified due to the lack of available engines,
its proper testing will be done later. At this point the formatter for MHEG-5 incorpo-
rates a class library generated by the snacc compiler1 from the ASN.1 descriptions for
MHEG-5. This class library can be used to construct the scenes for the MHEG-5 pres-
entation and to produce the universal encoding according to the Basic Encoding Rules
(BER, ISO).
3.1.5. Semantics
With a separate Semantics class we address the problem of classifying the contents of
an article apart from its actual representation in a systematic way. The approach we
use is a straight-forward implementation of rating categories of two levels.
An instance of the Semantics class is attached to each instance of SingleComponent.
This helps the editor when he/she plans, for example, a View with contents of a given
type. One instance of Semantics class is also linked to every user as the users may have
personal profiles which determine the semantical contents they are interested in. The
user profile can be used to automatically filter information from all available articles
to a personal subset of that material.
The semantic metadata must, however, be provided by someone early in the edito-
rial process. We suggest the following scheme for semantic metadata: a fixed set of
semantic categories each having a value from 0 to 5. The bigger the value, the more
this information is relevant to the given semantic category. 0 indicates a total absence
1. http://remarque.berkeley.edu/~muir/free-compilers/TOOL/ASN1-1.html
Going for the goal
Logical structure of a hypermedia newspaper 17
of relevance to a given category. These categories can have one level of subcategories.
This is how we extend the categories but still remain in a level abstract enough not to
get into single keywords which describe the semantics.
Table 1 gives an idea what a semantic metadata entry might look like for an article
describing the wedding ceremony of a celebrated domestic music artist.
The categories must be defined before they are used as they need to be available on
equal basis for all news material generated with this model. The person who assigns
these categories and their respective values to SingleComponents, should ideally stay
the same as otherwise the subjective classification of these entries might vary too
much.
3.2. Using the semantics to filter information
In addition to these categories, the users can type in a list of inclusive and exclusive
keywords. The inclusive keywords are used to select the articles no matter what the
semantic rating and exclusive keywords to discard articles.
Now, it is necessary to stress the two-fold nature of Semantics.
1. An editor assigns values to each category from his newspaper’s i.e. the product’s
point of view.
2. The user can set a profile based on his/her personal interests. For example, value 5
for sports category will make sure the user gets articles with rating 5 for sports. An
Table 1: Semantic categories
Categories Subcategories Value
social event 5
social event family-related 5
geography domestic 5
art music 2
Going for the goal
Logical structure of a hypermedia newspaper 18
example interface for setting the values using Java is shown in Fig. 5.
As an example there are five articles having different ratings 1, 2, 3, 4, 5 for sports.
The user says he wants sports with value 4. He will now get all articles having the rep-
resentative value greater or equal to 4 e.g. 4 and 5.
Once the values have been assigned to different categories and their subcategories,
the following 5 different cases can be considered when retrieving the articles accord-
ing to the ratings. These five cases, A-E, are also depicted in Fig. 6. This figure shows
how the user has set the rating of Sports to 5 and Economics to 3. We will now analyse
each of the five conceptually different articles and see why and when they are
retrieved.
• A — Article is selected because its Sports rating, 5, is equal to the setting in the
user’s profile. It will be retrieved to the Sports section if the user wanted to create
sections according to semantic categories.
• B — The article’s rating is equal to both Sports and Economics and will thus be
Figure 5. Configuring the user profile with a Java applet
Using the model
Logical structure of a hypermedia newspaper 19
retrieved to either of those two categories.
• C — The article’s rating is bigger than the one in the user’s profile and will be
retrieved to the Economics section.
• D and E — Neither of these two articles will be retrieved due to the user’s profile.
The only way they would be retrieved is by a random procedure which will be acti-
vated when there are no articles with high enough ratings. The random selection
procedure should, however, take into account the relative values the articles have
e.g. article E would be more likely to appear in the Sports section than D.
4. Using the model
4.1. Generating views
Instantiating the class objects and making the associations between these instances
will be left to the editor. His task will consist of creating documents and rating them
with the semantic metadata. In addition he makes logical associations between the
document instances and generates Views. An instance of the View class is an ordered
Figure 6. Example of ratings of two categories
����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
��������������������������������������������
��������������������������������������������������������������������������������������������
��������������������������������������������������������������������������������������������
D
Sports
Economics50
0
5
3
A B
C
E
Using the model
Logical structure of a hypermedia newspaper 20
collection of SingleComponent or GroupComponent instances. The main activity of the
editor will be the creation of these Views which form a single Publication. An example
of the Views is given in Fig. 7. This task is somewhat similar to the one using Hyper-
media Design Model (Garzotto et al., 1995) implemented by (Kessler, 1995). It is worth
noticing that the structure will be a graph instead of a tree due to the fact that different
Views may have common children.
The concrete representation of these Views will be generated when needed. This we
call the late binding of views. Each instance will know whether it has already been con-
verted into a given concrete representation or whether it is even its duty to do the con-
version. This concept helps us with the distribution scheme described later in this
paper more detail.
It should be noted that each object in this model will have a unique identifier con-
sisting, for instance, of a date and a positive integer. These identifiers will have an infi-
nite lifetime which guarantees that a single instance of SingleComponent or an instance
of Audio can be retrieved from the system at any time. Having the identifier—concrete
representation mapping table available at all times helps to reduce the storage
requirements at the servers. If the concrete representation is not here, we know where
Figure 7. Two views sharing same components
International news
SingleComponent
Domestic news Agassi wins
Politics Domestic
EU directive for music
Sweden declines EMU France bombing Atolls Unemployment -10% Parlament on leave
GroupComponent SingleComponent GroupComponent SingleComponent
ViewView
SingleComponent SingleComponent SingleComponent
Using the model
Logical structure of a hypermedia newspaper 21
to find it.
The concrete representation sets the granularity of the identifiers. An HTML docu-
ment is identified using a Uniform Resource Locator (URL) which corresponds to the
level of hypermedia documents (SingleComponents). If the user saves the URL he/she
can find the same SingleComponent at any later time.
4.2. Logging activities
Once a client requests an instance of the Component class, his request will be resolved
through a logical resolver. Its implementation is very simple; it finds a corresponding
instance given a logical name, sees whether this object has the physical representation
at this server in which case it simply returns it. In case the concrete representation is
missing, it will be either generated on-the-fly or fetched from the server that has the
MediaObjects associated with the Component.
Once a document is retrieved from the server, a new association between a user
specific LogBook instance and the retrieved Component will be created. On regular
basis, these associations can be collected to one central server which can perform a
Figure 8. Logging activities and annotations using LogBook and Annotation
EU directive for music
AnnotationLogBook
User
SingleComponentSingleComponent SingleComponent
Janne Saarela
Agassi wins EU Unemployment -10%
Implications of the model
Logical structure of a hypermedia newspaper 22
global analysis of the user behaviour. Figure 8 shows a user who has fetched three
Components and annotated one of them.
These log entries will work as a basis for the intelligent producer agents which
work at the server end analysing the clients. The agents are described in more detail
in (Turpeinen, 1995).
4.3. Annotating components
The user will be given a chance to evaluate the Components by giving, for example, a
value ranging from 0 to 5; 0 indicating no interest at all, 3 some interest and 5 I-want-
more-of-these. These instances of the Annotation class will be linked from the User
instance to several Components.
The annotations can later be processed and used in the analysis of the user profile
in trying to find out what type of articles the user finds interesting. Letting users see
other people’s ratings facilitates social filtering first introduced in (Malone et al.,
1987).
5. Implications of the model
5.1. Lost concept of an edition
The edition of a hypermedia newspaper complying with the model is a collection of
the latest versions of the introduced objects. An editor can generate Views once a day
to give the reader a daily newspaper, but the MediaObjects can be introduced at any
time. A new version of an object replaces the old one. The new version then has a his-
tory link to the previous versions. Figure 9 presents this situation.
It is also possible for the editor to create new articles during the day. Once he or she
introduces a new SingleComponent into the system, he or she has to either update a
Implications of the model
Logical structure of a hypermedia newspaper 23
View or a GroupComponent in order to reflect the new association to this article. It is
once again worth noting that if the user does not want to go through the predesigned
Views, he can take advantage of the filtering process and thus be able to include the
new article in his personal newspaper.
5.2. Object lifetime
As the amount of objects becomes increasingly large, an object-collection process can
be run on the material. This process can store old (old as in age or as an old version)
objects into a long-term storage. The object identifiers remain available at each server
thus providing a way to find old material being placed at a specific storage system.
To avoid a large table of identifiers being stored at each server, the identifiers could
be structured, for example, so that a known prefix redirects a request to a specific
server. A request such as /obj/text/archive487 would be interpreted and the
archive part would be enough to redirect this request to a long-term text storage
Figure 9. Latest edition defined in terms of latest objects
View
View
version link
View
Object repository
Latest edition
assigned link
An architectural framework
Logical structure of a hypermedia newspaper 24
server.
6. An architectural framework
All the benefits of this model can be achieved by using a persistent store for storing
the objects. The objects and their associations will remain the same from one invoca-
tion of the server to another which provides us with a solid basis for the framework.
Object database management systems (Cattell, 1994) can provide many different
features we find useful. In addition to persistency a concept of versioning is necessary.
An electronic newspaper can be taken as a collection of the latest versions of the
objects in the logical model. Every time a new version of an existing object is intro-
duced into the model, the old ones still remain available through the versioning
mechanism.
The object model that the object database uses may not be final at the early stage of
the design process. Changes to the model will cause problems if the ODBMS is not
capable of handling them. Some of the available databases can be adjusted to modifi-
cations in the attributes, methods, types and inheritance chains. We find this adaptive-
ness crucially important.
Of other features a query language seems appropriate. A user who has read a given
article can be traced with a query expression. The language should, however, be com-
patible with an object-oriented language such as C++. Several object databases do this
and they also provide an ad-hoc query facility when the ODGM-93 compliant inter-
face is missing.
6.1. A distribution scheme
Distributing the instances of the logical model to all of the servers serving the news-
paper is necessary as we wish to have the same material available at each site. Trans-
An architectural framework
Logical structure of a hypermedia newspaper 25
ferring the instances over a network to other servers appears more effective than
having all the concrete representations of media being transferred as well. This plan is
shown in Fig. 10.
Due to the huge amount of information contained in the electronic newspapers we
plan to distribute the origin of different types of media into different servers. This
leaves room for optimising the server load and storage for a given situation. A single
server who is missing the concrete representation of an object can choose whether it
wants to keep a local copy (cache) of these representations or delete them as soon as
they have been served. An example situation where a server is missing the Text for a
Component and fetches it from a text archive is presented in Fig. 11.
Figure 10. Replicating instances instead of concrete representations
Figure 11. The connected server does the late binding of views
1
3. server replicates the instances of the logical model
2
3a
3b
3c
1. editor introduces a new component into system
2. receiving server stores the MediaObjects locally
text archive
live video feed
voting system
5. return the concrete representation
1
5 3
2
4
1. request a Component
2. request the Text part of the Component
3. return the Text
4. create the concrete representation and cache it
Results
Logical structure of a hypermedia newspaper 26
It is also possible to classify different servers having several levels of functionality.
A basic server might only hold the concrete representations of the objects and have no
network connection but once a day when it retrieves the daily newspaper. Other
superservers could communicate and update the news material at any time and thus
reach for the real-time electronic newspapers.
The centralised method of the MediaObject class tells whether a given instance can be
distributed to several servers or if it should remain at a single server from where cli-
ents request it every time they need it. Should there be an application, say a voting
system, whose result influences other objects, say a pie chart image, the application
(instance of the Dynamic class) should remain centralised at one server in order to pro-
vide all the clients with the same output.
Application-level control of the distribution can be implemented in many ways.
These aspects are discussed in (Korkea-aho, 1995).
7. Results
The object model has been implemented on a single Objectivity/DB database which
provides a clean interface for objectmanagement. Features such as versioning, cluster-
ing and indexing are used in the implementation. Some tests have been conducted
where articles have been composed of separate media objects and higher level struc-
tures have been designed to provide the GroupComponent and View levels of this
model. Simple formatter subclasses have also been designed which can be used to set
the layout for the presentable units such as SingleComponents. An example of this is
presented in Fig. 12.
The algorithms for generating the contents and the structure of the newspaper have
also been verified and they have proven to work with the simple rating system pre-
Conclusions
Logical structure of a hypermedia newspaper 27
sented in this paper.
The ease with which a whole Publication can be created is clearly an advantage
compared to the old-fashioned editing of HTML documents. It, however, remains to
be seen how well the editors at the newspaper companies adapt to the edition of a
new type of product.
What still remains to be verified is the efficiency of using late binding of views
instead of transferring the concrete representations over the network in a distributed
environment. The current implementation does the late binding of views but only in a
centralised manner.
8. Conclusions
A logical model of an electronic newspaper has been described. It aims for a next gen-
eration newspaper which not only is the counterpart of the printed version but brings
Figure 12. A SingleComponent formatted by a Formatter class instance in HTML
Conclusions
Logical structure of a hypermedia newspaper 28
added value by introducing several features:
• The documents are described in a structural manner independent of any specific
presentation format such as HTML or MHEG-5. Separate Formatter class instances
are applied to the structures to produce the wanted layout;
• The material is distributed at the level of this model without binding the structure
to any specific presentation format before required. The servers are also capable of
determining what level of functionality they provide to their clients as different
servers have different storage and processing resources available;
• The contents are rated with a simple two level category classification system which
enables efficient filtering of information thus enabling personalised views inside
the electronic newspaper;
• The users are allowed to annotate the articles. This enables social filtering i.e. the
possibility to follow popular articles read and annotated by other users;
We believe this is the most suitable framework for an electronic newspaper. It pro-
vides a solid basis for the whole editorial process starting off from traditional page
layout systems such as QuarkXPress. The process consists of composing single pre-
sentable units from media objects. After this a higher structure consisting of groups of
presentable units, views collecting together groups and articles and finally publica-
tions each with their own brand can be designed.
Once the structure has been designed, the editor attaches specific formatting
instructions to the structures on how to layout them on different presentation envi-
ronments. The product is then distributed using an a scheme which binds the struc-
tures with the formatting instructions at the server where they are requested from.
Personalisation is enabled through the use of a semantic rating system which
allows for personal generation of the contents and the structure of the product. Other
Conclusions
Logical structure of a hypermedia newspaper 29
added value features such as social filtering is also possible with the help of users’
annotations.
Acknowledgements
This research was supported by the Finnish Technology Development Center
(TEKES), Nokia and the Aamulehti corporation.
Thanks to Mikael Honkala for analysing the OMT model and reviewing the paper.
Conclusions
Logical structure of a hypermedia newspaper 30
References
Bruford, J.F.K. (1994). Multimedia Systems, Addison-Wesley.
Cattell, R.G.G. (1994). Object Data Management, Addison-Wesley.
Garzotto F.,Mainetti L., Paolini P. (1995). Hypermedia Design, Analysis and EvaluationIssues, Communications of the ACM, 38(8).
Gibbs, S.J., Tsichritzis, D.C. (1994). Multimedia Programming, Addison-Wesley.
Halasz, F., Schwartz, M. (1994). The Dexter HyperText Reference Model, Communicationsof the ACM, 37(2).
International Organization for Standardization. Abstract Syntax Notation One (ASN.1):Specification of basic notation. ISO/IEC 8824-1.
International Organization for Standardization. ASN.1 encoding rules: Specification ofBasic Encoding Rules (BER), Canonical Encoding Rules (CER) and DistinguishedEncoding Rules (DER). ISO/IEC 8825-1.
International Organization for Standardization. Document Style Semantics and Specifica-tion Language, DSSSL, ISO/IEC DIS 10179.
International Organization for Standardization. MHEG Object Representation, Basenotation (ASN.1), ISO/IEC DIS 13522-1.
International Organization for Standardization. Presentation Environments for Multime-dia Objects (PREMO), ISO/IEC CD 14478-1.
International Organization for Standardization. Standard Generalized Markup Language(SGML), ISO 8879.
Isakowitz, T., Stohr, E.A., Balasubramanian, P. (1995). RMM. A Methodology for Struc-tured Hypermedia Design, Communications of the ACM, 38(8).
Kessler, M. (1995). A Schema-Based Approach to HTML Authoring. World Wide WebJournal, Fourth International World Wide Web Conference Proceedings, O’Reilly &Associates.
Korkea-aho, M. (1995). Scalability in Distributed Multimedia Systems, Technical reportTKO-B128, Helsinki University of Technology.
Malone, T.W., Grant K.R., Turbak F.A., Brobst S.A., Cohen M.D. (1987). IntelligentInformation-sharing Systems, Communications of the ACM, 30(5), pp. 390-402.
Turpeinen, M. (1995). Agent-mediated personalised multimedia services, Technical reportTKO-B125, Helsinki University of Technology.