xml and modern techniques of content management – 2010/11czarnik/zajecia/xml10/06-en... · 2010....

26
Presentation of XML Documents Patryk Czarnik Institute of Informatics University of Warsaw XML and Modern Techniques of Content Management – 2010/11 Stylesheets Separating content and presentation Associating style with document CSS Idea and applications CSS sheet structure Formatting XSL Presentation by transformation XSLT — Introduction XSL-FO

Upload: others

Post on 15-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Presentation of XML Documents

Patryk Czarnik

Institute of Informatics University of Warsaw

XML and Modern Techniques of Content Management – 2010/11

StylesheetsSeparating content and presentationAssociating style with document

CSSIdea and applicationsCSS sheet structureFormatting

XSLPresentation by transformationXSLT — IntroductionXSL-FO

Page 2: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Separating content and presentationI According to XML best practices documents contain:

I content/dataI markup for structure and meaning (semantical markup)I no formatting

I How to present?I “hard-coded” interpretation of known document typesI external style sheets

Stylesheets idea

<person position="starszy referent"> <name>Dawid</name> <surname>Paszkiewicz</surname> <phone type="office">+48223213203</telefon> <phone type="mobile">+48501502503</telefon> <email>[email protected]</email></pracownik>

* yellow backgorund* blue font and border* italic font for name* typewritter font for email

* font 'Times 10pt'* 12pt for name* abbreviation before phone number

<person position="starszy referent"> <name>Arkadiusz</name> <surname>Gierasimczyk</surname> <phone type="office">+48223213213</phone> <phone type="mobile">+48501502503</phone> <email>[email protected]</email></pracownik>

Page 3: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Advantages of separating content and formattingI General advantages of semantical markup

I better content understanding and easier analysisI Ability to show again

I the same document after modificationI another document of the same structure

I Formatting managed in one placeI easy to change look of whole class of documents

I Ability to define many stylesheets for a document, depending on:

I purposeI medium (screen, paper, voice)I detail level (full report, summary)I reader preferences (font size, colors, . . . )

Standards related to XML presentationI General:

I Associating Style Sheets with XML documentsI Stylesheet languages:

I DSSSL (deprecated, applied to SGML)Document Style Semantics and Specification Language

I CSSCascading Style Sheets

I XSLExtensible Stylesheet Language

Page 4: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Associating Style Sheets with XML documentsI W3C RecommendationI Processing instruction xml-stylesheet

One stylesheet

<?xml-stylesheet type="text/xsl" href="default.xsl"?>

Alternative stylesheets

<?xml-stylesheet title="Blue"type="text/css" href="blue.css" ?>

<?xml-stylesheet title="Yellow" alternate="yes"type="text/css" href="yellow.css" ?>

<?xml-stylesheet title="Big fonts" alternate="yes"type="text/css" href="big.css" ?>

Cascading Style SheetsI Style sheets idea beginnings: 1970s

I motivation — different printer languages and properties

I CSS beginnings: 1994I CSS Level 1 Recommendation: December 1996I CSS Level 2 Recommendation: May 1998

I implemented (more or less completly) in present-day internetbrowsers

I CSS Level 3 — work still in progress:I modularisation,I new features.

I CSS 2.1 — Candidate Recommendation:I CSS 2 corrected and made less ambiguousI also, a bit less ambitious (unsupported features thrown out)

Page 5: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

CSS applicationsI First and main application: style for WWW sitesI Separation of content and formatting for HTMLI Simple stylesheets for XML

Ideologically important

! especially from Level 2

! according to web accessibility

I support for alternative presentation methods(e.g. voice)

I enabling reader to provide her/his own style(e.g. larger font and contrast colors for people with eyesightdisability)

Example XML

<department id="D07"><name>Accountancy</name>

<person position="accountant" id="102103"><name>Dawid</name><surname>Paszkiewicz</surname><phone type="office">+48223213203</phone><phone type="mobile">+48501502503</phone><email>[email protected]</email>

</person>

< person position="manager" id="102104"><name>Monika</name><surname>Domzałowicz</surname><phone type="office">+48223213200</phone><phone type="mobile">+48501502513</phone><email>[email protected]</email>

</person>...</department>

Page 6: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Example stylesheet

person {display: block;margin: 10px auto 10px 30px;padding: 0.75em 1em;width: 200px;border-style: solid;border-width: 2px;border-color: #002288;background-color: #FFFFFF; }

person[position=’manager’] {background-color: #DDFFDD; }

name, surname {display: inline;font-size: larger; }

person[position=’manager’] name {font-weight: bold; }

Result of formatting

Page 7: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

SelectorsI person — element nameI name, surname — two elementsI department name — descendantI department > name — childI A + B — element B directly following AI name:first-child — ‘name’ being first childI person[position] — test of attribute existenceI person[position=’manager’] — attribute value testI person[position~=’manager’] — value is element of listI person#k12 — ID-type attribute valueI ol.staff — equivalent to ol[class~=’staff’]

(HTML specific).

Style depending on media type

Example

@media print {person {background-color: white;font-family: serif;

}}

@media screen {person {background-color: yellow;font-family: sans-serif;

}}

@media all {person {border-style: solid;

}}

Page 8: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

display propertyI Kind of visual object representing an elementI Allowed values: inline, block, list-item, run-ininline-block table, table-cell, . . . , none

I Basic property in case of XML visualisationI Rare exeplicit use for HTML (browsers know default values for

HTML elements)

Boxes and positionsI Blocks nested according to nesting of elements in documentI Blocks layout handled by CSSI Custom positioning possibleI margin, padding — external and internal marginsI border-style, border-color, border-width —

border propertiesI position (static | relative | absolute | fixed) —

positioning policyI left, right, top, bottom — position coordinatesI width, height, min-width, max-height, . . . — size

Page 9: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Boxes, margins, borders

cite: W3C, CSS Level 2 Recommendation

Margins and borders — example

person {display: block;margin: 10px auto 10px 30px;padding: 0.75em 1em;width: 200px;border-style: solid;border-width: 2px;border-color: #002288;background-color: #FFFFFF;

}

Page 10: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Text and font propertiesI color, background-color, background-image —

color and background,I font-family — name or generic family of fontI font-size

I font-style, font-weightI text-decoration

I text-align

Fonts and colors example

company {display: block;background-color: #EEFFFF;color: rgb(0, 0, 33%);font-family: ’Bookman’, ’Times New Roman’, serif;font-size: 14pt;

}

company > name {font-size: 1.5em;font-family: ’Verdana’, ’Arial’, sans-serif;font-weight: bold;text-align: center;

}

department > name {font-size: 1.3em;font-family: ’Verdana’, ’Arial’, sans-serif;font-weight: bold;font-style: italic;

}

Page 11: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Generated contentI Ability to show text not occurring in document text contentI Access to attribute valuesI Automatic numbering

Example

person:before {content: attr(position);

}

phone[type=’office’]:before {content: ’tel. ’;

}

phone[type=’mobile’]:before {content: ’kom. ’;

}

CSS capabilities and advantagesI Great visual formatting capabilities

I especially for single elementsI Elements distinguished basing on

I nameI placement within document treeI existence and value of attributes

I SupportI internet browsersI authoring tools

I Easy to write simple stylesheets :)

Page 12: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

CSS lacks and drawbacksI Only visualisation

I no conversion to other formatsI Some visual formatting issues

I e.g. confusing block size computing (moreover, incorrectlyperformed in the most popular Web browser)

I Not possible (CSS Level 2):I repeating the same content several timesI distinguish elements basing on their contentsI complex logical conditionsI data processing (e.g. summing numeric values)I access to many documents at once

I Some things require more work than they should, e.g.:I to show attribute valuesI to change elements order

Presentation by transformation

I In order to present a document, we can transform it into a format,that we know how to present.

I Examples of “presentable” XML formats:I XHTMLI XSL Formatting Objects

Page 13: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Extensible Stylesheet Language (XSL)I First generation of recommendations (1999 and 2001):

I XSL (general framework, XSL Formatting Objects language)I XSLT (transformation language)I XPath (expression language, in particular paths addressing

document fragments)I XSL stylesheet — transformation from XML to XSL-FO

I usually for a class of documents (XML application)I in practice, transformation to other formats, often (X)HTML

XSL Idea

cite: W3C, XSL 1.0

Page 14: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSLT — statusI Created within XSLI Applications beyond XML visualisationI Version 1.0:

I October 1999, connected with XPath 1.0I good support in tools and libraries

I Version 2.0:I January 2007, connected with XPath 2.0 and XQuery 1.0,I new featuresI less software support

XSLT — supportI XSLT 2.0 processors:

I SaxonI Java and .NET libraries, command-line applicationsI free (Open Source) basic versionI commercial schema aware version

I XML Spy (commercial windows application)I XSLT 1.0 processors:

I internet browsersI Xalan (Java and C++ libraries)I xsltproc, part of libxml (C, basically for Linux)I XML extensions of database enginesI . . .

I Authoring toolsI raw text editors and programmer environments (e.g. Eclipse)I specialised tools — rather paid (e.g. XML Spy, oXygen).

Page 15: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSLT — sheet example

1/2

<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:output method="html" encoding="utf-8" />

<xsl:template match="/"><html><head><title>Staff list</title>

</head><body><h1>Staff list</h1><ul><xsl:apply-templates

select="//person"/></ul>

</body></html>

</xsl:template>...

XSLT — sheet example

2/2

...<xsl:template match="person"><li><xsl:value-of select="name"/><xsl:value-of select="surname"/>(<xsl:value-of select="phone[type=’mobile’]"/>)

</li></xsl:template>

</xsl:stylesheet>

Page 16: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSLT — stylesheet structureI Stylesheet (arkusz) consists of templates.I Template (szablon) specifies how to transform a source node

into result tree fragmentI Inside templates:

I text and elements not from XSLT namespace → copied to resultI XSLT instructions → affects processingI XPath in instructions → access to source document, arithmetic,

conditions testing, . . .

I XSLT can be considered a programming language specialised forXML documents transformation.

XSLT — operation overviewI Transformation on tree levelI Template for document node (root) started first

I template exists even if we had not written it

I apply-templates within a template — another templatescalled for other nodes, usually for children

I Templates matched according to node kind and name, locationwithin document tree, etc.

I Additional features: conditional processing, copying values andnodes from source, generating new nodes, grouping, sorting, . . .

Page 17: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XPath in XSLTI XPath expressions used for:

I accessing nodes and values from source documentI computing valuesI checking logical conditions

I XPath constructions:I paths (navigating through document tree)I arithmetic and logical operatorsI functions for numbers. strings, etc.I more in version 2.0

I XSLT 1.0 uses XPath 1.0.I XSLT 2.0 uses XPath 2.0.

XPath in XSLT stylesheet — example

...<xsl:template match="person"><li><xsl:value-of select="name"/><xsl:value-of select="./surname"/>(<xsl:value-of select="phone[type=’mobile’]"/>)

</li></xsl:template>

</xsl:stylesheet>

Page 18: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSLT — transformation resultI XSL Formatting Objects:

I in line with XSL ideaI usable e.g. for printout

I HTML and XHTML:I most popularI usable for Web publication and on-screen view

I Arbitrary XML:I translating to another (or new) formatI picking up or processing data (alternative to XQuery)I XSLT as result of XSLT

I Plain textI CSV and other text data formatsI scripts or configuration filesI plain representation of text documents

Transformations vs styles

cite: Szymon Zioło, XML i nowoczesne technologie zarzadzania trescia

Page 19: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSL Formatting ObjectsI XML application designed for visual presentationI Focused on printed media

I page templates (“masters”)I automatic pagination

I Normally, XSL-FO documents are not written by hand butgenerated via XSLT.

XSL-FO document structure

<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">

<fo:layout-master-set><fo:simple-page-master master-name="my-page"><fo:region-body />

</fo:simple-page-master></fo:layout-master-set>

<fo:page-sequence master-reference="my-page"><fo:flow flow-name="xsl-region-body"><fo:block>Hello World!</fo:block>

</fo:flow></fo:page-sequence>

</fo:root>

Page 20: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Transformation to XSL-FO (1/2)<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:fo="http://www.w3.org/1999/XSL/Format"><xsl:output method="xml" encoding="utf-8"/>

<xsl:template match="/"><fo:root>

<fo:layout-master-set><fo:simple-page-master master-name="A4" ...>...</fo:simple-page-master></fo:layout-master-set>

<fo:page-sequence master-reference="A4"><fo:flow flow-name="xsl-region-body"><xsl:apply-templates /></fo:flow>

</fo:page-sequence></fo:root></xsl:template>...</xsl:stylesheet>

Transformation to XSL-FO (2/2)

<xsl:template match="person"><fo:blockborder-width="1.5pt"border-style="solid"border-color="#664400"background-color="#FFFFEE">...<fo:block><xsl:apply-templates select="phone"/>

</fo:block></fo:block>

</xsl:template>

<xsl:template match="phone"><fo:block><xsl:choose><xsl:when test="@type=’mobile’">kom. </xsl:when><xsl:otherwise>tel. </xsl:otherwise></xsl:choose><xsl:apply-templates />

</fo:block></xsl:template>

Page 21: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Example visualisation

Ksi#gowo##

Dawid Paszkiewicztel. +48223213203kom. [email protected]

Monika Dom#a#owicztel. +48223213200kom. [email protected]

page-master — page templateI Single page layoutI A document may be split in many such pages.

Example

<fo:simple-page-master master-name="A4"page-width="297mm" page-height="210mm"margin-top="1cm" margin-bottom="1cm"margin-left="1cm" margin-right="1cm">

<fo:region-body margin="3cm"/><fo:region-before extent="2cm"/><fo:region-after extent="2cm"/><fo:region-start extent="2cm"/><fo:region-end extent="2cm"/>

</fo:simple-page-master>

Page 22: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Content of pagespage-sequence results in a number of pages

flow content split into pages

static-content content repeated on all pages

flow-name page region reference

Stylesheet fragment

<fo:page-sequence master-reference="A4"><fo:static-content flow-name="xsl-region-before">Company <xsl:value-of select="name" /> staff

</fo:static-content>

<fo:flow flow-name="xsl-region-body"><xsl:apply-templates />

</fo:flow></fo:page-sequence>

XSL-FO tree elements

Block levelI block

I list-block,list-item,list-item-label

I table, table-row,table-cell, . . .

Inline level

I inline,character

I

external-graphics

Special features

I basic-link,bookmark

I footnote

I flow

I . . .

Page 23: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Style of visual elementsI CSS roots of visual formatting modelI Attributes for style properties:

I margin, padding, border-style . . .I background-color, background-image . . .I font-family, font-weight, font-style, font-size

. . .I text-align, text-align-last, text-indent,start-indent, end-indent, wrap-option,break-before . . .

Lists — example

<fo:list-block><fo:list-item><fo:list-item-label><fo:block>First name: </fo:block>

</fo:list-item-label><fo:list-item-body><fo:block margin-left="15em">Dawid</fo:block>

</fo:list-item-body></fo:list-item><fo:list-item><fo:list-item-label><fo:block>Surname: </fo:block>

</fo:list-item-label><fo:list-item-body><fo:block margin-left="15em">Paszkiewicz</fo:block>

</fo:list-item-body></fo:list-item>

</fo:list-block>

Page 24: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Example visualisation — lists

Ksi#gowo##

* Stanowisko: starszy referentImi#: DawidNazwisko: Paszkiewicztel.: +48223213203kom.: +48501502503Email: [email protected]

* Stanowisko: kierownikImi#: MonikaNazwisko: Dom#a#owicztel.: +48223213200kom.: +48501502513Email: [email protected]

Tables — example

<fo:table border="solid 2pt black"><fo:table-header><fo:table-row><fo:table-cell><fo:block font-weight="bold">Surname

</fo:block></fo:table-cell><fo:table-cell><fo:block font-weight="bold">First name

</fo:block></fo:table-cell>...

</fo:table-row></fo:table-header><fo:table-body><fo:table-row><fo:table-cell><fo:block>Paszkiewicz</fo:block></fo:table-cell><fo:table-cell><fo:block>Dawid</fo:block></fo:table-cell>...</fo:table-row>

...

Page 25: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

Example visualisation — table

XSL-FO — supportI Commercial software:

I Antenna House XSL FormatterI RenderXI EcrionI Lunasil LTD XincI . . .

I Open software:I Apache FOPI xmlroff

Page 26: XML and Modern Techniques of Content Management – 2010/11czarnik/zajecia/xml10/06-en... · 2010. 11. 22. · XSL-FO. Separating content and presentation I According to XML best

XSL-FO — critique

Main XSL-FO advantages

I “in line” with XSLI easy and direct way to obtain printout (e.g. PDF) from XML dataI general advantages of stylesheets over “hard-coded” formatting

Main XSL-FO disadvantages

I too complex for simple needsI too limited for advanced needs

I lack of pagination feedback, cannot say “if these two elementsoccur on the same page. . . ”

I hard to format particular elements in a very special way (thus thisis a general drawback of using stylesheets)