web-site management system strudel presented by: lakhlifi houda instructor: dr. haddouti

29
Web-site Management System Strude l Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Web-site Management

System

Strudel

Presented by: LAKHLIFI Houda

Instructor: Dr. Haddouti

OutlineOutline

Introduction:

Building Web sites

What is Strudel?

Strudel Architecture

Content Management:

Strudel Data Model

Structure management:

Site Graph

StruQL

Graphical Presentation Management

Conclusion

Building Web SitesBuilding Web SitesBuilding web sites involves 3 tasks:

Managing the information presented at the site (content)

Managing the structure of the web site (pages & links)

Creating the graphical presentation of pages

Before: Existing site-management tools unify these tasks

prevents the site builder from performing each task separately.

prohibits the generation of multiple sites from the same data.

Now: Current tools separate the three tasks.

Strudel (Manages content and structure declaratively)

StrudelStrudelFeatures:

Separates the 3 web site creation tasks.

Integrates content from multiple sources.

Manages semi structured data.

High level declarative language for managing site’s structure (StruQL)

Advantages:

Derives multiple sites from the same data.

Supports easy restructuring and modification.

Provides platform for:

-Enforcing integrity constraints

-Designing policies for efficient run-time management of sites

Strudel ArchitectureStrudel Architecture

Content Management

Content Content ManagementManagement

The web site’s raw data resides either in:

Tuple-stream sources: relational databases, flat files…

Graph-structured sources: XML documents, bibliographies,

graphs that conforms to Strudel’s data model…

Tuple-stream and graph-structured sources are mapped into Strudel’s data model.

The last data management step is data integration.

Source specific wrappers translate an external source into Strudel’s graph model. The integrated view of the data is produced by evaluating a StruQL query (mediator). The generated file has the Strudel’s .ddl format and is called data graph.

Content Content ManagementManagement

--Strudel Data Model-Strudel Data Model-• In Strudel, a database is modeled as a labeled, directed graph, called a data graph.

• A data graph contains objects and collections.

Objects are connected by directed edges labeled with string-valued attributes.

Objects are either internal nodes, identified by a unique object identifier (OID), or are atomic values, such as integers, strings, and files.

Collections contain groups of objects.

Objects may belong to multiple collections and may have different representations.

Content ManagementContent Management-Strudel Data Model--Strudel Data Model-

Content ManagementContent Management-Strudel Data Model--Strudel Data Model-

• Strudel’s Data-Definition Language (DDL) is an ASCII format used by Strudel for graph sources.

• It corresponds more closely to Strudel’s data model than does XML.

•A DDL file contains the graph’s name and a sequence of statements that define the graph’s objects and collections.

•Example:

Graph people

Collection Person { }

Object norman in Person {

lastname “Ramsey”

firstname “Norman”

}

Object mary in Person {lastname “Fernandez”firstname “Mary”homepage is url

http://www.research.att.com}

Content ManagementContent Management-Strudel Data Model--Strudel Data Model-

• Strudel can export its graph in XML; Strudel has its own DTD for specifying graph sources.

<STRUDEL> <collections> <collection name="Bibentry"> <member IDREF="Bibinfo.bib1"/> <member IDREF="Bibinfo.bib2"/> </collection> <collection name="People"> <member IDREF="Bibinfo.pers1"/> <member IDREF="Bibinfo.pers2"/> </collection> </collections> <objects> <object ID="Bibinfo.bib1"> <year type="int">1995</year> <month>Jun</month> <title>Simple and Effective ...</title> <author ID="Bibinfo.pers1"> <firstname>Mary</firstname> <lastname>Fernandez</lastname>

<homepage type="url">

http://www.research.att.com/~mff

</homepage> </author>

<bibtexkey>bib1</bibtexkey> </object>

<object ID="Bibinfo.bib2"> <bibtexkey>bib2</bibtexkey> <booktitle>ICDE '98</booktitle> <category>Semistructured Data</category> <title>Optimizing ...</title> <author IDREF="Bibinfo.pers1"/> <author ID="Bibinfo.pers2"> <firstname>Dan</firstname> <lastname>Suciu</lastname> <homepage type="url">http://www.research.att.com/~suciu</homepage> </author> <year>1998</year> </object> </objects> </STRUDEL>

Structure Manageme

nt

Structure Structure ManagementManagement-Site Graph--Site Graph-

• After data integration, the site builder declaratively specifies the web site’s structure using a site-definition query in StruQL, Strudel’s query language.

• A StruQL query extracts objects, attributes, and values from the input graphs and constructs a new graph using that data.

• The result of evaluating the site-definition query on the data graph is a site graph.

• A site graph models both the site’s content and structure.

• A site graph can be rendered as a browsable Web site by Strudel’s HTML generator.

• Site graphs are just data graphs and they can be provided as input to other StruQL queries.

Structure ManagementStructure Management-StruQL--StruQL-

• StruQL allows a site builder:

To extract the data that will be available in the site from tuple-stream and/or graph-structured sources.

To create a site graph that specifies both the content and structure of the site.

• StruQL queries are declarative.

• StruQL queries are compositional.

Structure ManagementStructure Management-StruQL--StruQL-

A simple site-definition query that creates a site graph grouping Bibentry objects by their year attribute:

collect WebPage{Root()}, YearPage()

{ where Bibentry{x},

x -> "year" -> y

{ where l = "year"

collect WebPage{YearPage()}, YearEntry(y)

link Root() -> "YearPage" -> YearPage(),

YearPage() -> "YearEntry" -> YearEntry(y),

YearEntry(y) -> "bibentry" -> x,

x -> "year" -> y }

}

Structure ManagementStructure Management-StruQL--StruQL-

The result of applying this query to the previous example graph database is the site graph:

Structure ManagementStructure Management-StruQL--StruQL-

• AStruQL query is a function from a set of input graphs to an output graph.

• A StruQL expression contains two parts:

- A query part: supports querying of the data source.

resultrelation

-A graph construction part: uses the relation to construct

the nodes and arcs in the output graph, or site graph.

• The result of a complete StruQL query is a new site graph.

Structure ManagementStructure Management-StruQL--StruQL-

WHERE clause: selects objects and values of interest

COLLECT clause: creates a new collection and adds the selected objects to the new collection. The new collection is defined in the output (or site) graph.

It can be of the following forms:

P{Q(x)}:adds the objects whose identifier is Q(x) to the collection P.

P{x}: adds the object bound to the node variable x to the collection P.

LINK clause: links new objects to other new objects in the output graph or to old objects in the input graph.

Examples of StruQL queries

Structure ManagementStructure Management-StruQL--StruQL-

-Example: Selection on Attributes

This query selects all objects b in the Bibentry collection that have a booktitle or journal attribute; it puts all such objects in the new collection RefereedPub.

Where Bibentry{b}, b -> l -> x, l = “booktitle” or l = “journal”

Collect RefereedPub(b)

Structure ManagementStructure Management-StruQL--StruQL-

-Example: Selection on Attributes Values

This query selects objects b in the Bibentry collection that have a booktitle attribute whose value is “SIGMOD” and puts all such objects in the new collection InSIGMOD.

Where Bibentry{b}, b -> “booktitle” -> “SIGMOD”

Collect InSIGMOD(b)

Structure ManagementStructure Management-StruQL--StruQL-

-Example: Traversing Paths with Regular Path Expressions

Regular path expressions support traversal of arbitrary paths in the input graph. For example, this query selects all Bibentry objects that have an author attribute that refers to an objects that has a lastname attribute.

Where Bibentry{b},

b -> “author” -> x,

x -> “lastname” -> “Fernandez”

Collect ByMe(b)

Structure ManagementStructure Management-Changes to StruQL: XML data -Changes to StruQL: XML data

sources-sources-

• Strudel can read input graphs and emit output graphs in an XML format.

• Data in XML can be:

1- Written by hand or produced by an XML source

2- produced by wrappers, such as bib2xml, which map external data into an XML format

3- generated by StruQL queries

Structure ManagementStructure Management-StruQL & XML docs--StruQL & XML docs-

XML documents conforming to Strudel’s DTD:

are mapped directly into Strudel’s internal graph data model.

<?xml encoding="US-ASCII"?> <!ELEMENT STRUDEL (collections,objects)> <!ELEMENT collections (collection*)>

<!ELEMENT collection (member*)> <!ATTLIST collection name ID #REQUIRED> <!ELEMENT member EMPTY > <!ATTLIST member IDREF IDREF #REQUIRED> <!ELEMENT objects (object*)> <!ELEMENT object ANY ><!ATTLIST object ID ID #REQUIRED> <!ATTLIST object type CDATA #IMPLICIT>

Arbitrary XML documents:Do not match exactly Strudel’s graph model, but it is also possible to access all their objects.

Structure ManagementStructure Management-More on StruQL--More on StruQL-

Bare bones language for semi-structured data: includes the essential features.

More expressive than Lorel or UnQL (e.g., can reverse graphs)

Conceptually and in practice: separation between query component and restructuring component is important.

Graphical Presentation Management

Graphical Presentation Graphical Presentation ManagementManagement

In Strudel the graphical presentation is described separately in HTML templates.

HTML templates are just HTML files extended with a few Strudel tags.

Certain nodes in the site graph have an attribute HTMLtemplate, which associates them with a template file.

The purpose of the template is to instruct Strudel how to generate an HTML file for that node.

The result is the browsable web site.

Conclusion:Conclusion:Many advantages…Many advantages…

Multiple views of the Web site can be defined with minimal effort.

Personalized Web sites can be offered.

The three Web creation tasks are separated.

Maintenance is easier. It is easier to understand and modify a declarative program like StruQL than a CGI-BIN script.

Using a declarative query language allows to express more complex queries, like the following StruQL query:

where x -> (_ | _._) -> y

x CONTAINS “database” , y CONTAINS “warehouse”

collect result{x}

Thank You!!!References:

• Abiteboul, S., Buneman, P. & Suciu, D. (2000) Data on the Web.

Morgan Kaufmann Publishers

• http://www.research.att.com