13 maart 1006iss, 20051 internet applications. 13 maart 1006iss, 20052 the world wide web by far the...

77
13 Maart 1006 ISS, 2005 1 Internet Applications

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

13 Maart 1006 ISS, 2005 1

Internet Applications

13 Maart 1006 ISS, 2005 2

The World Wide WebBy far the best known distributed application is the World Wide Web (WWW), or the Web for short. Technically, the web is a distributed system of HTTP servers and clients, more commonly known as web servers and web browsers.

Prior to the emergence of the web, the user community of the Internet largely comprised of researchers and academics who used network services such as electronic mail and file transfer to exchange data.

The World Wide Web originated with Tim Berners-Lee in late 1990 for CERN, the European Particle Physics Laboratory in Geneva, Switzerland. A proposal for a "universal hypertext system" was submitted in November 1990 by Tim Berners-Lee and Robert Cailliau for a "universal hypertext system."

In April 2004 Tim Berners-Lee received the first-ever Millennium-Technology Award of 1 million euros from the Finnish Technology Award Foundation.

13 Maart 1006 ISS, 2005 3

The World Wide WebSince the original proposal, the growth of the World-Wide Web has been extraordinary (see Figure 1), and has expanded far beyond the research and academic community into all sectors world-wide, including commerce and private homes. The continued development of the Web technology is currently coordinated by the World-Wide Web Consortium, W3C.

13 Maart 1006 ISS, 2005 4

The World Wide WebThe genius of the World-Wide Web is that it combines three important and well-established computing technologies:

Hypertext documents: documents in which chosen words or phrases, typically highlighted, can be marked as links to other documents, so that a user is able to access the linked documents by clicking with a mouse on the highlighted text.

Network based information retrieval: the File Transfer Protocol (FTP) service was the most widely used service for such information retrieval.

Standard Generalized Markup Language (SGML), an ISO standard which allows documents to be “marked up” with tags so that they can be displayed in a uniform format on any platform, independent of the presentation mechanics.

13 Maart 1006 ISS, 2005 5

The World Wide WebAt its most basic, the World-Wide Web is a client-server application based on a protocol named the HyperText Transfer Protocol (HTTP).

A web server is a connection-oriented server that implements the HTTP. By default, an HTTP server runs at the well-known port 80.

A user runs a World-Wide Web client (sometimes referred to as a browser) on a local computer. The client interacts with a web server according to the HTTP, specifying a document to be fetched. If the document is located by the server in its directory, the document’s contents is returned to the client, which presents it to the user.

13 Maart 1006 ISS, 2005 6

The Hypertext Markup Language (HTML)

HTML is a markup language used to create documents that can be retrieved using the World Web Web.

HTML is based on SGML, with semantics that are appropriate for representing information of a wide range of types.

HTML markup can represent hypertext news, mail, documentation, and hypermedia; menus of options; database query results; simple structured documents with in-lined graphics; and hypertext views of existing bodies of information.

13 Maart 1006 ISS, 2005 7

HTML<HTML>

<HEAD>

<TITLE>A Sample Web Page</TITLE>

</HEAD>

<HR>

<BODY>

<center>

<H1>My Home Page</H1>

<IMG SRC="/images/myPhoto.gif">

<b>Welcome to Kelly's page!</b>

<p>

<! A list of hyperlinks follows.>

<a href="/doc/myResume.html"> My resume</a>.

<p>

<a href="http://www.someUniversity.edu/">My university<a>

</center>

<HR>

</BODY>

13 Maart 1006 ISS, 2005 8

The Extensible Markup Language XML Whereas HTML is a language that allows a document to be marked up for the presentation or display of the information contained in a document, XML allows a document to be marked up for structured information.Also based on SGML, XML uses tags to describe the information contained in a document. <message> <to>[email protected]</to> <from>[email protected]</from> <subject>This is a message</subject> <text> Hello world! </text></message>

13 Maart 1006 ISS, 2005 9

Content Type – MIME Protocol

13 Maart 1006 ISS, 2005 10

Content Type and the Mime ProtocolOne of the header lines returned in a server response is the Contents Type of the object requested. Specification of the contents type follows the scheme established in a protocol known as MIME (Multipurpose Internet Mail Extension.)Originally used for Email, MIME is now widely used for describing the content of a document sent over a network.It supports a large number and evolving set of predefined content types, specified in the format Type/Subtype.

13 Maart 1006 ISS, 2005 11

The Mime Protocol

A small subset of the types and subtypes are: Type Subtype

text plain, rich text, html, tab-separated-values, xml

message Email, news

application Octet-stream (can be used for transferring Java .class files, for

example), Adobe-postscript, Mac-binhex40, xml

image jpeg, gif

audio basic,midi,mp3

video mpeg, quicktime

13 Maart 1006 ISS, 2005 12

Characteristics of HTTP

13 Maart 1006 ISS, 2005 13

HTTP is a Connection-Oriented Protocol

With HTTP1.0, a connection to a server is automatically closed as soon as the server returns a response.

Thus exactly one round of exchange is allowed between a client and a web server; if a client needs to contact the same server in one session, it must reconnect to the server to reissue another request.

13 Maart 1006 ISS, 2005 14

HTTP is a Connection-Oriented Protocol

The scheme is adequate for the original intent of HTTP for retrieving simple network documents. It is inefficient for documents such as those that contain a large number of links to image objects to be fetched by the server, since fetching each of these links require a reestablishment of a connection. It is also insufficient for sophisticated web applications based on HTTP (such as shopping carts).

13 Maart 1006 ISS, 2005 15

HTTP is a stateless ProtocolHTTP 1.0 (as well as version 1.1) is also a stateless protocol: the server does not maintain any state information on a client’s session. Regardless of whether the connection is kept alive, each request is handled by a server as a new request. As with non-persistent connections originally in practice with HTTP, a stateless protocol is adequate for the original intent of the protocol, but not so for the more complex applications for which HTTP has been extended, the next topic that we will study.

13 Maart 1006 ISS, 2005 16

HTTP is a Connection-Oriented Protocol

HTTP1.0 was extended to allow a request header line Connection: Keep-Alive to be issued by a client who wishes to maintain a persistent connection with the server; a cooperating server will keep the connection open after sending a response.

In HTTP/1.1, connections are persistent by default. Such a connection allows multiple requests to be send over the same TCP connection.

13 Maart 1006 ISS, 2005 17

Dynamically generated web contents

13 Maart 1006 ISS, 2005 18

Dynamically-generated Web Contents In the beginning, HTTP was employed to transfer static contents, that is, contents that exist in a constant state, such as a plain text file or an image file.

As the web evolved, applications began to use HTTP for a purpose not originally intended: an application which allows a browser user to retrieve data based on dynamic information entered during an HTTP session.

13 Maart 1006 ISS, 2005 19

Dynamically-generated Web ContentsA typical web application, such as a shopping cart, requires fetching remote data based on data entered by a client at runtime.

For example, an enterprise application typically allows a user to key in data, which is then used to formulate a query to retrieve data from a database, and the outcome is displayed to the user.

Applied to the web, it is desirable to allow a client to submit data during a web session to retrieve data from the web server host, to be

displayed by the web browser W e b s e rv e r h o s t W e b clie n t h o s t

"id = 1 2 3 4 5 "we b s e rv e r we b clie n t

da ta ba s e s y s te m

"in c o m e= 3 0 0 0 0 "

"id=

1234

5"

"income=30000"

13 Maart 1006 ISS, 2005 20

Dynamically-generated Web Contents

A generic HTTP server does not possess the application logic for fetching the data from the data source.

Instead, an external process that has the application logic will serve as an intermediary.

The external process runs on the server host, accepts input data from the web server, exercises its application logic to obtain data from the data source, returns the outcome to the web server, which transmits the outcome to the client.

13 Maart 1006 ISS, 2005 21

Dynamically-generated Web ContentsThe first widely adopted protocol to augment HTTP in supporting run-time generated web contents is the Common Gateway Interface (CGI) protocol. Although rudimentary by comparison, CGI is the predecessor of more sophisticated protocols and facilities (such as the Java Applet and Servlet) that serve similar purposes. The understanding of CGI and some of its supplementary protocols is important in that it prepares us for the understanding of more advanced protocols and facilities.

13 Maart 1006 ISS, 2005 22

The Common Gateway Interface (CGI) Protocol

13 Maart 1006 ISS, 2005 23

Common Gateway Interface (CGI) The Common Gateway Interface (CGI) is a standard for providing an interface, or a gateway, between an information server and an external process (that is, a process external to the server). Using the protocol, a web client may specify a program, known as a CGI script, as the target web object in an HTTP request. The web server fetches the CGI script, activates it as a process, passing to the process input data transmitted by the web client. The web script executes and transmits its output to the web server, which returns the web-script generated data as the body of a response to the web client.

  

13 Maart 1006 ISS, 2005 24

CGI - 2An HTTP request may specify a CGI program, or CGI script.

A CGI program can be written in: Programming languages: C. Ada, C++, Fortran; such a

program needs to be compiled to generate an executable.

Script languages such as Php, Perl, Tkl, cobra, such a program, referred to as a CGI script, requires the appropriate language interpreter to be present at the server host.

Commonly used for processing user input from HTML forms, and subsequently composing a web page sent as part of the server response.

13 Maart 1006 ISS, 2005 25

CGI Program - 3When a web server receives a request whose URI specifies a web program, the web server initiates the execution of the web program.

The web program formulates its output in HTML, which is sent to the server and forwarded to the web client as the HTTP response.

13 Maart 1006 ISS, 2005 26

CGI program

H TTP s e rv e rH TTP clie n t

re qu e s t

re s po n s e

C G I pro g ra m

s e rv e r h o s t c lie n t h o s t

13 Maart 1006 ISS, 2005 27

Action field in a web pageA web script can be specified in an action field of a web page. When the web page is submitted, an HTTP request is issued by the browser specifying the web script as the URI:

<HTML><HEAD> <TITLE>A Simple Web Page which illustrates CGI</TITLE></HEAD><BODY><FORM ACTION="Hello.cgi"><CENTER>Click on the SUBMIT button to activate the CGI script Hello.cgi:<br> <INPUT TYPE="Submit" NAME="submit" VALUE="SUBMIT"></CENTER></FORM></BODY></HTML>

13 Maart 1006 ISS, 2005 28

Common Gateway Interface (CGI)W e b s e rv e r h o s t

we b clie n t h o s t

r eq u es t f o r h e llo .h tm l

c o n ten ts o f h e llo .h tm l

r eq u es t f o r h e llo . c g i

d a ta , if an y , f r o m th e c lien t

HT T P s er v er r es p o n s e , in c lu d in g d y n am ic a lly g en er a ted w eb p ag e

HT T P s er v erC G I s c r ip t

w eb c lien t

13 Maart 1006 ISS, 2005 29

A sample web page (hello.html) which invokes a CGI script

<HTML><HEAD><TITLE>A web page which invokes a web script</TITLE></HEAD><BODY><H1>This web page illustrates the use of a web script</H1><P><BR>The script or program is either a run-script written in a script language such as Perl, or an executable generated from a source program written in a language such as C/C++. </P><HR><FORM METHOD="post" ACTION="hello.cgi"><HR>Press <input type="submit" value="here"> to submit your query.</FORM><HR></BODY></HTML>

13 Maart 1006 ISS, 2005 30

A sample web script hello.c/**

* This C program is for a CGI script which generates

* the output for a web page. When displayed by a

* browser, the message "Hello there!" will be shown

* in blue.

*/

#include <stdio.h>

 

main(int argc, char *argv[]) {

printf("Content-type: text/html%c%c",10,10);

printf("<font color = blue>");

printf("<H1>Hello there!</H1>");

printf("</font>");

}

13 Maart 1006 ISS, 2005 31

A sample web script hello.pl#!/usr/local/bin/perl

# A simple Perl CGI script

print "Content-type: text/html\n\n";

print "<head>\n";

print "<title>Hello, World</title>\n";

print "</head>\n";

print "<body>\n";

print "<font color = blue>\n";

print "<h1>Hello, World</h1>\n";

print "</font>\n";

print "</body>\n";

13 Maart 1006 ISS, 2005 32

Web forms

13 Maart 1006 ISS, 2005 33

A Web FormYou may have noticed that the “hello” example presented does not make use of any user input, and the contents of the dynamically generated web page is predeterminable. This is because the example is provided as an overview of the CGI protocol.

In practice, a CGI script is typically invoked by a special kind of web page known as a web form, to be described in the next section, which accepts input at run time, and invokes a CGI script which makes use of such input.

13 Maart 1006 ISS, 2005 34

A web formA web form is a special kind of web page which provides a graphical user interface that prompts

input data from a user invokes the execution of an external program

on the web server host, when a submit button on the page is pressed by the user.

13 Maart 1006 ISS, 2005 35

A web formThe code that generates a web form is enclosed between the HTML tags <FORM> ... </FORM> Within the <FORM> tag attributes can be specified to provide additional information related to the CGI protocol, including:~ ACTION=<a character string containing the absolute or relative

URL of the identification of the external program which is to be initiated by the web server when the form is submitted>

~ METHOD=<a reserved word, POST or GET, which specifies the manner that the external program expects to receive from the web server the collection of data submitted by the user, called the query data.>

FORM METHOD="post" ACTION="form.cgi”

13 Maart 1006 ISS, 2005 36

A web formIn the coding for the form, each of the input items (also called an input elements) has a NAME tag. For each of these items, the browser user enters or selects a value.

What is thy NAME: <INPUT NAME=“name"><P>What is thy favorite color: <SELECT NAME="color">

The collection of the data for the input items is a character string, called a query string, of name=value pairs separated by the & character. name=John%20Chen&color=red

Each name=value pair is encoded using URL-encoding, so that some “unsafe” characters (such as spaces,quotes, %, and &) are mapped to a hexadecimal representation. For example, the value string“The return is >17%” is encoded as “The%20return%20is

%20%3E17%25”.

 

13 Maart 1006 ISS, 2005 37

A Web Form Query StringAn example of a query string for the example form is:

name=John%20Doe&quest=peace%20on%20earth&color=azure

&swallow=continental&text=The%20return%20is%20%3E17%25 (all on one line)

The collection of the data into a query string, including the encoding of the values, is performed by the browser.

When the form is submitted by the user, the query string is passed to the server in the HTTP request, in a manner depending on the FORM METHOD specified in the form. The query string is then forwarded by the server to the external program.

13 Maart 1006 ISS, 2005 38

Web Form Query String Processing

Based on the form input, the browser assembles the query string.

The string is transmitted to the web server, which in turn passes it on to the external program (the CGI script named in the form).

The manner that the string is transmitted depends on the specification of the FORM METHOD in the web form.

13 Maart 1006 ISS, 2005 39

FORM GET Method – browser to server

Intended for requests of information only.If GET is specified with the FORM METHOD tag, the query string is transmitted to the server in a HTTP request with a GET method line.

<FORM METHOD=“get" ACTION=“getForm.cgi">

Recall that an HTTP GET request specifies a URI for the web object requested by the client. To accommodate the query string, the syntax for the URI specification was extended to allow the attachment of the query string to the end of the URI (for the CGI script), delimited by the ‘?’ character, as, for example:

GET /cgi/getForm.cgi?name=John%20Doe&quest=peace HTTP/1.0

Since the length of the GET Request-URI line is limited (8K bytes), the length of the query string that can be appended in this manner is also limited. Hence this method is not suitable if the form needs to send a large amount of data, such as data in a text box.

13 Maart 1006 ISS, 2005 40

Form GET method – server to external program

The server invokes the CGI script and passes on the query string that it received from the browser, as appended to the URI in the HTTP request.

The CGI program, or the external program in general, will receive the encoded form input in an environment variable called QUERY_STRING.

Environment variables are variables maintained by the operating system of the server host.

The CGI program retrieves the query string from the environment variable, decodes the character string to obtain the name-value pairs, and uses the parameters during the execution of the program to generate output phrased in HTML.

13 Maart 1006 ISS, 2005 41

FORM POST Method – browser to serverIntended for actions with a side-effectIf POST is specified with the FORM METHOD tag, the query string is transmitted to the server in a HTTP request with a POST method line previous described.

<FORM METHOD=“post" ACTION=“postForm.cgi">

Recall that an HTTP POST request is followed by a request body, which holds text contents to be sent to the server. Using the POST METHOD, the URI of the CGI script is specified with the POST request line, followed by the request header, a blank line, then the query string, as, for example:POST /cgi/postForm.cgi HTTP/1.0Accept: */*Connection: Keep-AliveHost: myHost.someU.eduUser-Agent: Generic name=John%20Doe&quest=peace%20on%20earth&color=azure

Since the length of the request body is unlimited, the query string can be of arbitrary length. Hence the POST method can be used to send any amount of query data to the server.

13 Maart 1006 ISS, 2005 42

Form POST method – server to external program

The server invokes the CGI script and passes on the query string that it received from the browser via the request body.

The CGI program, or the external program in general, will receive the encoded form input on the standard input.

The server will NOT send you an EOF on the end of the data, instead you should use the environment variable CONTENT_LENGTH to determine how much data you should read from (the standard input).

The CGI program reads the query string from the standard input, decode the character string to obtain the name-value pairs, and uses the parameters during the execution of the program to generate output phrased in HTML.

13 Maart 1006 ISS, 2005 43

Encoding and decoding query stringsWhether a query string is obtained from the QUERY_STRING environment variable, or from the standard input, the CGI program must decode the string and extract the name-value pairs from it, so that the parameters may be used for the program’s execution.

Due to the popularity of CGI programs, there are a number of existing libraries or classes that provide routines(functions) and methods for this purpose. For example, Perl has easy-to-use procedures in a library called CGI-lib for the decoding and for extracting the name-value pairs into a data structure called an associative array; and NCSA provides a library of C routines for the same purpose.

13 Maart 1006 ISS, 2005 44

Environment Variables used with CGIAn environment variable defines is a parameter of a user's working environment on a computer system, such as the default directory path for the system to locate a program invoked by the user. On a computer system, environment variables are used across multiple languages and operating systems to provide information to applications that may be specific to a user. CGI uses environment variables that are set by the HTTP server to pass information about requests from the server to the external program (CGI script).

13 Maart 1006 ISS, 2005 45

Environment Variables used with CGI

Some of the key environment variables related to CGI are listed below: ~ REQUEST_METHOD: The method with which the

request was made. For CGI, this is "GET" or "POST".

~ QUERY_STRING: If the GET method was specified in the form, this variable contains a character string for the form data.

~ CONTENT_TYPE: the content type of the data, which should be “application/x-www-form-urlencoded” for a query string

~ CONTENT_LENGTH : The length of the query string.

13 Maart 1006 ISS, 2005 46

Web Session State Data

13 Maart 1006 ISS, 2005 47

Web Session and session state data During a session of a web application such as a shopping cart, several HTTP requests are issued, each of which invokes an external program such as a CGI script.

r eq u es t f o r f o r m .h tm l

f o r m .h tm l

f o r m .c g i? id = 1 2 3 4 5

f o r m 2 .h tm l ( d y n am ic )

f o r m 2 .c g i?b u y = T V

W e b s e rv e r B ro ws e rwe b s criptfo rm .cg i

we b s criptfo rm 2 .cg i

id = 1 2 3 4 5

f o r m 2 .h tm l( d y n am ic )

b u y = T V

"c u s to m er 1 2 3 4 5 h as a T V in s h o p p in g c ar t""c u s to m er 1 2 3 4 5 h as a T V in s h o p p in g c ar t"

13 Maart 1006 ISS, 2005 48

Web Session and session state data

Data that needs to be shared among CGI scripts invoked successively during a web session are called session state data.

There is no provision in HTTP nor CGI to allow for such sharing, as both of these protocols are stateless and do not support the notion of a session.

13 Maart 1006 ISS, 2005 49

Session Data Sharing Mechanisms

Because of the popularity of Internet applications, a variety of mechanisms have emerged to allow the sharing of session data among CGI scripts (and other external programs).

These mechanisms can be classified as follows: Server-side facilities Client-side facilities

13 Maart 1006 ISS, 2005 50

Server-side facilities for session state data

secondary storage (file or database) on the server host may be used as a repository of session state data

software objects which may be employed as state data repository:

java beans, session objects, application context state data objects.

13 Maart 1006 ISS, 2005 51

Client-side facilities for session state dataAn ingenious idea for maintaining session state data is to maintain the data through the web client.

Since each session is associated with a single client, this scheme allows the state data to be maintained in a decentralized fashion.

Specifically, the scheme allows the state data to be passed from a web script to the web client, which passes the data to a subsequent web script. The data passing can be repeated throughout the duration of the web session.

C G Is cript 1

we bs e rv e r

we b clie n ti d =1 2 3 4 5

C G Is cript 2

C G Is cript 3

i d =1 2 3 4 5

i d =1 2 3 4 5 & b u y =TV

i d =1 2 3 4 5 & b u y =TV

i d =1 2 3 4 5 & b u y - TV & c h a r g e =5 0 0 .3 5

S e rv e r h o s t C lie n t h o s t

13 Maart 1006 ISS, 2005 52

Client-side facilities for session state data

Two schemes which makes use of client-side facilities to maintain session data:~ HIDDEN FORM fields: this scheme embeds

session state data in dynamically generated web forms

~ Cookies: this mechanism uses transient or persistent storage on the client host to hold state data, which is passed in the HTTP request header to web scripts that require the data.

13 Maart 1006 ISS, 2005 53

Maintaining state data using hidden form fields

13 Maart 1006 ISS, 2005 54

Using HIDDEN FORM FieldsA hidden form field or a hidden field is an INPUT element in a web form specified with `TYPE=HIDDEN'. Unlike other other INPUT elements, a hidden field is not displayed by the browser and requires no input. Rather, the value of the element is the VALUE attribute specified with the field, and the name-value of the field is collected by the browser, along with the name-value pairs of other INPUT elements, in the query string when the form is submitted.

13 Maart 1006 ISS, 2005 55

Using HIDDEN FORM FieldsThe hidden field is a rudimentary scheme for maintaining session data. It has the merit of simplicity, requiring only the introduction of a new form field element and no additional resources on either the server-side or the client-side. In the scheme, the HTTP client becomes a temporary repository for the state information, and the session data is sent using the normal mechanisms for transmitting query strings.The simplicity of the scheme comes at the cost a security risk, in the sense that the state data transmitted using hidden form field is unprotected.

13 Maart 1006 ISS, 2005 56

Using HIDDEN FORM Fields

Although a hidden input element is not displayed by the browser, it is embedded in the source code of the dynamically generated web page, which is plainly viewable by any browser user who exercises the view-source capability provided by the user. Hence the scheme allows data to become exposed, and therefore poses a security risk.

Hidden fields should not be used to transmit sensitive data such as an identification or account balances.

13 Maart 1006 ISS, 2005 57

Maintaining state data using cookies

13 Maart 1006 ISS, 2005 58

Using cookies for state dataA more sophisticated scheme for session state data repository on the client side is a mechanism known as a cookie, “for no compelling reason”. The scheme makes use of an extension of the basic HTTP to allow a server’s response to contain a piece of state information for which the client will provide storage in an object. Included in that state object is a description of the range of URLs for which that state is valid. Any future HTTP requests made by the client which fall in that range will include a transmittal of the current value of the state object from the client back to the server.

13 Maart 1006 ISS, 2005 59

Using cookies for state dataA CGI script creates a cookie by including a Set-Cookie header line as part of the HTTP response that it outputs. Each cookie contains a URL-encoded name-value pair, similar to a name-value pair in a query string, for a state data item (for example, id=12345). When the response is received by the browser, it creates an object (a cookie) which contains the name-value pair. The cookie is sent as a request header line in each subsequent request sent by the browser to the web server, which appends the name-value pair to the query string being sent to a web script.

13 Maart 1006 ISS, 2005 60

Syntax of the Set-Cookie HTTP Response Header Line

The core syntax of the set-cookie header line is a string in the following format (keywords are listed in bold):Set-Cookie: NAME=VALUE; expires=DATE;path=PATH; domain=DOMAIN_NAME; secure

The domain and path attributes for the cookies are designed to allow state data to be shared among selective CGI scripts.The line starts with the keyword “Set-Cookie” and the delimiter colon (‘:’), followed by a list of attributes separated by semi-colons. The attributes are explained as follows:

13 Maart 1006 ISS, 2005 61

Syntax of the Set-Cookie HTTP Response Header Line

NAME=VALUE

URL-encoded name-value pair for the state data to be stored in the cookie created. This is the only required attribute on the Set-Cookie header line.

13 Maart 1006 ISS, 2005 62

Syntax of the Set-Cookie HTTP Response Header Line

expires=DATE The expires attribute specifies a date string that defines the valid life time of that cookie. Once the expiration date has been reached, the client host is free to deallocate the cookie and the state data contain in the cookie can no longer be assumed to be sent to the server. The date string is formatted as:

Wdy, DD-Mon-YYYY HH:MM:SS GMTThe time format is based on RFC 822, RFC 850, RFC 1036, and RFC 1123, with the variations that the only legal time zone is GMT and the separators between the elements of the date must be dashes. expires is an optional attribute. If not specified, the cookie will expire when the user's session ends.

13 Maart 1006 ISS, 2005 63

Syntax of the Set-Cookie HTTP Response Header Line

domain=DOMAIN_NAME This attribute sets the domain for the cookie created.

Among the cookies stored on the client host, a browser is supposed to send only cookies whose domain attributes of the cookie is made with the Internet domain name of the host name specified in the URI of the object in the HTTP request (with which the cookie is sent).

If there is a “tail match”, then the cookie will go through path matching to see if it should be sent. "Tail matching" means that the domain attribute is matched against the tail of the fully qualified domain name in the URI.

13 Maart 1006 ISS, 2005 64

Syntax of the Set-Cookie HTTP Response Header Line

For example: A domain attribute of "acme.com" would match host names "anvil.acme.com"

as well as "shipping.crate.acme.com“

so that the name-value pair in the cookie tagged with the domain attribute of acme.com will be sent with a HTTP request where the requested object has a URI containing the host name

anvil.acme.com (such as anvil.acme.com/index.html) or

shipping.crate.acme.com (such as shipping.crate.acme.com/sales/shop.htm).

13 Maart 1006 ISS, 2005 65

Syntax of the Set-Cookie HTTP Response Header Line

The default value of domain is the host name of the server which generated the cookie response.

For example, if the server is www.someU.edu, then, if no domain attribute is set with a cookie, then the cookie’s domain is www.someU.edu.

13 Maart 1006 ISS, 2005 66

Syntax of the Set-Cookie HTTP Response Header Line

path=PATH

The path attribute is used to specify the subset of URIs in a domain for which the cookie is valid.

If a cookie has already passed the domain matching, then the pathname component of the URI is compared with the path attribute, and if there is a match, the cookie is considered valid and is sent along with the HTTP request. The path "/foo" would match "/foobar" and "/foo/bar.html". The path "/" is the most general path.

If the path is not specified, it is assumed to be the same path as the document being described by the header which contains the cookie.

13 Maart 1006 ISS, 2005 67

The path attribute in set-cookie

Examples:

Cookie

Domain attribute

Cookie

Path attribute

Request URI that will cause the name-value pair in the cookie to be sent in the request header.

www.someU.edu none www.someU.edu/*

www.someU.edu / www.someU.edu/*

www.someU.edu /foo www.someU.edu/foo*

www.someU.edu /foo/foo www.someU.edu/foo/foo*

13 Maart 1006 ISS, 2005 68

The secure attribute in set cookie

secure

If a cookie is marked secure, it will only be transmitted if the communications channel with the host is a secure one. Currently this means that secure cookies will only be sent to HTTPS (HTTP over SSL) servers.

If secure is not specified, a cookie is considered safe to be sent in the clear over unsecured channels.

13 Maart 1006 ISS, 2005 69

How cookies are passed from the browser to the server

When requesting a URL from an HTTP server, the browser will match the URI against all cookies stored on the client host.

If any matching cookie is found, then a line containing the name/value pairs of all matching cookies will be included in the HTTP request. The format of the line is:

Cookie: NAME1=VALUE1; NAME2=VALUE2; ...; NAMEn=VALUEn

13 Maart 1006 ISS, 2005 70

How cookies are passed from the browser to the server

Cookie: NAME1=VALUE1; NAME2=VALUE2; ...; NAMEn=VALUEn

When such a line is encountered by the HTTP server in the request header, the server extracts the substrings containing the name-value pairs from the line and place the string in an environment variable named HTTP_COOKIE.

When the CGI script is executed, it may retreive the state data, as name-value pairs, from the environment variable HTTP_COOKIE.

13 Maart 1006 ISS, 2005 71

How cookies are passed from the browser to the server

Example:

If the following request is sent to the server GET /cgi/hello.cgi?name=John&quest=peace HTTP/1.0

Cookie: age=25

<blank line>

then the server will place the string “name=John&quest=peace” in the environment variable QUERY_STRING and the string “age=25” in HTTP_COOKIE for the invoked CGI script.

13 Maart 1006 ISS, 2005 72

How cookies are passed from the browser to the server

Example:If a request sent to a server is:POST /cgi/hello.cgi HTTP/1.0

Cookie: age=25

<blank line>

name=John&quest=peace

then the string “name=John&quest=peace” will be sent by the server to the standard input of the CGI script, while the string “age=25” will be placed in the environment variable HTTP_COOKIE.

13 Maart 1006 ISS, 2005 73

Summary - 1We introduce Internet applications and the key protocols that support them. The Hypertext Markup Language (HTML) is

a markup language used to create documents that can be retrieved using the World Web Web.

The XML (Extensible Markup Language) allows a document to be marked up for structured information.

13 Maart 1006 ISS, 2005 74

Summary - 2The HTTP (HyperText Hyperlink Protocol) is the transport protocol on the web It allows the transferring of web contents of virtually unlimited types It is a connection-oriented, stateless, request-response protocol In HTTP/1.0, each connection allows only one round of request-response HTTP is text-based: the request and responses are character strings Each HTTP request and response is composed of four parts: The request/response line; a header section; a blank line; the body

13 Maart 1006 ISS, 2005 75

Summary - 3The Common Gateway Interface (CGI) protocol is a protocol to augment HTTP in supporting run-time generated web. Using the protocol, a web client may specify an external program, known as a CGI script, as the target web object in an HTTP request. When requested, the web server fetches the CGI script, activates it as a process, passing to the process input data transmitted by the web client.The web script executes and transmits its output to the web server, which returns the web-script generated data as the body of a response to the web client.

13 Maart 1006 ISS, 2005 76

Summary - 4A web form is a special kind of web page which

i. provides a graphical user interface that prompts input data from a user, and,

ii. when a submit button on the page is pressed by the user, invokes the execution of an external program on the web server host.

The input data is gathered in a query string, which is sent to a web script.

13 Maart 1006 ISS, 2005 77

Summary - 5To allow session data to be shared among the web scripts invoked during a web session, there are a number of mechanisms: Server-side facilities: files, database, and

others. Client-side: hidden-form tags and cookies

The use of hidden-form tags and cookies raises privacy and security concerns.