web services

34
1 Web services web client (browser) web server HTTP request HTTP response Clients and servers communicate using the HyperText Transfer Protocol (HTTP) Current version is HTTP/1.1 RFC 2616, June, 1999.

Upload: jackson-burns

Post on 30-Dec-2015

26 views

Category:

Documents


0 download

DESCRIPTION

Web services. Clients and servers communicate using the HyperText Transfer Protocol (HTTP) Current version is HTTP/1.1 RFC 2616, June, 1999. HTTP request. web client (browser). web server. HTTP response. Static and dynamic content. - PowerPoint PPT Presentation

TRANSCRIPT

1

Web services

webclient

(browser)

webserver

HTTP request

HTTP response

• Clients and servers communicate using the HyperText Transfer Protocol (HTTP)

• Current version is HTTP/1.1– RFC 2616, June, 1999.

2

Static and dynamic content

• The content returned in HTTP responses can be either static or dynamic.

• Static content: – content stored in files and retrieved in response to an

HTTP request

» HTML files

» images

» audio clips

• Dynamic content:– content produced on-the-fly in response to an HTTP

request

» usually by CGI process executed by the server.

3

Backus-Naur Form (BNF)• Developed in late 50’s for FORTRAN

• BNF is a way to write grammars that specify computer languages, protocols, and command line arguments to programs (and many others)– BNF describes the legal strings for that language or

protocol

• The HTTP spec uses the following BNF constructs (among others):– “literal”

» literal text

– rule1 | rule2

» rule 1 or rule 2 is allowed

4

BNF (cont)

• HTTP BNF constructs– (rule1 rule2)

» sequence of elements: rule1 followed by rule2» (elem (foo | bar) elem)

– “elem foo elem” or “elem bar elem” are allowed.– <n>*<m>(element)

» at least n and at most m occurences of element.» default values are n=0 and m=infinity.» *(element) allows zero or more occurences of

element.» 1(element) requires at least one element.» 1*2(element) allows one or two.

– [rule]» rule is optional

5

BNF (cont)

• Basic rules– OCTET = <any 8-bit sequence of data>

– CHAR = <any ASCI I char>

– ALPHA = <A...Z and a...z>

– DIGIT = <0…9>

– SP = <ASCII SPACE character (32)>

– CRLF = <newline \n>

– TEXT = <printable ASCII chars>

6

URIs and URLs

• network resources are identified by Universal Resource Indicators (URIs)

• The most familiar is the absolute URI known as the HTTP URL:– http-url = “http:” “//” host [“:” port] [abs_path]– port defaults to “80”– abs_path defaults to “/”– abs_path ending in / defaults to …/index.html

• examples:– http://www.iiit.net:80/index.html– http://www.iiit.net/index.html– http://www.iiit.net

7

Uniform Resource Locators (URLs)

• The internet name of the site containing the resource

• The type of service (eg. HTTP, FTP)

• The Internet port number of this service.

• The location of the resource in the directory structure of the server.

8

http://www.address.edu:1234/path/subdir/file.ext

| | | | |

|service| | | |

|____ host _____| | |

| | |

|port| |

| file and |

|_ resource details _|

9

HTTP/1.1 messages

An HTTP message is either a Request or a Response:

HTTP-message = Request | Response

Requests and responses have the same basic form:

generic-message = start-line *message-header CRLF [message body]

start-line = Request-line | Status linemessage-header = field-name “:” [field value] CRLFmessage-body = <e.g., HTML file>

10

HTTP/1.1 requests

Request = Method SP Request-URI SP HTTP-VERSION CRLF *(general-header | request-header | entity header) CRLF [ message-body ]

• Method: tells the server what operation to perform– GET: retrieve file

– OPTIONS: retrieve server and access capabilities

• Request-URI: identifies the resource to manipulate– data file (HTML), executable file (CGI)

• headers: parameterize the method– Accept-Language: en-us – User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)

• message-body: text characters

11

HTTP/1.1 responsesResponse = HTTP-Version SP Status-Code SP Reason-Phrase CRLF *(general-header | response-header | entity header) CRLF [ message-body ]

• Status code: 3-digit number

• Reason-Phrase: explanation of status code

• headers: parameterize the response– Date: Thu, 22 Jul 1999 23:42:18 GMT

– Server: Apache/1.2.5 BSDI3.0-PHP/FI-2.0

– Content-Type: text/html

• message-body:– file

12

How servers interpret Request-URIs

• GET / HTTP/1.1– resolves to home/html/index.html– action: retrieves index.html

• GET /index.html ...– home/html/index.html– action: retrieves index.html

• GET /foo.html ...– home/html/foo.html– action: retrieves foo.html

• GET /cgi-bin/test.pl …– home/cgi-bin/test.pl– action: runs test.pl

• GET http://www.iiit.net/index.html– home/html/index.html– action: retrieves index.html

home

cgi-bin html

test.pl index.html foo.html

13

Example HTTP/1.1 conversationgkgarg> telnet www.iiit.net 80

Trying 196.12.32.11...

Connected to www.iiit.net.

Escape character is '^]'.GET /index.htm HTTP/1.1 ;request lineHost: www.iiit.net ;request hdrCRLFHTTP/1.1 200 OK ;status line

Date: Wed, 06 Sep 2000 19:53:49 GMT ;response hdr

Server: Apache/1.3.9 (Unix) (Red Hat/Linux)

Last-Modified: Sat, 02 Sep 2000 06:42:15 GMT

ETag: "3ab62-19ee-39b0a147"

Accept-Ranges: bytes

Content-Length: 6638

Content-Type: text/html

<html> ;beginning of the message body 6638 bytes

<head>

<title>IIIT HOME</title>

Request sent by client

Response sent by server

14

MIME

• WWW uses MIME (Multipurpose Internet Mail Extension) types to define the type of a particular piece of information being send from a web server to a browser which in turn determines how the data should be treated.

15

OPTIONS

• Retrieves information about the server in general or resources on that server, without actually retrieving the resource.

• Request URIs:– if request URI = “*”, then the request is about the server

in general

» Is the server up?

» Is it HTTP/1.1 compliant?

» What brand of server?

» What OS is it running?

– if request URI != “*”, then the request applies to the options that available when accessing that resource:

» what methods can the client use to access the resource?

16

OPTIONS (www.iiit.net)

gkgarg> telnet www.iiit.net 80

Trying 196.12.32.11...

Connected to www.iiit.net.

Escape character is '^]'.

OPTIONS * HTTP/1.1Host: www.iiit.netCRLF

HTTP/1.1 200 OK

Date: Wed, 06 Sep 2000 19:57:24 GMT

Server: Apache/1.3.9 (Unix) (Red Hat/Linux)

Content-Length: 0

Allow: GET, HEAD, OPTIONS, TRACE

Request

Response

Host is a requiredheader in HTTP/1.1 but not in HTTP/1.0

17

OPTIONS (www.iitd.ernet.in)[vishal@mail vishal]$ telnet www.iitd.ernet.in 80

Trying 202.141.68.9...

Connected to www.iitd.ernet.in.

Escape character is '^]'.

OPTIONS /cgi-bin/ HTTP/1.1Host: www.iitd.ernet.inCRLF

HTTP/1.1 200 OK

Date: Wed, 06 Sep 2000 19:11:12 GMT

Server: Apache/1.3.12 (Unix) (Red Hat/Linux)

Content-Length: 0

Allow: GET, HEAD, POST, OPTIONS, TRACE

18

OPTIONS (microsoft.com)telnet microsoft.com 80

Trying 207.46.230.218...

Connected to microsoft.com.

Escape character is '^]'.

OPTIONS * HTTP/1.1Host: microsoft.comCRLF

HTTP/1.1 200 OK

Server: Microsoft-IIS/5.0

Date: Wed, 06 Sep 2000 19:11:57 GMT

Content-Length: 0

Accept-Ranges: bytes

DASL: <DAV:sql>

DAV: 1, 2

Public: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIN

D, PROPPATCH, LOCK, UNLOCK, SEARCH

Allow: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIND

, PROPPATCH, LOCK, UNLOCK, SEARCH

Cache-Control: private

19

[vishal@mail vishal]$ telnet microsoft.com 80

Trying 207.46.230.219...

Connected to microsoft.com.

Escape character is '^]'.OPTIONS / HTTP/1.1Host: microsoft.comCRLFHTTP/1.1 200 OK

Server: Microsoft-IIS/5.0

Date: Thu, 07 Sep 2000 10:44:48 GMT

MS-Author-Via: DAV

Content-Length: 0

Accept-Ranges: none

DASL: <DAV:sql>

DAV: 1, 2

Public: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIN

D, PROPPATCH, LOCK, UNLOCK, SEARCH

Allow: OPTIONS, TRACE, GET, HEAD, COPY, PROPFIND, SEARCH, LOCK, UNLOCK

Cache-Control: private

20

OPTIONS (amazon.com)

[vishal@mail vishal]$ telnet amazon.com 80

Trying 208.216.181.15...

Connected to amazon.com.

Escape character is '^]'.

OPTIONS / HTTP/1.0CRLF

HTTP/1.1 200 OK

Date: Wed, 06 Sep 2000 19:16:41 GMT

Server: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix)

Content-Length: 0

Allow: GET, HEAD, POST, OPTIONS, TRACE

Connection: close

21

OPTIONS (netscape.com)

[vishal@mail vishal]$ telnet www.netscape.com 80

Trying 64.236.8.10...

Connected to www-jp.netscape.com.

Escape character is '^]'.

OPTIONS * HTTP/1.1

Host: www.netscape.com

CRLF

HTTP/1.1 200 OK

Server: Netscape-Enterprise/3.6 SP1

Date: Thu, 07 Sep 2000 10:53:18 GMT

Content-length: 0

Public: HEAD, GET, PUT, POST

22

OPTIONS (netscape.com)

[vishal@mail vishal]$ telnet www.netscape.com 80

Trying 64.236.8.10...

Connected to www-jp.netscape.com.

Escape character is '^]'.

OPTIONS /index.html HTTP/1.1

Host: www.netscape.com

CRLF

HTTP/1.1 200 OK

Server: Netscape-Enterprise/3.6 SP1

Date: Thu, 07 Sep 2000 10:45:54 GMT

Content-length: 0

Public: HEAD, GET, PUT, POST

23

GET

• Retrieves the information identified by the request URI.– static content (HTML file)

– dynamic content produced by CGI program

» passes arguments to CGI program in URI

• Can also act as a conditional retrieve when certain request headers are present:– If-Modified-Since

– If-Unmodified-Since

– If-Match

– If-None-Match

– If-Range

• Conditional GETs useful for caching

24

GET (www.iiit.net)[vishal@mail vishal]$ telnet www.iiit.net 80Trying 196.12.32.11...Connected to www.iiit.net.Escape character is '^]'.GET /index.htm HTTP/1.1Host: www.iiit.netCRLFHTTP/1.1 200 OKDate: Thu, 07 Sep 2000 12:02:11 GMTServer: Apache/1.3.9 (Unix) (Red Hat/Linux)Last-Modified: Thu, 07 Sep 2000 12:00:36 GMTETag: "3ab62-1a38-39b78364"Accept-Ranges: bytesContent-Length: 6712Content-Type: text/html

<html>

<head><title>IIIT HOME</title>

25

GET request to euro.ecom(Internet Explorer browser)

GET /test.html HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98) Host: euro.ecom.cmu.edu Connection: Keep-AliveCRLF

26

GET response from euro.ecom

HTTP/1.1 200 OKDate: Thu, 22 Jul 1999 04:02:15 GMTServer: Apache/1.3.3 Ben-SSL/1.28 (Unix)Last-Modified: Thu, 22 Jul 1999 03:33:21 GMTETag: "48bb2-4f-37969101"Accept-Ranges: bytesContent-Length: 79Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/htmlCRLF<html><head><title>Test page</title></head><body><h1>Test page</h1></html>

27

GET request to euro.ecom (Netscape browser)

GET /test.html HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.06 [en] (Win98; I)Host: euro.ecom.cmu.eduAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8CRLF

28

GET response from euro.ecom

HTTP/1.1 200 OKDate: Thu, 22 Jul 1999 06:34:42 GMTServer: Apache/1.3.3 Ben-SSL/1.28 (Unix)Last-Modified: Thu, 22 Jul 1999 03:33:21 GMTETag: "48bb2-4f-37969101"Accept-Ranges: bytesContent-Length: 79Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/htmlCRLF<html><head><title>Test page</title></head><body><h1>Test page</h1></html>

29

HEAD

• Returns same response header as a GET request would have...

• But doesn’t actually carry out the request.– Some servers don’t implement this properly.

– example: espn.com

• Useful for applications that– check for valid and broken links in Web pages.

– check Web pages for modifications.

30

Head (netscape.com)

[vishal@mail vishal]$ telnet www.netscape.com 80Trying 64.236.8.10...Connected to www-jp.netscape.com.Escape character is '^]'.HEAD / HTTP/1.1Host: www.netscape.com

HTTP/1.1 200 OKServer: Netscape-Enterprise/3.6Date: Thu, 07 Sep 2000 11:14:02 GMTSet-Cookie: UIDC=196.12.44.3:0968325242:189082;domain=.netscape.com;path=/;expires=31-Dec-2010 23:59:59 GMTContent-type: text/htmlConnection: close

31

HEAD (espn.com)

[vishal@mail vishal]$ telnet espn.com 80Trying 204.202.136.31...Connected to espn.com.Escape character is '^]'.HEAD / HTTP/1.1Host: www.espn.comCRFLHTTP/1.1 301 Document MovedServer: Microsoft-IIS/4.0Date: Thu, 07 Sep 2000 11:11:00 GMTLocation: http://espn.go.comContent-Type: text/html

<html>Is now part of the http://espn.go.com service<br></html>

Modern browserstransparently

connect to the newespn.go.com location

32

POST

• Another technique for producing dynamic content.

• Executes program identified in request URI (the CGI program).

• Passes arguments to CGI program in the message body– unlike GET, which passes the arguments in the URI itself.

• Responds with output of the CGI program.

33

TRACE

• returns contents of request header in response message body.

• HTTP’s version of an echo server.

• used for debugging.

34

Assignments

• Write a small web server and see what are the request send by different web browsers

• Telnet to port 80 of 100 internet sites and make a study of different types of web servers they run. Make groups of 6 students by roll numbers. Roll number 1-6 form group A, 7-12 form group B …. Each group will work on 100 sites beginning with the alphabet of their group, eg. group A will go to sites beginning with A eg. Amazon, AOL etc.