web services
DESCRIPTION
Web services. Clients and servers communicate using the HyperText Transfer Protocol (HTTP) Current version is HTTP/1.1 RFC 2616, June, 1999. HTTP request. web client (browser). web server. HTTP response. Static and dynamic content. - PowerPoint PPT PresentationTRANSCRIPT
1
Web services
webclient
(browser)
webserver
HTTP request
HTTP response
• Clients and servers communicate using the HyperText Transfer Protocol (HTTP)
• Current version is HTTP/1.1– RFC 2616, June, 1999.
2
Static and dynamic content
• The content returned in HTTP responses can be either static or dynamic.
• Static content: – content stored in files and retrieved in response to an
HTTP request
» HTML files
» images
» audio clips
• Dynamic content:– content produced on-the-fly in response to an HTTP
request
» usually by CGI process executed by the server.
3
Backus-Naur Form (BNF)• Developed in late 50’s for FORTRAN
• BNF is a way to write grammars that specify computer languages, protocols, and command line arguments to programs (and many others)– BNF describes the legal strings for that language or
protocol
• The HTTP spec uses the following BNF constructs (among others):– “literal”
» literal text
– rule1 | rule2
» rule 1 or rule 2 is allowed
4
BNF (cont)
• HTTP BNF constructs– (rule1 rule2)
» sequence of elements: rule1 followed by rule2» (elem (foo | bar) elem)
– “elem foo elem” or “elem bar elem” are allowed.– <n>*<m>(element)
» at least n and at most m occurences of element.» default values are n=0 and m=infinity.» *(element) allows zero or more occurences of
element.» 1(element) requires at least one element.» 1*2(element) allows one or two.
– [rule]» rule is optional
5
BNF (cont)
• Basic rules– OCTET = <any 8-bit sequence of data>
– CHAR = <any ASCI I char>
– ALPHA = <A...Z and a...z>
– DIGIT = <0…9>
– SP = <ASCII SPACE character (32)>
– CRLF = <newline \n>
– TEXT = <printable ASCII chars>
6
URIs and URLs
• network resources are identified by Universal Resource Indicators (URIs)
• The most familiar is the absolute URI known as the HTTP URL:– http-url = “http:” “//” host [“:” port] [abs_path]– port defaults to “80”– abs_path defaults to “/”– abs_path ending in / defaults to …/index.html
• examples:– http://www.iiit.net:80/index.html– http://www.iiit.net/index.html– http://www.iiit.net
7
Uniform Resource Locators (URLs)
• The internet name of the site containing the resource
• The type of service (eg. HTTP, FTP)
• The Internet port number of this service.
• The location of the resource in the directory structure of the server.
8
http://www.address.edu:1234/path/subdir/file.ext
| | | | |
|service| | | |
|____ host _____| | |
| | |
|port| |
| file and |
|_ resource details _|
9
HTTP/1.1 messages
An HTTP message is either a Request or a Response:
HTTP-message = Request | Response
Requests and responses have the same basic form:
generic-message = start-line *message-header CRLF [message body]
start-line = Request-line | Status linemessage-header = field-name “:” [field value] CRLFmessage-body = <e.g., HTML file>
10
HTTP/1.1 requests
Request = Method SP Request-URI SP HTTP-VERSION CRLF *(general-header | request-header | entity header) CRLF [ message-body ]
• Method: tells the server what operation to perform– GET: retrieve file
– OPTIONS: retrieve server and access capabilities
• Request-URI: identifies the resource to manipulate– data file (HTML), executable file (CGI)
• headers: parameterize the method– Accept-Language: en-us – User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
• message-body: text characters
11
HTTP/1.1 responsesResponse = HTTP-Version SP Status-Code SP Reason-Phrase CRLF *(general-header | response-header | entity header) CRLF [ message-body ]
• Status code: 3-digit number
• Reason-Phrase: explanation of status code
• headers: parameterize the response– Date: Thu, 22 Jul 1999 23:42:18 GMT
– Server: Apache/1.2.5 BSDI3.0-PHP/FI-2.0
– Content-Type: text/html
• message-body:– file
12
How servers interpret Request-URIs
• GET / HTTP/1.1– resolves to home/html/index.html– action: retrieves index.html
• GET /index.html ...– home/html/index.html– action: retrieves index.html
• GET /foo.html ...– home/html/foo.html– action: retrieves foo.html
• GET /cgi-bin/test.pl …– home/cgi-bin/test.pl– action: runs test.pl
• GET http://www.iiit.net/index.html– home/html/index.html– action: retrieves index.html
home
cgi-bin html
test.pl index.html foo.html
13
Example HTTP/1.1 conversationgkgarg> telnet www.iiit.net 80
Trying 196.12.32.11...
Connected to www.iiit.net.
Escape character is '^]'.GET /index.htm HTTP/1.1 ;request lineHost: www.iiit.net ;request hdrCRLFHTTP/1.1 200 OK ;status line
Date: Wed, 06 Sep 2000 19:53:49 GMT ;response hdr
Server: Apache/1.3.9 (Unix) (Red Hat/Linux)
Last-Modified: Sat, 02 Sep 2000 06:42:15 GMT
ETag: "3ab62-19ee-39b0a147"
Accept-Ranges: bytes
Content-Length: 6638
Content-Type: text/html
<html> ;beginning of the message body 6638 bytes
<head>
<title>IIIT HOME</title>
Request sent by client
Response sent by server
14
MIME
• WWW uses MIME (Multipurpose Internet Mail Extension) types to define the type of a particular piece of information being send from a web server to a browser which in turn determines how the data should be treated.
15
OPTIONS
• Retrieves information about the server in general or resources on that server, without actually retrieving the resource.
• Request URIs:– if request URI = “*”, then the request is about the server
in general
» Is the server up?
» Is it HTTP/1.1 compliant?
» What brand of server?
» What OS is it running?
– if request URI != “*”, then the request applies to the options that available when accessing that resource:
» what methods can the client use to access the resource?
16
OPTIONS (www.iiit.net)
gkgarg> telnet www.iiit.net 80
Trying 196.12.32.11...
Connected to www.iiit.net.
Escape character is '^]'.
OPTIONS * HTTP/1.1Host: www.iiit.netCRLF
HTTP/1.1 200 OK
Date: Wed, 06 Sep 2000 19:57:24 GMT
Server: Apache/1.3.9 (Unix) (Red Hat/Linux)
Content-Length: 0
Allow: GET, HEAD, OPTIONS, TRACE
Request
Response
Host is a requiredheader in HTTP/1.1 but not in HTTP/1.0
17
OPTIONS (www.iitd.ernet.in)[vishal@mail vishal]$ telnet www.iitd.ernet.in 80
Trying 202.141.68.9...
Connected to www.iitd.ernet.in.
Escape character is '^]'.
OPTIONS /cgi-bin/ HTTP/1.1Host: www.iitd.ernet.inCRLF
HTTP/1.1 200 OK
Date: Wed, 06 Sep 2000 19:11:12 GMT
Server: Apache/1.3.12 (Unix) (Red Hat/Linux)
Content-Length: 0
Allow: GET, HEAD, POST, OPTIONS, TRACE
18
OPTIONS (microsoft.com)telnet microsoft.com 80
Trying 207.46.230.218...
Connected to microsoft.com.
Escape character is '^]'.
OPTIONS * HTTP/1.1Host: microsoft.comCRLF
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Wed, 06 Sep 2000 19:11:57 GMT
Content-Length: 0
Accept-Ranges: bytes
DASL: <DAV:sql>
DAV: 1, 2
Public: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIN
D, PROPPATCH, LOCK, UNLOCK, SEARCH
Allow: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIND
, PROPPATCH, LOCK, UNLOCK, SEARCH
Cache-Control: private
19
[vishal@mail vishal]$ telnet microsoft.com 80
Trying 207.46.230.219...
Connected to microsoft.com.
Escape character is '^]'.OPTIONS / HTTP/1.1Host: microsoft.comCRLFHTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Thu, 07 Sep 2000 10:44:48 GMT
MS-Author-Via: DAV
Content-Length: 0
Accept-Ranges: none
DASL: <DAV:sql>
DAV: 1, 2
Public: OPTIONS, TRACE, GET, HEAD, DELETE, PUT, POST, COPY, MOVE, MKCOL, PROPFIN
D, PROPPATCH, LOCK, UNLOCK, SEARCH
Allow: OPTIONS, TRACE, GET, HEAD, COPY, PROPFIND, SEARCH, LOCK, UNLOCK
Cache-Control: private
20
OPTIONS (amazon.com)
[vishal@mail vishal]$ telnet amazon.com 80
Trying 208.216.181.15...
Connected to amazon.com.
Escape character is '^]'.
OPTIONS / HTTP/1.0CRLF
HTTP/1.1 200 OK
Date: Wed, 06 Sep 2000 19:16:41 GMT
Server: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix)
Content-Length: 0
Allow: GET, HEAD, POST, OPTIONS, TRACE
Connection: close
21
OPTIONS (netscape.com)
[vishal@mail vishal]$ telnet www.netscape.com 80
Trying 64.236.8.10...
Connected to www-jp.netscape.com.
Escape character is '^]'.
OPTIONS * HTTP/1.1
Host: www.netscape.com
CRLF
HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.6 SP1
Date: Thu, 07 Sep 2000 10:53:18 GMT
Content-length: 0
Public: HEAD, GET, PUT, POST
22
OPTIONS (netscape.com)
[vishal@mail vishal]$ telnet www.netscape.com 80
Trying 64.236.8.10...
Connected to www-jp.netscape.com.
Escape character is '^]'.
OPTIONS /index.html HTTP/1.1
Host: www.netscape.com
CRLF
HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.6 SP1
Date: Thu, 07 Sep 2000 10:45:54 GMT
Content-length: 0
Public: HEAD, GET, PUT, POST
23
GET
• Retrieves the information identified by the request URI.– static content (HTML file)
– dynamic content produced by CGI program
» passes arguments to CGI program in URI
• Can also act as a conditional retrieve when certain request headers are present:– If-Modified-Since
– If-Unmodified-Since
– If-Match
– If-None-Match
– If-Range
• Conditional GETs useful for caching
24
GET (www.iiit.net)[vishal@mail vishal]$ telnet www.iiit.net 80Trying 196.12.32.11...Connected to www.iiit.net.Escape character is '^]'.GET /index.htm HTTP/1.1Host: www.iiit.netCRLFHTTP/1.1 200 OKDate: Thu, 07 Sep 2000 12:02:11 GMTServer: Apache/1.3.9 (Unix) (Red Hat/Linux)Last-Modified: Thu, 07 Sep 2000 12:00:36 GMTETag: "3ab62-1a38-39b78364"Accept-Ranges: bytesContent-Length: 6712Content-Type: text/html
<html>
<head><title>IIIT HOME</title>
25
GET request to euro.ecom(Internet Explorer browser)
GET /test.html HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98) Host: euro.ecom.cmu.edu Connection: Keep-AliveCRLF
26
GET response from euro.ecom
HTTP/1.1 200 OKDate: Thu, 22 Jul 1999 04:02:15 GMTServer: Apache/1.3.3 Ben-SSL/1.28 (Unix)Last-Modified: Thu, 22 Jul 1999 03:33:21 GMTETag: "48bb2-4f-37969101"Accept-Ranges: bytesContent-Length: 79Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/htmlCRLF<html><head><title>Test page</title></head><body><h1>Test page</h1></html>
27
GET request to euro.ecom (Netscape browser)
GET /test.html HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/4.06 [en] (Win98; I)Host: euro.ecom.cmu.eduAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*Accept-Encoding: gzipAccept-Language: enAccept-Charset: iso-8859-1,*,utf-8CRLF
28
GET response from euro.ecom
HTTP/1.1 200 OKDate: Thu, 22 Jul 1999 06:34:42 GMTServer: Apache/1.3.3 Ben-SSL/1.28 (Unix)Last-Modified: Thu, 22 Jul 1999 03:33:21 GMTETag: "48bb2-4f-37969101"Accept-Ranges: bytesContent-Length: 79Keep-Alive: timeout=15, max=100Connection: Keep-AliveContent-Type: text/htmlCRLF<html><head><title>Test page</title></head><body><h1>Test page</h1></html>
29
HEAD
• Returns same response header as a GET request would have...
• But doesn’t actually carry out the request.– Some servers don’t implement this properly.
– example: espn.com
• Useful for applications that– check for valid and broken links in Web pages.
– check Web pages for modifications.
30
Head (netscape.com)
[vishal@mail vishal]$ telnet www.netscape.com 80Trying 64.236.8.10...Connected to www-jp.netscape.com.Escape character is '^]'.HEAD / HTTP/1.1Host: www.netscape.com
HTTP/1.1 200 OKServer: Netscape-Enterprise/3.6Date: Thu, 07 Sep 2000 11:14:02 GMTSet-Cookie: UIDC=196.12.44.3:0968325242:189082;domain=.netscape.com;path=/;expires=31-Dec-2010 23:59:59 GMTContent-type: text/htmlConnection: close
31
HEAD (espn.com)
[vishal@mail vishal]$ telnet espn.com 80Trying 204.202.136.31...Connected to espn.com.Escape character is '^]'.HEAD / HTTP/1.1Host: www.espn.comCRFLHTTP/1.1 301 Document MovedServer: Microsoft-IIS/4.0Date: Thu, 07 Sep 2000 11:11:00 GMTLocation: http://espn.go.comContent-Type: text/html
<html>Is now part of the http://espn.go.com service<br></html>
Modern browserstransparently
connect to the newespn.go.com location
32
POST
• Another technique for producing dynamic content.
• Executes program identified in request URI (the CGI program).
• Passes arguments to CGI program in the message body– unlike GET, which passes the arguments in the URI itself.
• Responds with output of the CGI program.
33
TRACE
• returns contents of request header in response message body.
• HTTP’s version of an echo server.
• used for debugging.
34
Assignments
• Write a small web server and see what are the request send by different web browsers
• Telnet to port 80 of 100 internet sites and make a study of different types of web servers they run. Make groups of 6 students by roll numbers. Roll number 1-6 form group A, 7-12 form group B …. Each group will work on 100 sites beginning with the alphabet of their group, eg. group A will go to sites beginning with A eg. Amazon, AOL etc.