1 http messages entities and encoding herng-yow chen

Post on 26-Dec-2015

228 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

HTTP messagesEntities and Encoding

Herng-Yow Chen

2

Outline

The format and behavior of HTTP message entities as HTTP containers

How HTTP describes the size of entity bodies, and what HTTP requires in the way of sizing

The entity headers used to describe the format, alphabet, and language of content, so clients can process it properly

3

Reversible content encoding transforms data format to take up less space or be more secure

Transfer encoding modifies how HTTP ships data to enhance the communication of some kinds of data

Chunked encoding chops data into multiple pieces to deliver content of unknown length safely

4

The assortment of tags, labels, times, and checksums help clients get the latest version of requested content

Ranges are useful for continuing aborted downloads where they left off

Delta encoding extensions allow client to request just those parts of a web page that actually have changed since a previously viewed revision

5

Checksums of entity bodies are used to detect changes in entity content as it passes through proxies

6

Message is made up of header and body

HTTP/1.0 200 OKServer: Netscape_Enterprise/3.6Date: Sun, 17 Sep 2000 00:01:05 GMTContent_type: text/plainContent-length :18

Hi!I’m a message! Entity body

Entity headers

Entity

7

HTTP 1.1 defines 10 entity headers

Content-Type Content-Length Content-Language Content-Encoding Content-Location Content-Range

Content-MD5 Last-Modified Expires Allow ETag Cache-Control

8

Entity Bodies

9

Why content-length is important?

Detecting Truncation Incorrect Content-Length problems?

When connection is persistent, where one entity body ends and the next message begins.

Chunked encoding is an alternate, sending the data in a series of chunks, each with a specified chunk size.

When content-encoding is applied Content-length refers to the encoded body, not the

length of the original, unencoded body.

10

Entity Digest

Content-MD5 Is used to check message integrity Also can be used as a key into a hash

table to quickly locate documents and reduce duplicate storage of content.

11

Media type and Charset Content-type refers to original entity bo

dy type before encoding. Support optional parameters to further

specify the content type. Character Encodings for Text Media Content-Type: text/html; charset=iso-8859-

4

12

Common media typesMedia type Description

Text/html Entity body is an HTML document

Text/plain Entity body is a document in plain text

Image/gif Entity body is an image of type GIF

Image/jpeg Entity body is an image of type JPEG

Audio/x-wav Entity body contains WAV sound data

Model/vrml Entity body is a three-dimensional VRML model

Application/vnd.ms-powerpoint

Entity body is a Microsoft PowerPoint presentation

Multipart/byteranges Entity body has multiple parts,each containing a different range(in bytes) of the full document

Message/http Entity body contains a complete HTTP message (see TRACE)

13

Multipart Media Types

MIME “multipart” email messages contain multiple messages stuck together and sent as a single, complex message.

Each component is self-contained, with its own headers describing its contents; the different components are concatenated together and delimited by a string.

HTTP also supports multipart bodies; however, only used in two cases: fill-in form submission and range responses carrying pieces of a document.

14

Multipart Form Submissions

<form action=http://xxx/cgi enctype="multipart/form-data“

method=POST> <P> Your Name? <INPUT type=“text” name=“submit-name”><br> Your File to send? <INPUT type=“file” name=“files”> <br>

<INPUT type=“submit” value=“send”> <INPUT type=“reset”><form>

15

If the user enters “John” and selects the text file “hello.txt”

Content-Type: multipart/form-data; boundary=AaBo3x--AaBo3xContent-Disposition: form-data; name=“submit-name”John--AaBo3xContent-Disposition: form-data; name=“files”; filename=“hello.t

xt”Content-Type: text/plain… contents of hello.txt …--AaBo3x

16

If selects the text file “hello.txt” and the second image file “image.gif”

Content-Type: multipart/form-data; boundary=AaBo3x--AaBo3xContent-Disposition: form-data; name=“submit-name”John--AaBo3xContent-Disposition: form-data; name=“files”; Content-type: multipart/mixed; boundary=BbC04y--BbC04yContent-Disposition: file: filename=“hello.txt”Content-type: text/plain… contents of hello.txt …--BbC04yContent-Disposition: file: filename=“image.gif”Content-Type: image/gifContent-Transfer-Encoding: binary… contents of image.gif …--BbC04y--AaBo3x

17

Multipart Range Response

HTTP/1.0 206 Partial ContentServer: Microsoft-IIS/5.0Content-Location: http://xxx/hello.txtContent-Type: martipart/x-byteranges; boundary=--[abcdefghik…z]--

----[abcdefghik…z]—Content-Type: text/plainContent-Range: bytes 0-174/1441 …. Part I content -----[abcdefghik…z]--Content-Type: text/plainContent-Range: bytes 1344-1441/1441 …. Part II content -----[abcdefghik…z]--

18

Content-Encoding

HTTP applications sometimes want to encode content before sending it, to help lesson the time it takes to transmit the data.

Content-Type is the type of the original format, before encoding

Content-Length is the length of the encoded length

19

Content EncodingOriginal contentContent-Type: text/htmlContent-Length: 17571

Original contentContent-Type: text/htmlContent-Length: 17571

Content-encoded contentContent-Type: text/htmlContent-Length: 5746content-encoding: gzip

0111000100110010

Gzip contentdecoder Gzip content

encoder

20

Content-encoding tokens

Content-encoding value

Description

gzip Using the GNU zip encoding (RFC1952)

compress Using the UNIX file compression program

deflate Using zlib format (RFC1950) for deflate compression (RFC 1951)

identity No encoding has been performed. When a Content-encoding header is not present, this can be assumed.

21

Accept-Encoding Headers

serverclient

HTTP/1.1 200 OKContent-type: image/gifContent-encoding: gzip[…]

Request message

Response message

…00101101……00101101…

The server compresses the image with gzip to transport a smaller file over the thinNetwork connection between itself and the client.This saves network bandwidthAnd reduces the amount of time that the client waits for the transfer.Though,theClient will have to spend time decompressing the image once the image is served.

gzipgunzip

GET /logo.gif HTTP/1.1Accept-encoding: gzip[…]

22

Client can indicate preferred encodings by attaching Q values

Accept-Encoding: compress, gzipAccept-Encoding:Accept-Encoding: *Accept-Encoding: compress;q=0.5, gzip;q=1.0Accept-Encoding: gzip;q=1.0, identity;q=0.5; *;q=0

23

Transfer Encoding

Content-Encodings are to deal with the entity content to be encoded for less-space or security reason, tightly associated with the content format.

In comparison, transfer encodings are applied for architectural reasons and are independent of the content format.

24

Content encoding vs. transfer encoding

HTTP/1.0 200 OKcontent-encoding: gzipContent-Type: text/html[…][encoded message]

HTTP/1.1 200 OKTransfer-encoding: Chunked

10abcdefghijk1a

Content-encoded response

Transfer-encoded response

Normal header block

Normal entity(just encoded)

Basic header

Encoded blocks

A content-encoded message just encodes the entitySection of the message. With Transfer-encodedMessages the encoding is a function of the entireMessage, changing the structure of the message itself

25

Transfer-Encoding Headers

TE Used in the request header to tell the

server what extension transfer encoding are okay to use.

Transfer-Encoding Used in the response header to tell the

receiver (client) what encoding has been perform

26

Example

GET /1.html HTTP/1.1Host: www.csie.ncnu.edu.twUser-Agent: Mozilla/4.61TE: trailers, chunked

HTTP/1.1 200 okTransfer-Encoding: chunkedServer: Apache 3.0

27

Chunked Encoding

28

Chunked Encoding (continued)

Chunking and Persistent connection

Trailers in chunked messages

Combining Content and Transfer Encoding

29

Combining Content and Transfer Encodings

9BF2578EA42670CD

9BF2578EA42670CD

4268EA

25798B

426

8EA257

98B

Content encoding

Transfer encoding(chunking)

Content-type: text/heml

Content-Type: text/htmlcontent-encoding: gzip

Content-Type: text/htmlcontent-encoding: gzipTransfer-encoding: chunked

30

Time-Varying Instance

Web objects usually are not static. The same URL can, over time, point

to different versions of an object.

For example, the website of any media company like CNN, and BBC.

31

Time-Varying Instances

32

Validators and Freshness In the previous CNN example, the client got th

e initial resource V1 and can cache this copy, but for how long?

Once the document has “expired” at the client, it must request a fresh copy from the server.

Using a “conditional request” to tell the server which version it currently has, using a validator, and ask for a copy to be sent only if its current copy is no long valid.

33

Cache-Control header directives

Directive Message type

no-cache Request

no-store Request

max-age Request

max-fresh Request

no-transform Request

only-if-cached Request

public Response

private Response

34

Cache-Control header directives

Directive Message type

no-cache Response

no-store Response

no-transform Response

must-revalidate Response

proxy-revalidate Response

max-age Response

s-max-age Response

35

Conditional request types

Request type validator

If-Modified-Since Last-Modified

If-Unmodified-Since Last-Modified

If-Match ETag

If-None-Match ETag

36

Range Request

HTTP allows clients to actually request just part or a range of a document.

Applications: Request RoI (Region of Interest) Media Indexing and Access Streaming applications

37

Range Requests

GET /bigfile.html HTTP/1.1[…]

GET /bigfile.html HTTP/1.1Range: bytes=20224-[…]

HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 65537Accept-Ranges: bytes[…]

HTTP/1.1 200 OKContent-Type: text/htmlRange: bytes=20224-Accept-Ranges: bytes

[…]

Response message

Range response message

Request message

www.csie.ncnu.edu.tw

www.csie.ncnu.edu.tw

client

110100111001101001110010

The client’s original request wasInterrupted,but a second requestFor the part of the message that Was not received allows the Client to resume form the pointOf the interruption

Range request message

38

Delta Encoding

An extension to the HTTP protocol that optimizes transfer by communicating changes instead of entire objects.

RFC 3229 describe delta encoding.

39

Delta Encoding

40

Delta Encoding

41

Delta-encoding headers

Etag If-None-Match A-IM IM Delta-Base

42

IANA registered types of instance manipulations

Type Descriptionvcdiff Delta using the vcdiff algorithm

diffe Delta using the Unix diff-e command

gdiff Delta using the gdiff algorithm

gzip Compression using the gzip algorithm

deflate Compression using the deflate algorithm

range Used in a server response to indicate that the response is partial content as the result of a range selection

identity Used in a client request’s A-IM header to indicate that the client is willing to accept an identity instance manipulation

43

For More Information

http://www.ietf.org/rfc/rfc2616.txt Hypertext Transfer Protocol -- HTTP/1.1

http://www.ietf.org/rfc/rfc3229.txt Delta encoding in HTTP

http://www.ietf.org/rfc/rfc1521.txt MIME (Multipurpose Internet Mail Extensions) Part One:Mechanisms for

Specifying and Describing the Format of Internet Message Bodies http://www.ietf.org/rfc/rfc2045.txt

Multipurpose Internet Mail Extensions(MIME) Part One:Format of Internet Message Bodies

http://www.ietf.org/rfc/rfc1864.txt The Content-MD5 Header Field

http://www.ietf.org/rfc/rfc3230.txt Instance Digests in HTTP

top related