SPL/2010SPL/2010
Application Level Protocol Design
● atomic units used by protocol: "messages"
● encoding ● reusable, protocol independent, TCP
server, ● LinePrinting protocol implementation
2
SPL/2010SPL/2010
Protocol Definition
● set of rules, governing the communication details between two parties (processes)
● different forms and levels; ● protocols for exchange bits across a wire● protocols governing administration of super
computers. ● application level protocols - define
interaction between computer applications 3
SPL/2010SPL/2010
Protocol Communication Rules
● syntax : how do we phrase the information we exchange.
● semantics : what actions/response for information received.
● synchronization : whose turn it is to speak (given the above defined semantics).
4
SPL/2010SPL/2010
Protocols Skeleton
● all protocols follow a simple skeleton.● exchange information using messages,
which define the syntax. ● difference between protocols: syntax
used for messages, and semantics of protocol.
5
SPL/2010SPL/2010
Protocol Initialization (hand-shake)
● communication begins when party sends initiation message to other party.
● synchronization - each party sends one message in a round robin fashion.
6
SPL/2010SPL/2010
TCP 3-Way Handshake
● Establish/ tear down TCP socket connections
● computers attempting to communicate can negotiate network TCP socket connection
● both ends can initiate and negotiate separate TCP socket connections at the same time
7
SPL/2010SPL/2010
● A sends a SYNchronize packet to B● B receives A's SYN● B sends a SYNchronize-
ACKnowledgement● A receives B's SYN-ACK● A sends ACKnowledge● B receives ACK. ● TCP socket connection
is ESTABLISHED. 9
SPL/2010SPL/2010
HTTP (Hyper Text Transfer Protocol)
● exchanging special text files over the network.
● brief (not complete) protocol description:● synchronization: client initiates connection,
sends single request, receive reply from server.
● syntax: text based, see rfc2616.● semantics: server either sends to the client
the page asked for, or returns an error.
10
SPL/2010SPL/2010
What next?
● syntax and semantics aspects of protocols.
● assume: synchronization works in round robin, i.e., each party sends one message at a time.
11
SPL/2010SPL/2010
Message Format
● Protocol syntax: message is the atomic unit of data exchanged throughout the protocol.
● message = letter● concentrate on the delivery mechanism.
12
SPL/2010SPL/2010
Framing
● streaming protocols - TCP● separate between different messages
● all messages are sent on the same stream, one after the other,
● receiver should distinguish between different messages.
● Solution: message framing - taking the content of the message, and encapsulating it in a frame (letter - envelop).
13
SPL/2010SPL/2010
Framing – what is it good for?
● sender and receiver agree on the framing method beforehand
● framing is part of message format/protocol
● enable receiver to discover in a stream of bytes where message starts/ends
14
SPL/2010SPL/2010
Framing – how?● Simple framing protocol for strings:
● special FRAMING character (e.g., a line break). ● each message is framed by two FRAMING
characters at beginning and end. ● message will not contain a FRAMING character
● framing protocol by adding a special tag at start and end. ● message can be framed using <begin> / <end>
strings. ● avoid having <begin> / <end> in message body.
15
SPL/2010SPL/2010
Framing – how?
● framing protocol by employing a variable length message format● special tag to mark start of a frame● message contains information on message's
length
16
SPL/2010SPL/2010
Textual data
● Many protocols exchange data in textual form ● strings of characters, in character encoding,
(UTF-8)● very easy to document/debug - print
messages ● Limitation: difficult to send non-textual
data. – how do we send a picture? video? audio file?
18
SPL/2010SPL/2010
Binary Data
● non-textual data is called binary data. ● all data is eventually encoded in "binary"
format, as a sequence of bits● "binary data" = data that cannot be
encoded as a readable string of characters?
19
SPL/2010SPL/2010
Binary Data
● Sending binary data in raw binary format in a stream protocol is dangerous.● may contain any byte sequence, may corrupt
framing protocol. ● Devising a variable length message
format.
20
SPL/2010SPL/2010
Base64 Encoding Binary Data
encode binary data using encoding algorithm● Base64 encoding - encodes binary data
into a string ● Convert every 2 bytes sequence from the
binary data into 3 ASCII characters. ● used by many "standard" protocols (email to
encode file attachments of any type of data).
21
SPL/2010SPL/2010
Encoding using Poco
● In C++, Poco library includes module for encoding/decoding byte arrays into/from Base64 encoded ASCII data.
● functionality is modeled as a stream "filter" ● performs encode/decode on all data flowing
through the stream ● classes Base64Encoder / Base64Decoder.
22
SPL/2010SPL/2010
Encoding in Java
● iharder library. ● modeled as stream filters (wrappers
around Input/Output Java streams).
23
SPL/2010SPL/2010
Encoding binary data
● advantage: any stream of bytes can be "framed" as ASCII data regardless of character encoding used by protocol.
● disadvantage - size of the message, increased by 50%.
● (we will use UTF-8 encoding scheme)
24
SPL/2010SPL/2010
Protocol and Server Separation
code reuse is one of our design goals!● generic implementation of server, which
handles all communication details● generic protocol interface:
● handles incoming messages● implements protocol's semantics● generates the reply messages.
26
SPL/2010SPL/2010
Protocol-Server Separation: protocol object
● protocol object is in charge of implementing expected behavior of our server:● What actions should be performed upon the
arrival of a request. ● requests may be correlated one to another,
meaning protocol should save an appropriate state per client.
27
SPL/2010SPL/2010
Example: authenticated session
● protocols require user authentication (login),
● only authorized users can perform certain actions.
● protocol is statefull - serving requests of client can be in at least 2 distinct states:
1. authenticated (user has already logged in)
2. non-authenticated (user has not provided login).
● by state of the protocol object, behavior of protocol object is different
28
SPL/2010SPL/2010
Protocol and Server Separation
separate different tasks server must perform. ● Accept new connections from new clients.● Receive new bytes from connected clients.● Parse incoming bytes from clients into
messages ("de-serialization" / "unframing").● Dispatch message to right method on server
side to execute requested operation.● Send back an answer to a connected client
after an action has been executed.
29
SPL/2010SPL/2010
● The key participants in this architecture are:● Tokenizer - syntax, tokenizing a stream of
data into messages.● MessagingProtocol – semantics, handling
received messages and generating responses.
31
SPL/2010SPL/2010
● implementations of interfaces: ● generic server● MessageTokenizer● LinePrinitingProtocol,
32
SPL/2010SPL/2010
Interfaces
● implement separation between protocol and server. Define:
1. message (can be encoded in various ways: Base64, XML, text).
● Our messages encoded as plain UTF-8 text.
2. framing of messages - delimiters between messages sent in stream.
3. protocol interface which handles each individual message.
33
SPL/2010SPL/2010
ConnectionHandler
● server accepted new connection from client.
● server creates ConnectionHandler - will handle all incoming messages from this client.
● ConnectionHandler - maintains state of connection for specific client ● Ex: user perform "login" - ConnectionHandler
object remembers this in its state
34
SPL/2010SPL/2010
ConnectionHandler - Socket
● ConnectionHandler has access to Socket connecting server to client process.● TCP server - Socket connection is viewed as
a pair of InputStream and OutputStream. ● streams of bytes – client and the server
exchange a bunch of bytes.
35
SPL/2010SPL/2010
Tokenizer - in charge of parsing a stream of bytes into a stream of messages
● Tokenizer interface: filter between Socket input stream and protocol
● Protocol accesses the input stream only through the tokenizer.
● instead of "seeing" a stream of bytes, it sees a stream of messages.
● Many libraries model such "filters" on streams as wrappers around a lower-level stream.
● OutputStreamWriter - wraps stream and performs encoding from one character encoding to another
● BufferedReader - adds a layer of buffering around a non-buffered input stream.
36
SPL/2010SPL/2010
Tokenizer
● splits incoming bytes from the socket into messages.
● For simplicity, we model the Tokenizer as an iterator…
● protocol will see the input stream from the socket as an iterator over messages (instead of an iterator over bytes).
37
SPL/2010SPL/2010
Messaging Protocol
● protocol interface ● wraps together: socket and Tokenizer ● Pass incoming messages to
MessagingProtocol - execute action requested by client. ● look at the message and decide on action● decision may depend on the state
● Once the action is performed - answer back from the MessagingProtocol.
39
SPL/2010SPL/2010
● We use a String to pass data from Tokenizer to Protocol, and back from Protocol.
● Serialization/Deserialization (encode/decode parameters to/from Strings) performed by Protocol - and not by the Tokenizer. ● Tokenizer is only in charge of deframing (split
bytes into messages).
41
SPL/2010SPL/2010
Connection Handler
● active object:● handles one connection to one client for the
whole period during which the client is connected
● (from the moment the connection is accepted, until one of the sides decides to close it).
● modeled as a Runnable class.
43
SPL/2010SPL/2010
Connection Handler
● holds references to:● TCP socket connected to the client, ● Tokenizer● an instance of the MessagingProtocol.
44
SPL/2010SPL/2010
● connection handler is generic, works with any implementation of a messaging protocol.
● assumes data exchanged between client and server is in form of encoded strings
● encoder passed to constructor as an Encoder interface.
45
SPL/2010SPL/2010
What’s left?
● only need to implement:● specific framing handler (tokenizer) ● specific protocol we wish to use.
● continue our line printing example…
47
SPL/2010SPL/2010
Message Tokenizer
● we use a framing method based on a single character delimiter.
● assume stream of messages, delimited by FRAMING = we will use the character '\0‘
48
SPL/2010SPL/2010
● important part is connection termination and exception handling at any moment
● most of the code in low-level input/output and socket manipulation relates to error handling and connection termination.
50
SPL/2010SPL/2010
Line Printing Protocol
● implement a specific protocol on the server side. ● when receives a message, prints it on the server side
screen and adds a line number. ● line number is the state of the protocol.
● each client has its own line number. Two clients connected at the same time will see each one its own version of the line number.
● when protocol processes a message, - sends back message to client: ": printed" + date-time value when the message was processed (on the server side).
● timestamp acknowledgments.51
SPL/2010SPL/2010
A Client
● before ConnectionHandler, review code of compatible TCP client for protocol we have just described.
● no new idea - it is similar to the TCP client we have reviewed in the previous section.
53
SPL/2010SPL/2010
Concurrency Models of TCP Servers
Server quality criteria:● Scalability: capability to server a large
number of concurrent clients.● Low accept latency: acceptance wait
time● Low reply latency: reply wait time after
message received.● High efficiency: use little resources on
the server (RAM, number of threads CPU usage). 55
SPL/2010SPL/2010
● model the concurrency model of the server,
● define interface which controls concurrency application of each connection handler
56
SPL/2010SPL/2010
● Given:● Encoder● Tokenizer● Protocol● ServerConcurrencyModel
defined the MessagingServer
57
SPL/2010SPL/2010
● To obtain good quality, a TCP server will most often use multiple threads.
● 3 simple models of concurrency servers ● 3 implementations of preparing the
ServerConcurrencyModel interface
59
SPL/2010SPL/2010
Server Model 1: Single Thread
● 1 thread for;● accepting a new client● dealing requests, by applying run method of
the passive ConnectionHandler object.
60
SPL/2010SPL/2010
Single Thread Model: Quality
● no scalability: at any given moment, it can serve at most one client.
● high accept latency: a second client must wait until first client disconnects
● low reply latency: all resources are concentrated on serving one client.
● Good efficiency: server uses exactly the resources needed to serve one client
62
SPL/2010SPL/2010
When is model appropriate?
● time to process a full connection from one client is guaranteed to remain small.
● Example: server provides date and time value on the server machine. ● sends one string to the client then
disconnects.
63
SPL/2010SPL/2010
Server Model 2: Thread per Client
● assigns a new thread, for each connected client, by invoking the 'start' method over the runnable ConnectionHandler object.
64
SPL/2010SPL/2010
Model Quality: Scalability
● server can serve several concurrent clients, up to max threads running in the process. ● RAM of the host is used - each thread
allocates a stack and thus consumes RAM ● Approx. 500 - 1000 threads become active in
a single process. ● process does not defend itself – keeps
creating new threads - dangerous for the host.
66
SPL/2010SPL/2010
Model Quality: Latency
● Low accept latency: time from one accept to the next ~ time to create a new thread – ● short compared to delay in incoming client
connections. ● Reply latency: resources of the server
are spread among concurrent connections. ● reasonable number of active connections
(~hundreds), load requested relatively low in CPU and RAM, 67
SPL/2010SPL/2010
Model Quality: Efficiency
● Low efficiency: server creates full thread per connection, – connection may be bound to Input/Output
operations.
– ConnectionHandler thread will be blocked waiting for IO, ,still use the resources of the thread (RAM and Thread).
● Reactor architecture …
68
SPL/2010SPL/2010
Server Model 3: Constant Number of Threads
● constant number of 10 threads (given by the Executor interface of Java)
● adding runnable ConnectionHandler object to task queue of a thread pool executor
69
SPL/2010SPL/2010
Model Quality
● avoids server causing host crash when too many clients connect at the same time
● up to N concurrent client connections -server behaves as "thread-per-connection"
● above N, accept latency will grow● scalability is limited to amount of
concurrent connections we believe we can support. 70