usb protocol spec

8/3/2019 USB Protocol Spec

1/16

USB Protocol Specification

Introduction

This chapter deals with the way of organising the data flow between a computer

equipped with the USB hub and various USB devices connected to it. USB is a serial

bus meaning that there is only one line transmitting signals, so only 1 bit can be sentat one instance of time. Therefore, in order to provide error checking, flow control

and to synchronize the devices, information is organised in the form of packets and

frames. This, in turns, forces a standard header and a tail of each packet indicating theportion of data in between them. Since all types of information are exchanged using

packets, they must be differentiated accordingly to their function and version of the

protocol. The division into a number of fields of specific meaning defined by the

specification allows easy identification and unambiguous interpretation.

Compatibility with the previous standards is also one of the determining features ofthe protocol. In the 2.0 standard, the maximal speed of the transmission was increaseddrastically but the older devices were still able to work with the new hubs because of

the protocol, which enabled split or isochronous transmission and the microframes 8

times shorter than the normal frames. Therefore, one protocol can handle all of thetransmission band classes: low, full and high.

The concept of the polled bus

The features of the USB protocol are forced mainly by its design solution - the polled

bus. Every time the initiative to transfer data, configure a device, etc. comes from theroot hub controller and is messaged to the clients by packages. It also makes it easyand cheap to implement, since the root hub is responsible for negotiations and control

and very little amount of processing is left to the USB devices. The data sent via the

bus are ordered according to Intel's Little Endian specification, i.e. bytes are writtenand read from the least significan bit (LSB) to the most significant bit (MSB).

Therefore, the packet diagrams presented here should be read from the left hand side

because this is the order in which they are sent. All packets are also subjected to some

transparrent encoding/decoding procedures: NRZI and bit-stuffing.

Data Encoding/Decoding

The USB employs NRZI (Non Return to Zero Invert) data encoding whentransmitting packets. In NRZI encoding, a 1 is represented by no change in level and

a 0 is represented by a change in level. A string of zeros causes the NRZI data to

toggle each bit time. A string of ones causes long periods with no transitions in the


2/16

data.

NRZI Data encoding scheme [2]

Bit Stuffing

In order to ensure adequate signal transitions, bit stuffing is employed by thetransmitting device when sending a packet on USB. A zero is inserted after every six

consecutive ones in the data stream before the data is NRZI encoded, to force a

transition in the NRZI data stream. This gives the receiver logic a data transition at

least once every seven bit times to guarantee the data and clock lock. Bit stuffing isenabled beginning with the Sync Pattern. The data one that ends the Sync Pattern is

counted as the first one in a sequence. Bit stuffing by the transmitter is always

enforced, except during high-speed EOP. If required by the bit stuffing rules, a zerobit will be inserted even if it is the last bit before the End-of-Packet (EOP) signal. The

receiver must decode the NRZI data, recognize the stuffed bits, and discard them.

Addressing structure

The structure of an USB system from the protocol point of view is a bit different thanthe physical construction. A device connected to a hub (here called a "host") is

reffered to as "function" (with the following definition from the specification:

"Function - a USB device that provides a capability to the host, such as an ISDN

connection, a digital microphone, or speakers.")[2]. A function can possess couple of

"endpoints" ("Device Endpoint- a uniquely addressable portion of a USB device thatis the source or sink of information in a communication flow between the host and

device.")[2]. A fuller description can be found at the end of this section in the chapter"Stages of transactions".

Packet fields

Some fields, like SYNC and PID are standard for all packets, whereas the others are

specific for a particular group of packets, e.g. FrameNumber in the Start-of-Frame

packets. Packet bit definitions are displayed in unencoded format. The effects of

NRZI coding and bit stuffing have been removed for the sake of clarity. All packetshave distinct Start- and End-of-Packet delimiters.

SYNC Field

All packets begin with a synchronization (SYNC) field, which is a coded sequencedesigned to provide a maximal transition density. It is used by the input circuitry to

align incoming data with the local clock. A SYNC from an initial transmitter is


3/16

defined to be eight bits in length for full/low-speed and 32 bits for high-speed. SYNCserves only as a synchronization mechanism and is not shown in the following packet

diagrams. The last two bits in the SYNC field are a marker that is used to identify the

end of the SYNC field and the start of the PID.

Packet Identifier Field

A packet identifier (PID) immediately follows the SYNC field of every USB packet.

A PID consists of a four-bit packet type field followed by a four-bit check field as

shown below. The PID indicates the type of packet and, by inference, the format ofthe packet and the type of error detection applied to the packet. The four-bit check

field of the PID ensures reliable decoding of the PID so that the remainder of the

packet is interpreted correctly. The PID check field is generated by performing a one

s complement of the packet type field. A PID error exists if the four PID check bitsare not complements of their respective packet identifier bits.

PID Field [2]


4/16

PIDs available in the USB 2.0 protocol [2]

Address Fields

Function endpoints are addressed using two fields: the function address field and the

endpoint field. A function needs to fully decode both address and endpoint fields.Address or endpoint aliasing is not permitted, and a mismatch on either field must

cause the token to be ignored. Accesses to non-initialized endpoints will also cause


5/16

the token to be ignored.

Address Field. The function address (ADDR) field specifies the function, via its

address, that is either the source or destination of a data packet, depending on thevalue of the token PID. As shown in the figure below, a total of 128 addresses are

specified as ADDR. The ADDR field is specified for IN, SETUP, and OUTtokens and the PING and SPLIT special token. By definition, each ADDR value

defines a single function. Upon reset and power-up, a function s address defaults to avalue of zero and must be programmed by the host during the enumeration process.

Function address zero is reserved as the default address and may not be assigned to

any other use.

Addres field of a packet [2]

Endpoint Field.An additional four-bit endpoint (ENDP) field permits more flexible

addressing of functions in which more than one endpoint is required. Except forendpoint address zero, endpoint numbers are function-specific. The endpoint field is

defined for IN, SETUP, and OUT tokens and the PING special token. All functions

must support a control pipe at endpoint number zero (the Default Control Pipe).Lowspeed devices support a maximum of three pipes per function: a control pipe at

endpoint number zero plus two additional pipes (either two control pipes, a control

pipe and a interrupt endpoint, or two interrupt endpoints). Full-speed and high-speed

functions may support up to a maximum of 16 IN and OUT endpoints.

Endpoint address field [2]

Frame Number Field

The frame number field is an 11-bit field that is incremented by the host on a per-

frame basis. The frame number field rolls over upon reaching its maximum value of

7FFH and is sent only in Start-of-Frame tokens at the start of each (micro)frame. Theframing in 1.1 as well as 2.0 (microframes) standard are shown in the picture below


6/16

Comparison of normal- and microframes [2]

Data Field

The data field may range from zero to 1,024 bytes and must be an integral number ofbytes. The diagram below shows the format for multiple bytes. Data bits within each

byte are shifted out LSb first. Data Field Format Data packet size varies with the

transfer type, eg. interuption transfer, control transfer or isochronous transfer.

Data field (multiple bytes) [2]

Cyclic Redundancy Checks

Cyclic redundancy checks (CRCs) are used to protect all non-PID fields in token and

data packets. In this context, these fields are considered to be protected fields. The

PID is not included in the CRC check of a packet containing a CRC. All CRCs aregenerated over their respective fields in the transmitter before bit stuffing is

performed. Similarly, CRCs are decoded in the receiver after stuffed bits have been

removed. Token and data packet CRCs provide 100% coverage for all single- anddouble-bit errors. A failed CRC is considered to indicate that one or more of the

protected fields is corrupted and causes the receiver to ignore those fields and, in most

cases, the entire packet. For CRC generation and checking, the shift registers in the

generator and checker are seeded with an allones pattern. For each data bit sent or

received, the high order bit of the current remainder is XORed with the data bit andthen the remainder is shifted left one bit and the low-order bit set to zero. If the result

of that XOR is one, then the remainder is XORed with the generator polynomial.When the last bit of the checked field is sent, the CRC in the generator is inverted and

sent to the checker MSb first. When the last bit of the CRC is received by the checker

and no errors have occurred, the remainder will be equal to the polynomial residual.

A CRC error exists if the computed checksum remainder at the end of a packet


7/16

reception does not match the residual. Bit stuffing requirements must be met for theCRC, and this includes the need to insert a zero at the end of a CRC if the preceding

six bits were all ones.

Token CRCs A five-bit CRC field is provided for tokens and covers the ADDR and

ENDP fields of IN, SETUP, and OUT tokens or the time stamp field of an SOF token.The PING and SPLIT special tokens also include a five-bit CRC field. The generator

polynomial is:

G(X) = X5

+ X2

+ 1.

Data CRCs The data CRC is a 16-bit polynomial applied over the data field of a data

packet. The generating polynomial is:

G(X)= X16

+ X15

+ X2

+ 1

Types of packets

Token Packets

Figure below shows the field formats for a token packet. A token consists of a PID,

specifying either IN, OUT, or SETUP packet type and ADDR and ENDP fields. The

PING special token packet also has the same fields as a token packet. For OUT andSETUP transactions, the address and endpoint fields uniquely identify the endpoint

that will receive the subsequent Data packet. For IN transactions, these fields

uniquely identify which endpoint should transmit a Data packet. For PING

transactions, these fields uniquely identify which endpoint will respond with ahandshake packet. Only the host can issue token packets. An IN PID defines a Data

transaction from a function to the host. OUT and SETUP PIDs define Data

transactions from the host to a function. A PING PID defines a handshake transactionfrom the function to the host. Token and SOF packets are delimited by an EOP after

three bytes of packet field data. If a packet decodes as an otherwise valid token or

SOF but does not terminate with an EOP after three bytes, it must be consideredinvalid and ignored by the receiver.

Handshake Packets


8/16

Handshake packets, as shown below, consist of only a PID. Handshake packets areused to report the status of a data transaction and can return values indicating

successful reception of data, command acceptance or rejection, flow control, and halt

conditions. Only transaction types that support flow control can return handshakes.

Handshakes are always returned in the handshake phase of a transaction and may be

returned, instead of data, in the data phase. Handshake packets are delimited by anEOP after one byte of packet field. If a packet decodes as an otherwise valid

handshake but does not terminate with an EOP after one byte, it must be consideredinvalid and ignored by the receiver.

Handshake packet [2]

Types of handshake pockets

ACK

indicates that the data packet was received without bit stuff or CRC errors

over the data field and that the data PID was received correctly. ACK may

be issued either when sequence bits match and the receiver can accept dataor when sequence bits mismatch and the sender and receiver must

resynchronize to each other (refer to Section 8.6 for details). An ACK

handshake is applicable only in transactions in which data has been

transmitted and where a handshake is expected. ACK can be returned bythe host for IN transactions and by a function for OUT, SETUP, or PING

transactions.

NAK

indicates that a function was unable to accept data from the host (OUT) orthat a function has no data to transmit to the host (IN). NAK can only be

returned by functions in the data phase of IN transactions or the handshake

phase of OUT or PING transactions. The host can never issue NAK. NAKis used for flow control purposes to indicate that a function is temporarily

unable to transmit or receive data, but will eventually be able to do so

without need of host intervention.

STALL

returned by a function in response to an IN token or after the data phase of

an OUT or in response to a PING transaction (see Figure 8-30 and Figure

8-38). STALL indicates that a function is unable to transmit or receive data,

or that a control pipe request is not supported. The state of a function afterreturning a STALL (for any endpoint except the default endpoint) is

undefined. The host is not permitted to return a STALL under anycondition.

NYETa high-speed only handshake that is returned in two circumstances. It isreturned by a highspeed endpoint as part of the PING protocol described

later in this chapter. NYET may also be returned by a hub in response to a


9/16

split-transaction when the full-/low-speed transaction has not yet been

completed or the hub is otherwise not able to handle the split-transaction.

See Chapter 11 for more details.

ERR

a high-speed only handshake that is returned to allow a high-speed hub to

report an error on a full-/low-speed bus. It is only returned by a high-speedhub as part of the split transaction protocol. See Chapter 11 for more

details.

Start-of-Frame Packets

Start-of-Frame (SOF) packets are issued by the host at a nominal rate of once every

1.00 ms 0.0005 ms for a full-speed bus and 125 s 0.0625 s for a high-speed bus.

SOF packets consist of a PID indicating packet type followed by an 11-bit frame

number field as illustrated below.

SOF Packet [2]

The SOF token comprises the token-only transaction that distributes an SOF markerand accompanying frame number at precisely timed intervals corresponding to the

start of each frame. All high-speed and fullspeed functions, including hubs, receivethe SOF packet. The SOF token does not cause any receiving function to generate areturn packet; therefore, SOF delivery to any given function cannot be guaranteed.

Data Packets

A data packet consists of a PID, a data field containing zero or more bytes of data,

and a CRC as shown below. There are four types of data packets, identified by

differing PIDs: DATA0, DATA1, DATA2 and MDATA. Two data packet PIDs(DATA0 and DATA1) are defined to support data toggle synchronization. All four

data PIDs are used in data PID sequencing for high bandwidth high-speed

isochronous endpoints. Three data PIDs (MDATA, DATA0, DATA1) are used insplit transactions. Data must always be sent in integral numbers of bytes. The dataCRC is computed over only the data field in the packet and does not include the PID,

which has its own check field. The maximum data payload size allowed for low-

speed devices is 8 bytes. The maximum data payload size for full-speed devices is1023. The maximum data payload size for high-speed devices is 1024 bytes.


10/16

Data Packet [2]

PING packets

Ping packet is a class of packets used only with high speed devices. It is used to det

the transation rate.

Stages of transactions

Handshake

Handshake procedures are different for different connection types: a host can send a

IN or OUT query to a function. Depending on the state of the device, response can

allow the host to write data to the its buffer or cancel the transaction. The samesituation is possible in the inverse situation. All posibilities are described by the

following tables:

Data transactions 1 [2]



11/16


Error Correction

An error can be detected thanks to CRC fields of a packet (described in the chaptersreferring to packet types) but some fields have their own error checking methods. The

PID field has the negated duplicant bits and the error can also be detected if the bit-stuff convention is broken. The following table summarizes the procedures applied if

an error is detected:

Error checking responses [2]

Notice, however, that in case of a isosynchronous transaction, which takes placeunidirectionally there is no room for sending NAK packets and re-receiving data,

therfore this kind of transmission is reserved for streaming devices where error

control is not important (cameras, etc.)

Split transaction

Split transactions enhance the performance of an USB 2.0 host working with acompliant hub, to which a collection of low/medium and high speed devices is

connected. Using special token packets, the transmission can be split up into a high

speed and normal band, simultaneously being transparent to the old devices. High-speed split transactions for interrupt and isochronous transfers must be allocated by

the host from the 80% periodic portion of a microframe. A high-speed split


12/16

transaction has two parts: a start-split and a complete-split. Split transactions are onlydefined to be used between the host controller and a hub. No other high-speed or full-

/low-speed devices ever use split transactions. The scheme of split IN transaction is

shown in the picture below:

Split IN transaction scheme [2]

Abstract level of the protocol

The transmission of data between the hub and a endpoint provided by a function is

represented by "pipes". It is a logical abstraction representing the association betweenan endpoint on a device and software on the host. A pipe has several attributes; forexample, a pipe may transfer data as streams (stream pipe) or messages (message

pipe). Pipe 0 is reserved and must be provided in every device's software.

Stream Pipe - a pipe that transfers data as a stream of samples with no defined USB

structure.

Message Pipe - a bi-directional pipe that transfers data using a request/data/status

paradigm. The data has an imposed structure that allows requests to be reliably

identified and communicated.

It can be graphically represented in the following way:

References:

1. USB.org Developers Resources[www.usb.org/developers] 2. USB 2.0 Specification available atwww.usb.org.org/developers/usb_20.zip
http://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org/developers/http://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org/developers/http://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org/developers/http://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org.org/developers/usb_20.ziphttp://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org.org/developers/usb_20.ziphttp://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org.org/developers/usb_20.ziphttp://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org.org/developers/usb_20.ziphttp://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/14usb/FINAL%20VERSION/www.usb.org/developers/


13/16

USB packets & formats

All USB data is sent serially, of course, and least significant bit (LSB) first. USB data transfer is

essentially in the form of packets of data, sent back and forth between the host and peripheral

devices. Initially, all packets are sent from the host, via the root hub and possibly more hubs, to

devices. Some of those packets direct a device to send some packets in reply.

Each USB data transfer consists of a

1. Token Packet (Header defining what it expects to follow)2. Optional Data Packet, (Containing the payload)

3. Status Packet (Used to acknowledge transactions and to provide a means of error correction)

As we have already discussed, the host initiates all transactions. The first packet, also called a

token is generated by the host to describe what is to follow and whether the data transfer will be

a read or write and what the device's address and designated endpoint is. The next packet is

generally a data packet carrying the content information and is followed by a handshakingpacket, reporting if the data or token was received successfully, or if the endpoint is stalled or not

available to accept data.

USB packets may consist of the following fields:

1. Sync field: All the packets start with this sync field. The sync field is 8 bits long at low and

full speed or 32 bits long for high speed and is used to synchronize the clock of the receiver with

that of the transmitter. The last two bits indicate where the PID fields starts.

2. PID field: This field (Packet ID) is used to identify the type of packet that is being sent. The

PID is actually 4 bits; the byte consists of the 4-bit PID followed by its bit-wise complement,making an 8-bit PID in total. This redundancy helps detect errors.

3. ADDR field: The address field specifies which device the packet is designated for. Being 7bits in length allows for 127 devices to be supported.

4. ENDP field: This field is made up of 4 bits, allowing 16 possible endpoints. Low speeddevices however can only have 2 additional endpoints on top of the default pipe.

5. CRC field: Cyclic Redundancy Checks are performed on the data within the packet payload.All token packets have a 5-bit CRC while data packets have a 16-bit CRC.

6. EOP field: This indicates End of packet. Signaled by a Single Ended Zero (SE0) forapproximately 2 bit times followed by a J for 1 bit time.

The USB packets come in four basic types, each with a different format and CRC field:

1. Handshake packets2. Token packets


14/16

3. Data packets

4. PRE packet5. Start of Frame Packets

Handshake packets:

Handshake packets consist of a PID byte, and are generally sent in response to data packets. Thethree basic types of handshake packets are

1. ACK, indicating that data was successfully received,2. NAK, indicating that the data cannot be received at this time and should be retried,

3. STALL, indicating that the device has an error and will never be able to successfully transfer

data until some corrective action is performed.

Fig 4: Handshake packet format

USB 2.0 added two additional handshake packets.

1. NYET which indicates that a split transaction is not yet complete,2. ERR handshake to indicate that a split transaction failed.

The only handshake packet the USB host may generate is ACK; if it is not ready to receive data,

it should not instruct a device to send any.

Token packets:

Token packets consist of a PID byte followed by 11 bits of address and a 5-bit CRC. Tokens are

only sent by the host, not by a device.

There are three types of token packets.

1. In token - Informs the USB device that the host wishes to read information.

2. Out token- informs the USB device that the host wishes to send information.

3. Setup token - Used to begin control transfers.
http://www.eeherald.com/


15/16

IN and OUT tokens contain a 7-bit device number and 4-bit function number (for multifunction

devices) and command the device to transmit DATA-packets, or receive the following DATA-packets, respectively.

An IN token expects a response from a device. The response may be a NAK or STALL response,

or a DATA frame. In the latter case, the host issues an ACK handshake if appropriate. An OUTtoken is followed immediately by a DATA frame. The device responds with ACK, NAK, or

STALL, as appropriate.

SETUP operates much like an OUT token, but is used for initial device setup.

Fig 5: Token packet format

USB 2.0 added a PING token, which asks a device if it is ready to receive an OUT/DATA packetpair. The device responds with ACK, NAK, or STALL, as appropriate. This avoids the need to

send the DATA packet if the device knows that it will just respond with NAK.

USB 2.0 also added a larger SPLIT token with a 7-bit hub number, 12 bits of control flags, and a

5-bit CRC. This is used to perform split transactions. Rather than tie up the high-speed USB bus

sending data to a slower USB device, the nearest high-speed capable hub receives a SPLIT tokenfollowed by one or two USB packets at high speed, performs the data transfer at full or low

speed, and provides the response at high speed when prompted by a second SPLIT token.

Data packets:

There are two basic data packets, DATA0 and DATA1. Both consist of a DATA PID field, 0-1023 bytes of data payload and a 16-bit CRC. They must always be preceded by an address

token, and are usually followed by a handshake token from the receiver back to the transmitter.

1. Maximum data payload size for low-speed devices is 8 bytes.

2. Maximum data payload size for full-speed devices is 1023 bytes.

3. Maximum data payload size for high-speed devices is 1024 bytes.

4. Data must be sent in multiples of bytes
http://www.eeherald.com/


16/16

Fig6: Data packet format

USB 2.0 added DATA2 and MDATA packet types as well. They are used only by high-speeddevices doing high-bandwidth isochronous transfers, which need to transfer more than 1024

bytes per 125 s "micro-frame" (8192 kB/s).

PRE packet:

Low-speed devices are supported with a special PID value, PRE. This marks the beginning of alow-speed packet, and is used by hubs, which normally do not send full-speed packets to low-

speed devices.Since all PID bytes include four 0 bits, they leave the bus in the full-speed K state, which is the

same as the low-speed J state. It is followed by a brief pause during which hubs enable their low-

speed outputs, already idling in the J state, then a low-speed packet follows, beginning with a

sync sequence and PID byte, and ending with a brief period of SE0. Full-speed devices otherthan hubs can simply ignore the PRE packet and its low-speed contents, until the final SE0

indicates that a new packet follows.

Start of Frame Packets:

Every 1ms (12000 full-speed bit times), the USB host transmits a special SOF (start of frame)token, containing an 11-bit incrementing frame number in place of a device address. This is used

to synchronize isochronous data flows. High-speed USB 2.0 devices receive 7 additional

duplicate SOF tokens per frame, each introducing a 125 s "micro-frame".
http://www.eeherald.com/http://www.eeherald.com/http://www.eeherald.com/http://www.eeherald.com/

usb protocol spec

Documents