Skip to main content

3. Functional Specification - Part 1

This section contains the core technical specification of TCP: header format, terminology, and sequence number mechanisms.


3.1. Header Format

TCP segments are sent as internet datagrams. The Internet Protocol header carries several information fields, including the source and destination host addresses [2]. A TCP header follows the internet header, supplying information specific to the TCP protocol. This division allows for the existence of host level protocols other than TCP.

TCP Header Format

    0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

TCP Header Format

Note: One tick mark represents one bit position

Field Descriptions

Source Port: 16 bits

The source port number.

Purpose: Identifies the sending process on the sending host.

Destination Port: 16 bits

The destination port number.

Purpose: Identifies the receiving process on the receiving host.

Sequence Number: 32 bits

The sequence number of the first data octet in this segment (except when SYN is present). If SYN is present, the sequence number is the initial sequence number (ISN) and the first data octet is ISN+1.

Key Points:

  • Each data byte has a unique sequence number
  • SYN segments use ISN, data starts from ISN+1
  • Sequence number space: 0 to 2³² - 1

Acknowledgment Number: 32 bits

If the ACK control bit is set, this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established, this is always sent.

Cumulative Acknowledgment Mechanism:

  • An acknowledgment number of X indicates that all bytes up to but not including X have been received
  • X itself is not included

Data Offset: 4 bits

The number of 32-bit words in the TCP Header. This indicates where the data begins. The TCP header (even one including options) is an integral number of 32 bits long.

Calculation Formula:

Header Length (bytes) = Data Offset × 4
Minimum value: 5 (20 bytes)
Maximum value: 15 (60 bytes)

Reserved: 6 bits

Reserved for future use. Must be zero.

Control Bits: 6 bits (from left to right)

FlagFull NameMeaning
URGUrgentUrgent Pointer field valid
ACKAcknowledgmentAcknowledgment field valid
PSHPushPush function
RSTResetReset the connection
SYNSynchronizeSynchronize sequence numbers
FINFinishNo more data from sender

Flag Combinations:

SYN = 1: Connection establishment request
SYN + ACK = 1: Connection establishment response
FIN = 1: Connection termination request
RST = 1: Abnormal connection termination
PSH = 1: Push data immediately to application layer
URG = 1: Urgent data present

Window: 16 bits

The number of data octets beginning with the one indicated in the acknowledgment field which the sender of this segment is willing to accept.

Flow Control:

  • Window size = 0: Stop sending data
  • Window size > 0: Can send up to window size bytes
  • Maximum window: 65,535 bytes (can be extended with window scaling option)

Example:

ACK = 1000, Window = 5000
→ Can receive data with sequence numbers 1000-4999 (5000 bytes)

Checksum: 16 bits

The checksum field is the 16-bit one's complement of the one's complement sum of all 16-bit words in the header and text. If a segment contains an odd number of header and text octets to be checksummed, the last octet is padded on the right with zeros to form a 16-bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros.

Pseudo Header:

The checksum also covers a 96-bit pseudo header conceptually prefixed to the TCP header. This pseudo header contains the source address, destination address, protocol, and TCP length. This gives the TCP protection against misrouted segments.

+--------+--------+--------+--------+
| Source Address |
+--------+--------+--------+--------+
| Destination Address |
+--------+--------+--------+--------+
| zero | PTCL | TCP Length |
+--------+--------+--------+--------+

PTCL = 6 (TCP protocol number)
TCP Length = TCP header length + data length (in octets)

Checksum Calculation Steps:

def calculate_tcp_checksum(pseudo_header, tcp_header, data):
# 1. Set checksum field to 0
# 2. Combine pseudo header, TCP header, and data
# 3. Sum as 16-bit words
# 4. Add carry to low 16 bits
# 5. Take one's complement
pass

Urgent Pointer: 16 bits

This field communicates the current value of the urgent pointer as a positive offset from the sequence number in this segment. The urgent pointer points to the sequence number of the octet following the urgent data. This field is only interpreted in segments with the URG control bit set.

Use Cases:

  • Ctrl+C interrupt signals
  • Telnet interrupt commands
  • Control information requiring priority processing

Example:

SEG.SEQ = 1000
URG Pointer = 10
→ Urgent data ends at sequence number 1010
→ Sequence numbers 1000-1009 are urgent data

Options: variable

Options may occupy space at the end of the TCP header and are a multiple of 8 bits in length. All options are included in the checksum. An option may begin on any octet boundary. There are two formats for options:

Case 1: A single octet of option-kind Case 2: An octet of option-kind, an octet of option-length, and the actual option-data octets

The option-length counts the two octets of option-kind and option-length as well as the option-data octets.

Important: TCP must implement all options.

Currently Defined Options

Kind (octal)LengthMeaning
0-End of Option List
1-No-Operation
24Maximum Segment Size

Option Details

1. End of Option List

+--------+
|00000000|
+--------+
Kind=0
  • This option code indicates the end of the option list
  • This might not coincide with the end of the TCP header according to the Data Offset field
  • Used at the end of all options, not each option
  • Only needed if the end of the options would not otherwise coincide with the end of the TCP header

2. No-Operation

+--------+
|00000001|
+--------+
Kind=1
  • This option code may be used between options
  • For example, to align the beginning of a subsequent option on a word boundary
  • There is no guarantee that senders will use this option
  • Receivers must be prepared to process options that do not begin on a word boundary

3. Maximum Segment Size

+--------+--------+---------+--------+
|00000010|00000100| max seg size |
+--------+--------+---------+--------+
Kind=2 Length=4

Maximum Segment Size Option Data: 16 bits

  • If this option is present, it communicates the maximum receive segment size at the TCP which sends this segment
  • This field must only be sent in the initial connection request (i.e., in segments with the SYN control bit set)
  • If this option is not used, any segment size is allowed

MSS Notes:

  • Default MSS = 536 bytes (Internet default)
  • Common Ethernet MSS = 1460 bytes (1500 - 20 IP header - 20 TCP header)
  • MSS refers only to data portion, excluding TCP/IP headers

Padding: variable

The TCP header padding is used to ensure that the TCP header ends and data begins on a 32-bit boundary. The padding is composed of zeros.


3.2. Terminology

Before we can discuss the operation of the TCP, we need to introduce some detailed terminology. The maintenance of a TCP connection requires remembering several variables. We conceive of these variables being stored in a connection record called a Transmission Control Block (TCB).

Variables Stored in TCB

Among the variables stored in the TCB are:

  • The local and remote socket numbers
  • The security and precedence of the connection
  • Pointers to the user's send and receive buffers
  • Pointers to the retransmit queue and to the current segment
  • Several variables relating to the send and receive sequence numbers

Send Sequence Variables

VariableFull NameDescription
SND.UNASend UnacknowledgedOldest unacknowledged sequence number
SND.NXTSend NextNext sequence number to be sent
SND.WNDSend WindowSend window
SND.UPSend Urgent PointerSend urgent pointer
SND.WL1Segment Sequence NumberSegment sequence number used for last window update
SND.WL2Segment Acknowledgment NumberSegment acknowledgment number used for last window update
ISSInitial Send Sequence NumberInitial send sequence number

Receive Sequence Variables

VariableFull NameDescription
RCV.NXTReceive NextNext sequence number expected
RCV.WNDReceive WindowReceive window
RCV.UPReceive Urgent PointerReceive urgent pointer
IRSInitial Receive Sequence NumberInitial receive sequence number

Sequence Space Diagrams

Send Sequence Space

                 1         2          3          4
----------|----------|----------|----------
SND.UNA SND.NXT SND.UNA
+SND.WND

1 - old sequence numbers which have been acknowledged
2 - sequence numbers of unacknowledged data
3 - sequence numbers allowed for new data transmission
4 - future sequence numbers which are not yet allowed

Send Window: The portion of sequence space labeled 3 in the diagram

Receive Sequence Space

                     1          2          3
----------|----------|----------
RCV.NXT RCV.NXT
+RCV.WND

1 - old sequence numbers which have been acknowledged
2 - sequence numbers allowed for new reception
3 - future sequence numbers which are not yet allowed

Receive Window: The portion of sequence space labeled 2 in the diagram

Current Segment Variables

These variables are derived from the fields of the current segment:

VariableDescription
SEG.SEQSegment sequence number
SEG.ACKSegment acknowledgment number
SEG.LENSegment length
SEG.WNDSegment window
SEG.UPSegment urgent pointer
SEG.PRCSegment precedence value

Connection States

A connection progresses through a series of states during its lifetime. The states are: LISTEN, SYN-SENT, SYN-RECEIVED, ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT, and the fictional state CLOSED.

State Descriptions

StateDescription
LISTENRepresents waiting for a connection request from any remote TCP and port
SYN-SENTRepresents waiting for a matching connection request after having sent a connection request
SYN-RECEIVEDRepresents waiting for a confirming connection request acknowledgment after having both received and sent a connection request
ESTABLISHEDRepresents an open connection, data received can be delivered to the user. The normal state for the data transfer phase of the connection
FIN-WAIT-1Represents waiting for a connection termination request from the remote TCP, or an acknowledgment of the connection termination request previously sent
FIN-WAIT-2Represents waiting for a connection termination request from the remote TCP
CLOSE-WAITRepresents waiting for a connection termination request from the local user
CLOSINGRepresents waiting for a connection termination request acknowledgment from the remote TCP
LAST-ACKRepresents waiting for an acknowledgment of the connection termination request previously sent to the remote TCP (which includes an acknowledgment of its connection termination request)
TIME-WAITRepresents waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request
CLOSEDRepresents no connection state at all (fictional state, as it represents the state when no TCB exists)

TCP Connection State Diagram

                            +---------+ ---------\      active OPEN
| CLOSED | \ -----------
+---------+<---------\ \ create TCB
| ^ \ \ snd SYN
passive OPEN | | CLOSE \ \
------------ | | ---------- \ \
create TCB | | delete TCB \ \
V | \ \
+---------+ CLOSE | \
| LISTEN | ---------- | |
+---------+ delete TCB | |
rcv SYN | | SEND | |
----------- | | ------- | V
+---------+ snd SYN,ACK / \ snd SYN +---------+
| |`<----------------- ------------------>`| |
| SYN | rcv SYN | SYN |
| RCVD |<-----------------------------------------------| SENT |
| | snd ACK | |
| |------------------ -------------------| |
+---------+ rcv ACK of SYN \ / rcv SYN,ACK +---------+
| -------------- | | -----------
| x | | snd ACK
| V V
| CLOSE +---------+
| ------- | ESTAB |
| snd FIN +---------+
| CLOSE | | rcv FIN
V ------- | | -------
+---------+ snd FIN / \ snd ACK +---------+
| FIN |`<----------------- ------------------>`| CLOSE |
| WAIT-1 |------------------ | WAIT |
+---------+ rcv FIN \ +---------+
| rcv ACK of FIN ------- | CLOSE |
| -------------- snd ACK | ------- |
V x V snd FIN V
+---------+ +---------+ +---------+
|FINWAIT-2| | CLOSING | | LAST-ACK|
+---------+ +---------+ +---------+
| rcv ACK of FIN | rcv ACK of FIN |
| rcv FIN -------------- | Timeout=2MSL -------------- |
| ------- x V ------------ x V
\ snd ACK +---------+delete TCB +---------+
------------------------>|TIME WAIT|------------------>| CLOSED |
+---------+ +---------+

TCP Connection State Diagram

Events and State Transitions

A TCP connection progresses from one state to another in response to events. Events include:

  • User calls: OPEN, SEND, RECEIVE, CLOSE, ABORT, STATUS
  • Arriving segments: Particularly those containing SYN, ACK, RST, and FIN flags
  • Timeouts: Retransmission timeout, TIME-WAIT timeout, etc.

Note: The state diagram is only a summary and cannot stand alone as the complete specification. It only illustrates state changes, with their triggering events and resulting actions, but neither error conditions nor actions which are not associated with state changes are indicated.


3.3. Sequence Numbers

Basic Concepts

A fundamental notion in the design of TCP is that every octet of data sent over a TCP connection has a sequence number. Since every octet is sequenced, each of them can be acknowledged. The acknowledgment mechanism employed is cumulative so that an acknowledgment of sequence number X indicates that all octets up to but not including X have been received.

This mechanism allows for straightforward duplicate detection in the presence of retransmission. Numbering of octets within a segment is that the first data octet immediately following the header is the lowest numbered, and the following octets are numbered consecutively.

Sequence Number Space

Key Fact: The actual sequence number space is finite, though very large. This space ranges from 0 to 2³² - 1.

Modulo Arithmetic: Since the space is finite, all arithmetic dealing with sequence numbers must be performed modulo 2³². This unsigned arithmetic preserves the relationship of sequence numbers as they cycle from 2³² - 1 to 0 again. There are some subtleties to computer modulo arithmetic, so great care should be taken in programming the comparison of such values.

Notation Convention:

  • The symbol =< means "less than or equal" (modulo 2³²)

Sequence Number Comparisons

Typical sequence number comparisons which TCP must perform include:

  1. Determining that an acknowledgment refers to some sequence number sent but not yet acknowledged
  2. Determining that all the sequence numbers occupied by a segment have been acknowledged (e.g., to remove the segment from a retransmission queue)
  3. Determining that an incoming segment contains sequence numbers which are expected (i.e., that the segment "overlaps" the receive window)

Send Sequence Number Processing

In response to sending data, the TCP will receive acknowledgments. The following comparisons are needed to process the acknowledgments:

SND.UNA = oldest unacknowledged sequence number
SND.NXT = next sequence number to be sent
SEG.ACK = acknowledgment from the receiving TCP (next sequence number expected by the receiving TCP)
SEG.SEQ = first sequence number of a segment
SEG.LEN = the number of octets occupied by the data in the segment (counting SYN and FIN)
SEG.SEQ+SEG.LEN-1 = last sequence number of a segment

Acceptable ACK:

A new acknowledgment (called an "acceptable ack") is one for which the inequality below holds:

SND.UNA < SEG.ACK ≤ SND.NXT

A segment on the retransmission queue is fully acknowledged if the sum of its sequence number and length is less than or equal to the acknowledgment value in the incoming segment.

Example:

SND.UNA = 1000 (oldest unacknowledged)
SND.NXT = 2000 (next to send)

Receive SEG.ACK = 1500
Check: 1000 < 1500 ≤ 2000 ✓ (acceptable)

Receive SEG.ACK = 2500
Check: 1000 < 2500 ≤ 2000 ✗ (not acceptable, acknowledges unsent data)

Receive Sequence Number Processing

When data is received, the following comparisons are needed:

RCV.NXT = next sequence number expected on an incoming segment,
and is the left or lower edge of the receive window

RCV.NXT+RCV.WND-1 = last sequence number expected on an incoming segment,
and is the right or upper edge of the receive window

SEG.SEQ = first sequence number occupied by the incoming segment
SEG.SEQ+SEG.LEN-1 = last sequence number occupied by the incoming segment

Segment Acceptability Test

A segment is judged to be acceptable only if it lies within the window. The test depends on segment length and window size:

Segment LengthWindow SizeTest for Acceptability
00SEG.SEQ = RCV.NXT
0>0RCV.NXT ≤ SEG.SEQ < RCV.NXT+RCV.WND
>00not acceptable
>0>0RCV.NXT ≤ SEG.SEQ &lt; RCV.NXT+RCV.WND<br />or<br />RCV.NXT ≤ SEG.SEQ+SEG.LEN-1 &lt; RCV.NXT+RCV.WND

Explanation:

  • The first test for zero-length segments can be viewed as testing a phantom segment that begins at SEG.SEQ and does not occupy any sequence space
  • If RCV.WND is zero, no data is acceptable, but segments that occupy no space are acceptable

Practical Code Example:

def is_segment_acceptable(seg_seq, seg_len, rcv_nxt, rcv_wnd):
"""Check if segment is acceptable"""
if seg_len == 0:
if rcv_wnd == 0:
return seg_seq == rcv_nxt
else:
return rcv_nxt &lt;= seg_seq &lt; rcv_nxt + rcv_wnd
else: # seg_len > 0
if rcv_wnd == 0:
return False
else:
# Either start or end of segment is within window
start_in_window = rcv_nxt &lt;= seg_seq &lt; rcv_nxt + rcv_wnd
end_in_window = rcv_nxt &lt;= seg_seq + seg_len - 1 &lt; rcv_nxt + rcv_wnd
return start_in_window or end_in_window

Initial Sequence Number Selection (ISN)

The choice of the Initial Sequence Number (ISN) is crucial. The TCP must use a clock-driven ISN generator to avoid old connection segments being mistaken as part of a new connection.

ISN Generation Recommendations:

  • ISN should be incremented by 1 every 4 microseconds
  • ISN has a period of approximately 4.55 hours
  • ISN for new connections should be different from ISN of old connections

Security Considerations:

  • Modern implementations should use more secure ISN generation algorithms (RFC 6528)
  • Prevent sequence number prediction attacks

Key Concepts Summary

TCP Header Structure

  • Fixed 20-byte header: Contains all core fields
  • Variable-length options: Up to 40 bytes
  • Checksum covers pseudo header: Provides additional error detection

Sequence Number Mechanism

  • Per-byte numbering: Each data byte has a unique sequence number
  • Cumulative acknowledgment: Acknowledgment number indicates all bytes below it are received
  • Modulo 2³² arithmetic: Sequence number space is circular

Connection States

  • 11 states: From CLOSED to ESTABLISHED and back to CLOSED
  • Event-driven: User calls, segment arrivals, timeouts trigger state transitions
  • Three-way handshake: SYN → SYN-ACK → ACK
  • Four-way close: FIN → ACK → FIN → ACK

TCB Variables

  • Send variables: SND.UNA, SND.NXT, SND.WND, etc.
  • Receive variables: RCV.NXT, RCV.WND, etc.
  • Window management: Core of flow control

Next Section: 3.4-3.9 Connection Management & Event Processing - Detailed specifications for connection establishment, closing, data communication, and event processing