Skip to main content

5. Data Framing

The WebSocket Protocol uses frames to transmit data. This chapter defines the format and processing rules for WebSocket frames.

5.1 Overview

Once a WebSocket connection is established, the client and server can transmit data bidirectionally. Data is transmitted in the form of a series of frames.

Key Concepts:

  • Frame: The basic unit of transmission, containing a header and payload
  • Message: Application-level data that may consist of one or more frames
  • Fragmentation: Large messages can be sent in multiple frames

5.2 Base Framing Protocol

Frame Structure

 0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+

Field Descriptions

FIN (1 bit)

  • 0: This is not the last frame of the message (more frames follow)
  • 1: This is the last frame of the message (or the only frame)
Message fragmentation example:
Frame 1: FIN=0, Opcode=0x1 (Text), Data="Hello "
Frame 2: FIN=0, Opcode=0x0 (Continuation), Data="World"
Frame 3: FIN=1, Opcode=0x0 (Continuation), Data="!"

Complete message: "Hello World!"

RSV1, RSV2, RSV3 (1 bit each)

  • Reserved for extensions
  • MUST be 0 if no extension is negotiated
  • MUST close connection if non-zero value received without defined extension

Opcode (4 bits)

Defines the frame type:

OpcodeTypeDescription
0x0ContinuationContinuation frame (subsequent frames of fragmented message)
0x1TextText frame (UTF-8 encoded)
0x2BinaryBinary frame
0x3-0x7-Reserved (data frames)
0x8CloseClose frame
0x9PingPing frame
0xAPongPong frame
0xB-0xF-Reserved (control frames)

MASK (1 bit)

  • Client to server: MUST be 1
  • Server to client: MUST be 0

If MASK=1, Payload Data must be masked using Masking-key.

Payload Length (7 bits, 7+16 bits, or 7+64 bits)

Payload length encoding:

  • 0-125: This is the actual length
  • 126: Following 16 bits (2 bytes) are actual length (network byte order)
  • 127: Following 64 bits (8 bytes) are actual length (network byte order)
Example:
Payload length = 100 bytes
→ Payload len = 100 (direct encoding)

Payload length = 1000 bytes
→ Payload len = 126
→ Extended payload length = 1000 (16-bit)

Payload length = 100000 bytes
→ Payload len = 127
→ Extended payload length = 100000 (64-bit)

Masking-key (0 or 4 bytes)

If MASK=1, contains 32 bits (4 bytes) of masking key.

Payload Data (x+y bytes)

Payload data = Extension Data + Application Data

  • Extension Data: Length x, determined by extension negotiation, default 0
  • Application Data: Length y, actual application data

5.3 Client-to-Server Masking

Why Masking is Required?

Security Reason: Prevent cache poisoning attacks. Some intermediate proxies might incorrectly cache WebSocket frames; masking ensures data unpredictability.

Masking Algorithm

Clients MUST use the following algorithm to mask all frames sent to server:

1. Generate 32-bit random masking key
2. Place masking key in frame header's Masking-key field
3. Apply mask to each byte of Payload Data:

transformed-octet-i = original-octet-i XOR masking-key[i MOD 4]

Algorithm Implementation (JavaScript):

function maskData(data, maskingKey) {
const masked = new Uint8Array(data.length);
for (let i = 0; i < data.length; i++) {
masked[i] = data[i] ^ maskingKey[i % 4];
}
return masked;
}

// Example
const data = Buffer.from('Hello');
const maskingKey = Buffer.from([0x37, 0xfa, 0x21, 0x3d]);
const masked = maskData(data, maskingKey);

// Decode (using same algorithm)
const unmasked = maskData(masked, maskingKey); // 'Hello'

Key Points:

  • XOR is self-inverse: (A XOR B) XOR B = A
  • Server uses same algorithm to decode
  • Must use new random key for each frame sent

5.4 Fragmentation

Large messages can be sent in multiple frames.

Fragmentation Rules

  1. First frame: FIN=0, Opcode=data type (0x1 or 0x2)
  2. Middle frames: FIN=0, Opcode=0x0 (Continuation)
  3. Last frame: FIN=1, Opcode=0x0 (Continuation)

Fragmentation Example

Sending message "Hello World!" in three frames:

Frame 1:
FIN = 0
Opcode = 0x1 (Text)
Payload = "Hello "

Frame 2:
FIN = 0
Opcode = 0x0 (Continuation)
Payload = "World"

Frame 3:
FIN = 1
Opcode = 0x0 (Continuation)
Payload = "!"

Fragmentation Constraints

  • Control frames (Close, Ping, Pong) MUST NOT be fragmented
  • Control frames can be inserted between fragmented data frames
  • Fragments must be sent and received in order

5.5 Control Frames

Control frames are used to communicate connection state. Opcode range: 0x8-0xF.

5.5.1 Close

  • Opcode: 0x8
  • Can include close code and reason
  • See Chapter 7 for details

5.5.2 Ping

  • Opcode: 0x9
  • Purpose: Heartbeat detection, check if connection is alive
  • Can carry application data (max 125 bytes)
  • Receiver MUST respond with Pong frame
Client → Server: Ping (Opcode=0x9)
Server → Client: Pong (Opcode=0xA, same Payload)

5.5.3 Pong

  • Opcode: 0xA
  • Purpose: Respond to Ping frame
  • MUST contain same Payload as Ping frame
  • Can also be sent proactively (unsolicited heartbeat)

Control Frame Rules

  1. Maximum Payload length: 125 bytes
  2. MUST NOT be fragmented
  3. Can be inserted between fragmented data frames

5.6 Data Frames

Data frames transmit application or extension data. Opcode range: 0x0-0x2, 0x3-0x7 reserved.

Text Frame

  • Opcode: 0x1
  • Payload MUST be valid UTF-8 encoded text
  • MUST close connection if invalid UTF-8 received

Binary Frame

  • Opcode: 0x2
  • Payload can be arbitrary binary data
  • Application layer responsible for interpretation

5.7 Examples

Single Unmasked Text Frame

0x81 0x05 0x48 0x65 0x6c 0x6c 0x6f
│ │ └─────────┬────────────┘
│ │ └─ "Hello" (5 bytes)
│ └─ Payload len = 5
└─ FIN=1, Opcode=0x1 (Text)

Single Masked Text Frame

0x81 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58
│ │ └─────┬────────┘ └──────┬───────────┘
│ │ │ └─ Masked "Hello"
│ │ └─ Masking key
│ └─ MASK=1, Payload len = 5
└─ FIN=1, Opcode=0x1 (Text)

Fragmented Message

Frame 1: 0x01 0x03 0x48 0x65 0x6c   // FIN=0, Text, "Hel"
Frame 2: 0x80 0x02 0x6c 0x6f // FIN=1, Continuation, "lo"

Complete message: "Hello"

Ping Frame

0x89 0x05 0x48 0x65 0x6c 0x6c 0x6f
│ │ └─────────┬────────────┘
│ │ └─ "Hello" (optional data)
│ └─ Payload len = 5
└─ FIN=1, Opcode=0x9 (Ping)

5.8 Extensibility

The protocol can be extended through:

  1. Opcode: 0x3-0x7 and 0xB-0xF reserved for future use
  2. RSV bits: Reserved for extension use
  3. Extension Data: Extensions can add data before Payload

Extensions must be negotiated through handshake (Sec-WebSocket-Extensions).