5. Data Framing
The WebSocket Protocol uses frames to transmit data. This chapter defines the format and processing rules for WebSocket frames.
5.1 Overview
Once a WebSocket connection is established, the client and server can transmit data bidirectionally. Data is transmitted in the form of a series of frames.
Key Concepts:
- Frame: The basic unit of transmission, containing a header and payload
- Message: Application-level data that may consist of one or more frames
- Fragmentation: Large messages can be sent in multiple frames
5.2 Base Framing Protocol
Frame Structure
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
Field Descriptions
FIN (1 bit)
0: This is not the last frame of the message (more frames follow)1: This is the last frame of the message (or the only frame)
Message fragmentation example:
Frame 1: FIN=0, Opcode=0x1 (Text), Data="Hello "
Frame 2: FIN=0, Opcode=0x0 (Continuation), Data="World"
Frame 3: FIN=1, Opcode=0x0 (Continuation), Data="!"
Complete message: "Hello World!"
RSV1, RSV2, RSV3 (1 bit each)
- Reserved for extensions
- MUST be 0 if no extension is negotiated
- MUST close connection if non-zero value received without defined extension
Opcode (4 bits)
Defines the frame type:
| Opcode | Type | Description |
|---|---|---|
0x0 | Continuation | Continuation frame (subsequent frames of fragmented message) |
0x1 | Text | Text frame (UTF-8 encoded) |
0x2 | Binary | Binary frame |
0x3-0x7 | - | Reserved (data frames) |
0x8 | Close | Close frame |
0x9 | Ping | Ping frame |
0xA | Pong | Pong frame |
0xB-0xF | - | Reserved (control frames) |
MASK (1 bit)
- Client to server: MUST be 1
- Server to client: MUST be 0
If MASK=1, Payload Data must be masked using Masking-key.
Payload Length (7 bits, 7+16 bits, or 7+64 bits)
Payload length encoding:
- 0-125: This is the actual length
- 126: Following 16 bits (2 bytes) are actual length (network byte order)
- 127: Following 64 bits (8 bytes) are actual length (network byte order)
Example:
Payload length = 100 bytes
→ Payload len = 100 (direct encoding)
Payload length = 1000 bytes
→ Payload len = 126
→ Extended payload length = 1000 (16-bit)
Payload length = 100000 bytes
→ Payload len = 127
→ Extended payload length = 100000 (64-bit)
Masking-key (0 or 4 bytes)
If MASK=1, contains 32 bits (4 bytes) of masking key.
Payload Data (x+y bytes)
Payload data = Extension Data + Application Data
- Extension Data: Length x, determined by extension negotiation, default 0
- Application Data: Length y, actual application data
5.3 Client-to-Server Masking
Why Masking is Required?
Security Reason: Prevent cache poisoning attacks. Some intermediate proxies might incorrectly cache WebSocket frames; masking ensures data unpredictability.
Masking Algorithm
Clients MUST use the following algorithm to mask all frames sent to server:
1. Generate 32-bit random masking key
2. Place masking key in frame header's Masking-key field
3. Apply mask to each byte of Payload Data:
transformed-octet-i = original-octet-i XOR masking-key[i MOD 4]
Algorithm Implementation (JavaScript):
function maskData(data, maskingKey) {
const masked = new Uint8Array(data.length);
for (let i = 0; i < data.length; i++) {
masked[i] = data[i] ^ maskingKey[i % 4];
}
return masked;
}
// Example
const data = Buffer.from('Hello');
const maskingKey = Buffer.from([0x37, 0xfa, 0x21, 0x3d]);
const masked = maskData(data, maskingKey);
// Decode (using same algorithm)
const unmasked = maskData(masked, maskingKey); // 'Hello'
Key Points:
- XOR is self-inverse:
(A XOR B) XOR B = A - Server uses same algorithm to decode
- Must use new random key for each frame sent
5.4 Fragmentation
Large messages can be sent in multiple frames.
Fragmentation Rules
- First frame: FIN=0, Opcode=data type (0x1 or 0x2)
- Middle frames: FIN=0, Opcode=0x0 (Continuation)
- Last frame: FIN=1, Opcode=0x0 (Continuation)
Fragmentation Example
Sending message "Hello World!" in three frames:
Frame 1:
FIN = 0
Opcode = 0x1 (Text)
Payload = "Hello "
Frame 2:
FIN = 0
Opcode = 0x0 (Continuation)
Payload = "World"
Frame 3:
FIN = 1
Opcode = 0x0 (Continuation)
Payload = "!"
Fragmentation Constraints
- Control frames (Close, Ping, Pong) MUST NOT be fragmented
- Control frames can be inserted between fragmented data frames
- Fragments must be sent and received in order
5.5 Control Frames
Control frames are used to communicate connection state. Opcode range: 0x8-0xF.
5.5.1 Close
- Opcode:
0x8 - Can include close code and reason
- See Chapter 7 for details
5.5.2 Ping
- Opcode:
0x9 - Purpose: Heartbeat detection, check if connection is alive
- Can carry application data (max 125 bytes)
- Receiver MUST respond with Pong frame
Client → Server: Ping (Opcode=0x9)
Server → Client: Pong (Opcode=0xA, same Payload)
5.5.3 Pong
- Opcode:
0xA - Purpose: Respond to Ping frame
- MUST contain same Payload as Ping frame
- Can also be sent proactively (unsolicited heartbeat)
Control Frame Rules
- Maximum Payload length: 125 bytes
- MUST NOT be fragmented
- Can be inserted between fragmented data frames
5.6 Data Frames
Data frames transmit application or extension data. Opcode range: 0x0-0x2, 0x3-0x7 reserved.
Text Frame
- Opcode:
0x1 - Payload MUST be valid UTF-8 encoded text
- MUST close connection if invalid UTF-8 received
Binary Frame
- Opcode:
0x2 - Payload can be arbitrary binary data
- Application layer responsible for interpretation
5.7 Examples
Single Unmasked Text Frame
0x81 0x05 0x48 0x65 0x6c 0x6c 0x6f
│ │ └─────────┬────────────┘
│ │ └─ "Hello" (5 bytes)
│ └─ Payload len = 5
└─ FIN=1, Opcode=0x1 (Text)
Single Masked Text Frame
0x81 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58
│ │ └─────┬────────┘ └──────┬───────────┘
│ │ │ └─ Masked "Hello"
│ │ └─ Masking key
│ └─ MASK=1, Payload len = 5
└─ FIN=1, Opcode=0x1 (Text)
Fragmented Message
Frame 1: 0x01 0x03 0x48 0x65 0x6c // FIN=0, Text, "Hel"
Frame 2: 0x80 0x02 0x6c 0x6f // FIN=1, Continuation, "lo"
Complete message: "Hello"
Ping Frame
0x89 0x05 0x48 0x65 0x6c 0x6c 0x6f
│ │ └─────────┬────────────┘
│ │ └─ "Hello" (optional data)
│ └─ Payload len = 5
└─ FIN=1, Opcode=0x9 (Ping)
5.8 Extensibility
The protocol can be extended through:
- Opcode: 0x3-0x7 and 0xB-0xF reserved for future use
- RSV bits: Reserved for extension use
- Extension Data: Extensions can add data before Payload
Extensions must be negotiated through handshake (Sec-WebSocket-Extensions).
Reference Links
- Previous Chapter: 4. Opening Handshake
- Next Chapter: 6. Sending and Receiving Data
- Detailed Explanation: WebSocket Frame Structure Details