6. Content-Transfer-Encoding Header Field
Many media types which could usefully be transported via email are represented, in their "natural" format, as 8-bit character or binary data. Such data cannot be transmitted over some transport protocols. For example, RFC 821 restricts mail messages to 7-bit US-ASCII data with lines no longer than 1000 characters. The Content-Transfer-Encoding field is used to specify what encoding transformation has been applied.
6.1. Content-Transfer-Encoding Syntax
encoding := "Content-Transfer-Encoding" ":" mechanism
mechanism := "7bit" / "8bit" / "binary" /
"quoted-printable" / "base64" /
ietf-token / x-token
These values are not case sensitive -- Base64 and BASE64 and bAsE64 are all equivalent.
6.2. Content-Transfer-Encoding Semantics
7bit
The "7bit" encoding means that the data is all represented as short lines of US-ASCII data with no octets with decimal values greater than 127. Lines must be no longer than 998 octets, not counting the CRLF. No NUL octets (decimal value 0) are allowed. CR and LF occur only as part of CRLF sequences.
8bit
The "8bit" encoding means that the data is all represented as relatively short lines with 998 octets or less between CRLF sequences, but octets with decimal values greater than 127 may be used. As with "7bit" data, CR and LF occur only as part of CRLF sequences and no NULs are allowed.
binary
The "binary" encoding indicates that any sequence of octets whatsoever is allowed. This encoding is not further defined in this document.
quoted-printable
The "quoted-printable" encoding is intended to represent data that largely consists of octets that correspond to printable characters in the US-ASCII character set.
base64
The "base64" encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable.
6.3. New Content-Transfer-Encodings
New Content-Transfer-Encoding values may be registered with IANA. The requirements for such registration are specified in RFC 2048.
6.4. Interpretation and Use
The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all mean that the identity (i.e., NO) encoding transformation has been performed. As such, they serve simply as indicators of the domain of the body data, and provide useful information about the sort of encoding that might be needed for transmission in a given transport system.
6.5. Translating Encodings
It may be desirable to allow the transmission of non-textual body content without encoding it using the Base64 or Quoted-Printable encodings. The "8bit" and "binary" encoding mechanisms provide such functionality.
6.6. Canonical Encoding Model
The encoding formats defined here explicitly encode all data in ASCII. Thus, if the data being encoded is not ASCII, it must first be converted to ASCII using some character encoding. This encoding must be declared using the "charset" parameter in the Content-Type field.
6.7. Quoted-Printable Content-Transfer-Encoding
The Quoted-Printable encoding uses printable ASCII characters (characters with values 33 through 126) to allow encoding to be used on data that is largely text.
Encoding Rules
- Any printable ASCII character (decimal values 33 through 60 and 62 through 126) may be represented literally, except for "="
- Tab and space may be represented literally, unless they appear at the end of a line
- The equals sign "=" is used as an escape character
- Non-representable characters are represented as "=" followed by two hexadecimal digits representing the octet's value
- If data contains meaningful line breaks, they must be represented as quoted-printable encoding
- Encoded lines must not be longer than 76 characters, not counting the CRLF
Example
Original: If you believe that truth=beauty, then surely mathematics is the most beautiful branch of philosophy.
Encoded: If you believe that truth=3Dbeauty, then surely mathematics is the most =
beautiful branch of philosophy.
6.8. Base64 Content-Transfer-Encoding
The Base64 Content-Transfer-Encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable.
Encoding Process
- Divide the input data stream into groups of 24 bits (3 octets)
- Divide each 24-bit group into four groups of 6 bits
- Map each 6-bit group to one character in the Base64 alphabet
- If the last group has fewer than 24 bits, pad with zero bits and add "=" as padding in the output
Base64 Alphabet
Value Encoding Value Encoding Value Encoding Value Encoding
0 A 17 R 34 i 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
3 D 20 U 37 l 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6 G 23 X 40 o 57 5
7 H 24 Y 41 p 58 6
8 I 25 Z 42 q 59 7
9 J 26 a 43 r 60 8
10 K 27 b 44 s 61 9
11 L 28 c 45 t 62 +
12 M 29 d 46 u 63 /
13 N 30 e 47 v
14 O 31 f 48 w (pad) =
15 P 32 g 49 x
16 Q 33 h 50 y
Example
Original (ASCII): Man
Binary: 01001101 01100001 01101110
Grouped 6-bit: 010011 010110 000101 101110
Base64: T W F u
Encoded Output Format
- The encoded output stream must be represented in lines of no more than 76 characters each
- All lines except the last must be exactly 76 characters long
- Any CRLF pairs appearing in the encoded data represent line breaks in the encoded output only
Encoding Comparison:
| Encoding | Purpose | Line Limit | Character Set | Expansion |
|---|---|---|---|---|
| 7bit | Pure ASCII text | 998 bytes | US-ASCII | None |
| 8bit | Extended text | 998 bytes | 8-bit octets | None |
| binary | Binary data | None | Any | None |
| quoted-printable | Mostly ASCII | 76 chars | ASCII + escape | ~1-3x |
| base64 | Arbitrary binary | 76 chars | 64 chars | ~1.33x |
Selection Guide:
- Pure ASCII text: 7bit (no encoding needed)
- Text with occasional non-ASCII: quoted-printable (better readability)
- Binary data (images, attachments): base64 (standard method)
- Modern systems: 8bit or binary (if supported)