1. Introduction
Since its publication in 1982, RFC 822 has defined the standard format of textual mail messages on the Internet. Its success has been such that the RFC 822 format has been adopted, wholly or partially, well beyond the confines of the Internet and the Internet SMTP transport defined by RFC 821. As this format has been widely adopted, a number of limitations have proven increasingly restrictive for the user community.
RFC 822 was intended to specify a format for text messages. As such, non-text messages, such as multimedia messages that might include audio or images, are simply not mentioned. Even in the case of text, however, RFC 822 is inadequate for the needs of mail users whose languages require the use of character sets richer than US-ASCII. Since RFC 822 does not specify mechanisms for mail containing audio, video, Asian language text, or even text in most European languages, additional specifications are needed.
One of the notable limitations of the basic RFC 821/822 mail system is that they limit the contents of electronic mail messages to relatively short lines (e.g., 1000 characters or less [RFC-821]) of 7bit US-ASCII. This forces users to convert any non-textual data that they may wish to send into seven-bit bytes representable as printable US-ASCII characters before invoking a local mail UA (User Agent, a program with which human users send and receive mail). Examples of such encodings currently used in the Internet include pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in RFC 1421, the Andrew Toolkit Representation [ATK], and many others.
The limitations of the RFC 822 mail format become even more apparent as gateways are designed to allow for the exchange of mail messages between RFC 822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the inclusion of non-textual material within electronic mail messages. The current standards for mapping X.400 messages to RFC 822 messages specify that X.400 non-textual material must be converted to (not encoded in) IA5Text format, or be discarded, notifying the RFC 822 user that discarding has occurred. This is clearly undesirable, as information that a user may wish to receive is lost. Even though a user agent may not have the capability to process the non-textual material, the user might have some mechanism external to the UA that can extract useful information from the material. Furthermore, it does not allow for the fact that the message may eventually be gatewayed back into an X.400 message handling system (i.e., the X.400 message is "tunneled" through Internet mail), where the non-textual information would definitely become useful again.
This document describes several mechanisms that combine to solve most of these problems without introducing any serious incompatibilities with the existing RFC 822 mail world. In particular, it describes:
-
A MIME-Version header field, which uses a version number to declare a message to be conformant with MIME and allows mail processing agents to distinguish between such messages and those generated by older or non-conforming software, which is presumed to lack such a field.
-
A Content-Type header field, generalized from RFC 1049, which can be used to specify the media type and subtype of data in the body of a message, and to fully specify the native representation (canonical form) of such data.
-
A Content-Transfer-Encoding header field, which can be used to specify both the encoding transformation that was applied to the body and the domain of the result. Encoding transformations other than the identity transformation are usually applied to data in order to allow it to pass through mail transport mechanisms which may have data or character set limitations.
-
Two additional header fields, Content-ID and Content-Description, which can be used to further describe the data in a body.
All of the header fields defined in this document are subject to the general syntactic rules for header fields specified in RFC 822. In particular, all of these header fields except Content-Disposition can include RFC 822 comments, which have no semantic content and should be ignored during MIME processing.
Finally, to specify and promote interoperability, RFC 2049 provides a basic applicability statement for a subset of the above mechanisms that defines a minimal "conformant" level for the current document.
HISTORICAL NOTE: Several of the mechanisms described in this document set may seem somewhat strange or even baroque at first reading. It is important to note that compatibility with existing standards AND robustness across existing practice were two of the highest priorities of the working group that developed this document set. In particular, compatibility was always favored over elegance.
Please refer to the current edition of the "Internet Official Protocol Standards" for the standardization state and status of this protocol. RFC 822 and STD 3, RFC 1123 also provide important background for MIME since no conforming MIME implementation can violate them. In addition, several other informative RFC documents are also valuable to MIME implementors, in particular RFC 1344, RFC 1345, and RFC 1524.
Terminology:
- User Agent (UA): A program with which human users send and receive mail
- MIME: Multipurpose Internet Mail Extensions
- canonical form: The native representation of data
- media type: The general type of data
- subtype: A specific format within a media type
- encoding transformation: A conversion applied to data for transmission
Key Concepts:
- RFC 822 limitations: Only supports 7-bit US-ASCII text
- MIME objectives: Support multimedia, multiple character sets, non-textual content
- Compatibility first: MIME design maintains backward compatibility with RFC 822