3.7 Computing the Message Hashes

3.7. Computing the Message Hashes

Both signing and verifying message signatures start with a step of computing two cryptographic hashes over the message. Signers will choose the parameters of the signature as described in "Signer Actions" (Section 5); Verifiers will use the parameters specified in the DKIM-Signature header field being verified. In the following discussion, the names of the tags in the DKIM-Signature header field that either exists (when verifying) or will be created (when signing) are used. Note that canonicalization (Section 3.4) is only used to prepare the email for signing or verifying; it does not affect the transmitted email in any way.

The Signer/Verifier MUST compute two hashes: one over the body of the message and one over the selected header fields of the message.

Signers MUST compute them in the order shown. Verifiers MAY compute them in any order convenient to the Verifier, provided that the result is semantically identical to the semantics that would be the case had they been computed in this order.

In hash step 1, the Signer/Verifier MUST hash the message body, canonicalized using the body canonicalization algorithm specified in the "c=" tag and then truncated to the length specified in the "l=" tag. That hash value is then converted to base64 form and inserted into (Signers) or compared to (Verifiers) the "bh=" tag of the DKIM-Signature header field.

In hash step 2, the Signer/Verifier MUST pass the following to the hash algorithm in the indicated order.

The header fields specified by the "h=" tag, in the order specified in that tag, and canonicalized using the header canonicalization algorithm specified in the "c=" tag. Each header field MUST be terminated with a single CRLF.
The DKIM-Signature header field that exists (verifying) or will be inserted (signing) in the message, with the value of the "b=" tag (including all surrounding whitespace) deleted (i.e., treated as the empty string), canonicalized using the header canonicalization algorithm specified in the "c=" tag, and without a trailing CRLF.

All tags and their values in the DKIM-Signature header field are included in the cryptographic hash with the sole exception of the value portion of the "b=" (signature) tag, which MUST be treated as the null string. All tags MUST be included even if they might not be understood by the Verifier. The header field MUST be presented to the hash algorithm after the body of the message rather than with the rest of the header fields and MUST be canonicalized as specified in the "c=" (canonicalization) tag. The DKIM-Signature header field MUST NOT be included in its own "h=" tag, although other DKIM-Signature header fields MAY be signed (see Section 4).

When calculating the hash on messages that will be transmitted using base64 or quoted-printable encoding, Signers MUST compute the hash after the encoding. Likewise, the Verifier MUST incorporate the values into the hash before decoding the base64 or quoted-printable text. However, the hash MUST be computed before transport-level encodings such as SMTP "dot-stuffing" (the modification of lines beginning with a "." to avoid confusion with the SMTP end-of-message marker, as specified in [RFC5321]).

With the exception of the canonicalization procedure described in Section 3.4, the DKIM signing process treats the body of messages as simply a string of octets. DKIM messages MAY be either in plain-text or in MIME format; no special treatment is afforded to MIME content. Message attachments in MIME format MUST be included in the content that is signed.

More formally, pseudo-code for the signature algorithm is:

body-hash    =  hash-alg (canon-body, l-param)
data-hash    =  hash-alg (h-headers, D-SIG, body-hash)
signature    =  sig-alg (d-domain, selector, data-hash)

where:

body-hash: is the output from hashing the body, using hash-alg.
hash-alg: is the hashing algorithm specified in the "a" parameter.
canon-body: is a canonicalized representation of the body, produced using the body algorithm specified in the "c" parameter, as defined in Section 3.4 and excluding the DKIM-Signature field.
l-param: is the length-of-body value of the "l" parameter.
data-hash: is the output from using the hash-alg algorithm, to hash the header including the DKIM-Signature header, and the body hash.
h-headers: is the list of headers to be signed, as specified in the "h" parameter.
D-SIG: is the canonicalized DKIM-Signature field itself without the signature value portion of the parameter, that is, an empty parameter value.
signature: is the signature value produced by the signing algorithm.
sig-alg: is the signature algorithm specified by the "a" parameter.
d-domain: is the domain name specified in the "d" parameter.
selector: is the selector value specified in the "s" parameter.

NOTE: Many digital signature APIs provide both hashing and application of the RSA private key using a single "sign()" primitive. When using such an API, the last two steps in the algorithm would probably be combined into a single call that would perform both the "a-hash-alg" and the "sig-alg".

3.7. Computing the Message Hashes​

3.7. Computing the Message Hashes