6. Syslog Message Format

The syslog message has the following ABNF [RFC5234] definition:

SYSLOG-MSG      = HEADER SP STRUCTURED-DATA [SP MSG]

HEADER          = PRI VERSION SP TIMESTAMP SP HOSTNAME
                  SP APP-NAME SP PROCID SP MSGID
PRI             = "`&lt;" PRIVAL ">`"
PRIVAL          = 1*3DIGIT ; range 0 .. 191
VERSION         = NONZERO-DIGIT 0*2DIGIT
HOSTNAME        = NILVALUE / 1*255PRINTUSASCII
APP-NAME        = NILVALUE / 1*48PRINTUSASCII
PROCID          = NILVALUE / 1*128PRINTUSASCII
MSGID           = NILVALUE / 1*32PRINTUSASCII

TIMESTAMP       = NILVALUE / FULL-DATE "T" FULL-TIME
FULL-DATE       = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY
DATE-FULLYEAR   = 4DIGIT
DATE-MONTH      = 2DIGIT  ; 01-12
DATE-MDAY       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
                          ; month/year
FULL-TIME       = PARTIAL-TIME TIME-OFFSET
PARTIAL-TIME    = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND
                  [TIME-SECFRAC]
TIME-HOUR       = 2DIGIT  ; 00-23
TIME-MINUTE     = 2DIGIT  ; 00-59
TIME-SECOND     = 2DIGIT  ; 00-59
TIME-SECFRAC    = "." 1*6DIGIT
TIME-OFFSET     = "Z" / TIME-NUMOFFSET
TIME-NUMOFFSET  = ("+" / "-") TIME-HOUR ":" TIME-MINUTE

STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
SD-ELEMENT      = "[" SD-ID *(SP SD-PARAM) "]"
SD-PARAM        = PARAM-NAME "=" %d34 PARAM-VALUE %d34
SD-ID           = SD-NAME
PARAM-NAME      = SD-NAME
PARAM-VALUE     = UTF-8-STRING ; characters '"', '\' and
                               ; ']' MUST be escaped.
SD-NAME         = 1*32PRINTUSASCII
                  ; except '=', SP, ']', %d34 (")

MSG             = MSG-ANY / MSG-UTF8
MSG-ANY         = *OCTET ; not starting with BOM
MSG-UTF8        = BOM UTF-8-STRING
BOM             = %xEF.BB.BF

UTF-8-STRING    = *OCTET ; UTF-8 string as specified
                         ; in RFC 3629

OCTET           = %d00-255
SP              = %d32
PRINTUSASCII    = %d33-126
NONZERO-DIGIT   = %d49-57
DIGIT           = %d48 / NONZERO-DIGIT
NILVALUE        = "-"

6.1. Message Length

Syslog message size limits are dictated by the syslog transport mapping in use. There is no upper limit per se. Each transport mapping defines the minimum maximum required message length support, and the minimum maximum MUST be at least 480 octets in length.

Any transport receiver MUST be able to accept messages of up to and including 480 octets in length. All transport receiver implementations SHOULD be able to accept messages of up to and including 2048 octets in length. Transport receivers MAY receive messages larger than 2048 octets in length. If a transport receiver receives a message with a length larger than it supports, the transport receiver SHOULD truncate the payload. Alternatively, it MAY discard the message.

If a transport receiver truncates messages, the truncation MUST occur at the end of the message. After truncation, the message MAY contain invalid UTF-8 encoding or invalid STRUCTURED-DATA. The transport receiver MAY discard the message or MAY try to process as much as possible in this case.

6.2. HEADER

The character set used in the HEADER MUST be seven-bit ASCII in an eight-bit field as described in [RFC5234]. These are the ASCII codes as defined in "USA Standard Code for Information Interchange" [ANSI.X3-4.1968].

The header format is designed to provide some interoperability with older BSD-based syslog. For details on this, see Appendix A.1.

6.2.1. PRI

The PRI part MUST have three, four, or five characters and will be bound with angle brackets as the first and last characters. The PRI part starts with a leading "<" ('less-than' character, %d60), followed by a number, which is followed by a ">" ('greater-than' character, %d62). The number contained within these angle brackets is known as the Priority value (PRIVAL) and represents both the Facility and Severity. The Priority value consists of one, two, or three decimal integers (ABNF DIGITS) using values of %d48 (for "0") through %d57 (for "9").

Facility and Severity values are not normative but often used. They are described in the following tables for purely informational purposes. Facility values MUST be in the range of 0 to 23 inclusive.

Numerical Code	Facility
0	kernel messages
1	user-level messages
2	mail system
3	system daemons
4	security/authorization messages
5	messages generated internally by syslogd
6	line printer subsystem
7	network news subsystem
8	UUCP subsystem
9	clock daemon
10	security/authorization messages
11	FTP daemon
12	NTP subsystem
13	log audit
14	log alert
15	clock daemon (note 2)
16	local use 0 (local0)
17	local use 1 (local1)
18	local use 2 (local2)
19	local use 3 (local3)
20	local use 4 (local4)
21	local use 5 (local5)
22	local use 6 (local6)
23	local use 7 (local7)

Table 1. Syslog Message Facilities

Each message Priority also has a decimal Severity level indicator. These are described in the following table along with their numerical values. Severity values MUST be in the range of 0 to 7 inclusive.

Numerical Code	Severity
0	Emergency: system is unusable
1	Alert: action must be taken immediately
2	Critical: critical conditions
3	Error: error conditions
4	Warning: warning conditions
5	Notice: normal but significant condition
6	Informational: informational messages
7	Debug: debug-level messages

Table 2. Syslog Message Severities

The Priority value is calculated by first multiplying the Facility number by 8 and then adding the numerical value of the Severity. For example, a kernel message (Facility=0) with a Severity of Emergency (Severity=0) would have a Priority value of 0. Also, a "local use 4" message (Facility=20) with a Severity of Notice (Severity=5) would have a Priority value of 165. In the PRI of a syslog message, these values would be placed between the angle brackets as <0> and <165> respectively. The only time a value of "0" follows the "<" is for the Priority value of "0". Otherwise, leading "0"s MUST NOT be used.

6.2.2. VERSION

The VERSION field denotes the version of the syslog protocol specification. The version number MUST be incremented for any new syslog protocol specification that changes any part of the HEADER format. Changes include the addition or removal of fields, or a change of syntax or semantics of existing fields. This document uses a VERSION value of "1". The VERSION values are IANA-assigned (Section 9.1) via the Standards Action method as described in [RFC5226].

6.2.3. TIMESTAMP

The TIMESTAMP field is a formalized timestamp derived from [RFC3339]. Whereas [RFC3339] makes allowances for multiple syntaxes, this document imposes further restrictions. The TIMESTAMP value MUST follow these restrictions:

The "T" and "Z" characters in this syntax MUST be upper case.
Usage of the "T" character is REQUIRED.
Leap seconds MUST NOT be used.

The originator SHOULD include TIME-SECFRAC if its clock accuracy and performance permit. The "timeQuality" SD-ID described in Section 7.1 allows the originator to specify the accuracy and trustworthiness of the timestamp.

A syslog application MUST use the NILVALUE as TIMESTAMP if the syslog application is incapable of obtaining system time.

6.2.3.1. Examples

Example 1

1985-04-12T23:20:50.52Z

This represents 20 minutes and 50.52 seconds after the 23rd hour of 12 April 1985 in UTC.

Example 2

1985-04-12T19:20:50.52-04:00

This represents the same time as in example 1, but expressed in US Eastern Standard Time (observing daylight savings time).

Example 3

2003-10-11T22:14:15.003Z

This represents 11 October 2003 at 10:14:15pm, 3 milliseconds into the next second. The timestamp is in UTC. The timestamp provides millisecond resolution. The creator may have actually had a better resolution, but providing just three digits for the fractional part of a second does not tell us.

Example 4

2003-08-24T05:14:15.000003-07:00

This represents 24 August 2003 at 05:14:15am, 3 microseconds into the next second. The microsecond resolution is indicated by the additional digits in TIME-SECFRAC. The timestamp indicates that its local time is -7 hours from UTC. This timestamp might be created in the US Pacific time zone during daylight savings time.

Example 5 - An Invalid TIMESTAMP

2003-08-24T05:14:15.000000003-07:00

This example is nearly the same as Example 4, but it is specifying TIME-SECFRAC in nanoseconds. This results in TIME-SECFRAC being longer than the allowed 6 digits, which invalidates it.

6.2.4. HOSTNAME

The HOSTNAME field identifies the machine that originally sent the syslog message.

The HOSTNAME field SHOULD contain the hostname and the domain name of the originator in the format specified in STD 13 [RFC1034]. This format is called a Fully Qualified Domain Name (FQDN) in this document.

In practice, not all syslog applications are able to provide an FQDN. As such, other values MAY also be present in HOSTNAME. This document makes provisions for using other values in such situations. A syslog application SHOULD provide the most specific available value first. The order of preference for the contents of the HOSTNAME field is as follows:

FQDN
Static IP address
hostname
Dynamic IP address
the NILVALUE

If an IPv4 address is used, it MUST be in the format of the dotted decimal notation as used in STD 13 [RFC1035]. If an IPv6 address is used, a valid textual representation as described in [RFC4291], Section 2.2, MUST be used.

Syslog applications SHOULD consistently use the same value in the HOSTNAME field for as long as possible.

The NILVALUE SHOULD only be used when the syslog application has no way to obtain its real hostname. This situation is considered highly unlikely.

6.2.5. APP-NAME

The APP-NAME field SHOULD identify the device or application that originated the message. It is a string without further semantics. It is intended for filtering messages on a relay or collector.

The NILVALUE MAY be used when the syslog application has no idea of its APP-NAME or cannot provide that information. It may be that a device is unable to provide that information either because of a local policy decision, or because the information is not available, or not applicable, on the device.

This field MAY be operator-assigned.

6.2.6. PROCID

PROCID is a value that is included in the message, having no interoperable meaning, except that a change in the value indicates there has been a discontinuity in syslog reporting. The field does not have any specific syntax or semantics; the value is implementation-dependent and/or operator-assigned. The NILVALUE MAY be used when no value is provided.

The PROCID field is often used to provide the process name or process ID associated with a syslog system. The NILVALUE might be used when a process ID is not available. On an embedded system without any operating system process ID, PROCID might be a reboot ID.

PROCID can enable log analyzers to detect discontinuities in syslog reporting by detecting a change in the syslog process ID. However, PROCID is not a reliable identification of a restarted process since the restarted syslog process might be assigned the same process ID as the previous syslog process.

PROCID can also be used to identify which messages belong to a group of messages. For example, an SMTP mail transfer agent might put its SMTP transaction ID into PROCID, which would allow the collector or relay to group messages based on the SMTP transaction.

6.2.7. MSGID

The MSGID SHOULD identify the type of message. For example, a firewall might use the MSGID "TCPIN" for incoming TCP traffic and the MSGID "TCPOUT" for outgoing TCP traffic. Messages with the same MSGID should reflect events of the same semantics. The MSGID itself is a string without further semantics. It is intended for filtering messages on a relay or collector.

The NILVALUE SHOULD be used when the syslog application does not, or cannot, provide any value.

This field MAY be operator-assigned.

6.3. STRUCTURED-DATA

STRUCTURED-DATA provides a mechanism to express information in a well defined, easily parseable and interpretable data format. There are multiple usage scenarios. For example, it may express meta-information about the syslog message or application-specific information such as traffic counters or IP addresses.

STRUCTURED-DATA can contain zero, one, or multiple structured data elements, which are referred to as "SD-ELEMENT" in this document.

In case of zero structured data elements, the STRUCTURED-DATA field MUST contain the NILVALUE.

The character set used in STRUCTURED-DATA MUST be seven-bit ASCII in an eight-bit field as described in [RFC5234]. These are the ASCII codes as defined in "USA Standard Code for Information Interchange" [ANSI.X3-4.1968]. An exception is the PARAM-VALUE field (see Section 6.3.3), in which UTF-8 encoding MUST be used.

A collector MAY ignore malformed STRUCTURED-DATA elements. A relay MUST forward malformed STRUCTURED-DATA without any alteration.

6.3.1. SD-ELEMENT

An SD-ELEMENT consists of a name and parameter name-value pairs. The name is referred to as SD-ID. The name-value pairs are referred to as "SD-PARAM".

6.3.2. SD-ID

SD-IDs are case-sensitive and uniquely identify the type and purpose of the SD-ELEMENT. The same SD-ID MUST NOT exist more than once in a message.

There are two formats for SD-ID names:

Names that do not contain an at-sign ("@", ABNF %d64) are reserved to be assigned by IETF Review as described in BCP26 [RFC5226]. Currently, these are the names defined in Section 7. Names of this format are only valid if they are first registered with the IANA. Registered names MUST NOT contain an at-sign ('@', ABNF %d64), an equal-sign ('=', ABNF %d61), a closing brace (']', ABNF %d93), a quote-character ('"', ABNF %d34), whitespace, or control characters (ASCII code 127 and codes 32 or less).
Anyone can define additional SD-IDs using names in the format name@<private enterprise number>, e.g., "ourSDID@32473". The format of the part preceding the at-sign is not specified; however, these names MUST be printable US-ASCII strings, and MUST NOT contain an at-sign ('@', ABNF %d64), an equal-sign ('=', ABNF %d61), a closing brace (']', ABNF %d93), a quote-character ('"', ABNF %d34), whitespace, or control characters. The part following the at-sign MUST be a private enterprise number as specified in Section 7.2.2. Please note that throughout this document the value of 32473 is used for all private enterprise numbers. This value has been reserved by IANA to be used as an example number in documentation. Implementors will need to use their own private enterprise number for the enterpriseId parameter, and when creating locally extensible SD-ID names.

6.3.3. SD-PARAM

Each SD-PARAM consists of a name, referred to as PARAM-NAME, and a value, referred to as PARAM-VALUE.

PARAM-NAME is case-sensitive. IANA controls all PARAM-NAMEs, with the exception of those in SD-IDs whose names contain an at-sign. The PARAM-NAME scope is within a specific SD-ID. Thus, equally named PARAM-NAME values contained in two different SD-IDs are not the same.

To support international characters, the PARAM-VALUE field MUST be encoded using UTF-8. A syslog application MAY issue any valid UTF-8 sequence. A syslog application MUST accept any valid UTF-8 sequence in the "shortest form". It MUST NOT fail if control characters are present in PARAM-VALUE. The syslog application MAY modify messages containing control characters (e.g., by changing an octet with value 0 (USASCII NUL) to the four characters "#000"). For the reasons outlined in UNICODE TR36 [UNICODE-TR36], section 3.1, an originator MUST encode messages in the "shortest form" and a collector or relay MUST NOT interpret messages in the "non-shortest form".

Inside PARAM-VALUE, the characters '"' (ABNF %d34), '\' (ABNF %d92), and ']' (ABNF %d93) MUST be escaped. This is necessary to avoid parsing errors. Escaping ']' would not strictly be necessary but is REQUIRED by this specification to avoid syslog application implementation errors. Each of these three characters MUST be escaped as \\", \\\\, and \\] respectively. The backslash is used for control character escaping for consistency with its use for escaping in other parts of the syslog message as well as in traditional syslog.

A backslash ('\') followed by none of the three described characters is considered an invalid escape sequence. In this case, the backslash MUST be treated as a regular backslash and the following character as a regular character. Thus, the invalid sequence MUST not be altered.

An SD-PARAM MAY be repeated multiple times inside an SD-ELEMENT.

6.3.4. Change Control

Once SD-IDs and PARAM-NAMEs are defined, syntax and semantics of these objects MUST NOT be altered. Should a change to an existing SD-ID or PARAM-NAME be desired, the already-defined SD-ID and PARAM-NAME MUST be left unchanged and a new one MUST be created.

6.3.5. Examples

Example 1 - Valid STRUCTURED-DATA

[exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"]

In this example, "exampleSDID@32473" is the SD-ID; "iut", "eventSource", and "eventID" are SD-PARAM names.

Example 2 - Two Structured Data Elements

[exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@32473 class="high"]

6.4. MSG

The MSG part contains a free-form message that provides information about the event. The character set used in MSG SHOULD be UNICODE, encoded using UTF-8 as specified in [RFC3629]. If the syslog application cannot encode the MSG in Unicode, it MAY use any other encoding.

Syslog applications SHOULD avoid the use of control characters (USASCII codes 127 and codes 32 or less) in MSG. If a syslog application uses control characters, it might not be possible to interpret the message on the collector or relay.

The encoding of the MSG part is determined by the first octet. If the first octet is the ABNF %xEF.BB.BF value, this is the Unicode byte order mark (BOM) for UTF-8. In this case, the MSG MUST be encoded in UTF-8 and MUST NOT contain a BOM anywhere else. If the first octet is not the BOM, the encoding is not specified.

Traditional syslog applications only used printable USASCII characters in MSG. This specification permits the use of control characters and other code values in MSG. However, since this might lead to problems with display and general message processing, it is RECOMMENDED that syslog applications use the same encoding strategy as traditional syslog applications.

6.5. Examples

Example 1 - with BOM, multi-byte UTF-8, and STRUCTURED-DATA

`&lt;165>`1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] BOMAn application event log entry...

In this example, the multi-byte UTF-8 characters are present only in the MSG, and the "BOM" here marks the beginning of the MSG part. The encoding of this syslog message is such that it will be readable if the MSG part is interpreted as ISO 8859-1. However, that would not be appropriate in this case, because this message starts with a BOM. Given this, the receiver SHOULD interpret the MSG as UTF-8 encoded text.

Example 2 - no BOM, STRUCTURED-DATA

`&lt;165>`1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - - %% It's time to make the do-nuts.

This example shows a message with no STRUCTURED-DATA. The "-" NILVALUE is used in both the MSGID and STRUCTURED-DATA fields. The MSG itself is free-form text.

Example 3 - with STRUCTURED-DATA

`&lt;165>`1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"][examplePriority@32473 class="high"]

This example shows a message with two SD-ELEMENTs and no MSG part.

6.1. Message Length​

6.2. HEADER​

6.2.1. PRI​

6.2.2. VERSION​

6.2.3. TIMESTAMP​

6.2.3.1. Examples​

6.2.4. HOSTNAME​

6.2.5. APP-NAME​

6.2.6. PROCID​

6.2.7. MSGID​

6.3. STRUCTURED-DATA​

6.3.1. SD-ELEMENT​

6.3.2. SD-ID​

6.3.3. SD-PARAM​

6.3.4. Change Control​

6.3.5. Examples​

6.4. MSG​

6.5. Examples​