Skip to main content

3. Presentation Language

This document deals with the formatting of data in an external representation. The following very basic and somewhat casually defined presentation syntax will be used.

3.1 Basic Block Size

The representation of all data items is explicitly specified. The basic data block size is one byte (i.e., 8 bits). Multiple-byte data items are concatenations of bytes, from left to right, from top to bottom. From the byte stream, a multi-byte item (a number in the following example) is formed (using C notation) as:

value = (byte[0] << 8*(n-1)) | (byte[1] << 8*(n-2)) |
... | byte[n-1];

This byte ordering for multi-byte values is the commonplace network byte order or big-endian format.

3.2 Miscellaneous

Comments begin with /* and end with */.

Optional components are denoted by enclosing them in [[ ]] (double brackets).

Single-byte entities containing uninterpreted data are of type opaque.

A type alias T' for an existing type T is defined by:

T T';

3.3 Numbers

The basic numeric data type is an unsigned byte (uint8). All larger numeric data types are formed from fixed-length series of bytes concatenated as described in Section 3.1 and are also unsigned. The following numeric types are predefined:

uint8 uint16[2];
uint8 uint24[3];
uint8 uint32[4];
uint8 uint64[8];

All values, here and elsewhere in the specification, are transmitted in network byte (big-endian) order; the uint32 represented by the hex bytes 01 02 03 04 is equivalent to the decimal value 16909060.

3.4 Vectors

A vector (single-dimensioned array) is a stream of homogeneous data elements. The size of the vector may be specified at documentation time or left unspecified until runtime. In either case, the length declares the number of bytes, not the number of elements, in the vector. The syntax for specifying a new type, T', that is a fixed-length vector of type T is:

T T'[n];

Here, T' occupies n bytes in the data stream, where n is a multiple of the size of T. The length of the vector is not included in the encoded stream.

In the following example, Datum is defined to be three consecutive bytes that the protocol does not interpret, while Data is three consecutive Datum, consuming a total of nine bytes.

opaque Datum[3];      /* three uninterpreted bytes */
Datum Data[9]; /* three consecutive 3-byte vectors */

Variable-length vectors are defined by specifying a subrange of legal lengths, inclusively, using the notation <floor..ceiling>. When these are encoded, the actual length precedes the vector's contents in the byte stream. The length will be in the form of a number consuming as many bytes as required to hold the vector's specified maximum (ceiling) length. A variable-length vector with an actual length field of zero is referred to as an empty vector.

T T'<floor..ceiling>;

In the following example, mandatory is a vector that must contain between 300 and 400 bytes of type opaque. It can never be empty. The actual length field consumes two bytes, a uint16, which is sufficient to represent the value 400 (see Section 3.3). Similarly, longer can represent up to 800 bytes of data, or 400 uint16 elements, and it may be empty. Its encoding will include a two-byte actual length field prepended to the vector. The length of an encoded vector must be an even multiple of the length of a single element (for example, a 17-byte vector of uint16 would be illegal).

opaque mandatory<300..400>;
/* length field is two bytes, cannot be empty */
uint16 longer<0..800>;
/* zero to 400 16-bit unsigned integers */

3.5 Enumerateds

An additional sparse data type, called enum, is available. Each definition is a different type. Only enumerateds of the same type may be assigned or compared. Every element of an enumerated must be assigned a value, as demonstrated in the following example. Since the elements of the enumerated are not ordered, they can be assigned any unique value, in any order.

enum { e1(v1), e2(v2), ... , en(vn) [[, (n)]] } Te;

Future extensions or additions to the protocol may define new values. Implementations need to be able to parse and ignore unknown values unless the definition of the field states otherwise.

An enumerated occupies as much space in the byte stream as would its maximal defined ordinal value. The following definition would cause one byte to be used to carry fields of type Color:

enum { red(3), blue(5), white(7) } Color;

One may optionally specify a value without its associated tag to force the width definition without defining a superfluous element.

In the following example, Taste will consume two bytes in the data stream but can only assume the values 1, 2, or 4 in the current version of the protocol.

enum { sweet(1), sour(2), bitter(4), (32000) } Taste;

The names of the elements of an enumeration are scoped within the defined type. In the first example, a fully qualified reference to the second element of the enumeration would be Color.blue. Such qualification is not required if the target of the assignment is well specified.

Color color = Color.blue;     /* overspecified, legal */
Color color = blue; /* correct, type implicit */

The names of the elements are not required to be unique. The numerical value among the elements must be unique, though.

enum { low(1), medium(2), high(2) } Priority;  /* WRONG */

For enumerateds that are never converted to external representation, the numerical information may be omitted.

enum { low, medium, high } Priority;

3.6 Constructed Types

Structure types may be constructed from primitive types for convenience. Each specification declares a new, unique type. The syntax for definition is much like that of C:

struct {
T1 f1;
T2 f2;
...
Tn fn;
} T;

Fixed- and variable-length vector fields are allowed using the standard vector syntax. Structures V1 and V2 in the variants example (Section 3.8) demonstrate this.

The fields within a structure may be qualified using the type's name, with a syntax much like that available for enumerateds. For example, T.f2 refers to the second field of the previous declaration.

3.7 Constants

Fields and variables may be assigned a fixed value using =, as in:

struct {
T1 f1 = 8; /* T.f1 must always be 8 */
T2 f2;
} T;

3.8 Variants

Defined structures may have variants based on some knowledge that is available within the environment. The selector must be an enumerated type that defines the possible variants the structure defines. Each arm of the variant structure specifies the type of that variant's field and an optional field label. The mechanism by which the variant is selected at runtime is not prescribed by the presentation language.

struct {
T1 f1;
T2 f2;
....
Tn fn;
select (E) {
case e1: Te1 [[fe1]];
case e2: Te2 [[fe2]];
....
case en: Ten [[fen]];
};
} Tv;

For example:

enum { apple, orange, banana } VariantTag;

struct {
uint16 number;
opaque string<0..10>; /* variable length */
} V1;

struct {
uint32 number;
opaque string[10]; /* fixed length */
} V2;

struct {
VariantTag type;
select (VariantRecord.type) {
case apple: V1;
case orange:
case banana: V2;
};
} VariantRecord;