7. Resolver Implementation

The top levels of the recommended resolver algorithm are discussed in RFC-1034. This section discusses implementation details assuming the database structure suggested in the name server implementation section of this memo.

7.1. Transforming a User Request into a Query

The first step a resolver takes is to transform the client's request, stated in a format suitable to the local OS, into a search specification for RRs at a specific name which match a specific QTYPE and QCLASS.

Query Specification Guidelines

Single Type and Class Preference: Where possible, the QTYPE and QCLASS should correspond to a single type and a single class, because this makes the use of cached data much simpler.

Reason: The presence of data of one type in a cache doesn't confirm the existence or non-existence of data of other types. Hence the only way to be sure is to consult an authoritative source.

QCLASS=* Limitation: If QCLASS=* is used, then authoritative answers won't be available.

Request State Information

Since a resolver must be able to multiplex multiple requests if it is to perform its function efficiently, each pending request is usually represented in some block of state information.

This state block will typically contain:

1. Timestamp

Purpose: Indicating the time the request began.

Usage:

The timestamp is used to decide whether RRs in the database can be used or are out of date
This timestamp uses the absolute time format previously discussed for RR storage in zones and caches

TTL Interpretation:

When an RR's TTL indicates a relative time, the RR must be timely, since it is part of a zone
When the RR has an absolute time, it is part of a cache, and the TTL of the RR is compared against the timestamp for the start of the request

Advantage of Timestamps: Using the timestamp is superior to using a current time, since it allows RRs with TTLs of zero to be entered in the cache in the usual manner, but still used by the current request, even after intervals of many seconds due to system load, query retransmission timeouts, etc.

2. Work Limitation Parameters

Purpose: Some sort of parameters to limit the amount of work which will be performed for this request.

Rationale: The amount of work which a resolver will do in response to a client request must be limited to guard against:

Errors in the database, such as circular CNAME references
Operational problems, such as network partition which prevents the resolver from accessing the name servers it needs

Implementation:

While local limits on the number of times a resolver will retransmit a particular query to a particular name server address are essential
The resolver should have a global per-request counter to limit work on a single request
The counter should be set to some initial value and decremented whenever the resolver performs any action (retransmission timeout, retransmission, etc.)
If the counter passes zero, the request is terminated with a temporary error

Parallel Requests: Note that if the resolver structure allows one request to start others in parallel, such as when the need to access a name server for one request causes a parallel resolve for the name server's addresses, the spawned request should be started with a lower counter. This prevents circular references in the database from starting a chain reaction of resolver activity.

3. SLIST Data Structure

The SLIST data structure discussed in RFC-1034.

Purpose: This structure keeps track of the state of a request if it must wait for answers from foreign name servers.

7.2. Sending the Queries

As described in RFC-1034, the basic task of the resolver is to formulate a query which will answer the client's request and direct that query to name servers which can provide the information.

Query Formulation Challenges

The resolver will usually only have very strong hints about which servers to ask, in the form of NS RRs, and may have to:

Revise the query, in response to CNAMEs
Revise the set of name servers the resolver is asking, in response to delegation responses which point the resolver to name servers closer to the desired information

In addition to the information requested by the client, the resolver may have to call upon its own services to determine the address of name servers it wishes to contact.

Resolver Model

The model used in this memo assumes that:

The resolver is multiplexing attention between multiple requests, some from the client, and some internally generated
Each request is represented by some state information
The desired behavior is that the resolver transmit queries to name servers in a way that:
- Maximizes the probability that the request is answered
- Minimizes the time that the request takes
- Avoids excessive transmissions

Key Algorithm

The key algorithm uses the state information of the request to:

Select the next name server address to query
Compute a timeout which will cause the next action should a response not arrive

The next action will usually be a transmission to some other server, but may be a temporary error to the client.

SLIST Initialization

Starting Point: The resolver always starts with a list of server names to query (SLIST).

Initial Content:

This list will be all NS RRs which correspond to the nearest ancestor zone that the resolver knows about
To avoid startup problems, the resolver should have a set of default servers which it will ask should it have no current NS RRs which are appropriate

Address Addition: The resolver then adds to SLIST all of the known addresses for the name servers, and may start parallel requests to acquire the addresses of the servers when the resolver has the name, but no addresses, for the name servers.

History Information

To complete initialization of SLIST, the resolver attaches whatever history information it has to each address in SLIST.

Typical History Data:

Some sort of weighted averages for the response time of the address
The batting average of the address (i.e., how often the address responded at all to the request)

Important Notes:

This information should be kept on a per address basis, rather than on a per name server basis, because the response time and batting average of a particular server may vary considerably from address to address
This information is actually specific to a resolver address / server address pair, so a resolver with multiple addresses may wish to keep separate histories for each of its addresses
For addresses which have no such history: an expected round trip time of 5-10 seconds should be the worst case, with lower estimates for the same local network, etc.

Delegation Handling: Note that whenever a delegation is followed, the resolver algorithm reinitializes SLIST.

Server Selection and Timeout

Ranking: The information establishes a partial ranking of the available name server addresses.

Selection Strategy: Each time an address is chosen, the state should be altered to prevent its selection again until all other addresses have been tried.

Timeout Calculation: The timeout for each transmission should be 50-100% greater than the average predicted value to allow for variance in response.

Special Cases

Bootstrapping Problem:

The resolver may encounter a situation where no addresses are available for any of the name servers named in SLIST, and where the servers in the list are precisely those which would normally be used to look up their own addresses
This situation typically occurs when the glue address RRs have a smaller TTL than the NS RRs marking delegation, or when the resolver caches the result of a NS search
The resolver should detect this condition and restart the search at the next ancestor zone, or alternatively at the root

Server Errors:

If a resolver gets a server error or other bizarre response from a name server, it should remove it from SLIST
The resolver may wish to schedule an immediate transmission to the next candidate server address

7.3. Processing Responses

The first step in processing arriving response datagrams is to parse the response.

Response Parsing Procedure

This procedure should include:

1. Header Check:

Check the header for reasonableness
Discard datagrams which are queries when responses are expected

2. Section Parsing:

Parse the sections of the message
Ensure that all RRs are correctly formatted

3. TTL Check (Optional):

As an optional step, check the TTLs of arriving data looking for RRs with excessively long TTLs
If a RR has an excessively long TTL, say greater than 1 week:
- Either discard the whole response
- Or limit all TTLs in the response to 1 week

Response Matching

The next step is to match the response to a current resolver request.

Recommended Strategy:

Do a preliminary matching using the ID field in the domain header
Then verify that the question section corresponds to the information currently desired

Implementation Requirement: This requires that the transmission algorithm devote several bits of the domain ID field to a request identifier of some sort.

Special Considerations

Source Address Variation:

Some name servers send their responses from different addresses than the one used to receive the query
That is, a resolver cannot rely that a response will come from the same address which it sent the corresponding query to
This name server bug is typically encountered in UNIX systems

Retransmission Handling:

If the resolver retransmits a particular request to a name server, it should be able to use a response from any of the transmissions
However, if it is using the response to sample the round trip time to access the name server:
- It must be able to determine which transmission matches the response (and keep transmission times for each outgoing message)
- Or only calculate round trip times based on initial transmissions

Missing Zone Data:

A name server will occasionally not have a current copy of a zone which it should have according to some NS RRs
The resolver should simply remove the name server from the current SLIST, and continue

7.4. Using the Cache

In general, we expect a resolver to cache all data which it receives in responses since it may be useful in answering future client requests.

However, there are several types of data which should not be cached:

Data That Should Not Be Cached

1. Incomplete Sets:

When several RRs of the same type are available for a particular owner name, the resolver should either cache them all or none at all
When a response is truncated, and a resolver doesn't know whether it has a complete set, it should not cache a possibly partial set of RRs

2. Non-Authoritative Data Over Authoritative:

Cached data should never be used in preference to authoritative data
If caching would cause this to happen, the data should not be cached

3. Inverse Query Results:

The results of an inverse query should not be cached

4. Wildcard Query Results:

The results of standard queries where the QNAME contains * labels if the data might be used to construct wildcards
Reason: The cache does not necessarily contain existing RRs or zone boundary information which is necessary to restrict the application of the wildcard RRs

5. Dubious Reliability Data:

RR data in responses of dubious reliability
When a resolver receives unsolicited responses or RR data other than that requested, it should discard it without caching it
Basic Implication: All sanity checks on a packet should be performed before any of it is cached

Cache Update Strategy

When a resolver has a set of RRs for some name in a response, and wants to cache the RRs:

It should check its cache for already existing RRs
Depending on the circumstances, either the data in the response or the cache is preferred
The two should never be combined
If the data in the response is from authoritative data in the answer section, it is always preferred

Best Practices Summary

For Efficient Resolver Implementation

Use single type and class queries when possible for better cache utilization
Maintain per-request state including timestamps and work counters
Keep per-address history for smart server selection
Implement proper timeout strategies (50-100% above predicted time)
Handle special cases like bootstrapping problems and server errors
Parse responses carefully with header checks and format validation
Cache selectively - not all data should be cached
Prefer authoritative data over cached data when both are available

Related: See 6. Name Server Implementation for server-side details

7.1. Transforming a User Request into a Query​

Query Specification Guidelines​

Request State Information​

1. Timestamp​

2. Work Limitation Parameters​

3. SLIST Data Structure​

7.2. Sending the Queries​

Query Formulation Challenges​

Resolver Model​

Key Algorithm​

SLIST Initialization​

History Information​

Server Selection and Timeout​

Special Cases​

7.3. Processing Responses​

Response Parsing Procedure​

Response Matching​

Special Considerations​

7.4. Using the Cache​

Data That Should Not Be Cached​

Cache Update Strategy​

Best Practices Summary​

For Efficient Resolver Implementation​