7. Resolver Implementation
The top levels of the recommended resolver algorithm are discussed in RFC-1034. This section discusses implementation details assuming the database structure suggested in the name server implementation section of this memo.
7.1. Transforming a User Request into a Query
The first step a resolver takes is to transform the client's request, stated in a format suitable to the local OS, into a search specification for RRs at a specific name which match a specific QTYPE and QCLASS.
Query Specification Guidelines
Single Type and Class Preference: Where possible, the QTYPE and QCLASS should correspond to a single type and a single class, because this makes the use of cached data much simpler.
Reason: The presence of data of one type in a cache doesn't confirm the existence or non-existence of data of other types. Hence the only way to be sure is to consult an authoritative source.
QCLASS=* Limitation: If QCLASS=* is used, then authoritative answers won't be available.
Request State Information
Since a resolver must be able to multiplex multiple requests if it is to perform its function efficiently, each pending request is usually represented in some block of state information.
This state block will typically contain:
1. Timestamp
Purpose: Indicating the time the request began.
Usage:
- The timestamp is used to decide whether RRs in the database can be used or are out of date
- This timestamp uses the absolute time format previously discussed for RR storage in zones and caches
TTL Interpretation:
- When an RR's TTL indicates a relative time, the RR must be timely, since it is part of a zone
- When the RR has an absolute time, it is part of a cache, and the TTL of the RR is compared against the timestamp for the start of the request
Advantage of Timestamps: Using the timestamp is superior to using a current time, since it allows RRs with TTLs of zero to be entered in the cache in the usual manner, but still used by the current request, even after intervals of many seconds due to system load, query retransmission timeouts, etc.
2. Work Limitation Parameters
Purpose: Some sort of parameters to limit the amount of work which will be performed for this request.
Rationale: The amount of work which a resolver will do in response to a client request must be limited to guard against:
- Errors in the database, such as circular CNAME references
- Operational problems, such as network partition which prevents the resolver from accessing the name servers it needs
Implementation:
- While local limits on the number of times a resolver will retransmit a particular query to a particular name server address are essential
- The resolver should have a global per-request counter to limit work on a single request
- The counter should be set to some initial value and decremented whenever the resolver performs any action (retransmission timeout, retransmission, etc.)
- If the counter passes zero, the request is terminated with a temporary error
Parallel Requests: Note that if the resolver structure allows one request to start others in parallel, such as when the need to access a name server for one request causes a parallel resolve for the name server's addresses, the spawned request should be started with a lower counter. This prevents circular references in the database from starting a chain reaction of resolver activity.
3. SLIST Data Structure
The SLIST data structure discussed in RFC-1034.
Purpose: This structure keeps track of the state of a request if it must wait for answers from foreign name servers.
7.2. Sending the Queries
As described in RFC-1034, the basic task of the resolver is to formulate a query which will answer the client's request and direct that query to name servers which can provide the information.
Query Formulation Challenges
The resolver will usually only have very strong hints about which servers to ask, in the form of NS RRs, and may have to:
- Revise the query, in response to CNAMEs
- Revise the set of name servers the resolver is asking, in response to delegation responses which point the resolver to name servers closer to the desired information
In addition to the information requested by the client, the resolver may have to call upon its own services to determine the address of name servers it wishes to contact.
Resolver Model
The model used in this memo assumes that:
- The resolver is multiplexing attention between multiple requests, some from the client, and some internally generated
- Each request is represented by some state information
- The desired behavior is that the resolver transmit queries to name servers in a way that:
- Maximizes the probability that the request is answered
- Minimizes the time that the request takes
- Avoids excessive transmissions
Key Algorithm
The key algorithm uses the state information of the request to:
- Select the next name server address to query
- Compute a timeout which will cause the next action should a response not arrive
The next action will usually be a transmission to some other server, but may be a temporary error to the client.
SLIST Initialization
Starting Point: The resolver always starts with a list of server names to query (SLIST).
Initial Content:
- This list will be all NS RRs which correspond to the nearest ancestor zone that the resolver knows about
- To avoid startup problems, the resolver should have a set of default servers which it will ask should it have no current NS RRs which are appropriate
Address Addition: The resolver then adds to SLIST all of the known addresses for the name servers, and may start parallel requests to acquire the addresses of the servers when the resolver has the name, but no addresses, for the name servers.
History Information
To complete initialization of SLIST, the resolver attaches whatever history information it has to each address in SLIST.
Typical History Data:
- Some sort of weighted averages for the response time of the address
- The batting average of the address (i.e., how often the address responded at all to the request)
Important Notes:
- This information should be kept on a per address basis, rather than on a per name server basis, because the response time and batting average of a particular server may vary considerably from address to address
- This information is actually specific to a resolver address / server address pair, so a resolver with multiple addresses may wish to keep separate histories for each of its addresses
- For addresses which have no such history: an expected round trip time of 5-10 seconds should be the worst case, with lower estimates for the same local network, etc.
Delegation Handling: Note that whenever a delegation is followed, the resolver algorithm reinitializes SLIST.
Server Selection and Timeout
Ranking: The information establishes a partial ranking of the available name server addresses.
Selection Strategy: Each time an address is chosen, the state should be altered to prevent its selection again until all other addresses have been tried.
Timeout Calculation: The timeout for each transmission should be 50-100% greater than the average predicted value to allow for variance in response.
Special Cases
Bootstrapping Problem:
- The resolver may encounter a situation where no addresses are available for any of the name servers named in SLIST, and where the servers in the list are precisely those which would normally be used to look up their own addresses
- This situation typically occurs when the glue address RRs have a smaller TTL than the NS RRs marking delegation, or when the resolver caches the result of a NS search
- The resolver should detect this condition and restart the search at the next ancestor zone, or alternatively at the root
Server Errors:
- If a resolver gets a server error or other bizarre response from a name server, it should remove it from SLIST
- The resolver may wish to schedule an immediate transmission to the next candidate server address
7.3. Processing Responses
The first step in processing arriving response datagrams is to parse the response.
Response Parsing Procedure
This procedure should include:
1. Header Check:
- Check the header for reasonableness
- Discard datagrams which are queries when responses are expected
2. Section Parsing:
- Parse the sections of the message
- Ensure that all RRs are correctly formatted
3. TTL Check (Optional):
- As an optional step, check the TTLs of arriving data looking for RRs with excessively long TTLs
- If a RR has an excessively long TTL, say greater than 1 week:
- Either discard the whole response
- Or limit all TTLs in the response to 1 week
Response Matching
The next step is to match the response to a current resolver request.
Recommended Strategy:
- Do a preliminary matching using the ID field in the domain header
- Then verify that the question section corresponds to the information currently desired
Implementation Requirement: This requires that the transmission algorithm devote several bits of the domain ID field to a request identifier of some sort.
Special Considerations
Source Address Variation:
- Some name servers send their responses from different addresses than the one used to receive the query
- That is, a resolver cannot rely that a response will come from the same address which it sent the corresponding query to
- This name server bug is typically encountered in UNIX systems
Retransmission Handling:
- If the resolver retransmits a particular request to a name server, it should be able to use a response from any of the transmissions
- However, if it is using the response to sample the round trip time to access the name server:
- It must be able to determine which transmission matches the response (and keep transmission times for each outgoing message)
- Or only calculate round trip times based on initial transmissions
Missing Zone Data:
- A name server will occasionally not have a current copy of a zone which it should have according to some NS RRs
- The resolver should simply remove the name server from the current SLIST, and continue
7.4. Using the Cache
In general, we expect a resolver to cache all data which it receives in responses since it may be useful in answering future client requests.
However, there are several types of data which should not be cached:
Data That Should Not Be Cached
1. Incomplete Sets:
- When several RRs of the same type are available for a particular owner name, the resolver should either cache them all or none at all
- When a response is truncated, and a resolver doesn't know whether it has a complete set, it should not cache a possibly partial set of RRs
2. Non-Authoritative Data Over Authoritative:
- Cached data should never be used in preference to authoritative data
- If caching would cause this to happen, the data should not be cached
3. Inverse Query Results:
- The results of an inverse query should not be cached
4. Wildcard Query Results:
- The results of standard queries where the QNAME contains
*labels if the data might be used to construct wildcards - Reason: The cache does not necessarily contain existing RRs or zone boundary information which is necessary to restrict the application of the wildcard RRs
5. Dubious Reliability Data:
- RR data in responses of dubious reliability
- When a resolver receives unsolicited responses or RR data other than that requested, it should discard it without caching it
- Basic Implication: All sanity checks on a packet should be performed before any of it is cached
Cache Update Strategy
When a resolver has a set of RRs for some name in a response, and wants to cache the RRs:
- It should check its cache for already existing RRs
- Depending on the circumstances, either the data in the response or the cache is preferred
- The two should never be combined
- If the data in the response is from authoritative data in the answer section, it is always preferred
Best Practices Summary
For Efficient Resolver Implementation
-
Use single type and class queries when possible for better cache utilization
-
Maintain per-request state including timestamps and work counters
-
Keep per-address history for smart server selection
-
Implement proper timeout strategies (50-100% above predicted time)
-
Handle special cases like bootstrapping problems and server errors
-
Parse responses carefully with header checks and format validation
-
Cache selectively - not all data should be cached
-
Prefer authoritative data over cached data when both are available
Related: See 6. Name Server Implementation for server-side details