Skip to main content

RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax

Status: Internet Standard (STD 66)
Updates: RFC 1738
Obsoletes: RFC 2732, 2396, 1808
Authors: T. Berners-Lee (W3C/MIT), R. Fielding (Day Software), L. Masinter (Adobe Systems)
Date: January 2005

Abstract

A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource.

This specification defines:

  • The generic URI syntax
  • A process for resolving URI references that might be in relative form
  • Guidelines and security considerations for the use of URIs on the Internet

Key Features:

  • The URI syntax defines a grammar that is a superset of all valid URIs
  • Allows implementations to parse common components of a URI reference without knowing scheme-specific requirements
  • Does not define a generative grammar for URIs; that task is performed by individual URI scheme specifications

Importance

RFC 3986 is core to Web infrastructure:

  • 🌐 Defines generic syntax for URLs and URNs
  • 🔗 Foundation for all resource location on the Web
  • 📋 Basis for all protocols including HTTP, HTTPS, FTP
  • 🎯 Relative URI resolution algorithm
  • 🔒 URI security considerations

Table of Contents

1. Introduction

  • 1.1 Overview of URIs
    • 1.1.1 Generic Syntax
    • 1.1.2 Examples
    • 1.1.3 URI, URL, and URN
  • 1.2 Design Considerations
    • 1.2.1 Transcription
    • 1.2.2 Separating Identification from Interaction
    • 1.2.3 Hierarchical Identifiers
  • 1.3 Syntax Notation

2. Characters

  • 2.1 Percent-Encoding
  • 2.2 Reserved Characters
  • 2.3 Unreserved Characters
  • 2.4 When to Encode or Decode
  • 2.5 Identifying Data

3. Syntax Components

  • 3.1 Scheme
  • 3.2 Authority
    • 3.2.1 User Information
    • 3.2.2 Host
    • 3.2.3 Port
  • 3.3 Path
  • 3.4 Query
  • 3.5 Fragment

4. Usage

  • 4.1 URI Reference
  • 4.2 Relative Reference
  • 4.3 Absolute URI
  • 4.4 Same-Document Reference
  • 4.5 Suffix Reference

5. Reference Resolution

  • 5.1 Establishing a Base URI
  • 5.2 Relative Resolution
  • 5.3 Component Recomposition
  • 5.4 Reference Resolution Examples

6. Normalization and Comparison

  • 6.1 Equivalence
  • 6.2 Comparison Ladder

7. Security Considerations

  • 7.1 Reliability and Consistency
  • 7.2 Malicious Construction
  • 7.3 Back-End Transcoding
  • 7.4 Rare IP Address Formats
  • 7.5 Sensitive Information
  • 7.6 Semantic Attacks

8. IANA Considerations

9. Acknowledgements

10. References

Appendices

Quick Reference

URI Generic Syntax

URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty

URI Component Example

  foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment

Common URI Schemes

SchemePurposeExample
httpHTTP Protocolhttp://www.example.com/
httpsSecure HTTPhttps://www.example.com/
ftpFile Transferftp://ftp.example.com/file.txt
mailtoEmailmailto:[email protected]
fileLocal Filefile:///path/to/file
dataInline Datadata:text/plain;base64,SGVsbG8=
telTelephonetel:+1-800-555-1212

Reserved Characters

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="

Unreserved Characters

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

Percent-Encoding

pct-encoded = "%" HEXDIG HEXDIG

Examples:
Space → %20
"你" (Chinese) → %E4%BD%A0

URI vs URL vs URN

Relationship Diagram

        URI (Uniform Resource Identifier)
/ \
URL URN
(Uniform Resource Locator) (Uniform Resource Name)
(How to access) (What it is)

Comparison

ConceptFocusPersistenceExample
URIIdentificationNot guaranteedAll URIs
URLLocationLocation-dependenthttp://example.com/page
URNNamePersistenturn:isbn:0-486-27557-4

Key: All URLs are URIs, all URNs are URIs, but not all URIs are URLs or URNs.

Implementation Requirements

MUST Implement

  • ✅ Basic URI syntax parsing
  • ✅ Correct percent-encoding handling
  • ✅ Relative URI resolution algorithm
  • ✅ Case-insensitive scheme and host
  • ✅ Dot segment removal in paths

SHOULD Implement

  • ✅ URI normalization
  • ✅ IRI support (Internationalized Resource Identifiers)
  • ✅ IPv6 address support
  • ✅ Secure user information handling

MAY Implement

  • Scheme-specific validation
  • URI equivalence comparison
  • Automatic normalization
  • RFC 1738: URL Specification (updated)
  • RFC 2396: URI Generic Syntax (obsoleted)
  • RFC 2732: IPv6 Address Format (obsoleted)
  • RFC 3987: IRI (Internationalized Resource Identifiers)
  • RFC 6874: IPv6 Zone Identifiers
  • RFC 7230: HTTP/1.1 Message Syntax
  • RFC 8820: URI Design and Ownership

Online Resources


Next Chapter: 1. Introduction