Skip to main content

3. Syntax Components

The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment.

URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty

Component Structure Example:

  foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment
| _____________________|__
/ \ / \
urn:example:animal:ferret:nose

The following sections provide detailed descriptions of each component.


3.1. Scheme

Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-").

scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

Examples:

  • http
  • https
  • ftp
  • mailto
  • file
  • data
  • tel

Scheme names are case-insensitive. The canonical form is lowercase, and normalizers SHOULD convert scheme names to lowercase.

Note: Although schemes are case-insensitive, producers and normalizers should use lowercase letters.


3.2. Authority

The authority component consists of an optional user information subcomponent, a host subcomponent, and an optional port subcomponent, preceded by two slashes ("//").

authority = [ userinfo "@" ] host [ ":" port ]

Complete Example:

user:[email protected]:8080
\__________/ \______________/ \__/
userinfo host port

3.2.1. User Information

The userinfo subcomponent may consist of a user name and, optionally, scheme-specific information about how to gain authorization to access the resource.

userinfo = *( unreserved / pct-encoded / sub-delims / ":" )

Examples:

  • user@host
  • user:password@host (deprecated, security risk)
  • anonymous@host

Security Warning: Including passwords in URIs is deprecated because passwords may be exposed in logs, history, etc.


3.2.2. Host

The host subcomponent can be a registered name (including but not limited to a hostname) or an IP address. IPv6 addresses MUST be enclosed in square brackets.

host = IP-literal / IPv4address / reg-name

IP-literal = "[" ( IPv6address / IPvFuture ) "]"

IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet

reg-name = *( unreserved / pct-encoded / sub-delims )

Host Type Examples:

1. Registered Name

www.example.com
example.org
localhost

2. IPv4 Address

192.0.2.1
127.0.0.1

3. IPv6 Address

[2001:db8::1]
[::1]
[fe80::1]

4. Future IP Version

[v9.abc:def]

Host Normalization: Hostnames are case-insensitive and SHOULD be normalized to lowercase.


3.2.3. Port

The port subcomponent is optional and represented in decimal digits.

port = *DIGIT

Examples:

  • http://example.com:80/ (HTTP default port)
  • https://example.com:443/ (HTTPS default port)
  • http://example.com:8080/ (custom port)

Default Port: If the port is empty or not given, it is assumed to be the default port for the scheme.


3.3. Path

The path component contains data that, along with data in the query component, serves to identify a resource within the scope of the URI's scheme and naming authority. The path is composed of a sequence of path segments separated by a slash ("/") character.

path          = path-abempty    ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characters

path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>

segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

Path Type Examples:

1. Absolute Path

/path/to/resource
/index.html
/

2. Relative Path

path/to/resource
../parent/resource
resource

3. Empty Path

http://example.com

Path Segments: A path consists of segments separated by slashes. Special segments include:

  • . (current directory)
  • .. (parent directory)

3.4. Query

The query component contains non-hierarchical data that, along with data in the path component, serves to identify a resource. The query component is indicated by a question mark ("?").

query = *( pchar / "/" / "?" )

Query String Examples:

?key=value
?name=John&age=30
?search=hello+world
?filter[]=a&filter[]=b
?q=%E4%BD%A0%E5%A5%BD

Common Format: Although the URI specification doesn't define the query string format, the key=value pairs separated by & format has become a de facto standard:

?key1=value1&key2=value2&key3=value3

3.5. Fragment

The fragment component allows indirect identification of a secondary resource within a representation of the resource identified by the URI. The fragment identifier is indicated by a hash sign ("#").

fragment = *( pchar / "/" / "?" )

Fragment Examples:

#section1
#top
#chapter-3
#line-42

Important Characteristics:

  1. Client-Side Processing: Fragment identifiers are processed by the client and are not sent to the server
  2. In-Document Navigation: Commonly used for anchors in HTML documents
  3. Secondary Resource: Identifies a part of the primary resource

Example:

http://example.com/page.html#section2
  • Server receives: http://example.com/page.html
  • Client navigates to: #section2

URI Components Summary

ComponentPrefixRequiredExampleDescription
scheme-YeshttpProtocol/scheme
authority//Nouser@host:portAuthority information
path-Yes*/path/to/resourceResource path
query?Nokey=valueQuery parameters
fragment#Nosection1Fragment identifier

*Path may be empty


Complete URI Example Breakdown

Example 1: HTTP URL

https://user:[email protected]:8080/path/to/page?key=value#section
ComponentValue
schemehttps
userinfouser:pass
hostwww.example.com
port8080
path/path/to/page
querykey=value
fragmentsection

Example 2: Mailto URI

mailto:[email protected]?subject=Hello
ComponentValue
schememailto
path[email protected]
querysubject=Hello

Example 3: File URI

file:///home/user/document.txt
ComponentValue
schemefile
authority(empty)
path/home/user/document.txt

Component Encoding Rules

Different components allow different character sets:

ComponentAllowed Special CharactersMust Be Encoded
scheme+ - .All others
userinfo: ! $ & ' ( ) * + , ; =All others
host- . _ (reg-name)All others
port0-9All others
path: @ ! $ & ' ( ) * + , ; =All others
query: @ / ? ! $ & ' ( ) * + , ; =All others
fragment: @ / ? ! $ & ' ( ) * + , ; =All others

Next Chapter: 4. URI Reference - URI references and relative references