1. Introduction
1. Introduction
JSON [RFC8259] is a popular representation format for structured data values. JSONPath defines a string syntax for selecting and extracting JSON values from within a given JSON value.
In relation to JSON Pointer [RFC6901], JSONPath is not intended as a replacement but as a more powerful companion. See Appendix C.
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
The grammatical rules in this document are to be interpreted as ABNF, as described in [RFC5234]. ABNF terminal values in this document define Unicode scalar values rather than their UTF-8 encoding. For example, the Unicode PLACE OF INTEREST SIGN (U+2318) would be defined in ABNF as %x2318.
Functions are referred to using the function name followed by a pair of parentheses, as in fname().
The terminology of [RFC8259] applies except where clarified below. The terms "primitive" and "structured" are used to group different kinds of values as in Section 1 of [RFC8259]. JSON objects and arrays are structured; all other values are primitive. Definitions for "object", "array", "number", and "string" remain unchanged. Importantly, "object" and "array" in particular do not take on a generic meaning, such as they would in a general programming context.
The terminology of [RFC9485] applies.
Additional terms used in this document are defined below.
Value: As per [RFC8259], a data item conforming to the generic data model of JSON, i.e., primitive data (numbers, text strings, and the special values null, true, and false), or structured data (JSON objects and arrays). [RFC8259] focuses on the textual representation of JSON values and does not fully define the value abstraction assumed here.
Member: A name/value pair in an object. (A member is not itself a value.)
Name: The name (a string) in a name/value pair constituting a member. This is also used in [RFC8259], but that specification does not formally define it. It is included here for completeness.
Element: A value in a JSON array.
Index: An integer that identifies a specific element in an array.
Query: Short name for a JSONPath expression.
Query Argument: Short name for the value a JSONPath expression is applied to.
Location: The position of a value within the query argument. This can be thought of as a sequence of names and indexes navigating to the value through the objects and arrays in the query argument, with the empty sequence indicating the query argument itself. A location can be represented as a Normalized Path (defined below).
Node: The pair of a value along with its location within the query argument.
Root Node: The unique node whose value is the entire query argument.
Root Node Identifier: The expression $, which refers to the root node of the query argument.
Current Node Identifier: The expression @, which refers to the current node in the context of the evaluation of a filter expression (described later).
Children (of a node): If the node is an array, the nodes of its elements; if the node is an object, the nodes of its member values. If the node is neither an array nor an object, it has no children.
Descendants (of a node): The children of the node, together with the children of its children, and so forth recursively. More formally, the "descendants" relation between nodes is the transitive closure of the "children" relation.
Depth (of a descendant node within a value): The number of ancestors of the node within the value. The root node of the value has depth zero, the children of the root node have depth one, their children have depth two, and so forth.
Nodelist: A list of nodes. While a nodelist can be represented in JSON, e.g., as an array, this document does not require or assume any particular representation.
Parameter: Formal parameter (of a function) that can take a function argument (an actual parameter) in a function expression.
Normalized Path: A form of JSONPath expression that identifies a node in a value by providing a query that results in exactly that node. Each node in a query argument is identified by exactly one Normalized Path (we say that the Normalized Path is "unique" for that node), and to be a Normalized Path for a specific query argument, the Normalized Path needs to identify exactly one node. This is similar to, but syntactically different from, a JSON Pointer [RFC6901]. Note: This definition is based on the syntactical definition in Section 2.7; JSONPath expressions that identify a node in a value but do not conform to that syntax are not Normalized Paths.
Unicode Scalar Value: Any Unicode [UNICODE] code point except high-surrogate and low-surrogate code points (in other words, integers in the inclusive base 16 ranges, either 0 to D7FF or E000 to 10FFFF). JSONPath queries are sequences of Unicode scalar values.
Segment: One of the constructs that selects children ([<selectors>]) or descendants (..[<selectors>]) of an input value.
Selector: A single item within a segment that takes the input value and produces a nodelist consisting of child nodes of the input value.
Singular Query: A JSONPath expression built from segments that have been syntactically restricted in a certain way (Section 2.3.5.1) so that, regardless of the input value, the expression produces a nodelist containing at most one node. Note: JSONPath expressions that always produce a singular nodelist but do not conform to the syntax in Section 2.3.5.1 are not singular queries.
1.1.1. JSON Values as Trees of Nodes
This document models the query argument as a tree of JSON values, each with its own node. A node is either the root node or one of its descendants.
This document models the result of applying a query to the query argument as a nodelist (a list of nodes).
Nodes are the selectable parts of the query argument. The only parts of an object that can be selected by a query are the member values. Member names and members (name/value pairs) cannot be selected. Thus, member values have nodes, but members and member names do not. Similarly, member values are children of an object, but members and member names are not.
1.2. History
This document is based on Stefan Gössner's popular JSONPath proposal (dated 2007-02-21) [JSONPath-orig], builds on the experience from the widespread deployment of its implementations, and provides a normative specification for it.
Appendix B describes how JSONPath was inspired by XML's XPath [XPath].
JSONPath was intended as a lightweight companion to JSON implementations in programming languages such as PHP and JavaScript, so instead of defining its own expression language, like XPath did, JSONPath delegated parts of a query to the underlying runtime, e.g., JavaScript's eval() function. As JSONPath was implemented in more environments, JSONPath expressions became decreasingly portable. For example, regular expression processing was often delegated to a convenient regular expression engine.
This document aims to remove such implementation-specific dependencies and serve as a common JSONPath specification that can be used across programming languages and environments. This means that backwards compatibility is not always achieved; a design principle of this document is to go with a "consensus" between implementations even if it is rough, as long as that does not jeopardize the objective of obtaining a usable, stable JSON query language.
The term JSONPath was chosen because of the XPath inspiration and also because the outcome of a query consists of paths identifying nodes in the JSON query argument.
1.3. JSON Values
The JSON value a JSONPath query is applied to is, by definition, a valid JSON value. A JSON value is often constructed by parsing a JSON text.
The parsing of a JSON text into a JSON value and what happens if a JSON text does not represent valid JSON are not defined by this document. Sections 4 and 8 of [RFC8259] identify specific situations that may conform to the grammar for JSON texts but are not interoperable uses of JSON, as they may cause unpredictable behavior. This document does not attempt to define predictable behavior for JSONPath queries in these situations.
Specifically, the "Semantics" subsections of Sections 2.3.1, 2.3.2, 2.3.5, and 2.5.2 describe behavior that becomes unpredictable when the JSON value for one of the objects under consideration was constructed out of JSON text that exhibits multiple members for a single object that share the same member name ("duplicate names"; see Section 4 of [RFC8259]). Also, when selecting a child by name (Section 2.3.1) and comparing strings (Section 2.3.5.2.2), it is assumed these strings are sequences of Unicode scalar values; the behavior becomes unpredictable if they are not (Section 8.2 of [RFC8259]).
1.4. Overview of JSONPath Expressions
A JSONPath expression is applied to a JSON value, known as the query argument. The output is a nodelist.
A JSONPath expression consists of an identifier followed by a series of zero or more segments, each of which contains one or more selectors.
1.4.1. Identifiers
The root node identifier $ refers to the root node of the query argument, i.e., to the argument as a whole.
The current node identifier @ refers to the current node in the context of the evaluation of a filter expression (Section 2.3.5).
1.4.2. Segments
Segments select children ([<selectors>]) or descendants (..[<selectors>]) of an input value.
Segments can use bracket notation, for example:
$['store']['book'][0]['title']
or the more compact dot notation, for example:
$.store.book[0].title
Bracket notation contains one or more (comma-separated) selectors of any kind. Selectors are detailed in the next section.
A JSONPath expression may use a combination of bracket and dot notations.
This document treats the bracket notations as canonical and defines the shorthand dot notation in terms of bracket notation. Examples and descriptions use shorthand where convenient.
1.4.3. Selectors
A name selector, e.g., 'name', selects a named child of an object.
An index selector, e.g., 3, selects an indexed child of an array.
In the expression [], a wildcard * (Section 2.3.2) selects all children of a node, and in the expression ..[], it selects all descendants of a node.
An array slice start:end:step (Section 2.3.4) selects a series of elements from an array, giving a start position, an end position, and an optional step value that moves the position from the start to the end.
A filter expression ?<logical-expr> selects certain children of an object or array, as in:
$.store.book[[email protected] < 10].title
1.4.4. Summary
Table 1 provides a brief overview of JSONPath syntax.
| Syntax Element | Description |
|---|---|
| $ | root node identifier (Section 2.2) |
| @ | current node identifier (Section 2.3.5) |
| (valid only within filter selectors) | |
[<selectors>] | child segment (Section 2.5.1): selects |
| zero or more children of a node | |
| .name | shorthand for ['name'] |
| .* | shorthand for [*] |
..[<selectors>] | descendant segment (Section 2.5.2): |
| selects zero or more descendants of a node | |
| ..name | shorthand for ..['name'] |
| ..* | shorthand for ..[*] |
| 'name' | name selector (Section 2.3.1): selects a |
| named child of an object | |
| * | wildcard selector (Section 2.3.2): selects |
| all children of a node | |
| 3 | index selector (Section 2.3.3): selects an |
| indexed child of an array (from 0) | |
| 0💯5 | array slice selector (Section 2.3.4): |
| start:end:step for arrays | |
?<logical-expr> | filter selector (Section 2.3.5): selects |
| particular children using a logical | |
| expression | |
| length(@.foo) | function extension (Section 2.4): invokes |
| a function in a filter expression |
Table 1: Overview of JSONPath Syntax
1.5. JSONPath Examples
This section is informative. It provides examples of JSONPath expressions.
The examples are based on the simple JSON value shown in Figure 1, representing a bookstore (which also has a bicycle).
{ "store": {
"book": [
{ "category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{ "category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{ "category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{ "category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 399
}
}
}
Figure 1: Example JSON Value
Table 2 shows some JSONPath queries that might be applied to this example and their intended results.
| JSONPath | Intended Result |
|---|---|
| $.store.book[*].author | the authors of all books in the store |
| $..author | all authors |
| $.store.* | all things in the store, which are |
| some books and a red bicycle | |
| $.store..price | the prices of everything in the store |
| $..book[2] | the third book |
| $..book[2].author | the third book's author |
| $..book[2].publisher | empty result: the third book does not |
| have a "publisher" member | |
| $..book[-1] | the last book in order |
| $..book[0,1] | the first two books |
| $..book[:2] | |
| $..book[[email protected]] | all books with an ISBN number |
| $..book[[email protected]<10] | all books cheaper than 10 |
| $..* | all member values and array elements |
| contained in the input value |
Table 2: Example JSONPath Expressions and Their Intended Results When Applied to the Example JSON Value