2.4. Function Extensions
2.4. Function Extensions
Beyond the filter expression functionality defined in the preceding subsections, JSONPath defines an extension point that can be used to add filter expression functionality: "Function Extensions".
This section defines the extension point and some function extensions that use this extension point. While these mechanisms are designed to use the extension point, they are an integral part of the JSONPath specification and are expected to be implemented like any other integral part of this specification.
A function extension defines a registered name (see Section 3.2) that can be applied to a sequence of zero or more arguments, producing a result. Each registered function name is unique.
A function extension MUST be defined such that its evaluation is free of side effects, i.e., all possible orders of evaluation and choices of short-circuiting or full evaluation of an expression containing it MUST lead to the same result. (Note: Memoization or logging are not side effects in this sense as they are visible at the implementation level only -- they do not influence the result of the evaluation.)
function-name = function-name-first *function-name-char
function-name-first = LCALPHA
function-name-char = function-name-first / "_" / DIGIT
LCALPHA = %x61-7A ; "a".."z"
function-expr = function-name "(" S [function-argument
*(S "," S function-argument)] S ")"
function-argument = literal /
filter-query / ; (includes singular-query)
logical-expr /
function-expr
Any function expressions in a query must be well-formed (by conforming to the above ABNF) and well-typed; otherwise, the JSONPath implementation MUST raise an error (see Section 2.1). To define which function expressions are well-typed, a type system is first introduced.
2.4.1. Type System for Function Expressions
Each parameter and the result of a function extension must have a declared type.
Declared types enable checking a JSONPath query for well-typedness independent of any query argument the JSONPath query is applied to.
Table 13 defines the available types in terms of the instances they contain.
| Type | Instances |
|---|---|
| ValueType | JSON values or Nothing |
| LogicalType | LogicalTrue or LogicalFalse |
| NodesType | Nodelists |
Table 13: Function Extension Type System
Notes:
- The only instances that can be directly represented in JSONPath
syntax are certain JSON values in ValueType expressed as literals (which, in JSONPath, are limited to primitive values).
- The special result Nothing represents the absence of a JSON value
and is distinct from any JSON value, including null.
- LogicalTrue and LogicalFalse are unrelated to the JSON values
expressed by the literals true and false.
2.4.2. Type Conversion
Just as queries can be used in logical expressions by testing for the existence of at least one node (Section 2.3.5.2.1), a function expression of declared type NodesType can be used as a function argument for a parameter of declared type LogicalType, with the equivalent conversion rule:
- If the nodelist contains one or more nodes, the conversion result
is LogicalTrue.
- If the nodelist is empty, the conversion result is LogicalFalse.
Notes:
- Extraction of a value from a nodelist can be performed in several
ways, so an implicit conversion from NodesType to ValueType may be surprising and has therefore not been defined.
- A function expression with a declared type of NodesType can
indirectly be used as an argument for a parameter of declared type ValueType by wrapping the expression in a call to a function extension, such as value() (see Section 2.4.8), that takes a parameter of type NodesType and returns a result of type ValueType.
The well-typedness of function expressions can now be defined in terms of this type system.
2.4.3. Well-Typedness of Function Expressions
For a function expression to be well-typed:
- Its declared type must be well-typed in the context in which it occurs.
As per the grammar, a function expression can occur in three different immediate contexts, which lead to the following conditions for well-typedness:
As a test-expr in a logical expression: The function's declared result type is LogicalType or (giving rise to conversion as per Section 2.4.2) NodesType.
As a comparable in a comparison: The function's declared result type is ValueType.
As a function-argument in another function expression: The function's declared result type fulfills the following rules for the corresponding parameter of the enclosing function.
- Its arguments must be well-typed for the declared type of the corresponding parameters.
The arguments of the function expression are well-typed when each argument of the function can be used for the declared type of the corresponding parameter, according to one of the following conditions:
- When the argument is a function expression with the same
declared result type as the declared type of the parameter.
- When the declared type of the parameter is LogicalType and the
argument is one of the following:
- A function expression with declared result type NodesType.
In this case, the argument is converted to LogicalType as per Section 2.4.2.
-
A logical-expr that is not a function expression.
-
When the declared type of the parameter is NodesType and the
argument is a query (which includes singular query).
- When the declared type of the parameter is ValueType and the
argument is one of the following:
-
A value expressed as a literal.
-
A singular query. In this case:
o If the query results in a nodelist consisting of a single node, the argument is the value of the node.
o If the query results in an empty nodelist, the argument is the special result Nothing.
2.4.4. length() Function Extension
Parameters: 1. ValueType
Result: ValueType (unsigned integer or Nothing)
The length() function extension provides a way to compute the length of a value and make that available for further processing in the filter expression:
$[?length(@.authors) >= 5]
Its only argument is an instance of ValueType (possibly taken from a singular query, as in the example above). The result is also an instance of ValueType: an unsigned integer or the special result Nothing.
- If the argument value is a string, the result is the number of
Unicode scalar values in the string.
- If the argument value is an array, the result is the number of
elements in the array.
- If the argument value is an object, the result is the number of
members in the object.
- For any other argument value, the result is the special result
Nothing.
2.4.5. count() Function Extension
Parameters: 1. NodesType
Result: ValueType (unsigned integer)
The count() function extension provides a way to obtain the number of nodes in a nodelist and make that available for further processing in the filter expression:
$[?count(@.*.author) >= 5]
Its only argument is a nodelist. The result is a value (an unsigned integer) that gives the number of nodes in the nodelist.
Notes:
-
There is no deduplication of the nodelist.
-
The number of nodes in the nodelist is counted independent of
their values or any children they may have, e.g., the count of a non-empty singular nodelist such as count(@) is always 1.
2.4.6. match() Function Extension
Parameters: 1. ValueType (string)
- ValueType (string conforming to [RFC9485])
Result: LogicalType
The match() function extension provides a way to check whether (the entirety of; see Section 2.4.7) a given string matches a given regular expression, which is in the form described in [RFC9485].
$[?match(@.date, "1974-05-..")]
Its arguments are instances of ValueType (possibly taken from a singular query, as for the first argument in the example above). If the first argument is not a string or the second argument is not a string conforming to [RFC9485], the result is LogicalFalse. Otherwise, the string that is the first argument is matched against the I-Regexp contained in the string that is the second argument; the result is LogicalTrue if the string matches the I-Regexp and is LogicalFalse otherwise.
2.4.7. search() Function Extension
Parameters: 1. ValueType (string)
- ValueType (string conforming to [RFC9485])
Result: LogicalType
The search() function extension provides a way to check whether a given string contains a substring that matches a given regular expression, which is in the form described in [RFC9485].
$[?search(@.author, "[BR]ob")]
Its arguments are instances of ValueType (possibly taken from a singular query, as for the first argument in the example above). If the first argument is not a string or the second argument is not a string conforming to [RFC9485], the result is LogicalFalse. Otherwise, the string that is the first argument is searched for a substring that matches the I-Regexp contained in the string that is the second argument; the result is LogicalTrue if at least one such substring exists and is LogicalFalse otherwise.
2.4.8. value() Function Extension
Parameters: 1. NodesType
Result: ValueType
The value() function extension provides a way to convert an instance of NodesType to a value and make that available for further processing in the filter expression:
$[?value(@..color) == "red"]
Its only argument is an instance of NodesType (possibly taken from a filter-query, as in the example above). The result is an instance of ValueType.
- If the argument contains a single node, the result is the value of
the node.
- If the argument is the empty nodelist or contains multiple nodes,
the result is Nothing.
Note: A singular query may be used anywhere where a ValueType is expected, so there is no need to use the value() function extension with a singular query.
2.4.9. Examples
| Query | Comment |
|---|---|
| $[?length(@) < 3] | well-typed |
| $[?length(@.*) < 3] | not well-typed since @.* is a non- |
| singular query | |
| $[?count(@.*) == 1] | well-typed |
| $[?count(1) == 1] | not well-typed since 1 is not a query or |
| function expression | |
| $[?count(foo(@.*)) | well-typed, where foo() is a function |
| == 1] | extension with a parameter of type |
| NodesType and result type NodesType | |
| $[?match(@.timezone, | well-typed |
| 'Europe/.*')] | |
| $[?match(@.timezone, | not well-typed as LogicalType may not be |
| 'Europe/.*') == | used in comparisons |
| true] | |
| $[?value(@..color) | well-typed |
| == "red"] | |
| $[?value(@..color)] | not well-typed as ValueType may not be |
| used in a test expression | |
| $[?bar(@.a)] | well-typed for any function bar() with a |
| parameter of any declared type and | |
| result type LogicalType | |
| $[?bnl(@.*)] | well-typed for any function bnl() with a |
| parameter of declared type NodesType or | |
| LogicalType and result type LogicalType | |
| $[?blt(1==1)] | well-typed, where blt() is a function |
| with a parameter of declared type | |
| LogicalType and result type LogicalType | |
| $[?blt(1)] | not well-typed for the same function |
| blt(), as 1 is not a query, logical- | |
| expr, or function expression | |
| $[?bal(1)] | well-typed, where bal() is a function |
| with a parameter of declared type | |
| ValueType and result type LogicalType |
Table 14: Function Expression Examples