JSON Parser

The JSON parser is used for parsing incoming JSON data. To use it set engine to json and use a JSON path in your schema’s source.

To parse newline delimited JSON data set the engine to ndjson. Each row will be treated as a separate JSON document.

JSON Path Syntax

Code Description
. The dot operator is used to denote a child element of the current element e.g. parent.child.
* Wildcard matching 0+ characters e.g. foo* matches foo as well as foobar.
? Wildcard matching any 1 character e.g. fo? matches foo, not foobar.
\ Used to escape special characters such as ., * and ?.

Any source starting with a dot (.) will be considered a relative path and use the parent schema’s source as the root.
If the parent schema has the array type it will resolve to an element of that array.

You can test out your JSON paths on the GJSON Playground.

Examples

JSON

Content of https://example.com/data.json:

{
  "data": {
    "person": {
      "name": "Alice",
      "age": 29,
      "friends": [
        {
          "name": "Bob",
          "age": 34
        },
        {
          "name": "Carol",
          "age": 47
        }
      ]
    }
  }
}

Optimus task config:

tasks:
  - task: scrape
    input: https://example.com/data.json
    engine: json
    schema:
      type: object
      source: data.person # could be omitted if we weren't using a relative path for the `name` property below
      properties:
        name:
          type: string
          source: .name # as the path starts with a dot it is a relative path from the parent (i.e. `data.person`)
        age:
          type: integer
          source: data.person.age
        friends:
          type: array
          source: data.person.friends
          items:
            type: object
            properties:
              name:
                type: string
                source: .name # since the parent is an array paths starting with a dot will be resolved to the array's items
              age:
                type: integer
                source: .age

Output:

name: Alice
age: 29
friends:
  - name: Bob
    age: 34
  - name: Carol
    age: 47

Newline Delimited JSON

Content of https://example.com/data.json:

{ "name": "Alice", "age": 29}
{ "name": "Bob", "age": 34 }
{ "name": "Carol", "age": 47 }

Optimus task config:

tasks:
  - task: scrape
    input: https://example.com/data.json
    engine: json
    schema:
      type: object
      properties:
        name:
          type: string
          source: name
        age:
          type: integer
          source: age

Output:

- name: Alice
  age: 29
- name: Bob
  age: 34
- name: Carol
  age: 47