Parsing Data

Parse engines are used to parse incoming data using a schema. The syntax is based on JSON Schema with a few additions.

String

The string type is used for text.

Syntax

schema:
  type: string

Integer

The integer type is used for whole numbers (i.e. no fractions) and can be positive, negative or zero.

Fractional components of numbers will be dropped e.g. 1.9 becomes 1.

Syntax

schema:
  type: integer

Example

tasks:
  - task: parse
    input:
      - '-42'
      - '1.9'
      - '0'
      - '12'
    schema:
      type: integer

Outputs:

- -42
- 1
- 0
- 12

Number

The number type is used for floating point numbers (i.e. numbers with a fractional component).

Syntax

schema:
  type: number

Boolean

The boolean type can be either true or false. Booleans can be parsed from numbers & strings.

Values that are parsed as true: 1, t, T, true, TRUE, True.
Values that are parsed as false: 0, f, F, false, FALSE, False.

Syntax

schema:
  type: boolean

Example

tasks:
  - task: parse
    input:
      - '1'
      - '0'
      - 'FALSE'
    schema:
      type: boolean

Outputs:

- true
- false
- false

Array

The array type is used for lists of any single type. You could for example have an array of booleans, strings, numbers etc. It is however not possible to have an array that contains multiple types.

If the root schema is an array most exporters will consider each element of the array a separate record. To have an array in a single record you can wrap it in an object type.

Syntax

schema:
  type: array
  items: <schema-definition>

Examples

tasks:
  - task: parse
    engine: json
    input: '{ "data": [ "foo", "bar", "baz" ] }'
    schema:
      type: array
      source: 'data'
      items:
        type: string
        source: '.'

Outputs:

- foo
- bar
- baz

Object

The object type defines a key-value map. The keys must be strings. Each mapping of a key and a value is referred to as a property.

By adding a property name (i.e. a key) in the required list any object that doesn’t contain the property will be skipped.

Syntax

schema:
  type: object
  properties:
    <property-name>: <schema-definition>
  required:
    - <property-name>

Example

In the following example we’re parsing a few json objects. Notice how the 3rd input is omitted from the results as we’ve set name as a required property.

tasks:
  - task: parse
    engine: json
    input:
      - '{ "name": "Alice" }'
      - '{ "name": "Bob", "age": 34 }'
      - '{ "age": 45 }'
    schema:
      type: object
      properties:
        name:
          type: string
        age:
          type: integer
      required:
        - name

Outputs:

- name: Alice
- name: Bob
  age: 34