Structured Data Templates

Structured Data Templates

Structured data templates are a templating engine that takes a simplified set of input parameters and transforms them into a complex structured data output. Both the inputs and outputs can be validated against a schema.

The goals of this project are to:

  1. Provide a simple format: it's just JSON/YAML!
  2. Give enough tools to be useful:
    • Interpolation ${my_value} & ${num / 2 >= 5}
    • Branching (if/then/else)
    • Looping (for/each)
  3. Guarantee structural correctness
    • The structured data template is valid JSON / YAML
    • The input parameters are valid JSON / YAML
    • The output of the template is guaranteed to produce valid JSON
  4. Provide tools for semantic correctness via schemas
    • The input types and values pass the schema
    • The template will produce output that should pass the schema
    • The output of the template after rendering passes the schema

Structure

A structured data template document is made of two parts: schemas and a template. The schemas define the allowable input/output structure while the template defines the actual rendered output. An example document might look like:

schemas:
  # Dialect selects the default JSON Schema version
  dialect: openapi-3.1
  input:
    # Input schema goes here
    type: object
    properties:
      name:
        type: string
        default: world
  output:
    # Output schema goes here, also supports refs:
    $ref: https://api.example.com/openapi.json#components/schemas/Greeting
template:
  # Templated output structure goes here
  greeting: Hello, ${name}!

Example

You can run the example like so:

$ go run ./cmd/sdt ./samples/greeting.yaml <./samples/params.yaml
{
  "greeting": "Hello, SDT!"
}

Input params for rendering can be passed via stdin as JSON/YAML and/or via command line arguments as CLI shorthand syntax.

Schemas

JSON Schema is used for all schemas. It defaults to JSON Schema 2020-12 but can be overridden via the $schema key or using dialect in the structured data template document like above. Available dialects:

  • openapi-3.0
  • openapi-3.1
  • https://json-schema.org/draft/2020-12/schema
  • https://json-schema.org/draft/2019-09/schema
  • https://json-schema.org/draft-07/schema
  • https://json-schema.org/draft-06/schema
  • https://json-schema.org/draft-04/schema

The input schema describes the input parameters and the template will not render unless the passed parameters validate using the input schema. It also lets you set defaults for the input parameters, which default to nil if not passed.

The output schema describes the template's output structure. The validator is capable of understanding branches & loops to ensure that the output is semantically valid regardless of which path is taken during rendering.

Template Language Specification

A template is just JSON/YAML. For example:

hello: world

That is a valid static template. Nothing will change when rendered, which is not very useful. Normally, when a template is rendered, it is passed parameters, and these are used for interpolation, branching, and looping. These features all make use of a basic expression language.

Expressions

String interpolation, branching conditions, and loop variable selection all use an expression language. This allows you to make simple comparisons of the parameter context data. Examples:

  • foo > 50
  • len(item.bars) <= 5 || my_override
  • name contains "sdt"
  • name startsWith "sdt"
  • "foo" in ["foo", "bar"]
  • loop.index + 1

See antonmedv/expr language definition for details.

String Interpolation

String interpolation is the act of replacing the contents of ${...} within strings, where ... corresponds to an expression that makes use of input parameters. For example:

hello: ${name}

If passed {"name": "Alice"} as parameters this would render:

{
  "hello": "Alice"
}

Whenever the string is just one ${...} statement it will use whatever type it evaluates to in the result, so you are not limited to just strings. If the expression result is nil, then the property/item is not included in the rendered output.

It's also possible to add static text or multiple interpolation expressions in a single value:

hello: Greetings, ${name}!

Given the same input that would result in:

{
  "hello": "Greetings, Alice!"
}

Tricks

  • Force a string output by using more than one expression: ${my_number}${""}

Branching

Branching allows one of multiple paths to be followed in the template at rendering time based on the result of an expression. The special properties $if, $then, and $else are used for this. For example:

foo:
  $if: ${value > 5}
  $then: I am big
  $else: I am small

If rendered with {"value": 1} the result will be:

{
  "foo": "I am small"
}

Notice that the special properties are completely removed and replaced with the contents of either the $then or $else clauses. So while in the template foo is an object, the end result is that foo is a string and would pass the output schema.

If the expression is false and no $then is given, then the property is removed from the result.

Looping

Looping allows an array of inputs to be expanded into the rendered output using a per-item template. The $for, $as, and $each special properties are used for this. For example:

squares:
  $for: ${numbers}
  $each: ${item * item}

If rendered with {"numbers": [1, 2, 3]} the result will be:

{
  "squares": [2, 4, 9]
}

The $as property controls the name of the variable holding the current item, which defaults to item. A local variable loop is also set, which includes an index, and whether the item is the first or last in the array. If using $as then the loop variable is named loop_ + the $as value. This allows nested loops to access both their own and outer scope's loop variables. For example:

things:
  $for: ${things}
  $as: thing
  $each:
    id: ${loop_thing.index}-${thing.name}
    tags:
      $for: ${tags}
      $as: tag
      $each: ${loop_thing.index}-${loop_tag.index}-${tag}

Given:

{
  "things": [{ "name": "Alice" }, { "name": "Bob" }],
  "tags": ["big", "small"]
}

You would get as output:

{
  "things": [
    {
      "id": "0-Alice",
      "tags": ["0-0-big", "0-1-small"]
    },
    {
      "id": "1-Bob",
      "tags": ["1-0-big", "1-1-small"]
    }
  ]
}

Multiple Outputs

If the result of the $each template is an array, then each item of that array is individually appended to the overall result. This allows one input to generate multiple output entries in the final array.

things:
  $for: ${things}
  $each:
    - name: ${item.name} 1
    - name: ${itme.name} 2

With the same input as above you'd get:

{
  "things": [
    {
      "name": "Alice 1"
    },
    {
      "name": "Alice 2"
    },
    {
      "name": "Bob 1"
    },
    {
      "name": "Bob 2"
    }
  ]
}

If you need to create arrays of arrays, wrap it in another array to get around this behavior.

Open Questions

  1. Should we support macros? Could be done with $ref in the template, and we could add a top-level macros or definitions for document-local refs. They would be drop-in only, no calling with arguments, but would render based on the current params context.

  2. Should nil results from interpolation be rendered in the final output? Example: name: ${name} and what if name is nil?

  3. Support for constants? Values that should always be present in the params that can contain complex and reusable data for the template?

Owner
Similar Resources

Graph algorithms and data structures

Graph algorithms and data structures

Your basic graph Golang library of basic graph algorithms Topological ordering, image by David Eppstein, CC0 1.0. This library offers efficient and we

Jan 2, 2023

BTree provides a simple, ordered, in-memory data structure for Go programs.

BTree implementation for Go This package provides an in-memory B-Tree implementation for Go, useful as an ordered, mutable data structure. The API is

Dec 30, 2022

Set data structure for Go

Archived project. No maintenance. This project is not maintained anymore and is archived.. Please create your own map[string]Type or use one of the ot

Nov 21, 2022

Set data structure for Go

Archived project. No maintenance. This project is not maintained anymore and is archived.. Please create your own map[string]Type or use one of the ot

Nov 21, 2022

Graph algorithms and data structures

Graph algorithms and data structures

Your basic graph Golang library of basic graph algorithms Topological ordering, image by David Eppstein, CC0 1.0. This library offers efficient and we

Jan 25, 2021

flexible data type for Go

flexible data type for Go

Generic flexible data type for Go support: Go 1.12+ Install standard go get: go get -u github.com/usk81/generic/v2 Usage encode/decode: package main

Dec 31, 2022

Generates data structure definitions from JSON files for any kind of programming language

Overview Archivist generates data structure definitions from JSON files for any kind of programming language. It also provides a library for golang to

Jun 28, 2022

succinct provides several static succinct data types

succinct provides several static succinct data types Succinct Set Synopsis Performance Implementation License Succinct Set

Jan 5, 2023

A data structure for storing points.

A data structure for storing points.

ptree This package provides an in-memory data structure for storing points. Under the hood it stores points in a tree structure where nodes are spatia

Apr 18, 2022
Comments
  • feat: measure and warn if template complexity is high

    feat: measure and warn if template complexity is high

    This uses each branch, loop, and interpolation to calculate a complexity value for a template, and then writes a warning to os.Stderr when that value is high (>50). This should help act as a warning to the user that maybe the template needs to be split into two separate templates (i.e. too many use-cases are being handled).

  • feat: add explicit $flatten operation

    feat: add explicit $flatten operation

    Rather than having magic behavior for $for loops returning arrays, this removes that "feature" in favor of an explicit $flatten operation that enables things like:

    • Pre- and appending default values to arrays
    • Zipper-merging $for loops that output arrays of items

    This simplifies validation and rendering logic a bit. Docs are also updated.

  • feat: validate expr result type; better error filenames

    feat: validate expr result type; better error filenames

    This enables the validator to make sure the result of an expression is the right type based on the output schema. Before this was only possible at rendering time, not validation time.

    It also fixes the filenames in errors and allows the original filename to include a JSON path #/foo/bar which is appended to. This is useful for documents loaded from within other documents like we do with the test fixtures.

  • build failed

    build failed

    $ go get -u github.com/danielgtaylor/sdt

    github.com/danielgtaylor/sdt

    ../../go/pkg/mod/github.com/danielgtaylor/[email protected]/validate.go:137:123: invalid operation: match[0] + err.Offset() (mismatched types int and uint16) ../../go/pkg/mod/github.com/danielgtaylor/[email protected]/validate.go:150:132: cannot use err.Offset() + 2 (type uint16) as type int in argument to ctx.AddErrorOffset ../../go/pkg/mod/github.com/danielgtaylor/[email protected]/validate.go:179:133: cannot use err.Offset() + 2 (type uint16) as type int in argument to ctx.WithPath("$if").AddErrorOffset ../../go/pkg/mod/github.com/danielgtaylor/[email protected]/validate.go:206:135: cannot use err.Offset() + 2 (type uint16) as type int in argument to ctx.WithPath("$for").AddErrorOffset

Graphoscope: a solution to access multiple independent data sources from a common UI and show data relations as a graph
Graphoscope: a solution to access multiple independent data sources from a common UI and show data relations as a graph

Graphoscope A solution to access multiple independent data sources from a common UI and show data relations as a graph: Contains a list of by default

May 26, 2022
A tree like tool help you to explore data structures in your redis server
 A tree like tool help you to explore data structures in your redis server

Redis-view is a tree like tool help you explore data structures in your redis server

Mar 17, 2022
Bitset data structure

Your basic bit Set data structure for positive numbers A bit array, or bit set, is an efficient set data structure. It consists of an array that compa

Dec 29, 2022
Probabilistic set data structure
Probabilistic set data structure

Your basic Bloom filter Golang probabilistic set data structure A Bloom filter is a fast and space-efficient probabilistic data structure used to test

Dec 15, 2022
Probabilistic data structures for processing continuous, unbounded streams.

Boom Filters Boom Filters are probabilistic data structures for processing continuous, unbounded streams. This includes Stable Bloom Filters, Scalable

Dec 30, 2022
Data structure and algorithm library for go, designed to provide functions similar to C++ STL

GoSTL English | 简体中文 Introduction GoSTL is a data structure and algorithm library for go, designed to provide functions similar to C++ STL, but more p

Dec 26, 2022
Gota: DataFrames and data wrangling in Go (Golang)

Gota: DataFrames, Series and Data Wrangling for Go This is an implementation of DataFrames, Series and data wrangling methods for the Go programming l

Jan 6, 2023
A simple Set data structure implementation in Go (Golang) using LinkedHashMap.

Set Set is a simple Set data structure implementation in Go (Golang) using LinkedHashMap. This library allow you to get a set of int64 or string witho

Sep 26, 2022
Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching.

Trie Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching. Usage Create a Trie with: t := trie.New() Add Keys with:

Dec 27, 2022