Parse data and test fixtures from markdown files, and patch them programmatically, too.

go-testmark

Do you need test fixtures and example data for your project, in a language agnostic way?

Do you want it to be easy to combine with documentation, and easy for others to read?

Do you want those fixtures to be easy to maintain, because they're programmatically parsable, and programmatically patchable?

Do you want to be able to display those fixtures and examples in the middle of the docs you're already writing? (Are those docs already in markdown, and all you need is some glue code to make those code blocks sing?)

You're looking for testmark. And you found it.

This is go-testmark, a library implementing a parser (and patcher!) for the testmark format, which is itself a subset of markdown that you can use anywhere you're already using markdown.

It'll make your code blocks sing.

Read on:



What is the testmark format?

Testmark is a very simple, and language-agnostic, format. It's also a subset of markdown. Your markdown documents can also be testmark documents!

Testmark data is contained in a markdown codeblock -- you know, the things that start and end with triple-backticks.

Testmark data is labled by a markdown comment.

This means you can easily embed large blocks of data in a markdown file, and give them whatever label is appropriate, and do this seamlessly in a document you're also writing for a human audience.

This makes for great development of test fixtures, specs, and documentation, all in one.

testmark format by example

You see markdown code blocks all the time. They render like this:

{"these things": "you know?",
 "syntax highlighed, typically": "etc, etc"}

That right there? That was also testmark fixture data.

Check out the "raw" form of this page, if you're looking a rendered form. There's a specially structured "comment" in markdown which is tagging that markdown code block, marking it as testmark data, and giving it a name.

The comment looks like this:

[testmark]:# (the-data-name-goes-here)

... and then the triple-backticks go right after that. Data follows until the next line starting with triple-backticks, as is usual in markdown.

So, in total, it looks like this:

	[testmark]:# (the-data-name-goes-here)
	```text
	this is your data block
	as big as you like
	```

That's it.

Check out the testdata/example.md file (and other files in that directory) for more examples. Be sure to mind the raw form of the file too.

the purpose of testmark

Formats for test fixtures and example data are extremely useful. Some kind of language-agnostic format is critically important any time you're working on a project that involves codebases in more than one language, or any kind of networked interoperability.

Yet, picking a format (and getting people to agree on it) is hard. And then getting it in your documentation is hard.

And then maintaining the fixtures and your documentation is hard, because you're typically stuck between two choices that are both bad: either you can put some very ugly fixture formats in your documentation (and eventually realize that most users won't read them anyway, because of the eyeburn); or you can maintain the fixtures and documentation separately, while manually putting very pretty examples in the middle of your documentation (but then fail to make them load-bearing, so eventually coderot strikes, and now your documentation misleads users, and adoption drops and frustration rises, and oh dear).

Testmark is meant to solve all of these things.

  • Because testmark is "just markdown", you can easily use it together with other things that are already markdown. That means including in documentation, websites, etc.
  • Because you can intersperse markdown and prose with the code blocks, you can make good, readable, living and verifiable documentation, directly intertwined with your test data. It's great for commenting; it's great for docs. People will actually read this! And you have full control over the formatting and presentation. Annotate things however you like.
  • Because testmark is "just markdown", you can probably conclusively leap past a lot of bikeshedding conversations about test fixture data formats. Markdown isn't great, but good heavens is it useful, and ubiquitous. Your colleagues will probably agree.
  • Because testmark is "just markdown", you get all the other tools that work on markdown, for free. For example, that tasty, tasty syntax highlighting. There's no smoother way to getting pretty example data on a website and getting it directly used in tests at the same time. (No fancy website or publishing tool pipeline needed, either. You can just use github readme files -- just like this one!)
  • Because testmark is "just markdown", you can probably hand people links that jump them directly to headings in your fixtures files. Users who need those references will appreciate this; you who authors the fixtures and specs will probably take joy from being able to point directly at your latest work.
  • Because more than one code block can be in a file, and you can tag them with names, you can cram many fixtures in one file. (Or not. Up to you.)
  • Because it's machine-parsable, we can have tools and libraries that programmatically update the data blocks, too. And because testmark interacts with the markdown format in a very deterministic and well-bounded way, your markdown prose stays exactly where you put it, too. Easy fixture maintenance and automation, and good human readability? Yes, we can have both.

tl;dr: deduplicate the work of spec fixtures and docs, both saving time, and getting more confident in the results, simultaneously.

parsing testmark is easy

See README_parsing if you would like to write a testmark parser in another language. It's extremely straightforward.

This is go-testmark

This is a golang library that implements a parser, a patcher, and a writer for the testmark format.

You should be able to look at the godoc for about five seconds and figure it out. There's not much to it.

Features

parsing

go-testmark can parse any markdown file and look for testmark data hunks.

When you've parsed a testmark file, you can iterate over all the data hunks in it, and see their names, or look up them up by name.

Parsing works in the simplest way possible. It only looks at the code blocks tagged as testmark. (It actually ignores the actual markdown content as completely as possible. Simple is good. And it turns out it's possible to parse testmark data out, and even later support patching the testmark data blocks, without a complete markdown parser.)

walking and indexing

You can range linearly over the slice of parsed hunks in a Document once you've parsed it.

Each hunk has a name (from the testmark comment), a body (the blob from inside the code block), and optionally may have the code block's tag (if any; usually this is already used by other people, for syntax highlighting indicators).

If you use hunk names that look like filesystem paths (e.g. "foo/bar/baz", with slashes), you can also get an indexed view that lets you easily walk it as if it was directories. Just call Document.BuildDirIndex. ("Directories" for names with many segments will be created implicitly; it's very low friction.)

Once you've built a directory index, you can range over DirEnt either as an ordered list of its contents, or look things up by path segment like a map.

patching

When using the patch operation, the markdown you wrote will be maintained by the operation; only the testmark data blocks change. (No markdown gets reformated; nothing tries to normalize anything. Whatever you write is safe. Use whatever other markdown extensions you like; we're not gonna error if there's something fancy we didn't expect. It's chill.)

Patching is really simple. It looks like this:

doc, err := testmark.ReadFile("example.md")
doc = testmark.Patch(doc,
	testmark.Hunk{Name: "more-data", BlockTag: "text", Body: []byte("you have been...\nreplaced.\nand gotten\nrather longer.")},
	testmark.Hunk{Name: "this-is-the-data-name", BlockTag: "", Body: []byte("this one gets shorter.")},
	testmark.Hunk{Name: "this-one-is-new", BlockTag: "json", Body: []byte(`{"hayo": "new data!"}`)},
	testmark.Hunk{Name: "so-is-this", BlockTag: "json", Body: []byte(`{"appending": "is fun"}`)},
)
fmt.Printf("%s", doc.String())

(That's real code from our tests, and it applies on the example.md file in the testdata directory.)

writing

go-testmark can write back out a document that it's holding in memory.

You'll produce these by parsing, and by patching.

It's not really encouraged to try to create a new document purely via the go-testmark APIs. We don't offer any APIs for writing and formatting markdown outside of the testmark data blocks; it's better to just write that yourself, in an editor or with other tools fit for the purpose.

(You probably can start with an empty document and just patch hunk into it, and it'll be fine. It's just dubious if you'll really want to do that in practice.)

Examples

Check out the patch_test.go file for an example of what updating a testmark file looks like with this library.

Examples in the Wild

Check out how the IPLD project uses testmark:

  • This document is both prose documentation for humans, and full of testmark data (using directory naming conventions for the hunk names, too): ipld/specs/selectors/selector-fixtures-1
  • This code reads that document, and in a handful of lines, iterates over the "directories" of hunks, and then plucks data out of them: go-ipld/selector/spec_test.go

License

SPDX-License-Identifier: Apache-2.0 OR MIT

Acknowledgements

This is probably inspired by a lot of things. (Mostly my own failings. But hey. Those do teach you something.)

  • It's probably inspired heavily by rvagg's work on making IPLD Schema DSL parsers, which could parse content out of markdown codeblocks, and did this for similar "being able to embed the real data in the docs is cool" reasons. (That work differs slightly, in that that system just ran with the code block syntax tag hint, and also, has no patching capabilities, and also, aggregated all the data rather than making it accessible as named blocks. But the goals and reasons are very similar!)
  • It's probably inspired a bit by campoy's embedmd tool. (That work uses markdown comments in a similar way. Testmark differs in that it's meant for programmatic patching rather than use as a file-wangling tool, and also programmating reading; and that it treats the markdown file as the source of truth, rather than a terminal output.)
  • It's influenced by "taffy", another test fixture format I wrote not long before this one. (Taffy didn't get very far. It was special for no reason. Technically, it's "more correct" than testmark, because you can put any data in it. But: describing things attractively within the taffy format was basically impossible. That turns out to kill. This lesson informed the idea for testmark.)
  • It's also influenced by an even older attempt at test fixture format called wishfix (example). (You can see the "it should be attractive" rule applied more strongly in wishfix than in taffy; and yet, still, a lack of flexibility about formatting. The lesson to learn was again: don't be special; just use a format that's already capable of being decorative.)
  • Probably other things as well. A lot of test fixture formats have passed, however briefly, through my brain over the years. My apologies for any acknowledgements forgotten.
Owner
Eric Myhre
hash *all the things!* Working on trusted computing and reproducible builds.
Eric Myhre
Comments
  • Index fix

    Index fix

    The way the indexing is built, siblings can end up missing from the ChildrenList. I'm not sure exactly how the incorrect children list is created, but switching to pointers makes the problem go away. I think it's because the parent ChildrenList and the ChildrenList of a child in the parent's Children map are referencing different entities. However, looking at the code it appears that this isn't possible and it makes my head hurt a little.

    Below is how the new test shows the failure.

    $ go test ./...
    --- FAIL: TestIndexingTree (0.00s)
        index_test.go:97: 
            error:
              values are not equal
            comment:
              one
            comment:
              list: [two three]
            comment:
              keys: [two three four]
            got:
              int(3)
            want:
              int(2)
            stack:
              /home/cjb/repos/warpforge/go-testmark/index_test.go:97
                qt.Check(t, len(dir.Children), qt.Equals, len(dir.ChildrenList),
                    // If the lengths are equal then the map and list should contain entries with the same names.
                    // We don't know if the dir entries are _actually_ equivalent but the test recurses above so it should be fine.
                    qt.Commentf("%s", dir.Name),
                    qt.Commentf("list: %v", names(dir.ChildrenList)),
                    qt.Commentf("keys: %v", keys(dir.Children)),
                )
              /home/cjb/repos/warpforge/go-testmark/index_test.go:95
                assertChildren(t, child)
              /home/cjb/repos/warpforge/go-testmark/index_test.go:66
                assertChildren(t, *doc.DirEnt)
            
    FAIL
    FAIL	github.com/warpfork/go-testmark	0.003s
    ok  	github.com/warpfork/go-testmark/testexec	(cached)
    FAIL
    

    Here's an absurdly large chunk of json that shows the structure of the failing bits.

    $ jq '{Name: .Name, Children: .Children.one, ChildrenList: .ChildrenList[] | select(.Name == "one")}' log.json

    {
      "Name": "",
      "Children": {
        "Name": "one",
        "Children": {
          "four": {
            "Name": "four",
            "Children": {
              "bang": {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            },
            "ChildrenList": [
              {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            ]
          },
          "three": {
            "Name": "three",
            "Children": null,
            "ChildrenList": null
          },
          "two": {
            "Name": "two",
            "Children": null,
            "ChildrenList": null
          }
        },
        "ChildrenList": [
          {
            "Name": "two",
            "Children": null,
            "ChildrenList": null
          },
          {
            "Name": "three",
            "Children": null,
            "ChildrenList": null
          },
          {
            "Name": "four",
            "Children": {
              "bang": {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            },
            "ChildrenList": [
              {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            ]
          }
        ]
      },
      "ChildrenList": {
        "Name": "one",
        "Children": {
          "four": {
            "Name": "four",
            "Children": {
              "bang": {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            },
            "ChildrenList": [
              {
                "Name": "bang",
                "Children": null,
                "ChildrenList": null
              }
            ]
          },
          "three": {
            "Name": "three",
            "Children": null,
            "ChildrenList": null
          },
          "two": {
            "Name": "two",
            "Children": null,
            "ChildrenList": null
          }
        },
        "ChildrenList": [
          {
            "Name": "two",
            "Children": null,
            "ChildrenList": null
          },
          {
            "Name": "three",
            "Children": null,
            "ChildrenList": null
          }
        ]
      }
    }
    
  • Strict mode for testexec structure

    Strict mode for testexec structure

    Adds a strict mode such that the testexec structure is enforced by default.


    Original:

    • Adds tests to output that will show up as skipped. This should make the behavior of testexec easier understand and debug using go test -v.
    • Adds a RecursionFn to testexec to give users the ability to control recursion.
    • Adds a Path variable to DirEnt structs which contains its full path.

    Overall default behavior is unchanged.

  • Is there a sample Go app which uses this library?

    Is there a sample Go app which uses this library?

    After watching the GPN20 recording Testmark: a Markdown Convention for Test Fixtures and Data I wanted to try out go-testmark.

    After having a closer look, I realized that go-testmark is a golang library and not a "ready to run app". So I would need to write a tool myself to use it. Do you know if there is an app which uses this library?

  • A markdown code fence can be `~~~`

    A markdown code fence can be `~~~`

    https://spec.commonmark.org/0.30/#fenced-code-blocks

    Worth clarifying that the testmark parser requires code fences to be three backticks or adding support for both valid forms of code fence. Let us not even talk about indented code blocks.

    ```
    yer code
    ```
    

    vs

    ~~~
    yer code
    ~~~
    

    I love testmark.

  • Add a readme for the testexec extension and its conventions.

    Add a readme for the testexec extension and its conventions.

    This is a bit overdue :) Much of this was described in godocs or in comments in the code, but documentation deserves to be a little more discoverable than that!

  • testexec: subtest (

    testexec: subtest ("then-*") feature

    With this feature, naming a hunk in the pattern of {testname}/then-{subtestname}/script (when there's already a {testname}/script hunk, so testexec is already in motion) will cause subtests to run.

    Subtests are more of the same, but have one especially useful feature: they start within a working directory that's a copy of the files and directories that were in the working directory of the parent test after it was finished.

    Subtests can be nested -- {testname}/then-{subtestname}/then-{subsubtestname}/script -- in which case each of them continues getting a copy of the filesystem from the test that came before them....

    ... or, subtests can fan out -- {testname}/then-{subtestname}/script and {testname}/then-{othersubtestname}/script -- in which case each of them gets a copy of the parent, but separate copies, which do not interfere with each other.

    This should help a great deal in writing tests for programs that have filesystem state. It can also be very useful for writing suites of tests for a program that needs some configuration files, and you don't want to repeat those every time: now you can just use the top level test to set up the config files, and then use a bunch of sibling subtests for all the actual testing work.

    (This is itself... not yet tested. Don't rely on this commit hash; this PR may be rebased.)

  • Trailing whitespace error

    Trailing whitespace error

    Improves error handling around trailing whitespace after hunk name and extra lines before a code block after hunk declaration. Considering that we can't emit warnings of any kind, failing to parse when a hunk is missing a code block makes the most sense. This also aligns with my personal preference that malformed tests should fail.

  • testexec: getting exit codes correctly.

    testexec: getting exit codes correctly.

    ExecFn_Exec and ScriptFn_ExecBash (the defaults) now return exit codes correctly.

    I've gone perhaps a bit overboard in also making sure that death-by-signal is reported, too. Maybe this won't stick around (especially because it requires importing the syscall packaage, which might otherwise be avoidable on recentish versions of golang), because I don't know if it's critically relevant and likely to actually get exercised much in this domain... but I've written it, so let's see if it sticks.

  • fix parsing CRLF files, part 3

    fix parsing CRLF files, part 3

    Nothing can be easy or nice, can it.

    Dogpiling on https://github.com/warpfork/go-testmark/pull/4 and https://github.com/warpfork/go-testmark/pull/3 . I think this is addressing the last of the things we've discovered through that journey.

    Most of the diff is just me commenting crankily, as is natural.

    On the plus side, some tests are unskipped now and actually hold up some standards and expectations. That's nice.

  • Hunk.BlockTag should be called InfoString

    Hunk.BlockTag should be called InfoString

    Apparently, the text after the characters indicating the start of a code block are called the "info string": https://spec.commonmark.org/0.30/#info-string

    Currently the go-testmark code called it "BlockTag":

    https://github.com/warpfork/go-testmark/blob/53baa420296c28880c394db5f634b6d2c20208f5/testmark.go#L49-L51

    I like my name better, of course (I'm pretty sure someone would know what you mean faster when you say "the code block tag" than "the info string" until you bang them over the head with the spec link), but, ah well. It was a quick choice made without research.

    Might as well follow the upstream lexicon as long as there is indeed a clear name specified for it.

  • RFC: Describe function

    RFC: Describe function

    Here's a general idea for a Describe function which can help people figure out what is going on when they encounter failures in code using testmark.

    Why?

    1. test failure will be most people's first experience with this tool and when that happens they will have no context on what is happening or how testmark works. Pointing them to the document source where the relevant code actually resides can go a long way in reducing initial friction.
    2. Even having experience with testmark, it's still nice to have document markers in test failures so I can more easily find the thing I need to fix.
    go test ./...
    ok  	github.com/warpfork/go-testmark	(cached)
    --- FAIL: Test (0.02s)
        --- FAIL: Test/whee (0.01s)
            testexec_test.go:22: testmark describe
                selfexercise.md:13:whee/script
                selfexercise.md:21:whee/output
                selfexercise.md:31:whee/then-more-files/fs/b
                selfexercise.md:36:whee/then-more-files/script
                selfexercise.md:41:whee/then-more-files/output
                selfexercise.md:51:whee/then-touching-files/script
                selfexercise.md:60:whee/then-touching-files/output
                selfexercise.md:6:whee/fs/a
                selfexercise.md:70:whee/then-touching-files/then-subtesting-again/script
                selfexercise.md:76:whee/then-touching-files/then-subtesting-again/output
            testexec_test.go:27: forced failure
        --- FAIL: Test/using-stdin (0.00s)
            testexec_test.go:22: testmark describe
                selfexercise.md:87:using-stdin/input
                selfexercise.md:93:using-stdin/script
                selfexercise.md:98:using-stdin/output
            testexec_test.go:27: forced failure
    FAIL
    FAIL	github.com/warpfork/go-testmark/testexec	0.018s
    FAIL
    

    Potential Improvements:

    1. testexec doesn't have the information it needs to produce this sort of output at a more granular level. We could pass that information down and instrument throughout.
    2. Add options to Describe for sorting, setting/excluding path, min/max depth, filters, etc.
    3. Figure out what to do with empty paths. Not everything will be loaded via ReadFile.
    4. Absolute paths by default? Or at least dereferencing instances of ./../*?

    Other use case:

    1. In warpforge we just test formulas i.e. test.md:xx:formula and test.md:xx:formula/runrecord. The test could ask to describe the formula dir entirely, just the parent dir, or filter for runrecord.
  • started python-testmark

    started python-testmark

    Hi there. I was inspired by a talk in GPN so I took this as an exercise. I'm relatively new to (colaborative) software writing, so I didn't know how to do a pull request or how that works. Also there is two important things missing in python-testmark: some testing (ironically) and building it into a package. I will do this the next days (still learning) but for now I am happy to have you people have a quick look if I overlooked something. I'm somewhat baffled that this only took 40 lines of code.

    Greetings https://github.com/iameru/python-testmark

Markdown - Markdown converter for golang

markdown ?? Talks ?? Join ?? Youtube ❤️ Sponsor Install via nami nami install ma

Jun 2, 2022
Mdfmt - A Markdown formatter that follow the CommonMark. Like gofmt, but for Markdown

Introduction A Markdown formatter that follow the CommonMark. Like gofmt, but fo

Dec 18, 2022
Diff, match and patch text in Go

go-diff go-diff offers algorithms to perform operations required for synchronizing plain text: Compare two texts and return their differences. Perform

Dec 25, 2022
Go library to parse and render Remarkable lines files
Go library to parse and render Remarkable lines files

go-remarkable2pdf Go library to parse and render Remarkable lines files as PDF.

Nov 7, 2022
Schedule daily tweets from markdown files in your repo, posted via github actions.

markdown-tweet-scheduler Schedule daily tweets from markdown files in your repo, posted to twitter via github actions. Setup Fork this repo Get your t

Dec 6, 2022
Generate a global index for multiple markdown files recursively
Generate a global index for multiple markdown files recursively

markdown index Markdown-index is a library to help you generate a global index for multiple markdown files recursively in a directory, containing a su

Sep 25, 2022
Convert your markdown files to PDF instantly
Convert your markdown files to PDF instantly

Will take a markdown file as input and then create a PDF file with the markdown formatting.

Nov 7, 2022
'go test' runner with output optimized for humans, JUnit XML for CI integration, and a summary of the test results.
'go test' runner with output optimized for humans, JUnit XML for CI integration, and a summary of the test results.

gotestsum gotestsum runs tests using go test --json, prints formatted test output, and a summary of the test run. It is designed to work well for both

Dec 28, 2022
Easily to convert JSON data to Markdown Table

Easily to convert JSON data to Markdown Table

Oct 28, 2022
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler

Pagser Pagser inspired by page parser。 Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and str

Dec 13, 2022
Parse placeholder and wildcard text commands

allot allot is a small Golang library to match and parse commands with pre-defined strings. For example use allot to define a list of commands your CL

Nov 24, 2022
A Go library to parse and format vCard

go-vcard A Go library to parse and format vCard. Usage f, err := os.Open("cards.vcf") if err != nil { log.Fatal(err) } defer f.Close() dec := vcard.

Dec 26, 2022
Parse RSS, Atom and JSON feeds in Go
Parse RSS, Atom and JSON feeds in Go

gofeed The gofeed library is a robust feed parser that supports parsing both RSS, Atom and JSON feeds. The library provides a universal gofeed.Parser

Jan 8, 2023
parse and generate XML easily in go

etree The etree package is a lightweight, pure go package that expresses XML in the form of an element tree. Its design was inspired by the Python Ele

Dec 19, 2022
Parse line as shell words

go-shellwords Parse line as shell words. Usage args, err := shellwords.Parse("./foo --bar=baz") // args should be ["./foo", "--bar=baz"] envs, args, e

Dec 23, 2022
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

html-to-markdown Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent so

Jan 6, 2023
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.

Tagify Gets STDIN, file or HTTP address as an input and returns a list of most popular words ordered by popularity as an output. More info about what

Dec 19, 2022
Glow is a terminal based markdown reader designed from the ground up to bring out the beauty—and power—of the CLI.💅🏻
Glow is a terminal based markdown reader designed from the ground up to bring out the beauty—and power—of the CLI.💅🏻

Glow Render markdown on the CLI, with pizzazz! What is it? Glow is a terminal based markdown reader designed from the ground up to bring out the beaut

Dec 30, 2022
A clean, Markdown-based publishing platform made for writers. Write together, and build a community.
A clean, Markdown-based publishing platform made for writers. Write together, and build a community.

WriteFreely is a clean, minimalist publishing platform made for writers. Start a blog, share knowledge within your organization, or build a community

Jan 4, 2023