Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

Dud

Build status Go report card

Website | Install | Getting Started | Source Code

Dud is a lightweight tool for versioning data alongside source code and building data pipelines. In practice, Dud extends many of the benefits of source control to large binary data.

With Dud, you can commit, checkout, fetch, and push large files and directories with a simple command line interface. Dud stores recipes (a.k.a. stages) for retrieving your data in small YAML files. These stages can be stored in source control to link your data to your code. On top of that, stages can run the commands to generate the data, sort of like Make. Stages can be chained together to create data pipelines. See the Getting Started guide for a hands-on overview.

Dud is pronounced "duhd", not "dood". Dud is not an acronym.

Motivation

Dud is heavily inspired by DVC. DVC addresses the need for data versioning and reproducibility, but its implementation is not without problems. My criticisms of DVC boil down to two things: speed and simplicity. By speed, I mean throughput and responsiveness. By simplicity, I mean doing less--both in project scope and amount of abstraction.

In terms of speed, Dud is generally much faster than DVC. In terms of simplicity, Dud has a smaller, more focused scope, and it is distributed as a single executable.

To summarize with an analogy: Dud is to DVC what Flask is to Django. Both Dud and DVC have their strengths. If you want a "batteries included" suite of tools for managing machine learning projects, DVC may be a good fit for you. If data management is your main area of need and you want something lightweight and fast, Dud may be a better fit.

To get down to brass tacks, read on.

Concrete differences with DVC

Dud does not manage experiments and/or metrics.

Dud is solely focused on versioning and reproducing data alongside source code. DVC's scope has grown to encompass a large portion of a traditional machine learning workflow. While an integrated suite of tools has its benefits, if UNIX is any guide, the composition of smaller, more focused tools generally yield more productivity than their monolithic counterparts. For example, there's no reason you couldn't use MLflow or Aim alongside Dud to track your experiments. Dud does not prescribe any solution for experiment tracking, and it doesn't try to enter the new, yet already crowded, marketplace for such tools.

Secondly, versioning data alongside source code is an incredibly useful concept in its own right. Domains beyond machine learning and data science (e.g. game development and digital design) may greatly benefit from this approach to data management without being burdened by extra baggage carried by a specific domain.

Dud commits must always be explicitly invoked; they are never side effects.

For both Dud and DVC, committing data to the cache is one of the most expensive operations that each tool undertakes (in terms of both run-time and I/O). Because of this, Dud puts the user in absolute control of when to commit data. In Dud, commits only happen in when you run dud commit.

In contrast, DVC often commits automatically on your behalf as a side effect of other commands (for example, during dvc add and dvc repro). While DVC is trying to be helpful, these implicit commits are often accidental commits. For example, if you're rapidly iterating on a pipeline, you're likely running dvc repro or dvc run repeatedly as you develop. However, DVC will automatically commit the results each time you run dvc repro or dvc run--even if you are just debugging something or tweaking your code. Such accidental commits have a high cost; they turn "rapid development" into "development", and they bloat your cache. (You can disable DVC's implicit commits using the --no-commit flag, but you have to remember to type it each time, and DVC does not support enabling this flag by default, e.g. via configuration file.)

Dud checks out files as symbolic links by default.

When Dud checks out cached files into the workspace, it uses symbolic links (a.k.a. symlinks) by default. Symlinks have a number of benefits that make them an excellent choice for checkouts. First, symlinks require very little I/O to create, so dud checkout usually completes almost instantaneously. Second, symlinks transparently redirect to the cached files themselves, so data isn't duplicated between the workspace and the cache, and your storage space is used efficiently. Last but not least, symlinks make it trivial to check if a file is up-to-date (by checking the link target), so dud status can also be extremely fast.

By default, DVC checks out files as hard copies. (Technically, DVC tries to use reflinks before copies, but very few filesystems support reflinks, so copies are far more likely to be the default.) With hard copies, efficiencies listed above are not possible, so checkouts and status checks are inefficient by default. To its credit, DVC's cache can be configured to use symlinks, but arguably DVC's default cache configuration is not sensible for projects of any significant size.

Running a Dud pipeline never implicitly alters a stage's artifacts.

When you run a pipeline in DVC, DVC will remove all pipeline outputs before running the pipeline's command(s). While this can help ensure reproducible pipelines, it is another implicit behavior the user must consider, and it prevents the user from deciding when stage outputs can safely be reused.

If you don't want DVC to automatically remove outputs for you, you need to explicitly tell it each output you'd like to persist. However, by telling DVC to persist an output, DVC may perform a new and different automatic behavior. If you're using symbolic links (or hard links) for checkouts (which is generally a good idea; see above), DVC will "unprotect" all output links by replacing them with hard copies from the cache. Not only is this behavior surprising, it's also very costly in both runtime and storage.

The result of these two behaviors in DVC means that, in a sensible configuration, stages simply cannot reuse outputs efficiently; the user has little choice but to accept DVC's limitations.

When you run a pipeline Dud, Dud doesn't do any implicit modification of existing files. Dud defers all modification of workspace files to the user. If you want a specific behavior, you should code it into your stage's command. For example, if you want to clear all outputs of a stage prior to it running, you can delete any outputs at the beginning of your command's script. If you want to reuse outputs, you can check for preexisting outputs in your script and choose not to recreate them. Dud's minimalist approach results in a stage's command entirely owning it's own reproducibility; the responsibility is not awkwardly shared between the stage and the tool.

Dud delegates remote cache management to Rclone.

Rclone is a very popular command-line tool which describes itself as "The Swiss army knife of cloud storage." At the time of writing, Rclone has more than 28,000 stars on Github. Rclone supports just about any cloud storage provider you've possibly heard of. (S3, GCS, Dropbox, Backblaze, to name a few.) This is all to say: Rclone is a top-tier choice for moving data around the internet.

Dud internally calls Rclone for all of its remote cache functionality, such as dud fetch and dud push. But Dud doesn't hide the Rclone abstraction entirely. Dud exposes its Rclone configuration file, and it's expected and encouraged that users will use Rclone directly to configure remote storage or interact with their remote data. By using Rclone, Dud's remote cache interface immediately gains the benefit of years of open-source development and a rich, well-documented CLI. This is an example of how Dud embraces the UNIX philosophy and the composition of single-focus tools, as stated above.

In contrast, DVC stiches together various Python packages to support a modest assortment of cloud storage options. At the time of writing, DVC 2.6 supports eleven cloud storage providers, and Rclone 1.56 supports more than fifty. But the amount of cloud storage options isn't the critical disadvantage of DVC's approach. (Both Dud and DVC support the biggest players, such as S3 and GCS.) DVC's critical disadvantage is that they must develop and maintain most of their remote data management stack themselves. If Rclone is any indication, cloud data transfer is a very hard problem, and DVC has their work cut out for them.

In summary, Dud leverages the deep knowledge and effort of the Rclone developers to provide a robust and familiar remote cache experience. DVC plots their own course, and in doing so incurs a steep development cost.

Dud does not use analytics. (And it never will.)

By default, DVC enables embedded analytics. I strongly disagree with this practice, especially in free and open-source software. I will never embed analytics in Dud.

Contributing

See CONTRIBUTING.md.

License

BSD-3-Clause. See LICENSE.

Comments
  • deps: bump github.com/spf13/viper from 1.13.0 to 1.14.0

    deps: bump github.com/spf13/viper from 1.13.0 to 1.14.0

    Bumps github.com/spf13/viper from 1.13.0 to 1.14.0.

    Release notes

    Sourced from github.com/spf13/viper's releases.

    v1.14.0

    What's Changed

    Enhancements 🚀

    Breaking Changes 🛠

    Dependency Updates ⬆️

    Full Changelog: https://github.com/spf13/viper/compare/v1.13.0...v1.14.0

    Commits
    • b89e554 chore: update crypt
    • db9f89a chore: disable watch on appengine
    • 4b8d148 refactor: use new Has fsnotify method for event matching
    • 2e99a57 refactor: rename watch file to unsupported
    • dcb7f30 feat: fix compilation for all platforms unsupported by fsnotify
    • 2e04739 ci: drop dedicated wasm build
    • b2234f2 ci: add build for aix
    • 52009d3 feat: disable watcher on aix
    • b274f63 build(deps): bump github.com/fsnotify/fsnotify from 1.5.4 to 1.6.0
    • 7c62cfd build(deps): bump github.com/stretchr/testify from 1.8.0 to 1.8.1
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump hugo/themes/book from `317ccae` to `3d2bfec`

    deps: bump hugo/themes/book from `317ccae` to `3d2bfec`

    Bumps hugo/themes/book from 317ccae to 3d2bfec.

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/spf13/cobra from 1.5.0 to 1.6.1

    deps: bump github.com/spf13/cobra from 1.5.0 to 1.6.1

    Bumps github.com/spf13/cobra from 1.5.0 to 1.6.1.

    Release notes

    Sourced from github.com/spf13/cobra's releases.

    v1.6.1

    Bug fixes 🐛

    • Fixes a panic when AddGroup isn't called before AddCommand(my-sub-command) is executed. This can happen within more complex cobra file structures that have many different inits to be executed. Now, the check for groups has been moved to ExecuteC and provides more flexibility when working with grouped commands - @​marckhouzam (and shout out to @​aawsome, @​andig and @​KINGSABRI for a deep investigation into this! 👏🏼)

    v1.6.0

    Summer 2022 Release

    Some exciting changes make their way to Cobra! Command completions continue to get better and better (including adding --help and --version automatic flags to the completions list). Grouping is now possible in your help output as well! And you can now use the OnFinalize method to cleanup things when all "work" is done. Checkout the full changelog below:


    Features 🌠

    Deprecation 👎🏼

    • ExactValidArgs is deprecated (but not being removed entirely). This is abit nuanced, so checkout #1643 for further information and the updated user_guide.md on how this may affect you (and how you can take advantage of the correct behavior in the validators): @​umarcor #1643

    Bug fixes 🐛

    Dependencies 🗳️

    Testing 🤔

    Docs ✏️

    Misc 💭

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/stretchr/testify from 1.8.0 to 1.8.1

    deps: bump github.com/stretchr/testify from 1.8.0 to 1.8.1

    Bumps github.com/stretchr/testify from 1.8.0 to 1.8.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/spf13/cobra from 1.5.0 to 1.6.0

    deps: bump github.com/spf13/cobra from 1.5.0 to 1.6.0

    Bumps github.com/spf13/cobra from 1.5.0 to 1.6.0.

    Release notes

    Sourced from github.com/spf13/cobra's releases.

    v1.6.0

    Summer 2022 Release

    Some exciting changes make their way to Cobra! Command completions continue to get better and better (including adding --help and --version automatic flags to the completions list). Grouping is now possible in your help output as well! And you can now use the OnFinalize method to cleanup things when all "work" is done. Checkout the full changelog below:


    Features 🌠

    Deprecation 👎🏼

    • ExactValidArgs is deprecated (but not being removed entirely). This is abit nuanced, so checkout #1643 for further information and the updated user_guide.md on how this may affect you (and how you can take advantage of the correct behavior in the validators): @​umarcor #1643

    Bug fixes 🐛

    Dependencies 🗳️

    Testing 🤔

    Docs ✏️

    Misc 💭

    Note: Per #1804, we will be moving away from "seasonal" releases and doing more generic point release targets. Continue to track the milestones and issues in the spf13/cobra GitHub repository for more information!

    Great work everyone! Cobra would never be possible without your contributions! 🐍

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump hugo/themes/book from `317ccae` to `6090fdd`

    deps: bump hugo/themes/book from `317ccae` to `6090fdd`

    Bumps hugo/themes/book from 317ccae to 6090fdd.

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump go.uber.org/goleak from 1.1.12 to 1.2.0

    deps: bump go.uber.org/goleak from 1.1.12 to 1.2.0

    Bumps go.uber.org/goleak from 1.1.12 to 1.2.0.

    Release notes

    Sourced from go.uber.org/goleak's releases.

    v1.2.0

    Added

    • Add Cleanup option that can be used for registering cleanup callbacks. (#78)

    Changed

    • Mark VerifyNone as a test helper. (#75)

    Thanks to @​tallclair for their contribution to this release.

    Changelog

    Sourced from go.uber.org/goleak's changelog.

    1.2.0

    Added

    • Add Cleanup option that can be used for registering cleanup callbacks. (#78)

    Changed

    • Mark VerifyNone as a test helper. (#75)

    Thanks to @​tallclair for their contribution to this release.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/spf13/viper from 1.12.0 to 1.13.0

    deps: bump github.com/spf13/viper from 1.12.0 to 1.13.0

    Bumps github.com/spf13/viper from 1.12.0 to 1.13.0.

    Release notes

    Sourced from github.com/spf13/viper's releases.

    v1.13.0

    Important: This is the last release supporting Go 1.15.

    What's Changed

    Exciting New Features 🎉

    Enhancements 🚀

    Bug Fixes 🐛

    Dependency Updates ⬆️

    New Contributors

    Full Changelog: https://github.com/spf13/viper/compare/v1.12.0...v1.13.0

    Commits
    • 57cc9a0 test: fix ini tests
    • 8030d5b build(deps): bump gopkg.in/ini.v1 from 1.66.4 to 1.67.0
    • 312417a Add a DebugTo convenience funtion
    • 202060b Adds support for uint16 with GetUint16
    • 97591f0 build: fix lint violations
    • 9af8dae ci: upgrade golangci-lint
    • 7b4f2b2 ci: add Go 1.19 to CI
    • 601ec81 test: fix toml tests
    • d7f4832 build(deps): bump github.com/pelletier/go-toml/v2 from 2.0.2 to 2.0.5
    • c2f42f3 build(deps): bump github.com/subosito/gotenv from 1.4.0 to 1.4.1
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/google/go-cmp from 0.5.8 to 0.5.9

    deps: bump github.com/google/go-cmp from 0.5.8 to 0.5.9

    Bumps github.com/google/go-cmp from 0.5.8 to 0.5.9.

    Release notes

    Sourced from github.com/google/go-cmp's releases.

    v0.5.9

    Reporter changes:

    • (#299) Adjust heuristic for line-based versus byte-based diffing
    • (#306) Use value.TypeString in PathStep.String

    Code cleanup changes:

    • (#297) Use reflect.Value.IsZero
    • (#304) Format with Go 1.19 formatter
    • (#300 )Fix typo in Result documentation
    • (#302) Pre-declare global type variables
    • (#309) Run tests on Go 1.19
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/felixge/fgprof from 0.9.2 to 0.9.3

    deps: bump github.com/felixge/fgprof from 0.9.2 to 0.9.3

    Bumps github.com/felixge/fgprof from 0.9.2 to 0.9.3.

    Release notes

    Sourced from github.com/felixge/fgprof's releases.

    v0.9.3

    • f1f92dd feat: Support line numbers in pprof (#22)
    • 3cad799 Set time and duration of profile (#18)
    • 1fa9aa6 Populate PeriodType appropriately (#17)
    • b0f80df README: Update for Go 1.19
    • 3eac545 Set Period to the number of nanoseconds between samples
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump github.com/mattn/go-isatty from 0.0.14 to 0.0.16

    deps: bump github.com/mattn/go-isatty from 0.0.14 to 0.0.16

    Bumps github.com/mattn/go-isatty from 0.0.14 to 0.0.16.

    Commits
    • 13e91bf bump
    • 47c6087 update GitHub Workflow
    • f001b72 Merge pull request #75 from tjni/update-x-sys
    • 89699b9 Update golang.org/x/sys for Go 1.18 on M1 Macs.
    • cdb00f1 Merge pull request #68 from tklauser/fix-go.sum
    • 689cfc2 Update go.sum after golang.org/x/sys update
    • d423e9c Merge pull request #67 from tklauser/fix-go-version-gh-action
    • 65c33a1 Use correct field to specify Go version in GitHub action
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • deps: bump hugo/themes/book from `317ccae` to `d5b75f4`

    deps: bump hugo/themes/book from `317ccae` to `d5b75f4`

    Bumps hugo/themes/book from 317ccae to d5b75f4.

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • status: add flag to skip byte-level equality checks

    status: add flag to skip byte-level equality checks

    When you've checked out hard copies of files (or when you have generated new files, e.g., as part of a stage run) dud status can be extremely slow, as it checks the hard copies for equality vs. the cache. The user should be able to skip this checking to get a more responsive (albeit incomplete) status update.

  • add diff command to summarize the differences between two cached artifacts

    add diff command to summarize the differences between two cached artifacts

    Usage:

    dud diff <checksum_a> <checksum_b>
    

    or

    dud diff <path_to_cached_artifact_a> <path_to_cached_artifact_b>
    

    If the cached artifacts are directory manifests, the directory artifacts are recursively loaded and a diff of the full structure is displayed. (cmp.Diff could be used to accomplish this.)

    If the cached artifacts are NOT directory manifests, the location of the first difference is displayed (e.g., "first difference detected at byte X")

    If the cached artifacts are a directory manifest and a binary file, display as much.

  • add flag to dry-run impactful commands

    add flag to dry-run impactful commands

    At the least, remote cache commands (e.g. fetch, push, pull) should support some sort of dry-run. Commit and checkout could also benefit from a dry-run flag, but are not as critical.

CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

Jan 1, 2023
Open source framework for processing, monitoring, and alerting on time series data

Kapacitor Open source framework for processing, monitoring, and alerting on time series data Installation Kapacitor has two binaries: kapacitor – a CL

Dec 24, 2022
Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.
Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据,解析为Prometheus metric数据。

May 18, 2022
Baker is a high performance, composable and extendable data-processing pipeline for the big data era

Baker is a high performance, composable and extendable data-processing pipeline for the big data era. It shines at converting, processing, extracting or storing records (structured data), applying whatever transformation between input and output through easy-to-write filters.

Dec 14, 2022
sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel.

sq: swiss-army knife for data sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document fo

Jan 1, 2023
This project is meant to make you code a digital version of an ant farm

This project is meant to make you code a digital version of an ant farm. Create a program lem-in that will read from a file (describing the ants and the colony) given in the arguments. Upon successfully finding the quickest path, lem-in will display the content of the file passed as argument and each move the ants make from room to room. How does it work? You make an ant farm with tunnels and rooms. You place the ants on one side and look at how they find the exit.

Dec 24, 2021
DEPRECATED: Data collection and processing made easy.

This project is deprecated. Please see this email for more details. Heka Data Acquisition and Processing Made Easy Heka is a tool for collecting and c

Nov 30, 2022
Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go.

kanzi Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go. modern: state-of-the-art algorithms are impleme

Dec 22, 2022
churro is a cloud-native Extract-Transform-Load (ETL) application designed to build, scale, and manage data pipeline applications.

Churro - ETL for Kubernetes churro is a cloud-native Extract-Transform-Load (ETL) application designed to build, scale, and manage data pipeline appli

Mar 10, 2022
Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data
Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data throughout the software development life cycle (SDLC) for engineering teams.

Dec 30, 2022
A library for performing data pipeline / ETL tasks in Go.
A library for performing data pipeline / ETL tasks in Go.

Ratchet A library for performing data pipeline / ETL tasks in Go. The Go programming language's simplicity, execution speed, and concurrency support m

Jan 19, 2022
A distributed, fault-tolerant pipeline for observability data

Table of Contents What Is Veneur? Use Case See Also Status Features Vendor And Backend Agnostic Modern Metrics Format (Or Others!) Global Aggregation

Dec 25, 2022
Data syncing in golang for ClickHouse.
Data syncing in golang for ClickHouse.

ClickHouse Data Synchromesh Data syncing in golang for ClickHouse. based on go-zero ARCH A typical data warehouse architecture design of data sync Aut

Jan 1, 2023
Machine is a library for creating data workflows.
Machine is a library for creating data workflows.

Machine is a library for creating data workflows. These workflows can be either very concise or quite complex, even allowing for cycles for flows that need retry or self healing mechanisms.

Dec 26, 2022
Stream data into Google BigQuery concurrently using InsertAll() or BQ Storage.

bqwriter A Go package to write data into Google BigQuery concurrently with a high throughput. By default the InsertAll() API is used (REST API under t

Dec 16, 2022
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Gleam Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize. Gleam is built

Jan 5, 2023
Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

Gonum Installation The core packages of the Gonum suite are written in pure Go with some assembly. Installation is done using go get. go get -u gonum.

Dec 29, 2022
Graphik is a Backend as a Service implemented as an identity-aware document & graph database with support for gRPC and graphQL
Graphik is a Backend as a Service implemented as an identity-aware document & graph database with support for gRPC and graphQL

Graphik is a Backend as a Service implemented as an identity-aware, permissioned, persistant document/graph database & pubsub server written in Go.

Dec 30, 2022