On-line Machine Learning in Go (and so much more)

goml

Golang Machine Learning, On The Wire

GoDoc wercker status

goml is a machine learning library written entirely in Golang which lets the average developer include machine learning into their applications. (pronounced like the data format 'toml')

While models include traditional, batch learning interfaces, goml includes many models which let you learn in an online, reactive manner by passing data to streams held on channels.

The library includes comprehensive tests, extensive documentation, and clean, expressive, modular source code. Community contribution is heavily encouraged.

Each package (mentioned below) includes individual README's to learn more about the function, and purpose of the models. Above all, if you want to learn about models, read the GoDoc reference for the package. All models are, as mentioned above, heavily documented.

Installation

go get github.com/cdipaolo/goml/base

# This could be any other model package if you want
#
# Also, the base package is imported already
# by many of the packages so you might not even
# need to `go get` the package explicitly
go get github.com/cdipaolo/goml/perceptron

Documentation

All the code is well documented, and the source is/should be really readable if you'd like to make sense of it all! Look at each package (like right now, in GitHub,) and you will see a link to Godoc as well as an explanation of the package and an example usage. You can even click on the main bullets below and it'll take you to those packages. Also you could just use the Godoc link at the top of this README and navigate to the package you'd like to see more about.

Sub-bullets below will take you directly to the source code of the model.

Currently Implemented Models

Contributing!

see CONTRIBUTING.

I'd love help with any of this if anybody thinks that they would like to implement a model that isn't here, or if they have improvements to current models implemented, or if they want to help with documentation (this would be greatly appreciated, believe me, writing great documentation takes time! 👍 )

LICENSE - MIT

see LICENSE

Owner
Conner DiPaolo
Business Manager at Citadel GQS
Conner DiPaolo
Comments
  • Bayes.go tokenizer breaks the sentiment restoration

    Bayes.go tokenizer breaks the sentiment restoration

    #12 breaks restorations that don't call NewNaiveBayes()

    Sentiment is broken because it doesn't set the tokenizer. Should the tokenizer somehow be set when it is un-marshaled?

  • Remove `fmt.Printf`s?

    Remove `fmt.Printf`s?

    Hello!

    Great library. I noticed during tests that the code decides to just fmt.Printf. I don't want the ML lib in my app to be outputting to the console without me knowing. Can we disable that? Or provide a way to provide an alternate io.Writer?

    Thanks!

  • fmt.Errorf format %v reads arg #2, but call has 1 arg

    fmt.Errorf format %v reads arg #2, but call has 1 arg

    This line in kmeans throws an error fmt.Errorf format %v reads arg #2, but call has 1 arg while running tests.

    A simple fix would be to replace the line in question

    errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", point)
    

    with this

    errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", centroids, point)
    

    Follow up question, is this project in active development?

  • add concurrency-friendly map access to fix #8

    add concurrency-friendly map access to fix #8

    Created a new type histogram that couples a sync.RWMutex and the existing map. The type itself isn't exported (no one should be instantiating these, right?), but its Get and Set methods are. This is a significant change for consumers that create their own NaiveBayes struct without calling NewNaiveBayes.

  • Examples

    Examples

    I'd like to learn more about machine learning and this library looks like a good place to start building something with. Are there any examples you could post to demonstrate some simple use cases?

  • Add go.mod

    Add go.mod

    Thought I'd add this to make the project compatible with go modules. This fixes an issue with build where go get fails. Now the build should (in theory) only fail if tests don't pass.

  • Why do Predict and Probability functions use different operators?

    Why do Predict and Probability functions use different operators?

    I understand why the Naive Bayes "Predict" function uses a math.Log() to avoid an underflow. I don't understand why on lines 288 and 293 the operator is += instead of *=... Could you provide an explanation? Maybe an update to the docs?

  • handle unicode in sanitization functions

    handle unicode in sanitization functions

    I renamed the old functions so that everything now 'defaults' to the unicode-friendly versions. I also changed the range of digits accepted by OnlyAsciiWordsAndNumbers and refactored the tests to make what each one was testing more obvious.

  • Fix persistence & restoration of Naive Bayes models

    Fix persistence & restoration of Naive Bayes models

    It seems that #9 has caused projects using github.com/cdipaolo/sentiment to break. I've found out the reason is that the new concurrentMap is no longer a simple map that Go knows how to un/marshal out of the box.

    This pull request implements the necessary functions for concurrentMap so that the Restore() function works again.

  • Text models, uint8 for number of classes?

    Text models, uint8 for number of classes?

    I don't know that much at the moment about ML so pardon me if this is ignorant. Is there a reason that the number of classes for text classification is limited to 255 via uint8? Would it be possible to increase this?

  • Any interest in XGBoost?

    Any interest in XGBoost?

    Hello, I have an experimental high performance XGBoost (tree_method=exact only) implementation here:

    https://github.com/Statfactory/cortado (python + llvm) https://github.com/Statfactory/cortado-fs (F#) https://github.com/Statfactory/JuML.jl (Julia)

    I could port it to golang with some help if there is interest:)

    Adam

  • TFIDF doesn't work

    TFIDF doesn't work

    TFIDF doesn't work unless we actually save the DocsSeen value in the Bayes model.

    Currently the struct for Word doesn't do this.

    type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 json:"-" }

    Should be:

    type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 }

  • Roadmap / Comparison to other Go ML libraries

    Roadmap / Comparison to other Go ML libraries

    How does goml compare to some of the other Go libraries in terms of product vision / roadmap?

    • https://github.com/sjwhitworth/golearn
    • https://github.com/alonsovidales/go_ml

    There's a decent amount of overlap in terms of the implemented algorithms / models. Is your goal to eventually include all of the other types (neural networks, collaborative filtering, etc)? It seems like the stated goal of being more stream oriented than batch oriented differentiates this library too.

    At the end of the day, this seems like the most active repo with an exciting direction. I'm very curious to know where you plan on taking things.

  • Comparison with Weka, others?

    Comparison with Weka, others?

    It would be very useful to compare performance (run time, memory used) with other commonly used machine learning libraries/frameworks. like Weka and Apache Mahout....

Self-contained Machine Learning and Natural Language Processing library in Go
Self-contained Machine Learning and Natural Language Processing library in Go

Self-contained Machine Learning and Natural Language Processing library in Go

Jan 8, 2023
Machine Learning for Go
Machine Learning for Go

GoLearn GoLearn is a 'batteries included' machine learning library for Go. Simplicity, paired with customisability, is the goal. We are in active deve

Jan 3, 2023
Gorgonia is a library that helps facilitate machine learning in Go.
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Dec 30, 2022
Machine Learning libraries for Go Lang - Linear regression, Logistic regression, etc.

package ml - Machine Learning Libraries ###import "github.com/alonsovidales/go_ml" Package ml provides some implementations of usefull machine learnin

Nov 10, 2022
Gorgonia is a library that helps facilitate machine learning in Go.
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Dec 27, 2022
Prophecis is a one-stop machine learning platform developed by WeBank
Prophecis is a one-stop machine learning platform developed by WeBank

Prophecis is a one-stop machine learning platform developed by WeBank. It integrates multiple open-source machine learning frameworks, has the multi tenant management capability of machine learning compute cluster, and provides full stack container deployment and management services for production environment.

Dec 28, 2022
Go Machine Learning Benchmarks
Go Machine Learning Benchmarks

Benchmarks of machine learning inference for Go

Dec 30, 2022
A High-level Machine Learning Library for Go
A High-level Machine Learning Library for Go

Overview Goro is a high-level machine learning library for Go built on Gorgonia. It aims to have the same feel as Keras. Usage import ( . "github.

Nov 20, 2022
Standard machine learning models

Cog: Standard machine learning models Define your models in a standard format, store them in a central place, run them anywhere. Standard interface fo

Jan 9, 2023
Katib is a Kubernetes-native project for automated machine learning (AutoML).
Katib is a Kubernetes-native project for automated machine learning (AutoML).

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architec

Jan 2, 2023
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.

中文 | English PaddleDTX PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage. It solves the d

Dec 14, 2022
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.

Spice.ai Spice.ai is an open source, portable runtime for training and using deep learning on time series data. ⚠️ DEVELOPER PREVIEW ONLY Spice.ai is

Dec 15, 2022
Reinforcement Learning in Go
Reinforcement Learning in Go

Overview Gold is a reinforcement learning library for Go. It provides a set of agents that can be used to solve challenges in various environments. Th

Dec 11, 2022
FlyML perfomant real time mashine learning libraryes in Go

FlyML perfomant real time mashine learning libraryes in Go simple & perfomant logistic regression (~100 LoC) Status: WIP! Validated on mushrooms datas

May 30, 2022
Go (Golang) encrypted deep learning library; Fully homomorphic encryption over neural network graphs

DC DarkLantern A lantern is a portable case that protects light, A dark lantern is one who's light can be hidden at will. DC DarkLantern is a golang i

Oct 31, 2022
Go types, funcs, and utilities for working with cards, decks, and evaluating poker hands (Holdem, Omaha, Stud, more)

cardrank.io/cardrank Package cardrank.io/cardrank provides a library of types, funcs, and utilities for working with playing cards, decks, and evaluat

Dec 25, 2022
A tool for building identical machine images for multiple platforms from a single source configuration
A tool for building identical machine images for multiple platforms from a single source configuration

Packer Packer is a tool for building identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs o

Oct 3, 2021
Command line tool for improving typing skills (programmers friendly)
Command line tool for improving typing skills (programmers friendly)

Command line tool for improving typing speed and accuracy. The main goal is to help programmers practise programming languages. Demo Installation Pyth

Jan 5, 2023
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

End-to-end computer vision platform Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises. onepa

Dec 12, 2022