A well tested and comprehensive Golang statistics library package with no dependencies.

A well tested and comprehensive Golang statistics library / package / module with no dependencies.

If you have any suggestions, problems or bug reports please create an issue and I'll do my best to accommodate you. In addition simply starring the repo would show your support for the project and be very much appreciated!


go get github.com/montanaflynn/stats

Example Usage

All the functions can be seen in examples/main.go but here's a little taste:

// start with some source data to use
data := []float64{1.0, 2.1, 3.2, 4.823, 4.1, 5.8}

// you could also use different types like this
// data := stats.LoadRawData([]int{1, 2, 3, 4, 5})
// data := stats.LoadRawData([]interface{}{1.1, "2", 3})
// etc...

median, _ := stats.Median(data)
fmt.Println(median) // 3.65

roundedMedian, _ := stats.Round(median, 0)
fmt.Println(roundedMedian) // 4


The entire API documentation is available on GoDoc.org or pkg.go.dev.

You can also view docs offline with the following commands:

# Command line
godoc .              # show all exported apis
godoc . Median       # show a single function
godoc -ex . Round    # show function with example
godoc . Float64Data  # show the type and methods

# Local website
godoc -http=:4444    # start the godoc server on port 4444
open http://localhost:4444/pkg/github.com/montanaflynn/stats/

The exported API is as follows:

var (
    ErrEmptyInput = statsError{"Input must not be empty."}
    ErrNaN        = statsError{"Not a number."}
    ErrNegative   = statsError{"Must not contain negative values."}
    ErrZero       = statsError{"Must not contain zero values."}
    ErrBounds     = statsError{"Input is outside of range."}
    ErrSize       = statsError{"Must be the same length."}
    ErrInfValue   = statsError{"Value is infinite."}
    ErrYCoord     = statsError{"Y Value must be greater than zero."}

func Round(input float64, places int) (rounded float64, err error) {}

type Float64Data []float64

func LoadRawData(raw interface{}) (f Float64Data) {}

func AutoCorrelation(data Float64Data, lags int) (float64, error) {}
func ChebyshevDistance(dataPointX, dataPointY Float64Data) (distance float64, err error) {}
func Correlation(data1, data2 Float64Data) (float64, error) {}
func Covariance(data1, data2 Float64Data) (float64, error) {}
func CovariancePopulation(data1, data2 Float64Data) (float64, error) {}
func CumulativeSum(input Float64Data) ([]float64, error) {}
func Entropy(input Float64Data) (float64, error) {}
func EuclideanDistance(dataPointX, dataPointY Float64Data) (distance float64, err error) {}
func GeometricMean(input Float64Data) (float64, error) {}
func HarmonicMean(input Float64Data) (float64, error) {}
func InterQuartileRange(input Float64Data) (float64, error) {}
func ManhattanDistance(dataPointX, dataPointY Float64Data) (distance float64, err error) {}
func Max(input Float64Data) (max float64, err error) {}
func Mean(input Float64Data) (float64, error) {}
func Median(input Float64Data) (median float64, err error) {}
func MedianAbsoluteDeviation(input Float64Data) (mad float64, err error) {}
func MedianAbsoluteDeviationPopulation(input Float64Data) (mad float64, err error) {}
func Midhinge(input Float64Data) (float64, error) {}
func Min(input Float64Data) (min float64, err error) {}
func MinkowskiDistance(dataPointX, dataPointY Float64Data, lambda float64) (distance float64, err error) {}
func Mode(input Float64Data) (mode []float64, err error) {}
func NormBoxMullerRvs(loc float64, scale float64, size int) []float64 {}
func NormCdf(x float64, loc float64, scale float64) float64 {}
func NormEntropy(loc float64, scale float64) float64 {}
func NormFit(data []float64) [2]float64{}
func NormInterval(alpha float64, loc float64,  scale float64 ) [2]float64 {}
func NormIsf(p float64, loc float64, scale float64) (x float64) {}
func NormLogCdf(x float64, loc float64, scale float64) float64 {}
func NormLogPdf(x float64, loc float64, scale float64) float64 {}
func NormLogSf(x float64, loc float64, scale float64) float64 {}
func NormMean(loc float64, scale float64) float64 {}
func NormMedian(loc float64, scale float64) float64 {}
func NormMoment(n int, loc float64, scale float64) float64 {}
func NormPdf(x float64, loc float64, scale float64) float64 {}
func NormPpf(p float64, loc float64, scale float64) (x float64) {}
func NormPpfRvs(loc float64, scale float64, size int) []float64 {}
func NormSf(x float64, loc float64, scale float64) float64 {}
func NormStats(loc float64, scale float64, moments string) []float64 {}
func NormStd(loc float64, scale float64) float64 {}
func NormVar(loc float64, scale float64) float64 {}
func Pearson(data1, data2 Float64Data) (float64, error) {}
func Percentile(input Float64Data, percent float64) (percentile float64, err error) {}
func PercentileNearestRank(input Float64Data, percent float64) (percentile float64, err error) {}
func PopulationVariance(input Float64Data) (pvar float64, err error) {}
func Sample(input Float64Data, takenum int, replacement bool) ([]float64, error) {}
func SampleVariance(input Float64Data) (svar float64, err error) {}
func Sigmoid(input Float64Data) ([]float64, error) {}
func SoftMax(input Float64Data) ([]float64, error) {}
func StableSample(input Float64Data, takenum int) ([]float64, error) {}
func StandardDeviation(input Float64Data) (sdev float64, err error) {}
func StandardDeviationPopulation(input Float64Data) (sdev float64, err error) {}
func StandardDeviationSample(input Float64Data) (sdev float64, err error) {}
func StdDevP(input Float64Data) (sdev float64, err error) {}
func StdDevS(input Float64Data) (sdev float64, err error) {}
func Sum(input Float64Data) (sum float64, err error) {}
func Trimean(input Float64Data) (float64, error) {}
func VarP(input Float64Data) (sdev float64, err error) {}
func VarS(input Float64Data) (sdev float64, err error) {}
func Variance(input Float64Data) (sdev float64, err error) {}

type Coordinate struct {
    X, Y float64

type Series []Coordinate

func ExponentialRegression(s Series) (regressions Series, err error) {}
func LinearRegression(s Series) (regressions Series, err error) {}
func LogarithmicRegression(s Series) (regressions Series, err error) {}

type Outliers struct {
    Mild    Float64Data
    Extreme Float64Data

type Quartiles struct {
    Q1 float64
    Q2 float64
    Q3 float64

func Quartile(input Float64Data) (Quartiles, error) {}
func QuartileOutliers(input Float64Data) (Outliers, error) {}


Pull request are always welcome no matter how big or small. I've included a Makefile that has a lot of helper targets for common actions such as linting, testing, code coverage reporting and more.

  1. Fork the repo and clone your fork
  2. Create new branch (git checkout -b some-thing)
  3. Make the desired changes
  4. Ensure tests pass (go test -cover or make test)
  5. Run lint and fix problems (go vet . or make lint)
  6. Commit changes (git commit -am 'Did something')
  7. Push branch (git push origin some-thing)
  8. Submit pull request

To make things as seamless as possible please also consider the following steps:

  • Update examples/main.go with a simple example of the new feature
  • Update README.md documentation section with any new exported API
  • Keep 100% code coverage (you can check with make coverage)
  • Squash commits into single units of work with git rebase -i new-feature


To release a new version we should update the CHANGELOG.md and DOCUMENTATION.md.

First install the tools used to generate the markdown files:

go get github.com/davecheney/godoc2md
go get github.com/golangci/golangci-lint/cmd/golangci-lint

Then you can run these make directives:

make docs

Then we can create a CHANGELOG.md a new git tag and a github release:

make release TAG=v0.x.x

MIT License

Copyright (c) 2014-2021 Montana Flynn (https://montanaflynn.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.


Montana Flynn
Distributed systems engineer
Montana Flynn
  • Using an interface to support []float64 and []int

    Using an interface to support []float64 and []int

    I have a feeling it might be possible to use an interface to support both []float64 and []int data. However I've not designed Public interfaces or worked around the lack of generics myself so I'll either have to do some research and hacking or have the excellent community of Gophers help in this area or tell me my attempts will be futile. Either way, any feedback is appreciated!

  • undefined: stats.MedianAbsoluteDeviationPopulation

    undefined: stats.MedianAbsoluteDeviationPopulation


    I am not able to use the MedianAbsoluteDeviationPopulation function,

    If I use "go doc", I do not see all functions: $ go doc stats

    package stats // import "github.com/zizmos/ego/vendor/github.com/montanaflynn/stats"

    func Correlation(data1, data2 Float64Data) (float64, error) func Covariance(data1, data2 Float64Data) (float64, error) func GeometricMean(input Float64Data) (float64, error) func HarmonicMean(input Float64Data) (float64, error) func InterQuartileRange(input Float64Data) (float64, error) func Max(input Float64Data) (max float64, err error) func Mean(input Float64Data) (float64, error) func Median(input Float64Data) (median float64, err error) func Midhinge(input Float64Data) (float64, error) func Min(input Float64Data) (min float64, err error) func Mode(input Float64Data) (mode []float64, err error) func Percentile(input Float64Data, percent float64) (percentile float64, err error) func PercentileNearestRank(input Float64Data, percent float64) (percentile float64, err error) func PopulationVariance(input Float64Data) (pvar float64, err error) func Round(input float64, places int) (rounded float64, err error) func Sample(input Float64Data, takenum int, replacement bool) ([]float64, error) func SampleVariance(input Float64Data) (svar float64, err error) func StandardDeviation(input Float64Data) (sdev float64, err error) func StandardDeviationPopulation(input Float64Data) (sdev float64, err error) func StandardDeviationSample(input Float64Data) (sdev float64, err error) func StdDevP(input Float64Data) (sdev float64, err error) func StdDevS(input Float64Data) (sdev float64, err error) func Sum(input Float64Data) (sum float64, err error) func Trimean(input Float64Data) (float64, error) func VarP(input Float64Data) (sdev float64, err error) func VarS(input Float64Data) (sdev float64, err error) func Variance(input Float64Data) (sdev float64, err error) type Coordinate struct{ ... } func ExpReg(s []Coordinate) (regressions []Coordinate, err error) func LinReg(s []Coordinate) (regressions []Coordinate, err error) func LogReg(s []Coordinate) (regressions []Coordinate, err error) type Float64Data []float64 type Outliers struct{ ... } func QuartileOutliers(input Float64Data) (Outliers, error) type Quartiles struct{ ... } func Quartile(input Float64Data) (Quartiles, error) type Series []Coordinate func ExponentialRegression(s Series) (regressions Series, err error) func LinearRegression(s Series) (regressions Series, err error) func LogarithmicRegression(s Series) (regressions Series, err error)

    However, MedianAbsoluteDeviationPopulation function is a public function in the implementation. $go version go version go1.10.2 darwin/amd64

    $dep status .... .... github.com/montanaflynn/stats ^0.2.0 0.2.0 eeaced0 0.2.0 1
    ... ...

  • Edge cases with Percentiles

    Edge cases with Percentiles

    I believe there are some errors with the Percentiles() edge cases.

    Passing 0 as the percent will cause an error as will a small set of data and a small percentage (such that the c[i-1] is out of bounds because i = 0.

    I'm not sure the best approach to fix this, as picking which index is quite critical for the correct result, but I think this might work?

    index := (percent / 100) * float64(len(c) - 1)

    And then use c[i] and c[i+1] later on. Using c[i+1] would be dangerous with input.Len() == 1 and maybe in the case of 99.9 percent and few values?

  • Trouble with trying to add Changelog and Documentation

    Trouble with trying to add Changelog and Documentation

    Hello, I wanted to try and practice some of the go programming language, while contributing to an open source package. I was able to write some functions and test those functions utilizing the makefile provided. However, I had trouble trying to get the packages for changelog / documention . MD file updates

    go get github.com/davecheney/godoc2md go get github.com/golangci/golangci-lint/cmd/golangci-lint

    I ran the top command and had some errors, which I believe yielded me in errors for running the bottom command. I checked the top repository and it looks like it is no longer being developed.

    I've never contributed in this way and don't have experience in making the edits to the markdown files, but am trying to learn more of all things git and software.

  • Code changes for Manhattan, Euclidean and Minkowski distance calculation

    Code changes for Manhattan, Euclidean and Minkowski distance calculation

    These changes would introduce functions to calculate Manhattan, Euclidean and Minkowski distance. Plus, I have done a bit of code reformatting in the old code. Please take a look. -Shivendra

  • NaN when running exponential regression

    NaN when running exponential regression

    [{0 2.5} {1 5} {4 25} {6 5} {8 5} {11 15} {12 2.5} {14 25} {15 0} {16 1.6666666666666667} {17 40} {20 5} {21 15} {22 20} {23 16.666666666666668} {24 13.333333333333334} {25 50} {26 18} {27 75} {28 21} {29 0} {30 5} {31 37.5} {32 5} {34 40} {36 5} {37 39}]

    When running exponential regression with the series above. The result is:

    [{0 NaN} {1 NaN} {4 NaN} {6 NaN} {8 NaN} {11 NaN} {12 NaN} {14 NaN} {15 NaN} {16 NaN} {17 NaN} {20 NaN} {21 NaN} {22 NaN} {23 NaN} {24 NaN} {25 NaN} {26 NaN} {27 NaN} {28 NaN} {29 NaN} {30 NaN} {31 NaN} {32 NaN} {34 NaN} {36 NaN} {37 NaN}]

    Is this desired? If so could you maybe elaborate on why this is happens.

  • Adding ability to load quickly convert mixed-type data to Float64Data

    Adding ability to load quickly convert mixed-type data to Float64Data

    Simple example based off one provided on repository page.

    // start with the some source data to use
    raw := []interface{}{1.0, "2", 3.0, 4, "4.0", 5}
    // clean up raw data by converting to floats
    var data = stats.LoadRawData(raw)
    median, _ := stats.Median(data)
    fmt.Println(median) // 3.5
    roundedMedian, _ := stats.Round(median, 0)
    fmt.Println(roundedMedian) // 4
  • panic: sync: negative WaitGroup counter in dbscan.go

    panic: sync: negative WaitGroup counter in dbscan.go

    We use your library for our open-source photo app, in particular the DBSCAN implementation. Thanks for providing it!

    While it works great for me, a developer reported issues with panics in dbscan.go, line 251. Seems w.Done() may be called too often, probably depending on input data. Couldn't reproduce it with my local samples.

    Our related code and GitHub issue:

    • https://github.com/photoprism/photoprism/blob/develop/internal/photoprism/faces.go#L225
    • https://github.com/photoprism/photoprism/issues/1478


    panic: sync: negative WaitGroup counter
    goroutine 3326 [running]:
    sync.(*WaitGroup).Add(0xc004761070, 0xffffffffffffffff)
             /usr/local/go/src/sync/waitgroup.go:74 +0x147
            /go/pkg/mod/github.com/mpraski/[email protected]/dbscan.go:251 +0x230
     created by github.com/mpraski/clusters.(*dbscanClusterer).startNearestWorkers
             /go/pkg/mod/github.com/mpraski/[email protected]/dbscan.go:228
  • support string in LoadRawData()

    support string in LoadRawData()

    Thank you for this great package. I added support for a string and io.Reader in LoadRawData() so it support whitespace separated strings.


    stats.LoadRawData("1.1 2 3.0 4 5")
    // or

    Is this something you would consider implemented in you package? If so, I can create a pull request.


  • [Suggestion] Calculate Quartile from the instance of Float64Data

    [Suggestion] Calculate Quartile from the instance of Float64Data

    Nice package, I am using it right now. And I found an inconsistency while calculating the quartiles. Any reason why we must pass the data/input to calculate Quartile? Why not use the instance. If there is no specific reason, I suggest adding the Quartiles method on the Float64Data's struct without any input, but use the current instance, like we use Mean(), Max(), etc.


    func (f Float64Data) Quartiles() (Quartiles, error) {
    	return Quartile(f)

    If this is possible, I will make the MR.

  • A few tests fail in different architectures due to precision errors

    A few tests fail in different architectures due to precision errors

    When running the test suite on s390x and ppc64le architectures, I get the following output:

    go test -compiler gc -ldflags '' github.com/montanaflynn/stats
    --- FAIL: TestCorrelation (0.00s)
    	correlation_test.go:33: Correlation 0.9912407071619304 != 0.9912407071619302
    	correlation_test.go:47: Correlation 0.9912407071619304 != 0.9912407071619302
    --- FAIL: TestOtherDataMethods (0.00s)
    	data_test.go:22: github.com/montanaflynn/stats.(Float64Data).Correlation-fm() => 0.2087547359760545 != 0.20875473597605448
    	data_test.go:22: github.com/montanaflynn/stats.(Float64Data).Pearson-fm() => 0.2087547359760545 != 0.20875473597605448
    	data_test.go:22: github.com/montanaflynn/stats.(Float64Data).Covariance-fm() => 7.381421553571428 != 7.3814215535714265
    	data_test.go:22: github.com/montanaflynn/stats.(Float64Data).CovariancePopulation-fm() => 6.458743859374999 != 6.458743859374998
    --- FAIL: TestLinearRegression (0.00s)
    	regression_test.go:19: [{1 2.380000000000002} {2 3.080000000000001} {3 3.7800000000000002} {4 4.4799999999999995} {5 5.179999999999999}] != 2.3800000000000026
    	regression_test.go:23: [{1 2.380000000000002} {2 3.080000000000001} {3 3.7800000000000002} {4 4.4799999999999995} {5 5.179999999999999}] != 3.0800000000000014
    	regression_test.go:31: [{1 2.380000000000002} {2 3.080000000000001} {3 3.7800000000000002} {4 4.4799999999999995} {5 5.179999999999999}] != 4.479999999999999
    	regression_test.go:35: [{1 2.380000000000002} {2 3.080000000000001} {3 3.7800000000000002} {4 4.4799999999999995} {5 5.179999999999999}] != 5.179999999999998
    --- FAIL: TestLogarithmicRegression (0.00s)
    	regression_test.go:94: [{1 2.152082236381168} {2 3.330555922249221} {3 4.019918836568675} {4 4.509029608117274} {5 4.8884133966836645}] != 2.1520822363811702
    	regression_test.go:98: [{1 2.152082236381168} {2 3.330555922249221} {3 4.019918836568675} {4 4.509029608117274} {5 4.8884133966836645}] != 3.3305559222492214
    	regression_test.go:102: [{1 2.152082236381168} {2 3.330555922249221} {3 4.019918836568675} {4 4.509029608117274} {5 4.8884133966836645}] != 4.019918836568674
    	regression_test.go:106: [{1 2.152082236381168} {2 3.330555922249221} {3 4.019918836568675} {4 4.509029608117274} {5 4.8884133966836645}] != 4.509029608117273
    	regression_test.go:110: [{1 2.152082236381168} {2 3.330555922249221} {3 4.019918836568675} {4 4.509029608117274} {5 4.8884133966836645}] != 4.888413396683663
    FAIL	github.com/montanaflynn/stats	0.003s

    I also opened a similar bug report for x/image with a bug that seems to be related to this one https://github.com/golang/go/issues/21460

    In addition, the tests also fail for i686 architectures, with a different output:

    go test -compiler gc -ldflags '' github.com/montanaflynn/stats
    --- FAIL: TestLogarithmicRegression (0.00s)
    	regression_test.go:94: [{1 2.1520822363811654} {2 3.3305559222492205} {3 4.019918836568676} {4 4.509029608117276} {5 4.888413396683665}] != 2.1520822363811702
    	regression_test.go:98: [{1 2.1520822363811654} {2 3.3305559222492205} {3 4.019918836568676} {4 4.509029608117276} {5 4.888413396683665}] != 3.3305559222492214
    	regression_test.go:102: [{1 2.1520822363811654} {2 3.3305559222492205} {3 4.019918836568676} {4 4.509029608117276} {5 4.888413396683665}] != 4.019918836568674
    	regression_test.go:106: [{1 2.1520822363811654} {2 3.3305559222492205} {3 4.019918836568676} {4 4.509029608117276} {5 4.888413396683665}] != 4.509029608117273
    	regression_test.go:110: [{1 2.1520822363811654} {2 3.3305559222492205} {3 4.019918836568676} {4 4.509029608117276} {5 4.888413396683665}] != 4.888413396683663
    FAIL	github.com/montanaflynn/stats	0.005s

    Note that this does not seem to be related to the issue mentioned above for x/image.

  • Fix percentile computation

    Fix percentile computation


    Unless I'm missing something, the percentile computation was incorrect. For example, 50th percentile was not matching the median in the distribution [1, 2, ..., 10] which is 5.5.

    This pull request modifies the implementation of Percentile and adds more tests.

  • Avoid GeometricMean overflow by using exp and log

    Avoid GeometricMean overflow by using exp and log

    This PR complicates a formula a bit to avoid overflows.

    Geometric mean of 100 numbers around 1 billion gets us overflow as multiplying them yields infinity. Numeric roots from that still return infinity.

    We can take logarithm of all numbers to avoid most prominent overflows that manifest themselves for just few numbers.

    Add tests: Test with 0 in input failed in previous implementation. Test with more larger values shows the overflow issue.

  • How to calculate the weighted percentile?

    How to calculate the weighted percentile?

    See: https://en.wikipedia.org/wiki/Percentile#The_weighted_percentile_method

    My current code is as follows. But I don't know how to support weighted percentile?

    func TestPercentile(t *testing.T) {
    	values := []float64{4, 5, 3, 1, 2}
    	percentiles := []float64{}
    	for i := 1; i <= 100; i++ {
    		percentile, err := stats.PercentileNearestRank(values, float64(i))
    		if err != nil {
    		percentiles = append(percentiles, percentile)
    		fmt.Printf("%d%%: %f, ", i, percentile)
    	for f := 0.0; f <= 5; f += 0.1 {
    		index := sort.SearchFloat64s(percentiles, f + 0.00000001)
    		fmt.Printf("%f: %d%%, ", f, index)
  • Adding single-pass descriptive stats for large data sets

    Adding single-pass descriptive stats for large data sets

    Hi, wondered if I could contribute by adding single-pass descriptive stats for people working with large datasets? This will simply return mean, sdev, var, min, max, correlation. All the things you have, but for situations where Float64Data would be too big.

