Go bindings for the snowball libstemmer library including porter 2

Go (golang) bindings for libstemmer

This simple library provides Go (golang) bindings for the snowball libstemmer library including the popular porter and porter2 algorithms.

Requirements

You'll need the development package of libstemmer, usually this is simply a matter of:

sudo apt-get install libstemmer-dev

... or you might need to install it from source.

Installation

First, ensure you have your GOPATH env variable set to the root of your Go project:

export GOPATH=`pwd`
export PATH=$PATH:$GOPATH/bin

Then this cute statement should do the trick:

go get github.com/rjohnsondev/golibstemmer

Usage

Basic usage:

package main

import "github.com/rjohnsondev/golibstemmer"
import "fmt"
import "os"

func main() {
    s, err := stemmer.NewStemmer("english")
    defer s.Close()
    if err != nil {
        fmt.Println("Error creating stemmer: "+err.Error())
        os.Exit(1)
    }
    word := s.StemWord("happy")
    fmt.Println(word)
}

To get a list of supported stemming algorithms:

list := stemmer.GetSupportedLanguages()

Testing

You can execute the basic included tests with:

go test

If you have issues, double check you've installed the libstemmer development library.

If you still have issues, let me know!

Owner
Richard Johnson
Former London-based CTO recently relocated to Melbourne; wrangler of data, servant-leader of high performance teams, lover of functional programming.
Richard Johnson
Similar Resources

A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Jan 4, 2023

Functional programming library for Go including a lazy list implementation and some of the most usual functions.

functional A functional programming library including a lazy list implementation and some of the most usual functions. import FP "github.com/tcard/fun

May 21, 2022

:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech

Jan 4, 2023

Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.

Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.

Mab Multi-Armed Bandits Go Library Description Installation Usage Creating a bandit and selecting arms Numerical integration with numint Documentation

Jan 2, 2023

Go library that makes it easy to add automatic retries to your projects, including support for context.Context.

go-retry Go library that makes it easy to add automatic retries to your projects, including support for context.Context. Example with context.Context

Aug 15, 2022

Juniper is an extension to the Go standard library using generics, including containers, iterators, and streams.

Juniper Juniper is a library of extensions to the Go standard library using generics, including containers, iterators, and streams. container/tree con

Dec 25, 2022

Go session management for web servers (including support for Google App Engine - GAE).

Session The Go standard library includes a nice http server, but unfortunately it lacks a very basic and important feature: HTTP session management. T

Oct 10, 2022

A protoc-gen-go wrapper including an RPC stub generator

// Copyright 2013 Google. All rights reserved. // // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE fi

Nov 17, 2022

:100:Go Struct and Field validation, including Cross Field, Cross Struct, Map, Slice and Array diving

Package validator Package validator implements value validations for structs and individual fields based on tags. It has the following unique features

Jan 1, 2023

Fast, concurrent, streaming access to Amazon S3, including gof3r, a CLI. http://godoc.org/github.com/rlmcpherson/s3gof3r

s3gof3r s3gof3r provides fast, parallelized, pipelined streaming access to Amazon S3. It includes a command-line interface: gof3r. It is optimized for

Dec 26, 2022

The dynamic infrastructure framework for everybody! Distribute the workload of many different scanning tools with ease, including nmap, ffuf, masscan, nuclei, meg and many more!

The dynamic infrastructure framework for everybody! Distribute the workload of many different scanning tools with ease, including nmap, ffuf, masscan, nuclei, meg and many more!

Axiom is a dynamic infrastructure framework to efficiently work with multi-cloud environments, build and deploy repeatable infrastructure focussed on

Dec 30, 2022

a key-value store with multiple backends including leveldb, badgerdb, postgresql

Overview goukv is an abstraction layer for golang based key-value stores, it is easy to add any backend provider. Available Providers badgerdb: Badger

Jan 5, 2023

miscellaneous useful commands, including 'gosh' the Go scripting tool

utilities Miscellaneous useful commands. gosh This is a tool for running Go code from the command line. See here. findCmpRm This finds files with copi

Oct 31, 2022

Fetch web pages using headless Chrome, storing all fetched resources including JavaScript files

Fetch web pages using headless Chrome, storing all fetched resources including JavaScript files. Run arbitrary JavaScript on many web pages and see the returned values

Dec 29, 2022

Managing your Kubernetes clusters (including public, private, edge, etc) as easily as visiting the Internet

Clusternet Managing Your Clusters (including public, private, hybrid, edge, etc) as easily as Visiting the Internet. Clusternet (Cluster Internet) is

Dec 30, 2022

Go monolith with embedded microservices including GRPC,REST,GraphQL and The Clean Architecture.

Go monolith with embedded microservices including GRPC,REST,GraphQL and The Clean Architecture.

GoArcc - Go monolith with embedded microservices including GRPC,REST, graphQL and The Clean Architecture. Description When you start writing a Go proj

Dec 21, 2022

💯 Go Struct and Field validation, including Cross Field, Cross Struct, Map, Slice and Array diving

Package validator implements value validations for structs and individual fields based on tags.

Nov 9, 2022

Convert arbitrary formats to Go Struct (including json, toml, yaml, etc.)

go2struct Convert arbitrary formats to Go Struct (including json, toml, yaml, etc.) Installation Run the following command under your project: go get

Nov 15, 2022
Comments
  • Fatal error attempting to go get

    Fatal error attempting to go get

    Thank you for working on these bindings for the Snowball C library. I would really like to get this working in Go but am having problems installing.

    I don't understand why it is referencing C files that aren't included in the package. But it is, and Go is not happy. See the error below I got when trying to go get.

    I see libstemmer.h comes with the C Snowball tarball, but not stdio.h, etc. I compiled the C source into the stemwords executable and installed that, which I supposed was the right thing to do. If I'm supposed to copy some of the C source files somewhere then I have no idea what to do. This was the error on go get:

    go get github.com/rjohnsondev/golibstemmer

    /home/go/src/github.com/rjohnsondev/golibstemmer/stemmer.go:17:24: fatal error: libstemmer.h: No such file or directory #include <libstemmer.h> ^ compilation terminated.

  • Locks

    Locks

    According to the docs, snowball stemmer is not thread-safe. It would be if the go wrapper uses a lock around it since the natural assumption in go is that the code is thread-safe.

A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Jan 4, 2023
CGo bindings to Yandex.Mystem

go-mystem CGo bindings to Yandex.Mystem - russian morphology analyzer. Install $ wget https://github.com/yandex/tomita-parser/releases/download/v1.0/l

Oct 6, 2022
Cgo binding for icu4c library

About Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1. Installation Installation consis

Sep 27, 2022
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29

segment A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 Features Currently only segmentation at Word

Dec 19, 2022
Self-contained Machine Learning and Natural Language Processing library in Go

If you like the project, please ★ star this repository to show your support! ?? A Machine Learning library written in pure Go designed to support rele

Dec 30, 2022
Natural language detection library for Go
Natural language detection library for Go

Whatlanggo Natural language detection for Go. Features Supports 84 languages 100% written in Go No external dependencies Fast Recognizes not only a la

Dec 28, 2022
A go library for reading and creating ISO9660 images

iso9660 A package for reading and creating ISO9660 Joliet and Rock Ridge extensions are not supported. Examples Extracting an ISO package main import

Jan 2, 2023
Cgo binding for Snowball C library

Description Snowball stemmer port (cgo wrapper) for Go. Provides word stem extraction functionality. For more detailed info see http://snowball.tartar

Nov 28, 2022
A native Go clean room implementation of the Porter Stemming algorithm.

Go Porter Stemmer A native Go clean room implementation of the Porter Stemming Algorithm. This algorithm is of interest to people doing Machine Learni

Jan 3, 2023
Go implementation of the Snowball stemmers

Snowball A Go (golang) implementation of the Snowball stemmer for natural language processing. Status Latest release v0.7.0 (2020-11-30) Latest build

Dec 19, 2022