A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text.

To do:

  • write documentation
  • make it faster

Keywords: text categorization, language detector

Install

go get github.com/pebbe/textcat
go get github.com/pebbe/textcat/textcat
go get github.com/pebbe/textcat/textpat

Docs

Owner
Similar Resources

[UNMANTEINED] Extract values from strings and fill your structs with nlp.

nlp nlp is a general purpose any-lang Natural Language Processor that parses the data inside a text and returns a filled model Supported types int in

Nov 24, 2022

Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang

Natural Language Processing Implementations of selected machine learning algorithms for natural language processing in golang. The primary focus for t

Dec 25, 2022

Self-contained Machine Learning and Natural Language Processing library in Go

If you like the project, please ★ star this repository to show your support! 🤩 A Machine Learning library written in pure Go designed to support rele

Dec 30, 2022

Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.

Stemmer package for Go Stemmer package provides an interface for stemmers and includes English, German and Dutch stemmers as sub-packages: porter2 sub

Dec 14, 2022

A go library for reading and creating ISO9660 images

iso9660 A package for reading and creating ISO9660 Joliet and Rock Ridge extensions are not supported. Examples Extracting an ISO package main import

Jan 2, 2023

Gopher-translator - A HTTP API that accepts english word or sentences and translates them to Gopher language

Gopher Translator Service An interview assignment project. To see the full assig

Jan 25, 2022

A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text. To do: write documentation make it faster Keywords: text categ

Nov 28, 2022

A UTF-8 and internationalisation testing utility for text rendering.

ɱéťàł "English, but metal" Metal is a tool that converts English text into a legible, Zalgo-like character swap for the purposes of testing localisati

Jan 1, 2023

Go wrapper for gram-tgcalls.

Go wrapper for gram-tgcalls.

Go wrapper for gram-tgcalls. Features Doesn't let you worry about running Telegram clients, it starts an unlimited number of lightweight Gra

Dec 8, 2021

Package raw enables reading and writing data at the device driver level for a network interface. MIT Licensed.

raw Package raw enables reading and writing data at the device driver level for a network interface. MIT Licensed. For more information about using ra

Dec 28, 2022

Go package for sharding databases ( Supports every ORM or raw SQL )

Go package for sharding databases ( Supports every ORM or raw SQL )

Octillery Octillery is a Go package for sharding databases. It can use with every OR Mapping library ( xorm , gorp , gorm , dbr ...) implementing data

Dec 16, 2022

Upgit - Upgit helps you simply upload any file to your Github repository and then get a raw URL for it

Upgit - Upgit helps you simply upload any file to your Github repository and then get a raw URL for it

Upgit - Upgit helps you simply upload any file to your Github repository and then get a raw URL for it

Dec 27, 2022

Write your SQL queries in raw files with all benefits of modern IDEs, use them in an easy way inside your application with all the profit of compile time constants

About qry is a general purpose library for storing your raw database queries in .sql files with all benefits of modern IDEs, instead of strings and co

Dec 25, 2022

Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio.

pulse-simple Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio. The full Simple API is supported, including channel m

Dec 17, 2022

Go fearless SQL. Sqlvet performs static analysis on raw SQL queries in your Go code base.

Sqlvet Sqlvet performs static analysis on raw SQL queries in your Go code base to surface potential runtime errors at build time. Feature highlights:

Dec 19, 2022

a benchmarking&stressing tool that can send raw HTTP requests

reqstress reqstress is a benchmarking&stressing tool that can send raw HTTP requests. It's written in Go and uses fasthttp library instead of Go's def

Dec 18, 2022

Another Go shellcode loader designed to work with Cobalt Strike raw binary payload.

Another Go shellcode loader designed to work with Cobalt Strike raw binary payload.

Bankai Another Go shellcode loader designed to work with Cobalt Strike raw binary payload. I created this project to mainly educate myself learning Go

Dec 29, 2022

Raw ANSI sequence helpers

Raw ANSI sequence helpers

Oct 23, 2022

Vtterm - An raw-mode vt100 screen reader

#VT100 TERMINAL This is a vt100 screen reader ( clone of jaguilar/v100 ) and inc

Feb 26, 2022
A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Jan 4, 2023
ASCII transliterations of Unicode text.

go-unidecode ASCII transliterations of Unicode text. Inspired by python-unidecode. Installation go get -u github.com/mozillazg/go-unidecode Install C

Dec 2, 2022
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29

segment A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 Features Currently only segmentation at Word

Dec 19, 2022
A tool to find all duplicates in large sets of text documents.

⊧ dupi Dupi is an engine for identifying and exploring duplicative text in sets of documents. Status Dupi is in alpha/early beta development stage. Pl

Dec 23, 2022
Package i18n provides internationalization and localization for your Go applications.

i18n Package i18n provides internationalization and localization for your Go applications. Installation The minimum requirement of Go is 1.16. go get

Nov 9, 2022
Natural language detection package in pure Go

getlang getlang provides fast natural language detection in Go. Features Offline -- no internet connection required Supports 29 languages Provides ISO

Dec 26, 2022
The shamoji (杓文字) is a word filtering package

shamoji About The shamoji (杓文字) is word filtering package. Install $ go get -u github.com/osamingo/shamoji Usage package main import ( "fmt" "sync

Sep 27, 2022
i18n (Internationalization and localization) engine written in Go, used for translating locale strings.

go-localize Simple and easy to use i18n (Internationalization and localization) engine written in Go, used for translating locale strings. Use with go

Nov 29, 2022
Utilities for working with discrete probability distributions and other tools useful for doing NLP work

GNLP A few structures for doing NLP analysis / experiments. Basics counter.Counter A map-like data structure for representing discrete probability dis

Nov 28, 2022
Read and use word2vec vectors in Go

Introduction This is a package for reading word2vec vectors in Go and finding similar words and analogies. Installation This package can be installed

Nov 28, 2022