ASCII transliterations of Unicode text.

go-unidecode

Build Status Coverage Status Go Report Card GoDoc

ASCII transliterations of Unicode text. Inspired by python-unidecode.

Installation

go get -u github.com/mozillazg/go-unidecode

Install CLI tool:

$ go get -u github.com/mozillazg/go-unidecode/unidecode

$ unidecode 北京kožušček
Bei Jing kozuscek

Documentation

API documentation can be found here: https://godoc.org/github.com/mozillazg/go-unidecode

Usage

package main

import (
	"fmt"
	"github.com/mozillazg/go-unidecode"
)

func main() {
	s := "abc"
	fmt.Println(unidecode.Unidecode(s))
	// Output: abc

	s = "北京"
	fmt.Println(unidecode.Unidecode(s))
	// Output: Bei Jing

	s = "kožušček"
	fmt.Println(unidecode.Unidecode(s))
	// Output: kozuscek
}
Similar Resources

Unicode transliterator for #golang

Unicode transliterator (also known as unidecode) for Go Use the following command to install gounidecode go get -u github.com/fiam/gounidecode/unideco

Sep 27, 2022

Package mafsa implements Minimal Acyclic Finite State Automata in Go, essentially a high-speed, memory-efficient, Unicode-friendly set of strings.

MA-FSA for Go Package mafsa implements Minimal Acyclic Finite State Automata (MA-FSA) with Minimal Perfect Hashing (MPH). Basically, it's a set of str

Oct 27, 2022

Go linter which checks for dangerous unicode character sequences

bidichk - checks for dangerous unicode character sequences bidichk finds dangerous unicode character sequences in Go source files. Considered dangerou

Oct 5, 2022

A detector for the Trojan Source and other unicode-based vulnerabilities.

Trojan Source Detector This application detects Trojan Source attacks in source code. It can be used as part of the CI system to make sure there are n

Jan 6, 2022

utf8 - provide unicode information on runes

utf8 utf8 provides unicode code point values for input runes and the unicode rune (if printable) for a given unicode code point. With no arguments, pr

Jan 8, 2022

Bitprint - Figlet for bitmap fonts, using unicode block elements

bitprint Figlet for bitmap fonts, using unicode block elements. Usage bitprint

Feb 12, 2022

A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text. To do: write documentation make it faster Keywords: text categ

Nov 28, 2022

A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text. To do: write documentation make it faster Keywords: text categ

Nov 28, 2022

Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors.

Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors.

Chalk Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors. Documentati

Oct 29, 2022

Project-1 - Create a service that accepts input as text and provides Json Output as Top ten most used words and times of occurrence in the text

Project Assignment Steps to run the project: download or clone repo in your loca

Jan 27, 2022

Generates random text based on trigrams generated from input text

Generates random text based on trigrams generated from input text

Trigrams Generates random text based on trigrams generated from input text Contents Building Running Using Implementation notes NGram size Maximum wor

Feb 9, 2022

Read the text of memes, then inject that text into the image as searchable metadata.

Read the text of memes, then inject that text into the image as searchable metadata.

Make Meme Text Searchable I have an extensive set of memes I've been collecting since the early days of Flickr. #icanhascheeseburger It's a pain in th

May 2, 2022

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

asciigraph Go package to make lightweight ASCII line graphs ╭┈╯. Installation go get github.com/guptarohit/asciigraph Usage Basic graph package main

Jan 1, 2023

Tabular simplifies printing ASCII tables from command line utilities

tabular Tabular simplifies printing ASCII tables from command line utilities without the need to pass large sets of data to it's API. Simply define th

Oct 28, 2022

:foggy: Convert image to ASCII

:foggy: Convert image to ASCII

🌁 Image2ascii Image2ASCII is a library that converts images into ASCII images and provides command-line tools for easy use. Installation go get githu

Jan 8, 2023

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

Go package to make lightweight ASCII line graph ╭┈╯ in command line apps with no other dependencies.

asciigraph Go package to make lightweight ASCII line graphs ╭┈╯. Installation go get github.com/guptarohit/asciigraph Usage Basic graph package main

Jan 8, 2023

ASCII table in golang

ASCII table in golang

ASCII Table Writer Generate ASCII table on the fly ... Installation is simple as go get github.com/olekukonko/tablewriter Features Automatic Padding

Jan 1, 2023

A cross-platform tool to convert images into ascii art and print them on the console

A cross-platform tool to convert images into ascii art and print them on the console

A cross-platform tool to convert images into ascii art and print them on the console

Dec 30, 2022

Generate ANSI-/Ascii-art version images/Gifs in your terminal.

Generate ANSI-/Ascii-art version images/Gifs in your terminal.

ANSI-Art NOTE: This toy project is not yet finished. ANSI-version Logo Block ANSI-version Logo ASCII-version Logo Support Platform You are kindly remi

Jan 6, 2023
Comments
  • Upgrade to GitHub-native Dependabot

    Upgrade to GitHub-native Dependabot

    Dependabot Preview will be shut down on August 3rd, 2021. In order to keep getting Dependabot updates, please merge this PR and migrate to GitHub-native Dependabot before then.

    Dependabot has been fully integrated into GitHub, so you no longer have to install and manage a separate app. This pull request migrates your configuration from Dependabot.com to a config file, using the new syntax. When merged, we'll swap out dependabot-preview (me) for a new dependabot app, and you'll be all set!

    With this change, you'll now use the Dependabot page in GitHub, rather than the Dependabot dashboard, to monitor your version updates, and you'll configure Dependabot through the new config file rather than a UI.

    If you've got any questions or feedback for us, please let us know by creating an issue in the dependabot/dependabot-core repository.

    Learn more about migrating to GitHub-native Dependabot

    Please note that regular @dependabot commands do not work on this pull request.

  • Use string builder to reduce allocations

    Use string builder to reduce allocations

    Hello, this pr uses strings.Builder instead of append to reduce the number of memory allocations.

    Unidecode on string "üñíCöDÈ" repeated n times, with append:

    cpu: AMD Ryzen 5 5600X 6-Core Processor             
    BenchmarkUnidecode/go-unidecode_1-12             3358656               328.9 ns/op           256 B/op          7 allocs/op
    BenchmarkUnidecode/go-unidecode_10-12             491061              2354 ns/op            4240 B/op         29 allocs/op
    BenchmarkUnidecode/go-unidecode_100-12             64015             18734 ns/op           31440 B/op        212 allocs/op
    BenchmarkUnidecode/go-unidecode_1000-12             4491            242756 ns/op          521266 B/op       2019 allocs/op
    BenchmarkUnidecode/go-unidecode_10000-12             337           3411895 ns/op         5713300 B/op      20028 allocs/op
    

    with strings.Builder:

    cpu: AMD Ryzen 5 5600X 6-Core Processor             
    BenchmarkUnidecode/go-unidecode_1-12            17364760                70.24 ns/op            8 B/op          1 allocs/op
    BenchmarkUnidecode/go-unidecode_10-12            1610168               757.2 ns/op           248 B/op          5 allocs/op
    BenchmarkUnidecode/go-unidecode_100-12            170352              6377 ns/op            1912 B/op          8 allocs/op
    BenchmarkUnidecode/go-unidecode_1000-12            17743             68894 ns/op           34296 B/op         15 allocs/op
    BenchmarkUnidecode/go-unidecode_10000-12            1668            760811 ns/op          285432 B/op         22 allocs/op
    
  • Transliteration error

    Transliteration error

    Thank you very much for the wonderful library, I am glad it is here!

    I did come across one transliteration issue - I believe してください should be shitekudasai instead of shitekutasai. I tried a number of Hiragana-Romaji converters which used number of different methods and all choose "da" instead of "ta" for the fourth syllable.

Unicode transliterator for #golang

Unicode transliterator (also known as unidecode) for Go Use the following command to install gounidecode go get -u github.com/fiam/gounidecode/unideco

Sep 27, 2022
A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text. To do: write documentation make it faster Keywords: text categ

Nov 28, 2022
Go efficient text segmentation and NLP; support english, chinese, japanese and other. Go 语言高性能分词

gse Go efficient text segmentation; support english, chinese, japanese and other. 简体中文 Dictionary with double array trie (Double-Array Trie) to achiev

Jan 8, 2023
A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Jan 4, 2023
A tool to find all duplicates in large sets of text documents.

⊧ dupi Dupi is an engine for identifying and exploring duplicative text in sets of documents. Status Dupi is in alpha/early beta development stage. Pl

Dec 23, 2022
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29

segment A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 Features Currently only segmentation at Word

Dec 19, 2022
asciigrid is a Go package that implements decoder and encoder for the Esri ASCII grid format, also known as ARC/INFO ASCII GRID.

asciigrid asciigrid is a Go package that implements decoder and encoder for the Esri ASCII grid format, also known as ARC/INFO ASCII GRID. Install go

Jul 3, 2022
Wrap unicode text not to exceed a certain width.

wwrap Wrap unicode text not to exceed a specified column width. There is a fold utility in the GNU Coreutils package, but unfortunately it works on by

Dec 1, 2021
Simple utilities for creating ascii text in Go

Simple utilities for creating ascii text in Go

Oct 30, 2021
Simple text-to-ascii-art generator

Simple text-to-ascii-art generator

Nov 17, 2021