Go library that provides fuzzy string matching optimized for filenames and code symbols in the style of Sublime Text, VSCode, IntelliJ IDEA et al.

gopher looking for stuff gopher found stuff

fuzzy

Build Status Documentation

Go library that provides fuzzy string matching optimized for filenames and code symbols in the style of Sublime Text, VSCode, IntelliJ IDEA et al. This library is external dependency-free. It only depends on the Go standard library.

Features

  • Intuitive matching. Results are returned in descending order of match quality. Quality is determined by:

    • The first character in the pattern matches the first character in the match string.
    • The matched character is camel cased.
    • The matched character follows a separator such as an underscore character.
    • The matched character is adjacent to a previous match.
  • Speed. Matches are returned in milliseconds. It's perfect for interactive search boxes.

  • The positions of matches are returned. Allows you to highlight matching characters.

  • Unicode aware.

Demo

Here is a demo of matching various patterns against ~16K files from the Unreal Engine 4 codebase.

demo

You can run the demo yourself like so:

cd _example/
go get github.com/jroimartin/gocui
go run main.go

Usage

The following example prints out matches with the matched chars in bold.

package main

import (
	"fmt"

	"github.com/sahilm/fuzzy"
)

func main() {
	const bold = "\033[1m%s\033[0m"
	pattern := "mnr"
	data := []string{"game.cpp", "moduleNameResolver.ts", "my name is_Ramsey"}

	matches := fuzzy.Find(pattern, data)

	for _, match := range matches {
		for i := 0; i < len(match.Str); i++ {
			if contains(i, match.MatchedIndexes) {
				fmt.Print(fmt.Sprintf(bold, string(match.Str[i])))
			} else {
				fmt.Print(string(match.Str[i]))
			}
		}
		fmt.Println()
	}
}

func contains(needle int, haystack []int) bool {
	for _, i := range haystack {
		if needle == i {
			return true
		}
	}
	return false
}

If the data you want to match isn't a slice of strings, you can use FindFrom by implementing the provided Source interface. Here's an example:

package main

import (
	"fmt"

	"github.com/sahilm/fuzzy"
)

type employee struct {
	name string
	age  int
}

type employees []employee

func (e employees) String(i int) string {
	return e[i].name
}

func (e employees) Len() int {
	return len(e)
}

func main() {
	emps := employees{
		{
			name: "Alice",
			age:  45,
		},
		{
			name: "Bob",
			age:  35,
		},
		{
			name: "Allie",
			age:  35,
		},
	}
	results := fuzzy.FindFrom("al", emps)
	for _, r := range results {
		fmt.Println(emps[r.Index])
	}
}

Check out the godoc for detailed documentation.

Installation

go get github.com/sahilm/fuzzy or use your favorite dependency management tool.

Speed

Here are a few benchmark results on a normal laptop.

BenchmarkFind/with_unreal_4_(~16K_files)-4         	     100	  12915315 ns/op
BenchmarkFind/with_linux_kernel_(~60K_files)-4     	      50	  30885038 ns/op

Matching a pattern against ~60K files from the Linux kernel takes about 30ms.

Contributing

Everyone is welcome to contribute. Please send me a pull request or file an issue. I promise to respond promptly.

Credits

  • @ericpauley & @lunixbochs contributed Unicode awareness and various performance optimisations.

  • The algorithm is based of the awesome work of forrestthewoods. See this blog post for details of the algorithm.

  • The artwork is by my lovely wife Sanah. It's based on the Go Gopher.

  • The Go gopher was designed by Renee French (http://reneefrench.blogspot.com/). The design is licensed under the Creative Commons 3.0 Attributions license.

License

The MIT License (MIT)

Copyright (c) 2017 Sahil Muthoo

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Owner
Sahil Muthoo
I write software for the joy of creation. I love making the sand castles in my head come alive inside a computer.
Sahil Muthoo
Comments
  • Using an interface instead of a list of strings for data source

    Using an interface instead of a list of strings for data source

    I'd like to offer a possible improvement to the API of Find. In the style of the sort.Sort interface, I've put an interface between the data source and Find(). The motivation is to allow using Find() for lists of things that might not be strings. In my particular case, I had an array of structs that included a Name field. I was able to use Find (or the renamed FindFrom) like this:

    type MyStruct struct {
        SomeField string
        SomeID int64
        Name string
    }
    
    type MyStructList []*MyStruct
    
    func (a MyStructList) String(i int) string { return a[i].Name }
    func (a MyStructList) Len() int { return len(a[i]) }
    
    var thingList []*MyString
    
    // Do something to fill thingList
    
    matches := fuzzy.FindFrom(pattern, MyStructList(thingList))
    

    It strikes me as a much more general way of passing data to a function, rather than limiting the package to only work for lists of strings.

  • FindFrom return another struct value?

    FindFrom return another struct value?

    Fairly new to go, just have a question about the FindFrom function.

    1. Using the README example, is it possible to return the "age" field with the result?

    Edit: After playing with it a bit more...

    1. I am trying to use this as an auto complete in a react app. I noticed when imputing a quotation mark, on my computer it is U+0022 (Standard straight quotation mark) but when entering on my phone the value is U+201D ” (Right double quotation mark) and the search on my phone will not match my data. Would if be worth it to make U+0022,U+201C,U+201D equal when searching?

    2. Is there a more elegant way to limit the search results than this? maybe the FindFrom function could have a functional option to limit the result?

    	r := results.Len()
    	if r > 10 {
    		r = 10
    	}
    	results = results[:r]
    
  • Fixed for non-ascii characters and optimized

    Fixed for non-ascii characters and optimized

    This PR removes index access to input strings and instead uses range iteration and runeLen offsets.

    It also switches from string representations to rune representations, and replaces toLower conversions with equalFold checks, which are significantly faster.

    Performance numbers:

    benchmark                                          old ns/op     new ns/op     delta
    BenchmarkFind/with_unreal_4_(~16K_files)-8         18821066      4926121       -73.83%
    BenchmarkFind/with_linux_kernel_(~60K_files)-8     45350582      11973487      -73.60%
    
    benchmark                                          old allocs     new allocs     delta
    BenchmarkFind/with_unreal_4_(~16K_files)-8         207238         897            -99.57%
    BenchmarkFind/with_linux_kernel_(~60K_files)-8     190717         204            -99.89%
    
    benchmark                                          old bytes     new bytes     delta
    BenchmarkFind/with_unreal_4_(~16K_files)-8         5254408       158912        -96.98%
    BenchmarkFind/with_linux_kernel_(~60K_files)-8     12121526      38496         -99.68%
    
  • Feature interfaces

    Feature interfaces

    Hi, I added a new function that takes []*interface{} as input and extracts the string from it with fmt.Sprint() function.

    I added val *interface{} to match struct, it returns nil if you are searching for a string, but on the other hand, if you are searching for *interface{} it will return the actual value.

  • benchmarks comparison with similar fuzzy search engines in Go

    benchmarks comparison with similar fuzzy search engines in Go

    It might be fun/educational to compare/contrast the benchmark results posted here against other existing fuzzy search engines such as https://github.com/renstrom/fuzzysearch and see how they all stack up. :)

  • AddingPowerSupport_For_golang-github-sahilm-fuzzy

    AddingPowerSupport_For_golang-github-sahilm-fuzzy

    Adding Power Support.

    Adding power support ppc64le This is part of the Ubuntu distribution for ppc64le. This helps us simplify testing later when distributions are re-building and re-releasing. For more info tag @gerrith3.

    The build is successful on both arch:Amd64/ppc64le, Please find the Travis Link: https://travis-ci.com/github/santosh653/fuzzy

    Please Let me know if you need any other details or If I am missing any.. Thank You !!

  • Strings with spaces

    Strings with spaces

    Been using fuzzy for my react autocomplete component and its working very good. Is it possible to tweak fuzzy to ignore the order of words when there are spaces? If I'm searching for "duct tape" I can't find the result when I search "tape duct".

  • Errors in the demo example

    Errors in the demo example

    The initialization of g is g, err = gocui.NewGui(gocui.OutputNormal) instead of g, err := gocui.NewGui(gocui.OutputNormal)

    The usage of g in the finder function gives an error saying g is undefined as g is declared in the main function.

  • [Question] Multithreading

    [Question] Multithreading

    Hello @sahilm

    Do you consider improving performance with multi-threaded implementation? If so, I did this -> https://github.com/isacikgoz/fuzzy/commit/fc2d185b8ab5c12fcc68a446f904cae18fba066d if you approve it I can send a PR.

    Thanks in advance.

  • Scoring - implement score based on length of gaps between matched letter?

    Scoring - implement score based on length of gaps between matched letter?

    So this is a suggestion/question. The readme indicates that

    The matched character is adjacent to a previous match.

    This works in the following case - search string of blog image

    However, if I now type rpy so that the search string is blogrpy then image

    In this case, the first match has all the characters - but its obvious that it's of lesser quality.

    I'm not sure how the adjacency scoring is implemented but could do something with sum of number of letters between matches? Something which has a higher number of letters between matched chars gets a lower score.

  • feat: fuzzy find without sorting

    feat: fuzzy find without sorting

    alternative FindNoSort method to do only fuzzy matching, without sorting.

    This might be useful when the list is already sorted by some other priority and you want to keep that order...

  • Community fixes + Speed up the search

    Community fixes + Speed up the search

    Hi @sahilm

    For our personal use, we forked your project and merged interesting contributions from the community. The contributions in this merge request do not really change your original source code (minor fixes).

    I don't know what you are planning for your Git repo.... Are you still motivated to maintain it?

    I saw that @isacikgoz is also maintaining the fork https://github.com/isacikgoz/fuzzy with other changes…

    Please tell me if you want to merge my pull request: I will swap teal-finance -> sahilm back before you merge it.

  • would this make a good spell checker?

    would this make a good spell checker?

    Wondering if this lib could be used for spell checking? I know the algorithms for spell checkers can be much different than your classic auto complete fuzzy search jam, so just curious.

  • Add option to ignore spaces?

    Add option to ignore spaces?

    Would it be difficult to add an option to have fuzzy.Find() ignore spaces?

    This would be nice to normalize the behavior with fzf.

    For example, if I am searching through a bunch of filepaths, I will sometimes type part of one directory name, , and then a part of some other directory name or the filename.

    In this sense, you are sort of passing in fragments to be matched, and the spaces are just being used to separate the fragments.

    I saw that there was another similar issue requesting support to allow the order to be ignored, which is indeed how fzf operates, but even if the method still expects search phrases to be provided order, this would still be useful, imo.

    Thanks for taking the time to put this library together and share it with the community! It works really well overall and was quite simple to pick up and use.

  • Fuzzing Fuzzy project

    Fuzzing Fuzzy project

    This adds a fuzz target and quickly fixes a first bug found by it

    Reproducer for the bug is https://play.golang.org/p/ahd0SUU1J76

    package main
    
    import (
    	"fmt"
    	"github.com/sahilm/fuzzy"
    )
    
    func main() {
    	pattern := "\n\x13b\x00b\x98ш\x88\x95"
    	datas := []string{"\bL\x05\x00\x00\x00\xef\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x98ш\x88D\x95\x95\xbc\x15\xff\x03\x95\xd6\b%%\nB\n\x00\x00\xef\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x98ш\b\b"}
    	matches := fuzzy.Find(pattern, datas)
    	fmt.Printf("Hello, playground %#+v", matches)
    }
    

    outputs

    panic: runtime error: index out of range [9] with length 9
    
    goroutine 1 [running]:
    github.com/sahilm/fuzzy.FindFrom(0x116e83, 0xa, 0x15e960, 0x40a0e0, 0x40a0e0, 0x1793, 0x430070, 0x1)
    	/tmp/gopath192064974/pkg/mod/github.com/sahilm/[email protected]/fuzzy.go:116 +0xbe0
    
TUI grep tool respect for IntelliJ
TUI grep tool respect for IntelliJ

ilse TUI grep tool respect for IntelliJ Requirements ripgrep for fast grep bat for beautiful preview Features support HeadMatch(FirstMatch), WordMatch

Sep 27, 2022
Golang code-generators used to implement Kubernetes-style API types.

code-generator Golang code-generators used to implement Kubernetes-style API types. Purpose These code-generators can be used in the context of Custom

Dec 30, 2022
Generates random text based on trigrams generated from input text
Generates random text based on trigrams generated from input text

Trigrams Generates random text based on trigrams generated from input text Contents Building Running Using Implementation notes NGram size Maximum wor

Feb 9, 2022
Leftright - A concurrent map that is optimized for scenarios where reads are more frequent than writes

leftright A concurrent map that is optimized for scenarios where reads are more

Jan 30, 2022
rxscan provides functionality to scan text to variables using regular expression capture group.

rxscan rxscan provides functionality to scan text to variables using regular expression capture group. This library is still experimental, use at your

Dec 21, 2020
A go1.18+ package to (maybe) simplify performing operations on slices in a fluent-like style.

sop ✨ W.I.P. ✨ sop (slices operation) is a go1.18+ package to (maybe) simplify performing operations on slices in a fluent-like style with common oper

Oct 1, 2022
💪 Helper Utils For The Go: string, array/slice, map, format, cli, env, filesystem, test and more.
💪 Helper Utils For The Go: string, array/slice, map, format, cli, env, filesystem, test and more.

?? Helper Utils For The Go: string, array/slice, map, format, cli, env, filesystem, test and more. Go 的一些工具函数,格式化,特殊处理,常用信息获取等等

Jan 6, 2023
encLib is a simple golang package for quickly encoding and decoding string data in hex

encLib is a simple golang package for quickly encoding and decoding string data in hex

Nov 1, 2021
Deduplicated and GC-friendly string store

This library helps to store big number of strings in structure with small number of pointers to make it friendly to Go garbage collector.

Nov 10, 2021
Hotswap provides a solution for reloading your go code without restarting your server, interrupting or blocking any ongoing procedure.
Hotswap provides a solution for reloading your go code without restarting your server, interrupting or blocking any ongoing procedure.

Hotswap provides a solution for reloading your go code without restarting your server, interrupting or blocking any ongoing procedure. Hotswap is built upon the plugin mechanism.

Jan 5, 2023
Package strnaming is used to Convert string to camelCase, snake_case, kebab-case.

strnaming Package strnaming is used to Convert string to camelCase, snake_case, kebab-case. Contents strnaming Contents API Examples Install Quick sta

Oct 24, 2021
Cell is a Go package that creates new instances by string in running time.

Cell Cell is a Go package that creates new instances by string in running time. Getting Started Installing To start using CELL, install Go and run go

Dec 20, 2021
ms - 'my story' creates a secure password string which can be memorized with a technique shared by Max.

On 23.12.21 20:22, Stefan Claas wrote: [...] > > Yes, I am aware of that, but how can one memorize a key when traveling > and not taking any devices

Dec 24, 2021
Q2entities - Parse the entities string from a Quake 2 .bsp map file. Written in Go

Q2Entities A simple command-line utility to extract the entities string from a Quake 2 map file. Entities? Binary Space Partitioning maps (.bsp) conta

Apr 9, 2022
Generates a random alphanumeric string of a given length.

randstring randstring.Create () is fast and has minimal memory allocation. It returns a random alphanumeric string of a given length. Install go get g

Jan 7, 2022
Clean-Swift source and test code auto-generator. It can save you time typing 500-600 lines of code.
Clean-Swift source and test code auto-generator. It can save you time typing 500-600 lines of code.

Clean-Swift source & test code auto generator Overview Run Output Basic Usage make config.yaml target_project_name: Miro // target project name copyri

Apr 13, 2022
Code generator that generates boilerplate code for a go http server

http-bootstrapper This is a code generator that uses go templates to generate a bootstrap code for a go http server. Usage Generate go http server cod

Nov 20, 2021
📋 cross-platform clipboard package that supports accessing text and image in Go (macOS/Linux/Windows/Android/iOS)

clipboard Cross platform (macOS/Linux/Windows/Android/iOS) clipboard package in Go import "golang.design/x/clipboard" Features Cross platform supports

Dec 24, 2022
tail a text file and trigger an action if a new line occurs [wip]

Tailpipe Synopsis Config Help Synopsis Tail a file and trigger an action if a new line occurs. Currently only mailing the new line somewhere is suppor

Dec 23, 2021