The readme says nothing about how files are checked for uniqueness. I've looked through the source code and couldn't identify neither the usage of a cryptographic hash nor the comparison of actual contents byte-for-byte (although I might have missed any of that).
The only thing I found was the usage of CRC32 and the following comment in file_hash.go
:
// GetDigest generates entity.FileDigest of the file provided, in an extremely fast manner
// without compromising the quality of file's uniqueness.
//
// When this function was called on approximately 172k files (mix of photos, videos, audio files, PDFs etc.), the
// uniqueness identified by this matched uniqueness identified by SHA-256 for *all* files
To me it seems that for any file the uniqueness is determined by a CRC32 based on 8 KiB of the file contents (for larger files taken from the beginning, middle and end).
If this is the case I personally find this very concerning... It might work for high-entropy data formats (like the audio, video and other file formats you've tested against which employ some form of compression), but imagine using it for text files, say source code of the same code in multiple folders, then I think it is trivial to find "duplicates" which aren't actually duplicates.
I would dare to say that your statement on in the readme "blazingly-fast simple-to-use tool to find duplicate files" borderlines to false advertisement... Yes, it is blazingly-fast because you don't actually read the whole file, nor do you compute any sort of cryptographic hash, and as a consequence you don't actually test for uniqueness...
However to be constructive, I do think that using CRC32 is a good start to find duplicate candidates (since two files with different CRC32 are certainly different), so after you cluster files with the same CRC32 (and I might add the same size) you could now compute a proper cryptographic hash (I recommend testing a few like SHA1, SHA512/256 and Blake3 to see which is faster on your architecture, I went with Blake3 in my own tests).