- I have improved fastpfor by converting the bytes on-the-fly. It is still not as fast as it should be but it is about 10% faster. What we need to do is a simple cast, and it should be possible:
http://stackoverflow.com/questions/11924196/convert-between-slices-of-different-types
but I couldn't make it work. In Java, the cast happens underneath... that's why I use a ByteBuffer... to speed things up... Your implementation of a ByteBuffer in Go is nice and elegant, but it still does a lot of unnecessary copying.
- You test with very, very large arrays which, at least on my machine, takes several seconds. This seems unnecessary. If one wants to test on very large arrays, you could just construct a large slice and just assign some values into it, instead of generating fancy random data...