I want to share a similar exploration I've made in other project regarding the impact on defer
.
I avoid using defer
in the Add
method with some interesting results. Considering there wasn't a benchmark for Add
, I created one. Using this new benchmark, I tested the actual implementation and this new one.
The end results are:
benchmark old ns/op new ns/op delta
BenchmarkAdd/128B-8 55666075 49689240 -10.74%
BenchmarkAdd/256B-8 54631261 50188483 -8.13%
BenchmarkAdd/512B-8 57324361 50005511 -12.77%
BenchmarkAdd/1K-8 55605380 51304633 -7.73%
BenchmarkAdd/2K-8 55761200 50059447 -10.23%
BenchmarkAdd/4K-8 56268353 50719990 -9.86%
BenchmarkAdd/8K-8 55517450 50338050 -9.33%
BenchmarkAdd/16K-8 57047413 50159699 -12.07%
BenchmarkAdd/32K-8 56182766 50119649 -10.79%
benchmark old allocs new allocs delta
BenchmarkAdd/128B-8 306915 306915 +0.00%
BenchmarkAdd/256B-8 306915 306915 +0.00%
BenchmarkAdd/512B-8 306915 306915 +0.00%
BenchmarkAdd/1K-8 306915 306915 +0.00%
BenchmarkAdd/2K-8 306915 306915 +0.00%
BenchmarkAdd/4K-8 306915 306915 +0.00%
BenchmarkAdd/8K-8 306915 306915 +0.00%
BenchmarkAdd/16K-8 306915 306915 +0.00%
BenchmarkAdd/32K-8 306915 306915 +0.00%
benchmark old bytes new bytes delta
BenchmarkAdd/128B-8 14731932 14731929 -0.00%
BenchmarkAdd/256B-8 14731921 14731923 +0.00%
BenchmarkAdd/512B-8 14731922 14731920 -0.00%
BenchmarkAdd/1K-8 14731920 14731920 +0.00%
BenchmarkAdd/2K-8 14731921 14731920 -0.00%
BenchmarkAdd/4K-8 14731921 14731920 -0.00%
BenchmarkAdd/8K-8 14731921 14731920 -0.00%
BenchmarkAdd/16K-8 14731921 14731920 -0.00%
BenchmarkAdd/32K-8 14731920 14731920 +0.00%
This means that not using defer
seems to have a considerable impact in ns/op
of Add
.
What are those sub-benchs? Just applying the same idea of bitcask to test if the meta
size has some influence on the results. Seeing the implementation, it shouldn't and the results seem to reflect that. (so we can ignore the sub-benchmarks if you like, I kept them just in case).
How to reproduce these results? (not completely elegant, but enough to show the experiment)
git clone --branch master_avoiddefer https://github.com/jsign/trie.git
rm *.txt ; go test -benchmem -benchtime=5s -bench="BenchmarkAdd\b" > before.txt && go test -benchmem -benchtime=5s -bench=BenchmarkAddWithoutDefer > after.txt && sed -i 's/WithoutDefer//g' after.txt && benchcmp before.txt after.txt
Small explanation:
- Clone my fork wich has the new benchmark plus both the original a new implementation of
Add
- Run both benchmarks, fix the ouput to be compared with
benchcmp
.
This idea may apply to other parts where defer sync.Mutex.Unlock()
exists.
In the case of bitcask, the impact was close to 4%, here seems to be greater. If I'm not missing something, looks like the Add
function is quite fast and defer
overhead is relatively big (relatively).