Comment by withinboredom

3 days ago

Looks good! There's an important thing missing from the benchmarks though:

- cpu usage under concurrency: many of these spin-lock or use atomics, which can use up to 100% cpu time just spinning.

- latency under concurrency: atomics cause cache-line bouncing which kills latency, especially p99 latency

Yup, that's a valid point. I'll consider adding these metrics.

  • Am I reading the benchmark code that uses the same prefix for all string keys? This would be pathological for any trie-based implementation.