Comment by withinboredom
3 days ago
Looks good! There's an important thing missing from the benchmarks though:
- cpu usage under concurrency: many of these spin-lock or use atomics, which can use up to 100% cpu time just spinning.
- latency under concurrency: atomics cause cache-line bouncing which kills latency, especially p99 latency
Yup, that's a valid point. I'll consider adding these metrics.
Am I reading the benchmark code that uses the same prefix for all string keys? This would be pathological for any trie-based implementation.