Comment by theredsix
10 hours ago
Are you guys going to follow up with a paper showing EDEN results match or beat turboquant for needle in a haystack benchmarks?
10 hours ago
Are you guys going to follow up with a paper showing EDEN results match or beat turboquant for needle in a haystack benchmarks?
The note includes extensive experiments and reproduces many of the figures from the TurboQuant paper in our Section 5. Honestly, I think our case is pretty clear-cut as is. I am not sure what the overhead for those specific benchmarks would be, but we will look into it.
(In any case, I want to emphasize that TurboQuant quantizer is a private case of EDEN)
with the amount of traction this has gotten... coming with a clear set of experiments even on arxiv paper would be of great help to showcase your improvements. And if they're easily reproducible, they could get integrated in the mainstream inference engines as well, as the main point here is compression with little degradation.
When you use TurboQuant, you are essentially using the EDEN quantizer under a different name applied to KV-cache.
Both EDEN and its 1-bit variant have been implemented in PyTorch, JAX, and TensorFlow across numerous open-source libraries and are used in various applications. I am currently writing a blog post that will document these in detail.
EDEN defines a scale parameter, S, for which we suggest specific optimal values for both biased and unbiased versions. As shown in the note I shared, these values lead to clear empirical improvements. Consequently, users who rely on the less optimal S value and the unbiasing method popularized by TurboQuant will generally see inferior results compared to those using EDEN with the optimal scale values suggested in our original papers.