← Back to context

Comment by pleshkov

3 days ago

Author here — questions and pushback both welcome.

In the article, you mention this approach requires no search over hyper-parameter, because the method comprises a closed-form solution with "simple" linear algebra. I agree with this, but do you not in think need to tune the L2-regularization strength? That would for me be a hyper-parameter you would need to do a CV over (or similarly).

You should benchmark the retrieval speed of each method in terms of queries per second. I suspect that the gain in bandwidth you get from slightly better compression will be defeated by decompression being much more expensive.

Cool idea. But it only works when the data never changes. could you make a streaming/incremental version? One that updates the math cheaply when new data arrives, instead of recomputing everything, or does the math fundamentally prevent it?