Comment by pu_pe

4 months ago

What about accuracy? Maybe I'm missing something, but the crucial piece of information that is missing is whether the labels produced by both methods converge nicely. The fact that OP had >6000 categories using LLMs makes me wonder whether there is any validation at all, or you just let the LLMs freestyle.