Comment by macNchz

15 hours ago

I've used embeddings to define clusters, then passed sampled documents from each cluster to an LLM to create labels for each grouping. I had pretty impressive results from this approach when creating a category/subcategory labels for a collection of texts I worked on recently.

That's interesting, it sounds a bit like those cluster graph visualisation techniques. Unfortunately, my texts seem to fall into clusters that really don't match the ones that I had hoped to get out of these methods. I guess it's just a matter of fine-tuning now.