Comment by meander_water
4 months ago
Why wouldn't you use OP's approach to build up the representative embeddings, and then train the MLP on that?
That way you can effectively handle open sets and train a more accurate MLP model.
With your approach I don't think you can get a representative list of N tweets which covers all possible categories. Even if you did, the LLM would be subject to context rot and token limits.
No comments yet
Contribute on Hacker News ↗