Comment by ebonnafoux
9 days ago
You can easily make a RLAIF loop.
- Take a list of n animals * m vehicule
- Ask a LLM to generate SVG for this n*m options
- Generate png from the svg
- Ask a Model with vision to grade the result
- Change your weight accordingly
No need to human to draw the dataset, no need of human to evaluate.
No comments yet
Contribute on Hacker News ↗