Comment by recursive
8 days ago
No, not every combination. The question is about the specific combination of a pelican on a bicycle. It might be easy to come up with another test, but we're looking at the results from a particular one here.
8 days ago
No, not every combination. The question is about the specific combination of a pelican on a bicycle. It might be easy to come up with another test, but we're looking at the results from a particular one here.
More likely you would just train for emitting svg for some description of a scene and create training data from raster images.
None of this works if the testers are collaborating with the trainers. The tests ostensibly need to be arms-length from the training. If the trainers ever start over-fitting to the test, the tester would come up with some new test secretly.
You can easily make a RLAIF loop.
- Take a list of n animals * m vehicule
- Ask a LLM to generate SVG for this n*m options
- Generate png from the svg
- Ask a Model with vision to grade the result
- Change your weight accordingly
No need to human to draw the dataset, no need of human to evaluate.