Comment by dgacmu
1 day ago
Oh man, that's funny to see one of my grad school class projects in that list. Takes me back. :-)
From that experience: The LLM is likely to do drastically better. Most of the prior work, mine included, took a genetic algorithm approach, but an LLM is more likely to make coherent multi-instruction modifications.
It's a shame they didn't compare against some of the standard core wars benchmarks as a way to facilitate comparisons to prior work, though. Makes it hard to say that they're better for sure. https://corewar.co.uk/bench.htm
I'm not sure if that will hold up. The LLM is not going to do anything random and that is actually a powerful component that makes original output possible.
I wonder if a combination would be useful. Use an actual GA to do the mutation, and then let an LLM "fix" each mutated child.
Could be. But the interesting thing is that all you can do here is optimize. Random chance is - like attention ;) - all you need.