Comment by dgacmu

1 month ago

Oh man, that's funny to see one of my grad school class projects in that list. Takes me back. :-)

From that experience: The LLM is likely to do drastically better. Most of the prior work, mine included, took a genetic algorithm approach, but an LLM is more likely to make coherent multi-instruction modifications.

It's a shame they didn't compare against some of the standard core wars benchmarks as a way to facilitate comparisons to prior work, though. Makes it hard to say that they're better for sure. https://corewar.co.uk/bench.htm

For anybody who stumbles over this thread and is curious:

Ring Warrior Enhanced v9 has a Wilkies score of 34, and

Spiral Bomber Optimized v22 has a Wilkies score of 85.

At least that's what my quick and dirty check with exMars says :-)

34 is not that great. 85 is better, but I think some Core War evolvers can match it. For instance, the MEVO example at https://newton.freehostia.com/net/corewar/evol/ describes an evolved warrior with a score of 93.

I'm not sure if that will hold up. The LLM is not going to do anything random and that is actually a powerful component that makes original output possible.

  • I wonder if a combination would be useful. Use an actual GA to do the mutation, and then let an LLM "fix" each mutated child.

    • Could be. But the interesting thing is that all you can do here is optimize. Random chance is - like attention ;) - all you need.