Comment by johnfn
10 hours ago
The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up.
10 hours ago
The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up.
This is the key to the whole thing in my opinion.
If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work.
Yeah and he claims a pass rate of 99.96%. At that point you might be running into bugs in the original implementation.