Comment by jama211

1 month ago

The difference between the performance of models between 2024 and 2025 has been so stark, that graph really shows it. There are still many people on these forums who seem to think AI’s produce terrible code unless ultra supervised, and I can’t help but suspect some of them tried it a little while ago and just don’t understand how different it is now compared to even quite recently.

5 comments

jama211

Madmallard 1 month ago

I used Gemini Pro, Claude Pro yesterday a couple of dozen times and basically have been daily.

I have a project to convert my multiplayer XNA game from C# to Javascript and to add networking to the game-play using LLMs.

They are far worse at it now than they were a year ago. They actually implemented the requirements (Though inaccurately) to the best of their ability a year ago. Especially Gemini.

Now they don't even come remotely close to implementing just the basic requirements.

The thing is, I'm giving them the entirety of the C# source code and spelling out what they should do.

simonw 1 month ago
Weird. I would expect Gemini 3 Pro and Claude Opus 4.5 to run rings around Gemini 1.5 Pro and Claude Sonnet 3.5.
How are you running them - regular chat interface or do you have them setup with Claude Code or Gemini CLI?
- Madmallard 1 month ago
  
  Using the chat interface primarily with various prompting strategies.
  I am considering making a thread where I compel others to attempt to get what I'm trying to get out of it and show me their work.
  The game is only around 25000-30000 LOC in C#.
  
  1 reply →
jennyholzer3 1 month ago

"They are far worse at it now than they were a year ago."
This is the part they REALLY don't want you to say.
They can no longer train these models effectively and their performance is slipping. Late 2023 was the golden age.