Comment by chrisfrantz

2 days ago

This is where I believe we are headed as well. Frontier models "curate" and provide guardrails, very fast and competent agents do the work at incredibly high throughput. Once frontier hits cracks the "taste" barrier and context is wide enough, even this level of delivery + intelligence will be sufficient to implement the work.

Taste is why I switched from GLM-4.6 to Sonnet. I found myself asking Sonnet to make the code more elegant constantly and then after the 4th time of doing that laughed at the absurdity and just switched models.

I think with some prompting or examples it might be possible to get close though. At any rate 1k TPS is hard to beat!

  • I think you meant from Sonnet to GLM-4.6?

    • Did you have the opposite experience?

      It was a little while ago but, GLM's code was generally about twice as long, and about 30% less readable than Sonnet's even at the same length.

      I was able to improve this with prompting and examples but... at some point I realized, I would prefer the simplicity of using the real thing.

      I had been using GLM in Claude code with Claude code router, because while you can just change the API endpoint, the web search function doesn't work, and neither does image recognition.

      Maybe that's different now, or maybe that's because I was on the light plan, but that was my experience.

      Claude code router allowed me to Frankenstein this, so that it was using Gemini for search and vision instead of GLM. Except that turns out that Gemini also sucks at search for some reason, so I ended up just making my own proxy which uses actual Google instead.

      But yeah at some point I realized the Rube Goldberg machine was giving me more headaches than its solved. (It was also way slower than the real thing.) So I paid the additional $18 or whatever to just get rid of it.

      That being said I did just buy the GLM year for $25 because $2/month is hard to beat. But I keep getting rate limited, so I'm not sure what to actually use it for!

      1 reply →