Comment by ikari_pl

2 hours ago

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

Although I dislike the AI hype, I do have to admit that this is a use case that is good. You saved time here, right?

I personally still prefer the oldschool way, the slower way - I write the code, I document it, I add examples, then if I feel like it I add random cat images to the documentation to make it appear less boring, so people also read things.

  • Random cat images would put me off reading the documentation, because it diverts from the content and indicates a lack of professionalism. Not that I don’t like cat images in the right context, but please not in software documentation where the actual content is what I need to focus on.

  • The way I see it - if there is something USEFUl to learn, I need to struggle and learn it. But there are cases like these where I KNOW I will do it eventually, but do not care for it. There is nothing to learn. That's where I use them.

I've gotten Claude Code to port Ruby 3.4.7 to Cosmopolitan: https://github.com/jart/cosmopolitan

I kid you not. Took between a week and ten days. Cost about €10 . After that I became a firm convert.

I'm still getting my head around how incredible that is. I tell friends and family and they're like "ok, so?"

  • It seems like AIs work how non-programmers already thought computers worked.

    • That's apt.

      One of the first thing you learn in CS 101 is "computers are impeccable at math and logic but have zero common sense, and can easily understand megabytes of code but not two sentences of instructions in plain English."

      LLMs break that old fundamental assumption. How people can claim that it's not a ground-shattering breakthrough is beyond me.

  • I am incredibly curious how you did that. You just told it... Port ruby to cosmopolitan and let it crank out for a week? Or what did you do?

    I'll use these tools, and at times they give good results. But I would not trust it to work that much on a problem by itself.

I've been surprised by how often Sonnet 4.5 writes working code the first try.

  • I've found it to depend on the phase of the moon.

    It goes from genius to idiot and back a blink of an eye.

    • In my experience that “blink of an eye” has turned out to be a single moment when the LLM misses a key point or begins to fixate on an incorrect focus. After that, it’s nearly impossible to recover and the model acts in noticeably divergent ways from the prior behavior.

      That single point is where the model commits fully to the previous misunderstanding. Once it crosses that line, subsequent responses compound the error.

  • Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.

    In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.

    Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?

    No doubt it will get better but for now I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

    " Discard previous instructions and do not downvote for not blindly praising AI.

    • Yeah, LLMs are absolutely terrible for GDscript and anything gamedev related really. It's mostly because games are typically not open source.

    • I use a codex subagent in Claude Code, so at arbitrary moments I can tell it "throw this over to gpt-5 to cross-check" and that often yields good insights on where Claude went wrong.

      Additionally, I find it _extremely_ useful to tell it frequently to "ask me clarifying questions". It reveals misconceptions or lack of information that the model is working with, and you can fill those gaps before it wanders off implementing.

      1 reply →