Comment by HPsquared

1 year ago

Using the tool in this way is a bit like mining: repeatedly hacking away with a blunt instrument (simple prompt) looking for diamonds (100x speedup out of nowhere). Probably a lot of work will be done in this semi-skilled brute-force sort of way.

It looks to me to be exactly what a typical coding interview looks like; the first shot is correct and works, and then the interviewer keeps asking if you can spot any ways to make it better/faster/more efficient

If I were a CS student cramming for interviews, I might be dismayed to see that my entire value proposition has been completely automated before I even enter the market.

There must be a feedback request mechanism for a "Is this better?" This is doable with RLHF or DPO.

Once you can basically have it run and benchmark the code, and then iterate that overnight, it’s going to be interesting.

Automating the feedback loop is key.

  • Wouldn't there be some safety concerns to letting the AI run overnight with access to run any command?

    Maybe if it can run sandboxed, with no internet access (but if the LLM is not local, it does require internet access).

Well, in this case it's kind of similar to how people write code. A loop consisting of writing something, reviewing/testing, improving until we're happy enough.

Sure, you'll get better results with an LLM when you're more specific, but what's the point then? I don't need AI when I already know what changes to make.