Comment by 59nadir

3 days ago

It's possible kiitos has (or had?) a higher standard in mind for what should constitute a senior/"lead engineer" at Cloudflare and how much they should be constrained by typing as part of implementation.

Out of interest: How much did the entire process take and how much would you estimate it to take without the LLM in the loop?

3 comments

59nadir

kentonv 3 days ago

> It's possible kiitos has (or had?) a higher standard in mind for what should constitute a senior/"lead engineer" at Cloudflare and how much they should be constrained by typing as part of implementation.

See again here, you're implying that I or my code is disappointing somehow, but with no explanation for how except that it was LLM-assisted. I assert that the code is basically as good as if I'd written it by hand, and if you think I'm just not a competent engineer, like, feel free to Google me.

It's not the typing itself that constrains, it's the detailed but non-essential decision-making. Every line of code requires making several decisions, like naming variables, deciding basic structure, etc. Many of these fine-grained decisions are obvious or don't matter, but it's still mentally taxing, which is why nobody can write code as fast as they can type even when the code is straightforward. LLMs can basically fill in a bunch of those details for you, and reviewing the decisions -- especially the fine-grained ones that don't matter -- is a lot faster than making them.

> How much did the entire process take and how much would you estimate it to take without the LLM in the loop?

I spent about five days mostly focused on prompting the LLM (although I always have many things interrupting me throughout the day, so I wasn't 100% focused). I estimate it would have taken me 2x-5x as long to do by hand, but it's of course hard to say for sure.

59nadir 2 days ago
> See again here, you're implying that I or my code is disappointing somehow, but with no explanation for how except that it was LLM-assisted. I assert that the code is basically as good as if I'd written it by hand, and if you think I'm just not a competent engineer, like, feel free to Google me.
I think you're reading a bit too deeply into what I wrote; I explained what I interpreted kiitos' posts as essentially saying. I realize that you've probably had to deal with a lot of people being skeptical to the point of "throwing shade", as it were, so I understand the defensive posture. I am skeptical, but the reason I'm asking questions (alongside the previous bit) is because I'm actually curious about your experiment.
> It's not the typing itself that constrains, it's the detailed but non-essential decision-making. Every line of code requires making several decisions, like naming variables, deciding basic structure, etc. Many of these fine-grained decisions are obvious or don't matter, but it's still mentally taxing, which is why nobody can write code as fast as they can type even when the code is straightforward. LLMs can basically fill in a bunch of those details for you, and reviewing the decisions -- especially the fine-grained ones that don't matter -- is a lot faster than making them.
In your estimation, what is your mental code coverage of the code you ended up with? Do you feel like you have a complete mapping of it, i.e. you could get an external request for change and map it quickly to where it needs to be made and why exactly there?
- kentonv 2 days ago
  
  > In your estimation, what is your mental code coverage of the code you ended up with? Do you feel like you have a complete mapping of it, i.e. you could get an external request for change and map it quickly to where it needs to be made and why exactly there?
  I know the code structure about as well as if I had written it.
  Honestly the code structure is not very complicated. It flows pretty naturally from the interface spec in the readme, and I'd expect anyone who knows OAuth could find their way around pretty easily.
  But yes, as part of prompting improvements to the code, I had to fully understand the implementation. My prompts are entirely based on reading the code and deciding what needed to be changed -- not based on any sort of black-box testing of the code (which would be "vibe coding").