Comment by visarga

1 month ago

The study measures if participants learn the library, but what they should study is if they learn effective coding agent patterns to use the library well. Learning the library is not going to be what we need in the future.

> "We collect self-reported familiarity with AI coding tools, but we do not actually measure differences in prompting techniques."

Many people drive cars without being able to explain how cars work. Or use devices like that. Or interact with people who's thinking they can't explain. Society works like that, it is functional, does not work by full understanding. We need to develop the functional part not the full understanding part. We can write C without knowing the machine code.

You can often recognize a wrong note without being able to play the piece, spot a logical fallacy without being able to construct the valid argument yourself, catch a translation error with much less fluency than producing the translation would require. We need discriminative competence, not generative.

For years I maintained a library for formatting dates and numbers (prices, ints, ids, phones), it was a pile of regex but I maintained hundreds of test cases for each type of parsing. And as new edge cases appeared, I added them to my tests, and iterated to keep the score high. I don't fully understand my own library, it emerged by scar accumulation. I mean, yes I can explain any line, but why these regexes in this order is a data dependent explanation I don't have anymore, all my edits run in loop with tests and my PRs are sent only when the score is good.

Correctness was never grounded in understanding the implementation. Correctness was grounded in the test suite.

11 comments

visarga

2sk21 1 month ago

You can, most certainly, drive a car without understanding how it works. A pilot of an aircraft on the other hand needs a fairly detailed understanding of the subsystems in order to effectively fly it.

I think being a programmer is closer to being an aircraft pilot than a car driver.

iammjm 1 month ago

Sure, if you are a pilot then that makes sense. But what if you are a company that uses planes to deliver goods? Like when the focus shifts from the thing itself to its output
northfield27 1 month ago

Agreed

discreteevent 1 month ago

> Many people drive cars without being able to explain how cars work.

But the fundamentals all cars behave the same way all the time. Imagine running a courier company where sometimes the vehicles take a random left turn.

> Or interact with people who's thinking they can't explain

Sure but they trust those service providers because they are reliable . And the reason that they are reliable is that the service providers can explain their own thinking to themselves. Otherwise their business would be chaos and nobody would trust them.

How you approached your library was practical given the use case. But can you imagine writing a compiler like this? Or writing an industrial automation system? Not only would it be unreliable but it would be extremely slow. It's much faster to deal with something that has a consistent model that attempts to distill the essence of the problem, rather than patching on hack by hack in response to failed test after failed test.

gjadi 1 month ago

Interesting argument.

But isn't the corrections of those errors that are valuable to society and get us a job?

People can tell they found a bug or give a description about what they want from a software, yet it requires skills to fix the bugs and to build software. Though LLMs can speedup the process, expert human judgment is still required.

another-dave 1 month ago
I think there's different levels to look at it.
If you know that you need O(n) "contains" checks and O(1) retrieval for items, for a given order of magnitude, it feels like you've all the pieces of the puzzle needed to make sure you keep the LLM on the straight and narrow, even if you didn't know off the top of your head that you should choose ArrayList.
Or if you know that string manipulation might be memory intensive so you write automated tests around it for your order of magnitude, it probably doesn't really matter if you didn't know to choose StringBuilder.
That feels different to e.g. not knowing the difference between an array list and linked list (or the concept of time/space complexity) in the first place.
- gjadi 1 month ago
  
  My gut feeling is that, without wrestling with data structures at least once (e.g. during a course), then that knowledge about complexity will be cargo cult.
  When it comes to fundamentals, I think it's still worth the investment.
  To paraphrase, "months of prompting can save weeks of learning".
visarga 1 month ago
I think the kind of judgement required here is to design ways to test the code without inspecting it manually line by line, that would be walking a motorcycle, and you would be only vibe-testing. That is why we have seen the FastRender browser and JustHTML parser - the testing part was solved upfront, so AI could go nuts implementing.
- northfield27 1 month ago
  
  I partially agree, but I don’t think “design ways to test the code without inspecting it manually line by line” is a good strategy.
  Tests only cover cases you already know to look for. In my experience, many important edge cases are discovered by reading the implementation and noticing hidden assumptions or unintended interactions.
  When something goes wrong, understanding why almost always requires looking at the code, and that understanding is what informs better tests.
  
  2 replies →