Comment by solarkraft

19 hours ago

I somewhat disagree that this is due diligence. Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

8 comments

solarkraft

mpyne 18 hours ago

> Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

Does mmap(2) educate the developer on how disk I/O works?

At some point you have to know something about the technology you're using, or accept that you're a consumer of the ever-shifting general best practice, shifting with it as the best practice shifts.

websap 15 hours ago
Does using print() in Python means I need to understand the Kernel? This is an absurd thought.
- Nevermark 8 hours ago
  
  That might be an absurd comparison, but we can fix that.
  If you were being charged per character, or running down character limits, and printing on printers that were shared and had economic costs for stalled and started print runs, then:
  You wouldn’t “need” to understand. The prints would complete regardless. But you might want to. Personal preference.
  Which is true of this issue to.
  
  1 reply →
- redsocksfan45 6 hours ago
  
  [dead]
zem 17 hours ago
mmap(2) and all its underlying machinery are open source and well documented besides.
- mpyne 17 hours ago
  
  There are open-source and even open-weight models that operate in exactly this way (as it's based off of years of public research), and even if there weren't the way that LLMs generate responses to inputs is superbly documented.
  Seems like every month someone writes up a brilliant article on how to build an LLM from scratch or similar that hits the HN page, usually with fancy animated blocks and everything.
  It's not at all hard to find documentation on this topic. It could be made more prominent in the U/I but that's true of lots of things, and hammering on "AI 101" topics would clutter the U/I for actual decision points the user may want to take action upon that you can't assume the user already knows about in the way you (should) be able to assume about how LLMs eat up tokens in the first place.

computably 12 hours ago

I would say this is abstracting the behavior.