Comment by 827a
1 day ago
I wonder if the next phase for leveraging LLMs against large sets of contextual, proprietary data (code repositories & knowledge bases come to mind) is going to look more like smaller models highly (and regularly) trained/fine-tuned against that proprietary data (that is maybe delegated tasks by the ultra-sized internet scale omni-brain models)
If I'm asking Sonnet to agentically make this signin button green: does it really matter that it can also write haikus about the japanese landscape? That links back to your point: We don't have a grip, nearly at all, on how much this crosstalk between problem domains matters. Maybe it actually does matter? But certainly most of it doesn't. B
We're so far from the endgame on these technologies. A part of me really feels like we're wasting too much effort and money on training ASI ultra internet scale models. I'm never going to pay $200+/mo for even a much smarter Claude; what I need is a system that knows my company's code like the back of its hand, knows my company's patterns, technologies, and even business (Jira boards, Google docs, etc), and extrapolates from that. That would be worth thousands a month; but what I'm describing isn't going to be solved by a 195 IQ gigabrain, and it also doesn't feel like we're going to get there with context engineering.
No comments yet
Contribute on Hacker News ↗