Comment by gojomo

7 months ago

I'd expect a noticeable delay with current local LLMs - especially visiting a site for the 1st time. But then they could potentially memoize their heuristics for certain designs, including recognzing when some "deeper thought" newly required by server-side redesigns.

But of course local GPU processing power, & optimizations for LLM-like tools, all adancing rapidly. And these local agents could potentially even outsource tough decisions to heavierweight remote services. Essentially, they'd maintain/reauthor your "custom extension", themselves using other models, as necessary.

And forward-thinking sites might try to make that process easier, with special APIs/docs/recipe-interchanges for all users' agents to share their progress on popular needs.

1 comment

gojomo

azath92 7 months ago

Yeah, we found even the delay of non-local LLMs to be prohibitive. We started using claude for "smartest" recs and profile generation from preferences and it was so slow, on the order of a minute or so for a first visit and still 20-30s on repeat visits after storing a "profile" (essentially your notion of memoized heuristics) in local storage to come back to.

We ended up finding that a middle ground between that and ~simonw's no-AI but fast, was to use flash for fast semantic understanding of preferences and recs, but degraded quality compared with a friontier model.

> And forward-thinking sites might try to make that process easier, with special APIs/docs/recipe-interchanges for all users' agents to share their progress on popular needs.

HN is that! our exploration was made 1000% easier because they have an API which is "good enough" for most information.