Comment by nerdponx
2 years ago
Even though this might be entirely hallucinated, I remain consistently amazed that stacking transformers and training on a massive text corpus is sufficient to get something resembling a "language model" that can both understand and generate instructions like this:
You have the tool 'browser' with these functions:
'search(query: str, recency_days: int)' Issues a query to a search engine and displays the results.
'click(id: str)' Opens the webpage with the given id, displaying it. The ID within the displayed results maps to a URL.
'back()' Returns to the previous page and displays it.
'scroll(amt: int)' Scrolls up or down in the open webpage by the given amount.
'open_url(url: str)' Opens the given URL and displays it.
'quote_lines(start: int, end: int)' Stores a text span from an open webpage. Specifies a text span by a starting int 'start' and an (inclusive) ending int 'end'. To quote a single line, use 'start' = 'end'.
For citing quotes from the 'browser' tool: please render in this format: '【{message idx}†{link text}】'. For long citations: please render in this format: '[link text](message idx)'. Otherwise do not render links.
Who needs an actual model of cognition when you have billions of parameters and brute force? Maybe Asimov's "positronic brain" was just a very efficient inference engine for an LLM all along.
Consider by contrast the complexity of a system like AlphaZero.
> Who needs an actual model of cognition when you have billions of parameters and brute force?
Well, that is a hypothetical model of cognition. It's not a biological model, but transformers certainly mimic observed properties of brain structures, e.g., grid cells.
They mimic by estimating the patterns of knowledge & restructuring it into something that looks right.
They don't understand anything + calling generative "intelligence" is disingenuous. It's pattern replication. It's like fractal artwork with a happy interface.
Not many of us touch this stuff bare metal to final product ; it's complicated, too complicated for most individuals to work on solo. I wish I cared enough to learn the math to prove out my logical understanding.
Big tech cares about profit too much to actually document what these models are, how they draw their conclusions. Until independent researchers who care to document mathematically:
Probability of next segment being part of generated pattern, logical connections across known patterns that attach to current /previous pattern
We will continue to have this generative farce propped up on a pedestal, gilded with pyrite, and profits poured into a half-baked pattern bruteforcer.
I beat on the tech and badmouth it, but I'm still using it to make money. :) I'll ride the bandwagon until something better comes along. It's a means to an end.
> I wish I cared enough to learn the math to prove out my logical understanding.
There's not much math to it. If you know how matrix multiplication works and a little bit about differential calculus you're very almost there.
>Who needs an actual model of cognition when you have billions of parameters and brute force?
I mean did evolution do anything different ?
It seems like evolution started with really efficient brains and gradually increased their complexity. We seem to be starting with really powerful brains and gradually increasing their efficiency. (I'm not an evolutionary biologist so I could be totally wrong about this.)