← Back to context

Comment by whatevertrevor

2 days ago

I agree with that.

I think part of the issue with the lack of "real" quantification in the results of LLMs is that the output and problem domain is so ill-defined. With standard neural nets (and other kinds of ML) classifiers, regression models and reinforcement models all had very narrow, domain specific problems they were solving. It was a no-brainer to measure directly how your vision classifier performs against a radiologist in determining whether an image corresponds to lung cancer.

Now we've opened up the output to a wide variety of open-ended domains: natural languages, programming languages, images and videos. Since the output domain is inherently subjective, it's hard to get a good handle on their usefulness, let alone getting people to agree on that. Hence the never-ending discourse around them.