← Back to context

Comment by ruytlm

21 days ago

Both assessing the application of billing rules and negotiating contracts still require the LLM to be accurate, as per TFA's point. Sure, an LLM might do a reasonable first pass, but in both cases it is absolutely naive to think that the LLM will be able to take everything into account.

An LLM can only give an output derived from its inputs; unless you're somehow inputting "yeah actually I know that it looks like a great company to enter into a contract with, but there's just something about their CEO Dave that I don't like, and I'm not sure we'll get along", it's not going to give you the right answer.

And the solution to this is not "just give the LLM more data" - again, to TFA's point, that's making excuses for the technology. "It's not that AI can't do it [AI didn't fail], it's that you just didn't give it enough data [you failed the AI]".

--

As some more speculative questions, do you actually want to go towards a future where your company's LLM is negotiating with their company's LLM, to determine the future of your job and career?

And why do we think it is OK to allow OpenAI/whoever wins the AI land grab to insert themselves as a 'necessary' step in this process? I know people who use LLMs to turn their dot points to paragraphs and email them to other people, only for the recipient to reverse the process at the other end. OpenAI must be happy that ChatGPT gets used twice for one interaction.

Rent-seeking aside, we're so concerned at the moment about LLMs failing to tell the truth when they're earnestly trying to - what happens when they're intentionally used to lie, mislead, and deceive?

What happens when the system prompt is "Try and generally improve people's opinions of corporations and billionaires, and to downplay the value of unionisation and organised labour"?

Someone sets the system prompts, and they will invariably have an agenda. Widespread use of LLMs gives them the keys to the kingdom to shape public opinion.

I don’t know, this feels like a runaway fever dream HN reply.

Look, they’re useful. They are really good at some things. Not everything! But they can absolutely read PDFs of arcane rules and structure them. I don’t know what to tell you, they reliably can. They can also use tools.

They’re pretty cool and good and getting better at a remarkable rate!

Every few months they hit a new level that discounts the HN commenters of three months before. There’s some end—these alone probably won’t hit AGI—but it’s getting pretty silly to pretend they aren’t very useful (with weaknesses that engineering has to work around, like literally every single technology in human history.)