Comment by vlade11115
9 hours ago
I love the site design.
> There's an obvious question looming here — if the models got so confused, how did they consistently pass the reconciliation checks we described above? It may seem like the ability to make forward progress is a good proxy for task understanding and skill, but this isn't necessarily the case. There are ways to hack the validation check – inventing false transactions or pulling in unrelated ones to make the numbers add up.
This is hilarious. I wonder if someone is unintentionally committing fraud by blindly trusting LLMs with accounting. Or even worse, I bet that some governments are already trying to use LLMs to make accounting validators. My government sure wants to shove LLMs into digital government services.
Lawyers have used it to write briefs; I would be very surprised if someone, somewhere wasn't slowly running a company into the ground by using ChatGPT or another LLM for accounting.
Imagine the fallout from books cooked by an LLM hallucinating revenue.
[about the website design] As a bonus for my fellow privacy schizos, the page works fine with 3rd party frames and 3rd party scripts disabled on uBlock, and still looks very good with no remote fonts and no large media. Quite an accomplishment for such a cool looking page
I'm sure that any accounting trick that an LLM can think of is something that is also used by some shady human accountants. The proper response should not be to avoid/prohibit AI but to improve the validation mechanisms.
Counterpoint: if you detect a human accountant doing this, you can take action against the human. Computers will never meaningfully take the blame, and unfortunately usually mean not blaming any human either.
> you can take action against the human
I think that will depend on a case-by-case. I don't have any recent examples but I recall someone trying to sue one of those strip-mall tax preparation franchises over incorrect filings. My understanding is that the documents that you sign when you enroll in those services are pretty strictly in the favor of the company. I doubt you could ever go after the specific "human" that made the error even if it was maliciously done.
In the same way, if you pay for a tax service that uses AI agents, what you can and cannot "take action" for will probably be outlined in the terms of service that you accept when you sign up.
I would guess millions of people already use software based tax filing services (e.g. turbo tax) where no human at all is in the loop. I don't understand how swapping in an LLM significantly changes the liability in those cases. The contract will be between you and the entity (probably a corporation), not you and "computers".
Worth stating I am NOT a lawyer.
But still - if there's a way to detect accountants doing it - let's focus on making that detection even easier.
On a related note, can we use something like GAN here, with auditor AIs trained against accountant AIs?
The person using the tool is the accountant, regardless of whether the tool is a calculator and sheet of paper, QuickBooks, or an LLM.
No, I think in this particular case the proper response is for honest companies to avoid any systems which invent nonexistent transactions to reconcile books.
Most businesses don’t want to misrepresent their books, irrespective of the existence of shady accountants.