Comment by mdp2021

1 day ago

It is suggested that again this is an effect of training towards "sounding well" as opposed to alethics.

The results in an image: https://dl.acm.org/cms/10.1145/3706598.3713470/asset/95dbaf8...

--

From the actual paper ( https://dl.acm.org/doi/10.1145/3706598.3713470 ):

> We used ChatGPT-4o to generate the LLM-generated prompts, while UK-based lawyers generated the lawyer-generated advice

It would have been nice to also have layers assess the LLM output...