Comment by theamk

8 months ago

My answer is clear "yes" to most of those.

Yes, machine translations are AI-generated content - I read foreign-language news sites which sometimes has machine translation articles and the quality stands out and not in a good way.

"Maybe" for "writing on paper and using LLM for OCR". It's like automatic meeting transcript - if the speaker has perfect pronunciation, it works well. If they don't, then the meeting notes still look coherent but have little relationship to what speaker said and/or will miss critical parts. Sadly there is no way for reader to know that from reading the transcript, so I'd recommend labeling "AI edited" just in case.

Yes, even if "they give the AI a very detailed outline, constantly ask for rewrites, etc.." it's still AI generated. I am not sure how can you argue otherwise - it's not their words. Also, it's really easy to convince yourself that you are "ruthless in removing any facts they're not 100% sure" while actually you are anything but.

"What if they only use AI to fix the grammar and rewrite bad English into a proper scientific tone?" - I'd label it "AI-edited" if the rewrites are minor or "AI-generated" if the rewrites are major. This one is especially insidious as people may not expect rewrites to change meaning, so they won't inspect them too much, so it will be easier for hallucinations to slip in.

2 comments

theamk

fho 8 months ago

> they give the AI a very detailed outline […]

Honestly, I think that's a tough one.

(a) it "feels" like you are doing work. Without you the LLM would not even start. (b) it is very close to how texts are generated without LLMs. Be it in academia, with the PI guiding the process of grad students, or in industry, with managers asking for documentation. In both cases the superior takes (some) credit for the work that is in large parts by others.

theamk 8 months ago

Don't see anything "tough" here.
At least in academia, if PI takes credit for student's work and does not list them as co-author, it's considered widely unethical. The rules there are simple - someone contributed to the text, they get onto the author list.
If we had same same rule for blogs - "this post is authored by fho and ChatGPT" - then I'd be completely satisfied, as this would be sufficient AI disclosure.
As for industry, I think the rules are very different place-by-place. In some places the authorship does not even come up - the slide deck/document can contain copies from random internet sites, or some previous version of the doc, and the reference will only be present if there is a need (say to lend an authority)