Comment by timschmidt

3 days ago

It seems that there is tremendous incentive for people like yourself (I see you're very active in these comments) to claim that. But I see you've presented no quantitative evidence. Given the politicization of the systems and individuals involved, without evidence, it all reads like partisan mud slinging.

Any LLM can be convinced to say just about anything. Pliny has shown that time and time again.

6 comments

timschmidt

arp242 3 days ago

Does ChatGPT start ranting about Jews and "White Genocide" unprompted? How can I even quantify that it doesn't do that?

This is a classic "anything that can't be empirically measured is invalid and can be dismissed" mistake. It would be nice if we could easily empirically measure everything, but that's not how the world works.

The ChatGPT article is of a rather different nature where ChatGPT went off the rails after a long conversation with a troubled person. That's not good, but just no the same as "start spewing racism on unrelated questions".

timschmidt 3 days ago
Friend, if you can't empirically measure the outputs of LLMs which provide lovely APIs for doing so, what are you doing?
20 lines of code and some data would really bolster your case, but I don't see them.
- amrocha 2 days ago
  
  idk friend, it seems kind of presumptuous to demand other people’s time like this.
  It’s pretty evident that the people building grok are injecting their ideology into it.
  I don’t need more evidence, and I don’t need you to agree with me. Go ahead and write those 20 lines if you so desire. I’m happy to be proven wrong.
  
  1 reply →
- arp242 2 days ago
  
  You can't just run a few queries and base conclusion off that, you need to run tens of thousands of different ones and then somehow evaluate the responses. It's a huge amount of work.
  Demanding empirical data and then coming up with shoddy half-arsed methodology is unserious.
engineer_22 2 days ago

> Does ChatGPT start ranting about Jews and "White Genocide" unprompted? How can I even quantify that it doesn't do that?
Grok doesnt do that.