Comment by timschmidt
4 days ago
Friend, if you can't empirically measure the outputs of LLMs which provide lovely APIs for doing so, what are you doing?
20 lines of code and some data would really bolster your case, but I don't see them.
4 days ago
Friend, if you can't empirically measure the outputs of LLMs which provide lovely APIs for doing so, what are you doing?
20 lines of code and some data would really bolster your case, but I don't see them.
You can't just run a few queries and base conclusion off that, you need to run tens of thousands of different ones and then somehow evaluate the responses. It's a huge amount of work.
Demanding empirical data and then coming up with shoddy half-arsed methodology is unserious.
idk friend, it seems kind of presumptuous to demand other people’s time like this.
It’s pretty evident that the people building grok are injecting their ideology into it.
I don’t need more evidence, and I don’t need you to agree with me. Go ahead and write those 20 lines if you so desire. I’m happy to be proven wrong.
I don't think I'm the one being presumptuous or demanding. I've actually tried to help you make a stronger argument. Shooting a hundred or even a thousand queries to 3 or 4 LLMs and shoving the results through established sentiment analysis algorithms is something ChatGPT can one-shot in just about any language. You demand people agree with your opinion and refuse to spend 20 minutes supporting it with facts. Not my problem, I tried to help. You may not see it that way. That's fine.