← Back to context

Comment by NewsaHackO

2 years ago

The articles seems to report some data points which at least make it seem comparable to GPT4. To me, I feel as though this makes it more objective vs fluff.

There are some 7B weight models that look competitive with GPT4 on benchmarks, because they were trained on the benchmark data. Presumably Google would know better than to train on the benchmark data, but you never know. The benchmarks also fail to capture things such as Bard refusing to tell you how to kill a process on Linux because it's unethical.

  • >Bard refusing to tell you how to kill a process on Linux because it's unethical.

    Gives me what a quick scan looks like a pretty good answer.

  • >The benchmarks also fail to capture things such as Bard refusing to tell you how to kill a process on Linux because it's unethical.

    When I used Bard, I had to negotiate with it what is ethical and what is not[0]. For example when I was researching WW2(Stalin and Hitler), I asked: "When did Hitler go to sleep?" and Bard thought that this information can be used to promote violence an hatred and then I told to it....this information can not be used to promote violence in any way and it gave in! I laughed at that.

    [0] https://i.imgur.com/hIpnII8.png