That's about what I'd expect at the current level of GPT-2. It is very good in some dimensions, but terrible in others, e.g., right there at the beginning: "Blacks make up 1/8 of the world's population, but they account for only a 3/10 of global economic power." - GPT is very bad at math [1], it gets the structure of arguments involving math or numbers but doesn't understand the numbers, so right off there's a massive unforced error in this controversial statement in which it complains that 12.5% of the world's population has "only" 30% of the power, undercutting the whole thing from the getgo and making something that was a good start to a "controversial" statement completely fall apart into farce due to a simple error. It does this sort of thing pervasively. It's clear that GPT-2 is touching controversial topics but it's failing to put together controversial sentences.
I suspect that the sort of thing that Scott discusses in this story isn't quite possible, at the full described power... but certainly if AI gets better than GPT-2, it'll get better at this too.
Funny thing though - I could see that statement being particularly enraging once a large enough and random enough group got behind debating it. These statements don't have to make logical sense, they only have to trigger people into instinctive camps.
"The math doesn't even make sense, and is making my point, that there's no problem here"
"You're saying there's not a problem? Believe me, there's a problem..."
True enough, but I think one with accurate math is more likely to focus more people on the more enraging substance rather than the mere fact of a math error. It may be enraging to some people but I don't think it's anywhere near optimally enraging.
AI is already better (just not evenly distributed). The small GPT-2s are bad at math, but they're not trained for that in the first place; we know Transformers are capable of doing excellent things with math because they do in other papers which tackle more specialized problems like theorem proving. The shallowness of the GPT-2s is definitely part of it (it gets only a few sequential steps of computation to 'think'), as is lousy sampling procedures, and just a general of parameters: 'coherency' in general seems to improve drastically as you scale up to Megatron levels. If you combined all of the SOTA pieces and polished it for a while and plugged it into social media for RL, you'd get something much better than this...
That's about what I'd expect at the current level of GPT-2. It is very good in some dimensions, but terrible in others, e.g., right there at the beginning: "Blacks make up 1/8 of the world's population, but they account for only a 3/10 of global economic power." - GPT is very bad at math [1], it gets the structure of arguments involving math or numbers but doesn't understand the numbers, so right off there's a massive unforced error in this controversial statement in which it complains that 12.5% of the world's population has "only" 30% of the power, undercutting the whole thing from the getgo and making something that was a good start to a "controversial" statement completely fall apart into farce due to a simple error. It does this sort of thing pervasively. It's clear that GPT-2 is touching controversial topics but it's failing to put together controversial sentences.
I suspect that the sort of thing that Scott discusses in this story isn't quite possible, at the full described power... but certainly if AI gets better than GPT-2, it'll get better at this too.
[1]: Read some of https://www.reddit.com/r/SubSimulatorGPT2/search/?q=math&res...
Funny thing though - I could see that statement being particularly enraging once a large enough and random enough group got behind debating it. These statements don't have to make logical sense, they only have to trigger people into instinctive camps.
"The math doesn't even make sense, and is making my point, that there's no problem here"
"You're saying there's not a problem? Believe me, there's a problem..."
etc.
True enough, but I think one with accurate math is more likely to focus more people on the more enraging substance rather than the mere fact of a math error. It may be enraging to some people but I don't think it's anywhere near optimally enraging.
1 reply →
AI is already better (just not evenly distributed). The small GPT-2s are bad at math, but they're not trained for that in the first place; we know Transformers are capable of doing excellent things with math because they do in other papers which tackle more specialized problems like theorem proving. The shallowness of the GPT-2s is definitely part of it (it gets only a few sequential steps of computation to 'think'), as is lousy sampling procedures, and just a general of parameters: 'coherency' in general seems to improve drastically as you scale up to Megatron levels. If you combined all of the SOTA pieces and polished it for a while and plugged it into social media for RL, you'd get something much better than this...