The evaluation tools used are helpful for encoder development, but at best they're imperfect proxies for human perception, and their predictions are often inconsistent with the human experience. I assume that statements like "apparently the best AAC encoder" aren't meant to be taken too seriously, since everybody who does this stuff knows that ABX/MUSHRA tests with real humans is what tells the tale.
The evaluation tools used are helpful for encoder development, but at best they're imperfect proxies for human perception, and their predictions are often inconsistent with the human experience. I assume that statements like "apparently the best AAC encoder" aren't meant to be taken too seriously, since everybody who does this stuff knows that ABX/MUSHRA tests with real humans is what tells the tale.
On Opus vs. AAC specifically, there's a long history of studies like https://www.researchgate.net/publication/301428302_Perceived... to help answer that question. (There are interesting charts at the top of page 1175.)