Comment by dostick
10 hours ago
The tests should have negative weights based on how often that issue encountered and impact. The 2. SPI should have like 8 negative points out of 10 as most common blocker. And whole test inverse score.
10 hours ago
The tests should have negative weights based on how often that issue encountered and impact. The 2. SPI should have like 8 negative points out of 10 as most common blocker. And whole test inverse score.
Yeah, good call, we're on the same page about that. I designed this tool (agentreadingtest.com) to raise awareness of these issues in a more general way, so people can point agents at it and see how it performs for them. Separately, I maintain a related tool that can actually assess these issues in documentation sites: https://afdocs.dev/
My weighting system there scores the number of pages affected by SPA and caps the possible score at a "D" or "F" depending on the proportion of pages affected: https://afdocs.dev/interaction-diagnostics.html#spa-shells-i...
I've tried to weight things appropriately in assessing actual sites, but for the test here, I more wanted to just let people see for themselves what types of failures can occur.