← Back to context

Comment by PaulDavisThe1st

2 months ago

No, that's not an answer to that specific question.

Performance on the AT-SAT is not job performance.

If you have a qualification test that feels useful but also turns out to be highly non-predictive of job performance (as, for example, most college entrance exams turn out to be for college performance), you could change the qualification threshold for the test without any particular expectation of losing job performance.

In fact, it is precisely this logic that led many universities to stop using admissions tests - they just failed to predict actual performance very well at all.

> Performance on the AT-SAT is not job performance.

No, but it was the best predictor of job performance and academy pass rate there was.

https://apps.dtic.mil/sti/pdfs/ADA566825.pdf

https://www.faa.gov/sites/faa.gov/files/data_research/resear... (page 41)

There are a fixed number of seats at the ATC academy in OKC, so it's critical to get the highest quality applicants possible to ensure that the pass rate is as high as possible, especially given that the ATC system has been understaffed for decades.

  • That is NOT what the first study you've cited says at all:

    > "The empirically-keyed, response-option scored biodata scale demonstrated incremental validity over the computerized aptitude test battery in predicting scores representing the core technical skills of en route controllers."

    I.e the aptitude test battery is WORSE than the biodata scale.

    The second citation you offered merely notes that the AT-SAT battery is a better predictor than the older OPM battery, not that is the best.

    I'd also say at a higher level that both of those papers absolutely reek of non-reproduceability and low N problems that plague social and psychological research. I'm not saying they're wrong. They are just not obviously definitive.

    • > The second citation you offered merely notes that the AT-SAT battery is a better predictor than the older OPM battery, not that is the best.

      How is that a criticism? It is always possible that someone could invent a better test.

      In any case, the second citation directly refutes your point in another sub-thread with AnthonyMouse, the assertion that higher-performing applicants above the cutoff do not perform better on the job:

      "If all applicants scoring 70 or above on the AT-SAT are selected, slightly over one-third would be expected to be high performers. With slightly greater selectivity, taking only applicants scoring 75.1 or above, the proportion of high performers could be increased to nearly half."

      Also:

      "The primary point is that applicants who score very high (at 90) on the AT-SAT are expected to perform near the top of the distribution of current controllers (at the 86th percentile)."

    • > I.e the aptitude test battery is WORSE than the biodata scale.

      You're mistaken, it's the opposite. The first one found that AT-SAT performance was the best measure, with the biodata providing a small enhancement:

      > AT-SAT scores accounted for 27% of variance in the criterion measure (β=0.520, adjusted R2=.271,p<.001). Biodata accounted for an additional 2% of the variance in CBPM (β=0.134; adjusted ΔR2=0.016,ΔF=5.040, p<.05).

      > In other words, after taking AT-SAT into account, CBAS accounted for just a bit more of the variance in the criterion measure

      Hence, "incremental validity."

      > The second citation you offered merely notes that the AT-SAT battery is a better predictor than the older OPM battery, not that is the best.

      You're right, and I can't remember which study it was that explicitly said that it was the best measure. I'll post it here if I find it. However, given that each failed applicant costs the FAA hundreds of thousands of dollars, we can safely assume that there was no better measure readily available at the time, or it would have been used instead of the AT-SAT. Currently they use the ATSA instead of the AT-SAT, which is supposed to be a better predictor, and they're planning on replacing the AT-SAT in a year or two; it's an ongoing problem with ongoing research.

      > I'd also say at a higher level that both of those papers absolutely reek of non-reproduceability and low N problems that plague social and psychological research. I'm not saying they're wrong. They are just not obviously definitive.

      Given the limited number of controllers, this is going to be an issue in any study you find on the topic. You can only pull so many people off the boards to take these tests, so you're never going to have an enormous sample size.