Comment by Sharlin

5 days ago

Yeah, I later found that quote on Wikipedia too. Though I don't think the cited source is super reliable either, or just folklore ("Oh, 'code page' refers to actual deadtree pages"). All the IBM documentation I could find showed big gaps in the sequence of code pages.

But I just now found the list at [1], I don't know why I didn't notice it before. It's certainly comprehensive! There's been some real detective work to be done in compiling that list. The gaps are much smaller, though still exist, eg. from 40 to 251. The 300s are rather sparse, there are only a few 4xx codes, and then there's a jump from 500 to 8xx (with some 7xx assigned later I think).

In any case, I agree that the LLMs seem to have hallucinated the "more general sequence" part. The code page IDs, or more formally CCSIDs, always were a specific set of 16-bit ID numbers. Why exactly the various gaps exist is probably lost in history by now, if there ever even were any particular reasons.

[1] https://en.wikipedia.org/wiki/Code_page