Comment by mjevans
2 days ago
There might be legal / compliance reasons. It can be incredibly difficult to replace a validated system that is known (or already accepted even if it's technically incorrect) to implement lawmaker dictated behavior.
Otherwise, I think a new approach might be to ignore the specifics of the old system, implement a new system, and a separate translation layer that can run on an export of the old system (or the old system brought back online, but read only after the overnight maintenance) and completely cut over during an otherwise holiday weekend.
It doesn't work like that. When you're revamping large, important, fingers-in-everything-and-everybody's-fingers-in-it systems you can't ignore anything. A (presumably) hypothetical example is sorting names. Simple, right? You just plop an ORDER-BY in the SQL, or call a library function. Except for a few niggling details:
1. This is an old IBM COBOL system. That means EBCDIC, not UTF or even ASCII.
1.A Fine, we'll mass-convert all the old data from EBCDIC to UTF. Done.
1.A.1 Which EBCDIC character set? There are multiple variants. Often based on nationality. Which ones are in use? Can you depend on all records in a dataset using the same one (hint: no.) Can you depend on all fields in a particular record using the same one? (hint: no.) Can you depend on all records using the same one for a particular field? (hint...) Can you depend on any sane method for figuring out what a particular field in a particular record in a particular dataset is using? Nope nope nope.
1.A.2 Looking at program A, you find it reads data from source B and merges it with source C. Source B, once upon a time, was from a region with lots of French names, and used code page 279 ('94 French). Except for those using 274 (old Belgium). And one really ancient set of data with what appears to be a custom code set only used by two parishes. Program A muddles through well enough to match up names with C, at least well enough for programs D, E, and F.
1.A.3 But it's not good enough for program G (when handling the Wednesday set of batches). G has to cross-reference the broken output from A with H to figure out what's what.
1.B You have now changed the output. It works for D and F, but now E is broken, and all the adhoc, painstakingly hand-crafted workarounds in G are completely clueless.
1.C Oh, and there's consumer J that wasn't properly documented, you don't know exists, and handles renewals for 60-70 year old pensioners who will be very vocal when their licenses are bungled.
2. Speaking of birth years, here's a mishmash of 2-, 4-, and even 3-digit years....
Yes, that's why the new system has to be a complete replacement. Part of it's spec COULD be to provide backwards interfaces too, in case things can't all be cutover at once, but that would increase the project scope and also tie things down to the old system too.
Part of a full replacement system would be the option to use a _different_ set of rules, which better reflect current desires and are, hopefully, easier to implement.
Yes the old data would need to be _transcribed_ during it's restoration to the new system, and human bureaucratic layers can likely handle issues. Heck, they could do a deferred implementation of the new system where one long weekend the new system's brought up, and any of the issues that are noticed as kinks worked out. When there aren't any _noticed_ kinks in those tests have the results sent out to the stakeholders and solicit feedback on if there are any inaccuracies. Which might take a year or two of renewals and updates and the annual business as they see if the new notices are correct or not.