← Back to context

Comment by NVHacker

2 days ago

[flagged]

Yes I’d imagine the reason it still hasn’t been fixed after nearly a decade is management/politics etc. But it taking more than just 6 months will be technical. As a result it’s a job that falls into the area of being canned because it’s taking too long even though no one said it would be quick.

  • There might be legal / compliance reasons. It can be incredibly difficult to replace a validated system that is known (or already accepted even if it's technically incorrect) to implement lawmaker dictated behavior.

    Otherwise, I think a new approach might be to ignore the specifics of the old system, implement a new system, and a separate translation layer that can run on an export of the old system (or the old system brought back online, but read only after the overnight maintenance) and completely cut over during an otherwise holiday weekend.

    •     > I think a new approach might be to ignore the specifics of the old system, implement a new system
      

      It doesn't work like that. When you're revamping large, important, fingers-in-everything-and-everybody's-fingers-in-it systems you can't ignore anything. A (presumably) hypothetical example is sorting names. Simple, right? You just plop an ORDER-BY in the SQL, or call a library function. Except for a few niggling details:

      1. This is an old IBM COBOL system. That means EBCDIC, not UTF or even ASCII.

      1.A Fine, we'll mass-convert all the old data from EBCDIC to UTF. Done.

      1.A.1 Which EBCDIC character set? There are multiple variants. Often based on nationality. Which ones are in use? Can you depend on all records in a dataset using the same one (hint: no.) Can you depend on all fields in a particular record using the same one? (hint: no.) Can you depend on all records using the same one for a particular field? (hint...) Can you depend on any sane method for figuring out what a particular field in a particular record in a particular dataset is using? Nope nope nope.

      1.A.2 Looking at program A, you find it reads data from source B and merges it with source C. Source B, once upon a time, was from a region with lots of French names, and used code page 279 ('94 French). Except for those using 274 (old Belgium). And one really ancient set of data with what appears to be a custom code set only used by two parishes. Program A muddles through well enough to match up names with C, at least well enough for programs D, E, and F.

      1.A.3 But it's not good enough for program G (when handling the Wednesday set of batches). G has to cross-reference the broken output from A with H to figure out what's what.

      1.B You have now changed the output. It works for D and F, but now E is broken, and all the adhoc, painstakingly hand-crafted workarounds in G are completely clueless.

      1.C Oh, and there's consumer J that wasn't properly documented, you don't know exists, and handles renewals for 60-70 year old pensioners who will be very vocal when their licenses are bungled.

      2. Speaking of birth years, here's a mishmash of 2-, 4-, and even 3-digit years....

      1 reply →

> Having legacy data and systems for a few years is a challenge. Still having them after decades is government.

FTFY

> Having legacy data... after decades is incompetence

Harsh

How long does a person hold a drivers license?

  • That's just data. "Legacy data" was used here to suggest a legacy database/storage system. The reality is that the situation is not due to an insurmountable technical problem but due to a combination of lack of funds / prioritization / motivation / knowledge.