Comment by skissane
2 days ago
I think the difference is that I know that getting data out of mainframe COBOL systems is a long-known and long-solved problem, and I can list lots of different ways to do it (I mentioned a few, there's several more I didn't mention). Without knowing the details of the exact system, I'm not sure which one would be the best one to use, but the odds that you'd have a system for which none of these existing solutions is suitable is rather low – and indeed, likely most of these systems are already using one or another – there are whole teams of sales people who have spent the last 20-30 years convincing government agencies (inter alia) to buy these solutions.
Whereas, you don't seem to know anything about that topic, and are speculating based on parallels with completely different disciplines (such as statistical mechanics).
We both are speculating due to lack of details about the specific systems under discussion, but wouldn't you expect the person whose speculations are based on greater relevant knowledge to be more likely to be correct?
I'm sorry, but just because I didn't pepper my post with shibolleths like z/OS or VSAM or the vagaries of ACCEPT and DISPLAY keywords, doesn't mean I don't know what I'm talking about. I worked specifically on connecting COBOL system to a DB/2 database, and one thing was for certain: understanding the data format was the hardest part of the problem. Those definitions, in our system, were tightly coupled to the user interface code, AND the batch processing code.
No, it's not my specialty and didn't work with this system for long, but my overall impression was that COBOL programmers get (understandably) low-level abstractions, and therefore had to build higher level abstractions themselves. This is not like modern software development where you have an embarrasment of riches from any level of abstraction you want, and a large system where every part of the stack is a custom solution is generally going to be more chaotic. To put some numbers on it, to add a column of data to the system I worked on required on average about 20k hours of coding work. No doubt some of this was sand-bagging, but I'd say 80% of it was legitimate.
> I worked specifically on connecting COBOL system to a DB/2 database, and one thing was for certain: understanding the data format was the hardest part of the problem.
But now you are shifting the goalposts: from getting readonly access to the data, to understanding what it actually means. Yes, I totally agree, a lot of legacy COBOL systems, it can be very hard to work out what the data actually means - even though you probably have a COBOL copybook telling you what the columns/fields are, they can be full of things like single letter codes where the documentation telling you what the codes mean is incorrect. And likewise, you are right that seemingly simple tasks like adding a field can be monumental work given the number of different transaction screens, reports, batch jobs, etc, that need to be updated, and the fact that many mainframe programmers don’t know what “DRY” stands for
But simply getting read-only access to data? Most mainframe COBOL systems would already support that. Could there be some really badly maintained ones in which it was never configured properly and they just give DOGE read-write access because DOGE refuses to wait for it to be done properly? I doubt that’s the norm but it might be a rare exception. Such a system would likely violate security standards for federal IT systems, but agencies can get exemptions.
> To put some numbers on it, to add a column of data to the system I worked on required on average about 20k hours of coding work.
20,000 hours is 10 years of full-time work for a single person. If you "didn't work with this system for long," it is quite simply statistically impossible that you could have witnessed enough projects to have anything resembling an accurate "average".
>20,000 hours is 10 years of full-time work for a single person.
Or, while we're mythical man-monthing it, 6 months of work for 20 people? Or merely a single sprint for 240 people!