Comment by jclay
15 hours ago
Exciting work! I’ve often wondered if an LLM with the right harness could restore and optimize an aging C/C++ codebase. It would be quite compelling to get an old game engine running again on a modern system.
I would expect most of these systems come with very carefully guarded access controls. It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not). Curious to hear more about what you’ve learned here.
Also curious to hear how LLMs perform on a language like COBOL that likely doesn’t have many quality samples in the training data.
Thank you!
The decision makers we work with are typically modernization leaders and mainframe owners — usually director or VP level and above. There are a few major tailwinds helping us get into these enterprises:
1. The SMEs who understand these systems are retiring, so every year that passes makes the systems more opaque.
2. There’s intense top-down pressure across Fortune 500s to adopt AI initiatives.
3. Many of these companies are paying IBM 7–9 figures annually just to keep their mainframes running.
Modernization has always been a priority, but the perceived risk was enormous. With today’s LLMs, we’re finally able to reduce that risk in a meaningful way and make modernization feasible at scale.
You’re absolutely right about COBOL’s limited presence in training data compared to languages like Java or Python. Given COBOL is highly structured and readable, the current reasoning models get us to an acceptable level of performance where it's now valuable to use them for these tasks. For near-perfect accuracy (95%+), that is where we see an large opportunity to build domain-specific frontier models purpose built for these legacy systems.
In terms of training your own models, is there enough COBOL available for training or are you going to have to convince your customers to let you train on their data (do you think banks would push back against that?)
There isn’t enough COBOL data available to reach human-level performance yet.
That’s exactly the opportunity we have in front us to make it possible through our own frontier models and infra.
> It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not).
here that person is a manager which got demoted from ~500 reports to ~40 and then convinced his new boss that it's good to reuse his team for his personal AI strategy which will make him great again.