Comment by a3w

4 days ago

Can LLMs do Cobol?

10 comments

a3w

> Can LLMs do Cobol?

I imagine it's the one place where LLMs would absolutely shine. COBOL jobs are usually very verbose, lots of boilerplate, but what they do is mostly very straightforward batch processing. It's ripe for automation with LLMs.

The flip side is that banks are usually very conservative about technology (for good reason).

kreetx 4 days ago
I don't think LLMs are anything human language specific, so would they really shine here? I.e, COBOL and SQL may be great for humans who otherwise aren't used to programming languages, but LLMs have seen everything, and thus are able to know any (programming) language, not just ones which are English-like.
- the_af 4 days ago
  
  I think it would shine because COBOL is very verbose and the programs written with it (batch jobs) very straightforward but also very boilerplate-rish and boring to write by hand, a situation with little risk and high reward to automate.
accrual 4 days ago
IMO the ideal path to working on COBOL without having decades of experience would be to spend a few days groking the syntax and writing toy programs for practice, then collaborate with a large LLM to understand the current code and gradually make changes.
- macintux 4 days ago
  
  Have you worked with COBOL? My understanding is that the language itself isn't the real problem. Mainframes, control languages: everything around COBOL is very different from most Windows/UNIX people have experienced.

zoom6628 4 days ago

Yep. I have recently used prompts to ask for COBOL code solution so I can compare it to other languages I also know to check quality of the answer. So far no mistakes.

gwbas1c 4 days ago

I'm sure they can do brainfuck if you have a good training set.

bob1029 4 days ago
LLMs are terrible at brainfuck. I spent a solid week attempting to use generative models to iteratively refine BF program tapes with nothing to show for it. I've written genetic programming routines that can produce better brainfuck programs than ChatGPT can.
For example, if I prompt ChatGPT: "Write me a BF program that produces the alphabet, but inverts the position of J & K" it will deterministically fail. I've never even seen one that produces the alphabet the normal way. I can run a GP algorithm over an example of the altered alphabet string and use simple MSE to get it to evolve a BF program that actually emits the expected output.
The BPE tokenizer seems like a big part of the problem when considering the byte-per-instruction model, but fundamentally I don't think there is a happy path even if we didn't need to tokenize the corpus. The expressiveness of the language is virtually non-existent. Namespaces, type names, member names, attributes, etc., are a huge part of what allows for a LLM to lock on to the desired outcome. Getting even one byte wrong is catastrophic for the program's meaning. You can get a lot of bytes wrong in C/C++/C#/Java/Go/etc. (e.g. member names) and still have the function do exactly the same thing.
- gwbas1c 4 days ago
  
  That was a very serious response to an off-the-cuff joke.
  BUT: Please, oh please, write up a blog entry! I bet that would be fun to read.

bbarnett 4 days ago

Yes! But only if they write the compiler too.