Comment by mschuster91
10 hours ago
> Wonder if its feasible to reverse the old version using LLMs, vibecode it to run on modern platforms and then shorehorn in support for modern XLS format.
Oh no it won't. Photoshop PSD and the legacy Office file formats have one thing in common... they are raw dumps of the C in-memory structs representing the contents. That's how they save and load so fast [1], in contrast to the modern formats which are a bunch of XMLs in a ZIP in a trenchcoat. Unfortunately, that makes reverse engineering them not just a challenge in itself, but also reimplementing because you have to reimplement Microsoft's original engines piece by piece, quirk by quirk.
And that's before wading into the mess that is OLE or, yes, the older people will shudder, ActiveX. Or the wonders that VBA macros could achieve, including just running stuff directly from kernel32.dll. I'm reasonably sure you could import the DirectX DLLs into an Office VBA macro and implement a full blown 3D shooter engine with DirectX instead of Excel.
And that's also why conversion in either direction almost always carries loss potential, simply put, not each quirk of the legacy format has been carried over to the "new" XML storage format, and certainly not into OpenOffice XML.
[1] https://www.joelonsoftware.com/2008/02/19/why-are-the-micros...
I mean if people are reverse engineering entire n64 games into its original code that can target the original SGI compilers, then it is possible to reverse this other code. I don't think there is a drive to do so though. Thats where I hope some future LLM could help lower that barrier to people already well experienced in reversing.
>And that's also why conversion in either direction almost always carries loss potential, simply put, not each quirk of the legacy format has been carried over to the "new" XML storage format, and certainly not into OpenOffice XML.
Can modern Office reliably open the old formats? If so they must have implemented the parsers correctly no?