Comment by coulix

3 months ago

This.

Any complex parent table span cell relationship still has low accuracy.

Try the reverse, take a complex picture table and ask Chatgpt5, claude Opus 3.1, Gemini Pro 2.5 to produce a HTML table.

They will fail.

Maybe my imagination is limited or our documents aren't complex enough, but are we talking about realistic written documents? I'm sure you can take a screenshot of a very complex spreadsheet and it fails, but in that case you already have the data in structured form anyway, no?

  • Now if someone mails or faxes you that spreadsheet? You're screwed.

    Spreadsheets are not the biggest problem though, as they have a reliable 2-dimensional grid - at worst some cells will be combined. The form layouts and n-dimensional table structures you can find on medical and insurance documents are truly unhinged. I've seen documents that I struggled to interpret.

    • To be fair, this is problematic for humans too. My old insurer outright rejected things like that stating it's not legible.

      (I imagine it also had the benefit of reducing fraud/errors).

      In this day and age, it's probably easier/better to change the process around that as there's little excuse for such shit quality input. I understand this isn't always possible though.