Comment by magicalhippo
13 hours ago
> csv files. MS Excel can read some malformed csv files.
At work we have to parse CSV files which often have mixed encoding (Latin-1 with UTF-8 in random fields on random rows), occasionally have partial lines (remainder of line just missing) and other interesting errors.
We also have to parse fixed-width flat files where fields occasionally aren't fixed-width after all, with no discernible pattern. Customer can't fix the broken proprietary system that spits this out so we have to deal with it.
And of course, XML files with encoding mismatch (because that header is just a fixed string that bears no meaning on the rest of the content, right?) or even mixed encoding. That's just par for the course.
Just some examples of how fun parsing can be.
No comments yet
Contribute on Hacker News ↗