Comment by 3eb7988a1663

10 months ago

Maybe the headline should note that this a parser vulnerability, not the format itself. I suppose that is obvious, but my first knee-jerk thought was, "Am I going to have to re-encode XXX piles of data?"

What would it mean for the vulnerability to be in the format and not the parser?

  • I don't know. Something like a Python pickle file where parsing is unavoidable.

    On a second read, I realized a format problem was unlikely, but the headline just said, "Apache Parquet". My mind might the same conclusion if it said "safetensors" or "PNG".

  • That data had to be encoded in a certain way which would lead to unavoidable exploitation in every conforming implementation. For example, PDF permits embedded JavaScript and… that has not gone well.