Comment by g-mork
12 days ago
When did vulnerability reports get so vague? Looks like a classic serialization bug
https://github.com/apache/parquet-java/compare/apache-parque...
12 days ago
When did vulnerability reports get so vague? Looks like a classic serialization bug
https://github.com/apache/parquet-java/compare/apache-parque...
Better link: https://github.com/apache/parquet-java/pull/3169
If by “classic” you mean “using a language-dependent deserialization mechanism that is wildly unsafe”, I suppose. The surprising part is that Parquet is a fairly modern format with a real schema that is nominally language-independent. How on Earth did Java class names end up in the file format? Why is the parser willing to parse them at all? At most (at least by default), the parser should treat them as predefined strings that have semantics completely independent of any actual Java class.
This seems to come from parquet-avro, which looks to attempt to embed Avro in Parquet files and in the course of doing so, does silly Java reflection gymnastics. I don’t think “normal” parquet is affected.
Last time I tried to use the official Apache Parquet Java library, parsing "normal" Parquet files depended on parquet-avro because the library used Avro's GenericRecord class to represent rows from Parquet files with arbitrary schemas. So this problem would presumably affect any kind of Parquet parsing, even if there is absolutely no Avro actually involved.
(Yes, this doesn't make sense; the official Parquet Java library had some of the worst code design I've had the misfortune to depend on.)
3 replies →
The documentation for all of this is atrocious.
But if avro-in-parquet is a weird optional feature, it should be off by default! Parquet’s metadata is primarily in Thrift, not Avro, and it seems to me that no Avro should be involved in decoding Parquet files unless explicitly requested.
2 replies →
Tangential, but there was a recent sandbox escape vulnerability in both Chrome and Firefox.
The bug threads are still private, almost two weeks since it was disclosed and fixed. Very strange.
https://bugzilla.mozilla.org/show_bug.cgi?id=1956398
https://issues.chromium.org/issues/405143032
https://www.cve.org/CVERecord?id=CVE-2025-2783
Standard operating procedure for both the Chrome [https://chromium.googlesource.com/chromium/src/+/HEAD/docs/s...] and Firefox [https://www.mozilla.org/en-US/about/governance/policies/secu...] bug tracking systems.
But the fix itself is public in both the Chrome [https://chromium.googlesource.com/chromium/src.git/+/36dbbf3...] and Firefox [https://github.com/mozilla/gecko-dev/commit/ac605820636c3b96...] source repos, and it makes pretty clear what the bug is.
Looks like this one only applied to windows. Here’s a link to the diff: https://chromium.googlesource.com/chromium/src.git/+/36dbbf3...
[dead]