Comment by kstenerud
6 hours ago
"a\u0000b" ("a" followed by a vertical tabulation control code) is also a perfectly valid and in-bounds BONJSON string. What BONJSON rejects is any invalid UTF-8 sequences, which shouldn't even be present in the data to begin with.
My example was a three character string where the second one is \u0000, which is the NUL character in the middle of the string.
The spec on the GitHub says that it is banned to include NUL under a security stance, that someone that after parse someone might do strlen and accidentally truncate to a shorter string in C.
Which I think has some premise, but its a valid string contents in JSON (and in Utf8), so it is deliberately breaking 1:1 parity with JSON parity in the name of a security hypothetical.
You're thinking of "a\u000b". "a\u0000b" is the three-character string also written "a\x00b".
Did you read "Parsing JSON is a minefield"?