Comment by ghoul2
12 years ago
There is one significant problem I see:
the length field for compound types (arrays and maps) specify the length in "the number of items", not in bytes. This means while processing, If I need to skip a compound type, I actually need to process it in its entirety. Not very "small device" friendly.
In practice, I have found far more utility in knowing the byte-length of a compound field in advance than the number of items it contains. If I am interested in the field, I am anyway going to find out the number of items cause I am going to process it. If I am not interested in the field, the number of items are useless to me, but the byte-length would have come in handy.
I think the thinking here is that the sender may not be able to compute the byte size of the object a priori. Think HTTP chunked encoding.
I understand that is a concern in many situations. The problem here though is that you don't get the "streaming" benefits in any case: you still have to include the length-in-number-of-items of the compound type and the lengths of each individual member item in any case.
I'm a bit concerned about the "indefinite length" stuff for arrays, buffers, and strings.
That seems like something that's going to come back and byte us.
Because we can store enough things to fill memory?
Name me a particular time where a file format has ever had increased robustness, speed, security, or implementation simplicity because it did not specify record size ahead of time.
Please convince me of the reliability of sentinel values and null-terminated strings.
2 replies →
I agree. I think CBOR trades off a bit too much efficiency of in-place data access for compactness of representation.
Isn't the whole point of binary serialization formats efficiency and ease of parsing? Otherwise you might as well use .json.gz and probably end up with smaller files anyway.
You are very right. I was actually not arguing in favor of more complicated compression, but instead for more efficient access to data. Certainly, iterating though array and map items to get to the next element is not efficient.
Well just write the file type to deduce the size then.
I'm not sure I understand the problem you describe, really.
Even if there are string, just encode their lengths, or if you store a compound type, write the size when size can vary.