Comment by axblount

3 months ago

This is usually called NaN-boxing and is often used to implement dynamic languages.

https://piotrduperas.com/posts/nan-boxing

12 comments

axblount

I wonder if IEEE-754 designers anticipated this use case during development. Great article - this kind of "outside-the-normal-use-case" requires a very careful reading of the specifications/guarantees of the language.

purplesyringa 3 months ago

IEEE-754 seems to say that NaN payloads are designed to contain "retrospective diagnostic information inherited from invalid or unavailable data and results". While NaN boxing in particular probably wasn't the intention, untouched payloads in general absolutely were.

tetris11 3 months ago

Are there any security implications of NaN-Boxing?

If I encode data into the exponent of these NaN values, propagate them around the internet, and decode them elsewhere.... is that a security risk? Or this just falls into the category of "weird encryption"

rcxdude 3 months ago

Define security risk. Obviously this data could come from untrusted sources. Another consideration is that this data may not serialize correctly: most text serialization protocols will not distinguish different NaN values. It's possible for there to be some data confusion in your pipeline as well, if you don't control all the code that touches those bytes.
ufo 3 months ago
I'd be surprised. It's surprisingly difficult to setialize NaN values. Can't do it on JSON, for example.
- yencabulator 3 months ago
  
  Lots of formats just put the IEEE-754 bytes on the wire, as-is. For example, Protocol Buffers: https://protobuf.dev/programming-guides/encoding/

userbinator 3 months ago

It's especially useful since everything is 64 bits uniformly.

pwdisswordfishz 3 months ago

Like JavaScript. Which should immediately raise the question how it could possibly work on engines that employ it themselves. Turns out... it won't.

addoo 3 months ago

I appreciate this article a lot more because it contains an iota of benchmarking with an explanation about why this might be more performant. Especially since my first thought was 'wouldn't this require more instructions to be executed?'

The original post seems really weird to me. I would have dismissed it as someone's hobby project, but... that doesn't seem like what it's trying to be.

Tuna-Fish 3 months ago
"More instructions to execute" is not synonymous with "slower".
NaNboxing lets you use less memory on certain use cases. Because memory access is slow and caches are fixed size, this is usually a performance win, even if you have to do a few extra ops on every access.
- thehappypm 3 months ago
  
  I mean, a modern computer is operating the gigahertz range. Adding a few extra bitwise instructions might be something like a nanosecond. Which is absolutely fleeting compared to memory operations.
wging 3 months ago

It's a joke project.