Comment by geokon
12 hours ago
a few questions for the pros
> "The defining trait: no identity"
I get that this makes objects behave like primitive types. Maybe thats reason enough. But is it necessary for the performance boost and de-fluffing the objects? Seems like an orthogonal objective
> There’s a catch worth knowing about here, though: flattened data has to be readable and writable atomically (otherwise it risks “tearing” under concurrent access).
Isn't this a race condition and "undefined bahvior"..? Having to limit yourself to atomic sizes seems like a huge limitation, to accomodate what is most likely buggy code. Is all the effort only gunna help lil toy ColorRGB examples?
> The points array is a million pointers. Each pointer leads to a separate Point object lying somewhere on the heap.
Does this happen in actuality? One would assume the allocator tries to put stuff sequentially on the heap? Its not a guarantee as with these Value Types, but I'd think you could get similar-ish perf with prefetching in cache. I dunno whats happening under the hood.. But when writing Clojure apps the JVM always reserves absurd amounts of heapspace on my machine (to my annoyance). Id assume it can find some place to do contiguous allocations..
Which i guess gets me to my last question... where are the benchmarks broski? It all sounds great, but does it actually yield the insane speedups promised?
Great article, well written. But a benchmark would have been a nice "punchline"
> is it necessary for the performance boost and de-fluffing the objects?
Yes. The one part of the JVM GC that can't run concurrently is heap compaction; objects that can be moved by copying and then deleting would be a huge help for that. And it would be awkward to say the object has an identity but can't be wait/notify'd, at which point you need somewhere for the monitor to go.
> Does this happen in actuality? One would assume the allocator tries to put stuff sequentially on the heap?
Yes. Of course it tries, but semantically the pointers are just pointers and the prefetcher can guess but the system still has to chase them.
> I get that this makes objects behave like primitive types. Maybe thats reason enough. But is it necessary for the performance boost and de-fluffing the objects? Seems like an orthogonal objective
It feels like an orthogonal objective and honestly arbitrary distinction, yes.
> Isn't this a race condition and "undefined bahvior"..? Having to limit yourself to atomic sizes seems like a huge limitation, to accomodate what is most likely buggy code.
I think they meant it like the appearance of atomic behavior from a java multithreading view.
> Does this happen in actuality?
Yes, it does happen. Having guarantees on this front leads to better performance.
> But when writing Clojure apps the JVM always reserves absurd amounts of heapspace on my machine (to my annoyance)
Might be a configuration problem?
> Is all the effort only gunna help lil toy ColorRGB examples
Arguably flattening mostly makes sense for these only.
And yeah, you are right that allocations happen on something called a thread local allocation buffer, which is basically just a pointer bump in cost and objects allocated one after the other should be physically close in memory for the most part (though an object's creation may require a bunch of other object's creation that would sit in-between). But these have headers, so not as dense as they could be (though due to GCs being generational, they may end up actually closer in the next gen? The in-between temporary objects wouldn't survive for the most part)
There are plenty of cases where flattening an object that takes 64bits would make sense.
The current code will help with `Integer[]`, `Char []`, etc, as well as combinations of `byte`, `char`, and `int`. Past that it doesn't really help much.
It would be fantastic if we could also flatten something like `Pair` or `Tuple`. However, even with compressed pointers, that is 64 bits, so that, plus the `null` bit, means it can't be flattened, which is a real shame. For various reasons, I have `List<Long>` in numerous places in my code, It would be great if that could also be flattened. However, since a Long is 64 bits, it _also_ can't be flattened. https://openjdk.org/jeps/8316779 would go a long way to to helping here, since then at least the null bit could be thrown away, which would allow more things to be flattened.
And then, if you want to go Wishlist land, something that would allow SSO (Small String Optimisation) would also be awesome, but that would require something akin to unions in Java, which we can _kind_ of do with sealed classes, but, since String is a final class, can't be retrofitted back into the language.
Does anyone know if Valhalla will flatten "simple" sealed classes, where every sealed class is small enough to be flattened? Since that would also be a powerful example to share.
Is there some reason there isn't simply a write-lock/semaphore on Value Types that are over 64bits? The overhead should beat pointer-chasing. I mean maybe someone wants to concurrently write to values from different threads with no coordination, but that's not super common. As you illustrate, having "fat" Value Types would open up a lot of potential.
In the current setup will a Pair Value Type be a compiler error, or will it silently just have bad perf?