Comment by akoboldfrying

17 hours ago

Thanks, I think I get it now. The hash value would be a pure function of the method's signature (argument types and return type) and its name, so that two interfaces with a same-name, same-signature method would hash to the same value and thus invoke the same underlying method; the constraints would be that, after modulo, different methods must map to different indices; and the objective function to minimise would be the vtable size (which I think would be common across all classes).

But maybe I don't get it, since this would require knowledge of all interfaces, and as soon as you require that, it's straightforward to build a minimal-size mapping from method name+signature to integer index: e.g., just form the union of all method declarations appearing in any interface, sort them lexicographically, and use a method's position in this sorted list as its index. Lookups in this map are only ever done at compile time so there's no runtime inefficiency to worry about.

1 comment

akoboldfrying

hinkley 3 hours ago

At the time that article was published in SIGPLAN, the dominant language was Java, which is officially statically, strongly typed.

Although really it isn’t because the JVM is strongly typed but evaluates some things at load or first invocation so it allows some languages that run in the JVM to be a bit tricky. They first generics implementation on the JVM, called Pizza, leveraged this load time concretization to do its thing.

But if you have a language that can resolve the type system at link time then you can do this trick. Alternatively you could switch to cuckoo hashing and if you next module load starts causing collisions, then so be it.