Comment by mattip
3 years ago
> it's probably all about string index into dict and dict management
Cool. Is the performance here something you would like to pursue? If so could you open an issue [0] with some kind of reproducer?
3 years ago
> it's probably all about string index into dict and dict management
Cool. Is the performance here something you would like to pursue? If so could you open an issue [0] with some kind of reproducer?
I'm thinking about how to demonstrate the problem. I have a large pickle but pickle load/dump times across gc.disable()/gc.enable() really doesn't say much.
I need to find out how to instrument the seek/add cost of threads against the shared dict under a lock.
My gut feel is that probably if I inlined things instead of calling out to functions I'd shave a bit more too. So saying "slower than expected" may be unfair because there's limits to how much you can speed this kind of thing up. Thats why I wondered if alternate datastructures were a better fit.
its variable length string indexes into lists/dicts of integer counts. The advantage of a radix trie would be finding the record in semi constant time to the length in bits of the strings, and they do form prefix sets.
Would love to hear more. You can reach us with any of these methods https://www.pypy.org/contact.html