Comment by ggm
3 years ago
I'm thinking about how to demonstrate the problem. I have a large pickle but pickle load/dump times across gc.disable()/gc.enable() really doesn't say much.
I need to find out how to instrument the seek/add cost of threads against the shared dict under a lock.
My gut feel is that probably if I inlined things instead of calling out to functions I'd shave a bit more too. So saying "slower than expected" may be unfair because there's limits to how much you can speed this kind of thing up. Thats why I wondered if alternate datastructures were a better fit.
its variable length string indexes into lists/dicts of integer counts. The advantage of a radix trie would be finding the record in semi constant time to the length in bits of the strings, and they do form prefix sets.
Would love to hear more. You can reach us with any of these methods https://www.pypy.org/contact.html