Comment by tyingq
5 years ago
The JSON patch took out more of the elapsed time. Granted, it was a terrible parser. But I still think JSON is a poor choice here. 63k x X checks for colons, balanced quotes/braces and so on just isn't needed.
Time with only duplication check patch: 4m 30s
Time with only JSON parser patch: 2m 50s
> But I still think JSON is a poor choice here.
It’s an irrelevant one. The json parser from the python stdlib parses a 10Mb json patterned after the sample in a few dozen ms. And it’s hardly a fast parser.