Show HN: FlashTokenizer – 10x faster C++ tokenizer for Python
2 days ago (github.com)
I built a tokenizer in C++ with a Python binding that outperforms HuggingFace tokenizers by 10x on large inputs. It's optimized for minimal memory usage and latency.
Benchmarks and comparison included in README. Would love feedback or contributions!
No comments yet
Contribute on Hacker News ↗