Comment by mwsherman
4 days ago
Shameless plug, you may wish to do Lucene-style tokenizing using the Unicode standard: https://github.com/clipperhouse/uax29/tree/master/words
4 days ago
Shameless plug, you may wish to do Lucene-style tokenizing using the Unicode standard: https://github.com/clipperhouse/uax29/tree/master/words
Got to admit, initial impressions, this is pretty neat, would spend sometime with this. Thanks for the link :)