Comment by hcarvalhoalves
13 years ago
Sounds interesting, but theoretical. Any practical applications? I'm guessing it has some purpose on dealing with long string sequences, like genome sequencing?
13 years ago
Sounds interesting, but theoretical. Any practical applications? I'm guessing it has some purpose on dealing with long string sequences, like genome sequencing?
From what I understand, it is useful for storing compressed data and querying it without decompression.
Range queries allow for full-text search, for example.
Something like that + data compression. If you have some data type mapped to a sequence of symbols, WT can speed up search in it. Please also note it is not a complete full text search index (like ones used in genome sequencing). It is a rank/select dictionary.
I believe he's doing full text search on this by running Burrows-Wheeler on the data first. See here for more details:
I started reimplementing all of this in Go about a year ago, but moved onto other stuff before I finished.
Sure, that is what I meant. Some additional structures (BWT, ...) are required for a complete FTS. WT allows only 1-symbol searches that might be enough in practical applications (databases?).
I also have an implementation of dynamic LOUDS-based multiary WT (it is much faster on large alphabets), but it haven't been fully finished yet.
http://proceedings.spiedigitallibrary.org/mobile/proceeding....