Comment by saintfiends
5 days ago
Meilisearch is really good for a corpus that rarely changes from my experience so far. If the documents frequently change and you have a need to have those changes available in search results fairly quickly it ends up with pending tasks for hours.
I don't have a good solution for this use-case other than maybe just the good old RDBMS. I'm open for suggestions or anyway to tweak Meilisearch for documents that gets updated every few seconds. We have about 7 million documents that's about 5kb each. What kind of instance do I need to handle this.
The best you could do is put Meilisearch on a very good NVMe. I am indexing large streams of content (Bsky posts + likes), and I assure you that I tested Meilisearch on a not-so-good NVMe and a slow HDD — and ho, Boy!! The SSD is so much faster.
I am sending hundreds of thousands of messages and changes (of the likes count) into Meilisearch, and so far, so good. It's been a month, and everything is working fine. We also shipped the new batches/ stats showing a lot of internal information about indexing step timings [1] to help us prioritize.
[1]: https://github.com/meilisearch/meilisearch/pull/5356#issue-2...
You have 35gib of data, put it in memory and forget about nvmes and hdds
35 GiB is probably a third of the data I index into Meilisearch just for experimenting and don't forget about the inverted indexes. You wouldn't use any O(n) algorithm to search in your documents.
Also, every time you need to reboot the engine you would have to reindex everything from scratch. Not a good strategy, believe me.