Comment by citrin_ru
2 years ago
SQLite doesn't look like a good fit for large logs - nothing can beat liner write at least on HDD and with plain text logs you will have it (though linear write of compressed data even better but rare software supports it out of the box). With SQLite I would expect more write requests for the same stream of logs (may be not much more). Reading analysis will be faster than using grep over plain text log only if you'll create indices which add write cost (and space overhead).
ClikcHouse works really well when you need to store and analyze large logs but compare to SQLite it would require to maintain a server(s). There is DuckDB which is embedded like SQLite and it could be a better than SQLite fit for logs but I have no experience with DuckDB.
SQLite is meant for transactional data, DuckDB for analytical data.
I am not sure which one would be better for logs, I would need to play around with it. But i am not sure if SQLite wouldn’t be a better fit.
Logs can be different but most of the time it is close to analytics than to transaction processing. Also the way to get a fast SELECT in SQLite (and most traditional relational databases) is to use indexes and indexes IMHO don't work well with logs - you may need to use any column in a WHERE condition and having all columns indexed requires a lot of space and reduces write speed. Additionally many columns in logs have low cardinality and an index doesn't significantly reduces query time compare to a full scan. So with logs one often have to use full-scan queries and full-scan is where columnar storage works much better than a row-oriented storage. Additionally with a columnar storage you are getting a better compression ratio which allows to save space (often important for logs) and increase read speed.