Comment by _xnmw
2 years ago
This is precisely why I'm considering appending to a sqlite DB in WAL2 mode instead of plain text log files. Almost no performance penalty for writes but huge advantages for reading/analysis. No more Grafana needed.
2 years ago
This is precisely why I'm considering appending to a sqlite DB in WAL2 mode instead of plain text log files. Almost no performance penalty for writes but huge advantages for reading/analysis. No more Grafana needed.
Careful, some people will be along any second pointing out your approach limits your ability to use "grep" and "cat" on your log after recovering it to your pdp-11 running in your basement. Also something about the "Unix philosophy" :p
Seriously though, I think this is a great idea, and would be interested in how easy it is to write sqlite output adaptors for the various logging libraries out there.
> some people will be along any second pointing out your approach limits your ability to use "grep" and "cat" on your log
And they won’t be wrong.
Unix philosophy still applies
sqlite3 logs.db "select log from logs" | grep whatever
3 replies →
How would you tail or watch a sqlite log? (on a pdp-11 if necessary :)
Write a program using
https://www.sqlite.org/c3ref/update_hook.html
On a PDP-11, run this program via telnet, rsh, or rexec.
If you're more ambitious, porting SQLite to 2.11BSD would be a fun exercise.
3 replies →
`watch -n 5 sqlite3 logs.db "select log from logs" | grep whatever`
1 reply →
> Careful, some people will be along any second pointing out your approach limits your ability to use "grep" and "cat" on your log after recovering it
I wish Splunk and friends would have an interface like that. Sure it does basic grep, and it is a much more powerful language, but sometimes you just needed some command line magic to find what you wanted.
I've learned so much about Splunk this month. I hate it. The UX is hot garbage. Why are settings scattered everywhere? Why does a simple word search not return any results? Why is there no obvious way to confirm data is being forwarded; like actual packets, not just what connections are configured.
Is there a way to mount the sqlite tables as a filesystem?
I've been doing this for years. I keep SQLite log databases at varying grains depending on the solution. One for global logs, one per user/session/workflow, etc. I've also done things like in-memory SQLite trace databases that only get written to disk if an exception occurs.
I didn’t know what WAL/WAL2 mode was, so I looked it up. For anyone else interested: https://www.sqlite.org/wal.html
Some people have thought of this before. Here is one implementation
https://git.sr.ht/~martijnbraam/logbookd
Although I'm not sure it uses WAL2 mode, but that should be a trivial change.
There’s a tool called lnav that will parse logfiles into a temporary SQLite database and allows to analyse them using SQL features:
https://lnav.org/
It could probably work. For a peculiar application I even used sqlite to record key frame-only video. (There was a reason)
One could flip it around and store logs in a multimedia container, but then you won't have nice indices like with sqlite, just the one big time index
Are you using the wal2 branch of SQLite?
SQLite doesn't look like a good fit for large logs - nothing can beat liner write at least on HDD and with plain text logs you will have it (though linear write of compressed data even better but rare software supports it out of the box). With SQLite I would expect more write requests for the same stream of logs (may be not much more). Reading analysis will be faster than using grep over plain text log only if you'll create indices which add write cost (and space overhead).
ClikcHouse works really well when you need to store and analyze large logs but compare to SQLite it would require to maintain a server(s). There is DuckDB which is embedded like SQLite and it could be a better than SQLite fit for logs but I have no experience with DuckDB.
SQLite is meant for transactional data, DuckDB for analytical data.
I am not sure which one would be better for logs, I would need to play around with it. But i am not sure if SQLite wouldn’t be a better fit.
Logs can be different but most of the time it is close to analytics than to transaction processing. Also the way to get a fast SELECT in SQLite (and most traditional relational databases) is to use indexes and indexes IMHO don't work well with logs - you may need to use any column in a WHERE condition and having all columns indexed requires a lot of space and reduces write speed. Additionally many columns in logs have low cardinality and an index doesn't significantly reduces query time compare to a full scan. So with logs one often have to use full-scan queries and full-scan is where columnar storage works much better than a row-oriented storage. Additionally with a columnar storage you are getting a better compression ratio which allows to save space (often important for logs) and increase read speed.