← Back to context

Comment by jaysh

15 hours ago

Still via Grafana. I ran it side-by-side with Loki and despite trying to optimise Loki and using ClickHouse out of the box - it really was shocking how much faster ClickHouse was for every single query (e.g. in the last 12 hours give my the frequency of logs with a particular JSON event or even "find this log entry, then join back and find the number of times a different entry appears within the same correlation_id)

What does the layout in click house look like? Do the input logs need to have a very defined structure?

  • Not really, ClickHouse is super forgiving so you can do something like:

        CREATE TABLE default.events (
          `timestamp` DateTime
          `event` String -- e.g. 'product.updated' or empty/null
          `message` -- human readable message
          `raw` -- the raw message - this is very useful when pushing logs that aren't JSON - you just let the `event` be null and dump the entire message here
        )
        ENGINE = MergeTree
        PARTITION BY toDate(timestamp)
        ORDER BY (timestamp, event)
        TTL timestamp + toIntervalMonth(6)
    

    ClickHouse is extremely performant even in the cases of e.g.: SELECT count(*) FROM `events` WHERE `raw` LIKE '%hello world%'

    Of course, the more columns you splat out (e.g. like correlation_id, user_id, order_id, etc) the better you can index and expect those queries to perform but in general I don't bother outside the obvious core domain ones (exampled above), the performance is so good that unindexed queries are significantly faster than indexed queries in Loki. I have reached the point where I JSON extract on-the-fly for the WHERE clause with very large queries with no meaningful performance issues.