← Back to context

Comment by sondr3

7 years ago

I've moved away from using any kind of script embedded in my webpages for tracking and instead just use Goaccess (https://goaccess.io/) to analyze my logs. Though there are obvious caveats with this, you need to install it, configure the server logging to match it and so on. But personally the benefits outweighs the cons, it all runs on the server, you are the sole owner off all the data and this tracking doesn't require any kind of JS on the webpage.

Wow, it's been a long time since I've seen one of these. It's like the olden days with Urchin (what eventually become Google Analytics). They analyzed log files prior to the Google acquisition. IIRC you could buy whatever the current version was (e.g., Urchin 2) for a flat fee and use it forever. There were free alternatives, but I liked Urchin's UI and features the best at the time.

Anyone remember what the price was? I want to say it was something like $60-$100, but my memory could be conflating it with something else.

Isn't there a problem with GDPR compliance if you want to serve European pages? You are allowed to log IP addresses for security reasons. However, as far as I understand the situation, you need the agreement of the users if you use their personal data, which includes IP addresses, for anything else.

Has somebody figured out how to resolve this situation with log files?

  • You can use goaccess to create a log every day to json, excluding IP while retaining stats for geolocation.

    For this you can logrotate daily and run goaccess before rotation. I believe you can keep the server logs for a week for debugging while respecting GDPR.

    For today's "realtime" data you can use goaccess on today's log on demand and use a cache.

    You can write your custom stat viewers or use goaccess to view time range data from multiple json files.

Goaccess is amazing and, in a world where seemingly every technology touts itself as "lightweight" (whether they really are or not), truly is very light weight.

This is how we did web analytics in the old days. The original WebTrends was just a log analyzer for apache.

Not to mention GoAccess is often more accurate since many visitors use extensions which block 3rd party trackers.

  • I don't understand why Google Analytics works at all nowadays: A large percentage of visitors uses an adblocker and don't they block tracking and analytics by default?

    Users like me must be complete ghosts unless one looks in their real server logs!

    • Honestly what Google Analytics tells me is that much less people use ad blockers than various discussions suggest.

      I've had blogs do quite well on occasion and when that happens, GA seems to see > 75% of what the server logs do. And that's with a tech audience.

  • On the other side, server side logs show bots, and for some verticals that's massively bigger traffic than real people.

I LOVE Goaccess and highly recommend it as well. My single complaint is the lack of ability to filter/define a date and time range. I know there is an issue for it but last time I checked it had been open for quite some time :(.

  • There's a kind of a workaround. I rotate my logs with logrotate weekly, so the current week's logs are in access.log (and access.log.1) and past logs are in access.log.x.gz files. Then I run goaccess twice (once for .log and once .gz) to get both "all" and "latest" stats. It's not as flexible as a real filter, but it works for me.

    • Just curious ,are you using the web based UI to look at your data or the CLI? I use the web UI so I'm wondering how something like this might work with that. I'll have to poke around. Thanks for sharing!

      2 replies →

This looks awesome - I'm curious if anyone has found a good way to use this with Kubernetes. You can choose where to ship your cluster logs, so it should be possible.

From what Simple Analytics says they collect on their website, it sounds like the only information missing from GoAccess (or server logs in general) is screen width.

  • I'm able to get the screen size with goaccess. I placed a bogus <img> in the document. e.g.,

    <img src="/images/1px.png?"+screen.width+"x"+screen.height+"x"+screen.colorDepth+" />