More and more users are using adblockers or surfing the web via private browsing with tracking protection. But this also affects your web analytics, as the blockers also will block analytics tracking – not only third party services like Google Analytics, but also self-hosted solutions like Piwik.
I can totally see why users are using adblockers. Myself, I don’t mind seeing the ads, but I too value my privacy and do not want all my activities to be tracked.
The aggressive use of tracking done from parties like Facebook, Google, and other ad networks are privacy invading and should be blocked. However, anonymized website usage analytics is what I consider fair. As a content publisher I want to know what content is actually being read or not, helping me producing better content. I also want to know what distribution channels I should spend my limited time on. Knowing who did what when, is none of my business. But how I spend my time is.
Proxy the tracker scripts and tracking endpoints
The standard tracking script from Google Analytics is the ga.js
script loaded from Google’s servers at ssl.google-analytics.com
, and it will send data back to the same servers. This is easy enough to detect and block.
What is much worse for blocker applications is to detect that an arbitrary script from your webserver doing an arbitrary AJAX-request to an arbitrary endpoint is doing web analytics tracking.
So the solution is to either rolling your own analytics solution (not recommended, unless that’s something you’ll find fun to do) or proxy the requests through your web server.
Hosting the analytics script locally is not a problem at all. I’ve wrote about it back in 2013 in “Host ga.js locally with a WordPress plugin”. You don’t even have to use the same name for the script: name it scitylana-elgoog.js
if you want to. But you don’t really have to even physically host it locally on your server, just setup your web server as a reverse proxy to Google. Both Nginx and Apache is great at doing that.
While you’re at it, you’ll setup a proxy to do the tracking as well. Some analytics solutions, like Piwik, let you set the collection endpoint in the script you insert on your page, while others, like Google Analytics requires some monkeypatching. But if you look at the tracker script’s source code, you’ll see that a few simple search-and-replace will do it.
If you’re using a self-hosted solution – like Piwik – on a custom domain – like your own – you might get away with simply creating two symlinks: On to piwik.js and one to piwik.php. I called my symlinks graph.js and graph.php.
Why aren’t the ad networks doing this?
The ad networks will never trust data from your server. You see, they don’t trust you. If the data would come from you, it would be too easy to forge it. They want to be in control. They want to track you reliably across the internet.
The cat and mouse game
This is a cat and mouse game, where web developers finds new ways to avoid the blockers and the blocker developers gets better at detecting and blocking trackers. I don’t mind. This will only narrow down into better privacy and less identifying info for the spying ad networks and privacy invading parties. The spying parties will get useless cross-site info.
Respect the do-not-track header
When you implement a blocker circumvention like explained here, please respect people’s right to privacy. People have the right to privacy. But you as a publisher have the right to get numbers. When people have set the do-not-track header, you can drop all analytics cookies, do not collect any info at all, but just count the pageview. You can do so via an alternative AJAX request for those visitor. Count the page view if you like, but let the visitor be completely anonymous.
Could you please write up your experiences with Piwik? I used it for 10 months, but abandoned it. I found the tracking script to have bad client-side performance, and the dashboard was frustrating to use. I especially disliked how it limited in which ways I could combine the data (made worse by the fact that the UI indicated that I could indeed combine the two datapoints – e.g. most popular page and operating system.) I could generate the reports I wanted from the data in the database, but I found Piwik so lacking that I returned to Google Analytics.
My needs for analytics are very basic. I just want to know:
1. Roughly how many are reading what articles (because I have an ego).
2. How articles perform over time, especially when doing an SEO experiment.
3. From what countries are visitors from, so I know where to add more nodes to my DIY CDN.
4. What time is the best/worst time to do server maintenance?
I like some parts of Piwik better than GA, but overall GA has a much better UI. I’ve only used Piwik for a few months, but it was a relief to throw out Google.
From the little research I’ve done, it looks like it is possible to get just as good, or even better data in Piwik than in GA, but it will require some invested time and $$s.
If I were to setup Piwik for a client, it would be without any warranties and delivered “as-is”. Mostly because of my lack of experience in the area.
Interesting article. How do you track now on this page? Can’t find any tracking code…
I used Piwik up until a couple of months ago. Now I don’t track anything anymore. It might change in the future, but right now I’m OK with it.