Exclude certain requests from the Nginx access log

Logs are nice and all that, but sometimes certain entries are there just to fill up the logs or are cluttering them. Here’s a few ways to exclude requests – by URL or visitor IP – from the Nginx access log.

Exclude specific URLs

In my Nginx configs I usually have a location block like this for static resources. It makes sure correct caching headers are sent, but also turns off logging for static resources – both the access log and the error log when a 404 is returned:

location ~* \.(png|jpg|jpeg|gif|ico|woff|otf|ttf|eot|svg|txt|pdf|docx?|xlsx?)$ {
    access_log off;
    log_not_found off;
    expires max;
    add_header Pragma public;
    add_header Cache-Control "public";
    add_header Vary "Accept-Encoding";
}

(I handle JS and CSS files in a separate block. Also note that you can very well use a map instead of the awkwardly long regexp)

Exclude specific user agents

If you’re not interested in logging requests from certain bots/crawlers, or you have a monitoring service, like Pingdom.

map $http_user_agent $log_ua {

    ~Pingdom 0;
    ~Googlebot 0;
    ~Baiduspider 0;

    default 1;
}

server {
       
    […]

    access_log /var/log/nginx/access.log main if=$log_ua;

}

Note that maps must be created outside of the server block.

Exclude specific IP addresses

If you run a cronjob with curl/wget on your localhost polling your website at certain intervals (e.g. you run WordPress but use a real cronjob to poll wp-cron.php), or you have health checks from downstream.

map $remote_addr $log_ip {
    
    "127.0.0.1" 0;
    "10.0.0.2" 0;
    "10.0.0.3" 0;

    default 1;

}

server {
       
    […]

    access_log /var/log/nginx/access.log main if=$log_ip;

}

Combining tests

If you want to match only specific user agents coming for specific IP addresses, it is possible to combine the tests. However, the Nginx config does not support a logical and operator, so we have to resort to a “clever hack”. Here’s a config to check for certain user agents from certain IPs:

map $http_user_agent $log_ua {

    ~Pingdom 0;
    ~Googlebot 0;
    ~Baiduspider 0;

    default 1;
}

map $remote_addr $log_ip {
 
    "127.0.0.1" 0;
    "10.0.0.2" 0;
    "10.0.0.3" 0;

    default 1;

}

server {
 
    […]

    set $logging 1;
    set $logtest '';

    if ( $log_ua = 0 ) {
        set $logtest "${logtest}A";
    }
    if ( $log_ip = 0 ) {
        set $logtest "${logtest}B";
    }
    if ( $logtest = "AB" ) {
        set $logging 0;
    }

    access_log /var/log/nginx/access.log main if=$logging;

}

You can of course add a “test C” checking for the URL too.

8 Comments

  1. Very nice. I’m enjoying your recent posts on securing and optimizing nginx for use with WP.

    I’m assuming the separate block for css and js files is so that you can enable gzip for them?

    BTW can’t you replace your clever hack with the following single if block:

    set $logging 1;
    if ( $log_ua = $log_ip ) {
    set $logging = $log_ua;
    }

  2. Hi,

    Great post!

    I changed it to meet my needs. I needed to stop logging for either select bots or IPs. That way our Scans and Bots dont fill up logs.

    Here is mine:

    ——————————————————————————————————-

    # Rule to only block bots or user agents set to ‘0’ from logging in access logs
    map $http_user_agent $bot_in_log {

    ~*googlebot 0;
    ~*bingbot 0;

    default 1;

    }

    # Rule to block specific IPs set to ‘0’ from logging in access logs
    # Map Module doesnt allow CIDR Ranges
    geo $remote_addr $ip_in_log {

    # CIDR Range IPs:
    208.71.208.0/22 0;
    204.110.218.0/23 0;
    185.54.124.0/22 0;

    default 1;

    }

    server {

    [..]

    set $logging 1;

    if ( $bot_in_log = 0 ) {
    set $logging 0;
    }
    if ( $ip_in_log = 0 ) {
    set $logging 0;
    }

    access_log /path/to/log/file/nginx.access.log timed_combined if=$logging;

    ——————————————————————————————————-

    FYI, set directive in nginx isnt supposed to use an ‘=’.
    http://nginx.org/en/docs/http/ngx_http_rewrite_module.html#set

    Hope this helps someone else.

  3. Great post!

    One thing to note:
    map defined variable ($log_ip) has global scope. I.e. if one uses several .conf files for his/her sites, $log_ip is going to be defined just once, and other definitions are going to be ignored (or it is probably the last alphabetically .conf file overriding all the previous, I’m not sure).

    So, one needs to use different variable names if different IP sets are required to be excluded from logs for the respective websites, e. g. http://www.site1.com > site1.conf > $site1_log_ip, http://www.site2.com > site2.conf > $site2_log_ip etc.

  4. Nginx has nice feature, if variable is not empty or zero, then it’s true. So,

    not need complex
    if ( $log_ua = 0 ) {
    set $logtest “${logtest}A”;
    }
    if ( $log_ip = 0 ) {
    set $logtest “${logtest}B”;
    }
    if ( $logtest = “AB” ) {
    set $logging 0;
    }
    ———————-
    Enough:

    set logtest “$log_ua$log_ip”

  5. Thanks Bjorn, I have a problem though which I cannot seem to solve. How can one disable logging the wordpress heartbeat (/wp-admin/admin-ajax.php) ? I’ve tried so many things but just can’t seem to get it to stop logging this. I’ve reduced heartbeat down to 60 seconds but just really don’t want it filling up my logs.

  6. Very nice post, thank you!

    A minor improvement could be to simplify the access_log statement, if the objective is to turn off logging in specific cases. One could invert the assigned values, and use access_log off so that the log path wouldn’t have to be specified in each case.

    E.g. consider:

    map $http_user_agent $log_ua {
    ~Pingdom 1;
    ~Googlebot 1;
    ~Baiduspider 1;
    default 0;
    }
    server {
    […]
    access_log off if=$log_ua;
    }

  7. map $http_user_agent $log_ua {

    ~Pingdom 0;
    ~Googlebot 0;
    ~Baiduspider 0;

    default 1;
    }

    map $remote_addr $log_ip {

    “127.0.0.1” 0;
    “10.0.0.2” 0;
    “10.0.0.3” 0;

    default 1;

    }

    map “$log_ip:$log_ua” $logging {
    “0:0” 0;

    default 1;
    }

    server {

    […]

    access_log /var/log/nginx/access.log main if=$logging;
    }

Comments are closed.