Nginx compresses my what?

2016-08-29(Mon)

tags: Nginx

I build virtual machines based on Debian stable and I use nginx as a web server. So the question came up: "are we gzipping stuff?" My respect for curl has been growing by leaps and bounds over the last six months as I've come to realize that if the question involves HTTP or HTTPS, the answer is curl <something>.

The general consensus is that at this point images are compressed enough that trying to compress them further (particularly given that they're relatively large) sucks up a lot of compute power for almost zero return. Keep in mind that SVG is an image format ... but it's XML-based, so they're sometimes compressible. So we've got our rule of thumb. How do we test it? curl's --head option returns only the returned header, not the body:

$ curl --head https://test.gilesorr.com/
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 20:48:15 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
ETag: W/"ffcb593f2b0f3b2ac7fc8bcd89e689a0"
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: <...>
X-Request-Id: d3a39ee3-ea70-48ea-9514-c796c7fa4c4c
X-Runtime: 0.147656
Strict-Transport-Security: max-age=2592000

This doesn't tell us anything yet, but sets a baseline. Let's look a bit more closely. -H and --header are equivalent, and mean to include a header in the request. Different than --head which applies to the returned content ... confusing.

$ curl -H "Accept-Encoding: gzip" --head https://test.gilesorr.com/
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 20:48:26 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: <...>
X-Request-Id: fc9847ed-0103-4270-ac7a-61d4cc7514a7
X-Runtime: 0.149750
Strict-Transport-Security: max-age=2592000
Content-Encoding: gzip

Notice we have a new returned header: "Content-Encoding: gzip". So our default homepage is gzip-compressed. Great! (This is turned on by default on Debian, and I think a lot of other distros.)

$ curl -H "Accept-Encoding: gzip" --head https://test.gilesorr.com/test.svg
HTTP/1.1 405 Method Not Allowed
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 20:49:28 GMT
Content-Type: text/plain
Content-Length: 18
Connection: keep-alive
Cache-Control: no-cache
X-Request-Id: b1021679-3e15-4ac2-83fa-dec4a9a4d327
X-Runtime: 0.003722

Note the 405 error, which seems to apply to all images when I tried to fetch them this way. Reading the curl man page says that GET is the default, but that may not apply for images? I tried again without the "-H", but got the same error. So I tried another tack:

$ curl --head --request GET https://test.gilesorr.com/test.png
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 20:53:38 GMT
Content-Type: image/png
Content-Length: 693395
Connection: keep-alive
Cache-Control: public, max-age=31536000
ETag: "72770059cd2e27581b22744a3da371cfb18e8751d43d815c51c372a16ea600b4"
X-Request-Id: 0c012393-7027-4ef3-81a5-31d32d7ed4e8
X-Runtime: 0.009581
Strict-Transport-Security: max-age=2592000

We've got a 200, so let's try asking for gzip:

$ curl --header "Accept-Encoding: gzip" --head --request GET https://test.gilesorr.com/test.png
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 21:11:22 GMT
Content-Type: image/png
Content-Length: 693395
Connection: keep-alive
Cache-Control: public, max-age=31536000
ETag: "72770059cd2e27581b22744a3da371cfb18e8751d43d815c51c372a16ea600b4"
X-Request-Id: 7d8cc75d-a90b-4b3f-a7cb-41956ce57924
X-Runtime: 0.016960
Strict-Transport-Security: max-age=2592000

So a 200 OK, but no compression. This is good too: we seem to have sane defaults!

My next question was, where did those defaults come from? I mostly muck with the site-specific nginx config files in /etc/nginx/sites-available/ and /etc/nginx/sites-enabled/, but those files are Debian-specific and only included at the behest of the parent file, /etc/nginx/nginx.conf. Closer examination shows this default section:

http {
    ...
    # gzip_vary on;
    # gzip_proxied any;
    # gzip_comp_level 6;
    # gzip_buffers 16 8k;
    # gzip_http_version 1.1;
    # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
    ...
}

I initially read this to mean that the gzip_types setting was the default: it would compress text, css, json, javascript, and xml - but not binaries like images or PDFs or anything else. But some further checking showed:

$ curl --header "Accept-Encoding: gzip" --head --request GET https://test.gilesorr.com/test.css
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 21:41:52 GMT
Content-Type: text/css; charset=utf-8
Content-Length: 231293
Connection: keep-alive
Cache-Control: public, max-age=31536000
ETag: "d709cc29ef713d007cfb8304164985aceb52db6229dc58f67c8f91a95bf39112"
X-Request-Id: c9400a7d-c563-4d60-9d14-64d02e0f1bdc
X-Runtime: 0.047712
Strict-Transport-Security: max-age=2592000

CSS doesn't seem to be compressed. Likewise, JavaScript:

$ curl --header "Accept-Encoding: gzip" --head --request GET https://test.gilesorr.com/test.js
HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Fri, 26 Aug 2016 21:44:25 GMT
Content-Type: application/javascript
Content-Length: 19225
Connection: keep-alive
Cache-Control: public, max-age=31536000
ETag: "0300c0cf163642ecc7cfe13acc5d804220d7ff167ce44476387158ef00f8b81d"
X-Request-Id: bcee704f-0a49-42e5-a81a-5b556bb6d06e
X-Runtime: 0.019879
Strict-Transport-Security: max-age=2592000

But both ETag and Cache-Control (which I need to investigate as well) are set.

On Digital Ocean's recommendation, I added the gzip_types setting. This turns out to be trickier than you'd expect. The types are done by MIME type, and they included "application/x-javascript" and "text/javascript" but for whatever reason, ours come across the wire as "application/javascript" (not an exact match for either of theirs) so when I used their defaults, our JS wasn't compressed. You can check the MIME type by looking at the "Content-Type" header listed by curl. So for the moment - until I find another MIME type this doesn't match - I've set the gzip_types like this:

# gzip is on by default on most distros: uncomment this line if nginx complains:
#gzip on;
gzip_min_length 256;
gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript application/vnd.ms-fontobject application/x-font-ttf font/opentype image/svg+xml image/x-icon;

The gzip_min_length clause specifies how large a file has to be before compression is attempted. And I didn't say where to put this: you can put it inside the server { ... } clause in your standard config file (or several other places - look at the Nginx documentation on gzip, below). But better (if possible) is to put it in /etc/nginx/conf.d/gzip.conf ... This assumes that somewhere in /etc/nginx/nginx.conf there's a line like this:

include /etc/nginx/conf.d/*.conf;

I think this is the default on at least all Debian and Ubuntu systems, and very likely many others. But it doesn't hurt to check. And you should also run nginx -t to test your new config before you restart nginx.