News Flash: Usual Technique Actually Works

News Flash: Usual Technique Actually Works

TL;dr: Someone else did this test before I did.

For some time now I've been skeptical of the practice of minifying code before it gets sent from a webserver to a browser. Minification is the practice of (usually automatically) removing most or all of the code that the interpreter will find extraneous like carriage returns, spaces and even human-readable variable and function names.

For example,

function myFancyHumanReadableFunctionName(happyVariable, angryVariable) {
    this.happiness   = happyVariable;
    this.anger       = angryVariable;
    this.nothingness = null;
}

var happyVariable = "happy";
var angryVariable = "angry";
uselessResult = myFancyHumanReadableFunctionName(happyVariable, angryVariable);

would be reduced to:

function a(b,c){this.d=b;this.e=c;this.f=null;}var g="happy";var h="angry";i=a(g,h);

The computer doesn't care about people being able to read and understand the code, so there's not much point in leaving all that formatting or descriptive variable names. The smaller the file, the faster it will go from one computer to another. Everyone wants web pages to load faster, so this will be a good thing to do.

The first code snippet above is 320 bytes; the second is 85 bytes. In practice, files that get downloaded to your computer are much larger and the difference between those file sizes are significant. The very popular library jQuery is 288,580 bytes. Its minified version is 89,501 bytes.

The other way of shrinking

Web browsers and webservers (at least any made in the last 18 years) are able to communicate with compressed data. They use a technique called gzip which squashes file sizes down by analyzing patterns in text and using symbols to replace those patterns. Joshua Davies did a great job of explaining how gzip compression works if you're interested. Though he does get into the technical nitty-gritty, the first few paragraphs make it fairly easy to understand the general idea.

Commonly both minification and compression are done. The compression is usually done automatically by the webserver and the minification is done (automatically) by the developer before the file gets put on the webserver.

Intuitions aren't facts

My intuition about compressing minified files was that it would be redundant, because all of the things you're removing would be in patterns that are easy for the gzip algorithms to find. You have to store all those semicolons, so what's the difference between storing the semicolon and storing the semicolon next to a carriage return? That's an oversimplified example, but then so was my thinking on the subject.

To be clear, I didn't think it would be worse to do both. I guessed it would make the file a little bit smaller for the transfer from webserver to web browser. But I thought the difference would be relatively small.

I missed (or at least misestimated) two important factors. First that all the whitespace does add complexity to the file. It's not like adding a whole bunch of spaces to the end of every line of code. That is to say, minification doesn't just reduce the size of the file, it reduces the complexity of the file. The file will become easier to compress and the compression will do a better job of crunching the file down.

Second, it's not just whitespace that's being removed in minification. Minification removes comments and reduces the size of variable and function names. There's so much less to compress that minification would still be well worthwhile even if the compression didn't work as well.

Empiricism!

I tested my initial hypothesis, that a minified file wouldn't compress as well as the original code. I did this by taking two pairs of files, two files and their minified versions, and by compressing them using gzip compression, just like a webserver would. Then it was easy to compare the file size differences.

The first file pair was an exaggerated simplification; an example of the kind of file that would have most of the shrinking done in the minification, leaving not much work for the compressor. It was the word "hello" followed by over a thousand carriage returns, then followed by the word "world". The minified version version was simply the phrase "hello world" with only a single space between the two words.

The second pair of files was a real-world example. I used the regular and minified versions of the JQuery library I mentioned above. They are freely available from JQuery's website.

file original compressed bytes saved filesize difference
hello world 1616B 145B 1471B 91% smaller
hw minified 11B 43B -32B 390% larger
JQuery 288,580B 84,568B 204,012B 70% smaller
JQuery min. 89,501B 30,766B 58,735B 66% smaller

You can see I wasn't entirely wrong. The minified version of the JQuery file didn't compress quite as efficiently as the original JQuery file. But the difference in efficiency was very small, and the reduction in data to be transferred is very significant. It used to be that if an entire webpage was 30KB it was time for some optimization. Now one single library needs to be minified and compressed to get there.

One might find the case of the minified hello world surprising. This shows that if I were originally correct about the nature of compression. There's a certain amount of overhead in the compression format, so a small file like that won't be compressed at all. It will actually grow.

Even that does not make my original hypothesis valid. It may be an argument against compressing certain files, but it's a long way from saying that one shouldn't minify if one knows that their files are being automatically transferred compressed. Indeed, even the larger "compressed" minified file is significantly smaller than the compressed version. Files this small are edge cases anyway and are not worth worrying about. Also: I was never suggesting not to compress minified files. I didn't think it was worthwhile to minify files that would be compressed.

So I have settled this issue to my satisfaction. Minifying files makes a significant difference to file size even with server compression. Website developers, authors of CMSes and downloadable libraries, and even amateur bloggers ought to use minification to improve the performance of their websites.