Optimising Computer Programs for Performance

Posted in Java, Software Development by Dan on July 23rd, 2008

I’ve recently been working on a small Java simulation program that is going to take a long time to execute each time it runs. Basically it does the same thing around a billion times with different random inputs for each iteration. I calculated that for my first working version of the program it would take 22 and a half hours to complete (based on it completing one million iterations in 81 seconds).

This got me thinking about how to optimise the code for performance, which meant revisiting the various rules of optimisation that I’ve learned from my previous programming experiences. So that’s what this post is about: rules of thumb for optimising computer programs for performance (some of this is Java-specific but most of it is generally applicable).

After optimisations, my program will complete in 3 hours and 5 minutes on the same machine (I still have a few ideas left to try that may reduce this further).

    1. “Premature optimisation is the root of all evil”
      No discussion of optimisation is complete without somebody inevitably quoting Donald Knuth so let’s get it out of the way up front.  Knuth, as usual, is right. Optimisation ahead of time is at best speculative. Furthermore, optimisation is invariably a process of sacrificing readability, portability and general maintainability for performance. It’s better to refrain from making these compromises until it proves to be necessary. More often than not your simple, unoptimised application will be fast enough anyway. Spending time converting your application into a heap of dung in exchange for an unnecessary, and potentially negligible (or even negative), speed boost is not a winning proposition.
    2. “There’s a difference between ‘Premature Optimisation’ and ‘Doing things right in the first place'”
      So stated a former colleague of mine in one of his less profane moments. If you’re planning to sort a million records you wouldn’t choose to implement a Bubble Sort. Some things are just wrong from the start. Theo Schlossnagle argues that this ability to effectively determine what constitutes premature optimisation and what is merely common sense is what separates the senior developers from their less experienced colleagues.
    3. “You can guess, or you can know”
      If you really understood why your program performs so unacceptably slowly you wouldn’t have written it that way in the first place. So don’t put too much faith in your intuition. If you want to fumble around in the dark in the hope that you’ll eventually guess what you did wrong, go ahead. But if you want to know where you suck at programming ask the computer. A profiler is an essential tool for any optimisation effort. If you’re coding Java JProfiler is an excellent choice. If you want something for nothing the NetBeans Profiler is pretty good too, though not quite as slick. A profiler will quickly identify bottlenecks in your program and the best places to start looking for potential optimisations. Just remember to measure the performance before and after any changes that you make so that you can evaluate their impact.
    4. Hardware solutions to software problems
      Your application uses too much memory. You can either lead a crack team of four developers for 5 months and optimise the code until it fits in the available RAM… or you can buy more RAM for less than £50. Ask yourself, what would Wilhelm do? And then do the opposite. In the world of paid-for software development those performance problems that would go away with better hardware are usually best solved by buying better hardware. Even to the extent of replacing entire servers, it can be more cost-effective than non-trivial code changes.
      As well as buying better hardware you should make sure that you are taking full advantage of what is already available to you. My 81-second trial simulation completed in 51 seconds after I split the work between two threads in order to take advantage of my dual core CPU.
    5. Optimisations at lower levels are often easier and can have a bigger impact
      The lower the level of the optimisation the more opportunity it provides for improved performance since everything built on top of that layer can take advantage of it. For example, switching to a faster JVM potentially makes all of your classes faster without having to change any of them. In my case I switched from Apple’s Java 5 to the SoyLatte version of Java 6 to take advantage of Sun’s on-going performance work and I got a 20% speed boost without modifying my application. Other improvements in this vein would include upgrading your Linux kernel or replacing a library with a faster implementation (such as switching from Castor XML to JiBX rather than addressing the problem at a higher level by trying to reduce the size of the XML in order to squeeze better performance from Castor).
    6. Optimise algorithms not code
      This is where that Computer Science education comes in useful. A basic understanding of complexity theory and big O notation will help you select the best algorithm for the job. A common mistake of inexperienced programmers is to fixate on micro-optimisations. “Maybe if I use direct field access instead of a getter, it will be quicker?” It doesn’t matter. It especially doesn’t matter if your code is slow because you chose an O(n2) algorithm instead of the O(n log n) alternative.
    7. Avoid superstition
      This is related to the previous advice. Don’t do something just because someone told you it might be faster or you read it on the Internet. There are dozens of pages of Java performance tips (micro-optimisations mostly) on the web. Most of these tips are well past their sell-by-date. They are irrelevant with modern JVMs (the JIT compiler generally does a better job than misguided hand-optimised code). Some of them were never sensible in the first place. “Make all methods final for performance”, “iterate over arrays backwards because the comparison with zero is cheaper” they say. Yet these superstitious idioms are still religiously applied by some developers incapable of critical thinking. Critical thinking means taking measurements and evaluating for yourself what the impact is.
    8. Don’t waste your time.
      The profiler tells you that the two parts of your application consume 95% and 5% of CPU resources respectively. You know that the 5% is far too high and that it should be possible to complete that work in less than 1% of the total time. The problem is, even if you achieve this impressive five-fold performance boost in this part of the code, nobody is going to notice since overall application performance has improved by just 4%. Unless that 4% improvement represents the difference between success and failure it’s not worth the effort. Instead you should be focusing on the other 95% of the application since that is the only place where you might be able to achieve a significant improvement, even if it is more work to do so. My rule of thumb is that for anything less than a 20% improvement it’s generally not worth making my code more dung-like.

Hopefully this has been useful. If you remember only one sentence from this article, make sure it’s this one: “You can guess, or you can know”. Measure everything. Optimisation is science not witchcraft.