ReportNG 0.9.8 – HTML and XML reports for TestNG

Posted in Java by Dan on October 21st, 2008

Version 0.9.8 of ReportNG is now available for download. This version addresses a couple of issues with the XML output from the JUnitXMLReporter:

  • The XML output now includes failed and skipped configuration methods.  Previously these were included in HTML reports but omitted from the XML.
  • You can now control the dialect of the XML that is generated.  The default is to use the version that TestNG’s own reporter generates.  This includes the ability to mark tests as skipped and works well with Hudson.  Not all tools recognise the <skipped> element though, so you can now set the org.uncommons.reportng.xml-dialect property to "junit" (as opposed to "testng") and it will mark skips as failures. This works better with Ant’s junitreport task.

In addition, there have been a couple of enhancements to the HTML reporter:

  • There is now a separate page that collates all of the reporter log statements.
  • You can now specify your own stylesheet to over-ride the default appearance of the generated report.  Just set the org.uncommons.reportng.stylesheet property to the path of your CSS file. For example, the sample report looks like this when using a custom Hudson-inspired stylesheet.

Thanks to Ron Saito and Mike Feinberg for the feedback and suggestions that were incorporated into this release. If you have any problems, please use the issue tracker. And if you come up with a good custom CSS file for the HTML reports, please consider submitting it so that it can be included in the distribution.

Java Power Tools

Posted in Books, Java by Dan on October 13th, 2008

I’ve been keen to take a look at John Ferguson Smart‘s Java Power Tools since I first found out about it. Fortunately, it has just been added to the ACM’s online books programme so, as an ACM member, I’ve been able to read it online.

The book consists of 30 chapters, each dedicated to a different development tool. Build tools, version control, continuous integration, testing, profiling, static analysis and issue-tracking are among the topics covered. For most tasks, more than one option is presented. For example, the book covers both Ant and Maven, and JUnit and TestNG. All of the tools covered are open source and freely available.

Java Power Tools

Some of the chapters will only be of interest to beginning Java developers. I imagine that most Java professionals already know how to use Ant and some kind of version control system. On the other hand, the book also introduces some tools which are not so well-known, so you are sure to find something useful here.

CVS and Subversion are the version control options demonstrated. I can’t help thinking that Git (or even Mercurial) would have been a better choice for inclusion than CVS.  Usage of distributed version control systems is growing whereas CVS has effectively been supplanted by Subversion.

Elsewhere there are no such omissions. The author covers four different continuous integration servers: CruiseControl, Continuum, LuntBuild and Hudson. This is probably overkill. I haven’t used LuntBuild, but I would quickly dismiss CruiseControl and Continuum in favour of Hudson. It would have been sufficient to cover Hudson and one other.

The coverage of testing tools is particularly thorough, and is probably the most useful part for experienced developers. Not only does it cover JUnit 4 and TestNG, but it also goes into some detail on a variety of related tools, such as DbUnit, FEST and Selenium, and performance testing tools including JMeter and JUnitPerf.

I found the chapter on the JDK’s profiling tools to be useful and there is also a chapter on profiling from Eclipse, but nothing on the NetBeans profiler. This is my only real gripe with the book. Three of the chapters are Eclipse-only with no alternatives offered for users of other IDEs. One of these is the chapter on the Jupiter code review plug-in. ReviewBoard might have been a better choice.

All-in-all though, this is a substantially useful book. At 910 pages it covers a broad range of topics without skimping on the necessary detail. There are dozens of ideas for improving and automating your software development processes.

If you want more information, Meera Subbarao at JavaLobby has also reviewed Java Power Tools.

Distributed Evolutionary Algorithms with Watchmaker and Hadoop

Posted in Evolutionary Computation, Java by Dan on October 1st, 2008

One feature that has been on the TODO list of the Watchmaker Framework for Evolutionary Computation for some time is the ability to distribute the evolution across several machines.  Some time last year I started on a RMI-based solution, but I wasn’t happy with it so I deleted it and put the idea on the back burner while I concentrated on other things.  At some point I wanted to investigate using Terracotta, or possibly Hadoop, to distribute the computations.

However, it’s often the case with Open Source software that somebody smarter comes along and does the hard work for you.  I was delighted to find out today that Abdel Hakim Deneche has been busy integrating Watchmaker with the Apache Mahout project as part of Google’s Summer of Code programme.

I’d never heard of Mahout before.  According to Wikipedia, a Mahout is somebody who drives an elephant.  Apache Mahout is a sub-project of Lucene, the Java text search and indexing engine.  The Mahout project is focused on building scalable machine-learning libraries using Hadoop (presumably where the elephant connection comes in).

I haven’t yet tried using the Mahout software, but it looks like it provides a pretty straightforward way to distribute the fitness evaluations for just about any evolutionary algorithm implemented using Watchmaker.

Avoid NIO, Get Better Throughput

Posted in Java by Dan on September 3rd, 2008

The Java NIO (new/non-blocking I/O) API introduced in Java 1.4 is arguably the most arcane part of the standard library. With channels, selectors, byte buffers and all the associated flipping, marking, compacting, event-handling and registering/de-registering of read/write interest, it’s an entirely different level of complexity to the old-fashioned, straightforward blocking I/O. And if you want to use SSL with NIO then it’s a whole new world of pain.

Few have mastered NIO. For most it provides an opportunity to really get to know your debugger. “Should this buffer be flipped before I pass it to this method, or should the method flip it?”. BufferOverflowExceptions and memory leaks abound.

So, in the spirit of doing the simplest thing that could possibly work, writing your own NIO code is usually best avoided unless you have a compelling reason. Fortunately, some masochistic individuals have done a lot of the hard work so that we don’t have to. Projects such as Grizzly and QuickServer provide proven, reusable non-blocking server components.

However, in most instances, maybe non-blocking I/O is not necessary at all? In fact, maybe it is detrimental to performance?

That’s the point that Paul Tyma makes. He attacks some of the received wisdom about the relative merits of blocking and non-blocking servers in Java. The characteristics of JVMs and threading libraries change as new advances are made. Good advice often becomes bad advice over time, demonstrating the importance of making your own measurements rather than falling back on superstitions.

Paul’s experiments show that higher throughput is achieved with blocking I/O, that thread-per-socket designs actually scale well, and that the costs of context-switching and synchronisation aren’t always significant. Paul’s slides from his talk “Thousands of Threads and Blocking I/O: The Old Way to Write Java Servers Is New Again (and Way Better)” are well worth a look.

If you are writing your own multi-threaded servers in Java, Esmond Pitt’s Fundamental Java Networking and Java Concurrency in Practice by Brian Goetz et al. are essential reading.

More Stupid Java Tricks

Posted in Java by Dan on August 26th, 2008

My previous post was reasonably popular so I decided to follow-up with some more stupid Java tricks. It should go without saying that you shouldn’t use these techniques in any serious code, unless of course your objective is to write unmaintainable code.

It was pointed out to me by a couple of people that the puzzle I posed previously is also in Josh Bloch and Neal Gafter’s excellent Java Puzzlers book. If you are interested in all of the ugly corners of the Java platform, this book is well worth buying.

Unchecked Checked Exceptions

Throw checked exceptions without the hassle of dealing with them

There’s a lot of debate about the relative merits of checked and unchecked exceptions in Java. You can have the best of both worlds by throwing checked exceptions without either catching them or declaring them in a throws clause. The deprecated Thread.stop(Throwable) method is one way:

private void throwSomething() // Look, no throws clause!
{
    Thread.currentThread().stop(new IOException("This won't be caught."));
}

The slightly more socially-acceptable Class.newInstance() method is another way. It’s not deprecated but it does have one well-documented flaw:

Note that this method propagates any exception thrown by the nullary constructor, including a checked exception. Use of this method effectively bypasses the compile-time exception checking that would otherwise be performed by the compiler.

Invasion of Privacy

Strings aren’t really immutable, even literals can’t be trusted

This is a bit of a golden oldie but it still has great potential for messing with your co-workers’ sanity.

All high-level programming languages have rules. These rules are there to protect you, to stop you from really messing things up. Fortunately, Java allows you bypass some of these rules, via the dark art of reflection, and have some fun by changing things that you really shouldn’t be allowed to change.  Your confused colleagues might never trust a machine again. It’s all about the String constants cached by the JVM.

All it takes is a few lines (suitably hidden in some seemingly unrelated class)

java.lang.reflect.Field valueField = String.class.getDeclaredField("value");
valueField.setAccessible(true);
valueField.set("Hello World!", "Goodbye     ".toCharArray());

then this

System.out.println("Hello World!");

prints this

Goodbye

Escape to Victory

A language feature that appears to exist primarily to break syntax-highlighers

Of course, if your co-workers are made of sterner stuff you may need to be a little more devious to send them insane. They won’t be able to track down the offending code above by searching for “Hello World!”, “Goodbye” or even “setAccessible” if these Strings don’t appear anywhere in the source.

Unicode escape sequences (such as 'u0020') can be used not only to construct character and String literals but can actually be used anywhere in the source that their equivalent characters can be used. So entire statements, even entire classes can be written solely with these escape sequences.

Questionable Parentage

Instantiating inner class objects outside of the parent class

Inner classes (that is non-static nested classes) have an implicit reference to an instance of their containing class, but they don’t necessarily have to be instantiated within the scope of the parent class. Daniel Pitts shows another way. File this one under “ugly syntax” and “surely that’s not supposed to work?”.

Boxing Clever

From stupid Java tricks to stupid Java pitfalls

This one is not a trick as such but it is a good example of Java code that behaves completely unintuitively. The introduction of auto-boxing and auto-unboxing in Java 5 has created a whole new set of opportunities for bugs.

This is my personal favourite:

Integer n = 128;
Integer m = 128;
assert n <= m; // This works.
assert n >= m; // This works.
assert n == m; // This fails.

It gets better. Try changing n and m to both be 127 and it works, which, to be fair, is exactly how it’s supposed to work according to the JLS:

If the value p being boxed is true, false, a byte, a char in the range u0000 to u007f, or an int or short number between -128 and 127, then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.

But it’s not exactly what you would expect if you hadn’t read the spec.

Any more?

That’s all the Java stupidity I can think of right now. Feel free to add your own stupid Java tricks in the comments.

A Java Syntax Quirk

Posted in Java by Dan on August 24th, 2008

This little trick is shamelessly stolen from Daniele Futtorovic’s post on comp.lang.java.programmer.

This is legal, compilable Java:

public class Oddity
{
    public static void main(String[] args)
    {
        http://blog.uncommons.org
        System.out.println("Why is the URL allowed above?");
    }
}

Why doesn’t the URL being in there upset the compiler?  If you’re not sure why it’s valid, click “show” for a spoiler.

show

Uncommons Maths 1.1: Java Random Number Generators and Mathematical Utility Classes

Posted in Java by Dan on August 22nd, 2008

Uncommons Maths is a set of Java classes for working with random numbers, combinatorics and basic statistics.  The random number package is its most compelling feature, providing three advanced random number generators and support for several probability distributions (see A Java Programmer’s Guide to Random Numbers for more info, or have a play with the demo application).

It’s been a while since the initial release.  During this time I’ve made several minor tweaks and enhancements that I probably ought to share, so I’ve just uploaded a version 1.1 to Java.net.

If you are an existing user, please consult the changelog.  There is one change since version 1.0.2 that may cause backwards compatibility issues.  The combinatorics and number classes have been moved to their own packages to avoid cluttering the org.uncommons.maths base package.  So you made need to update your import statements accordingly (see the API documentation).

Optimising Computer Programs for Performance

Posted in Java, Software Development by Dan on July 23rd, 2008

I’ve recently been working on a small Java simulation program that is going to take a long time to execute each time it runs. Basically it does the same thing around a billion times with different random inputs for each iteration. I calculated that for my first working version of the program it would take 22 and a half hours to complete (based on it completing one million iterations in 81 seconds).

This got me thinking about how to optimise the code for performance, which meant revisiting the various rules of optimisation that I’ve learned from my previous programming experiences. So that’s what this post is about: rules of thumb for optimising computer programs for performance (some of this is Java-specific but most of it is generally applicable).

After optimisations, my program will complete in 3 hours and 5 minutes on the same machine (I still have a few ideas left to try that may reduce this further).

    1. “Premature optimisation is the root of all evil”
      No discussion of optimisation is complete without somebody inevitably quoting Donald Knuth so let’s get it out of the way up front.  Knuth, as usual, is right. Optimisation ahead of time is at best speculative. Furthermore, optimisation is invariably a process of sacrificing readability, portability and general maintainability for performance. It’s better to refrain from making these compromises until it proves to be necessary. More often than not your simple, unoptimised application will be fast enough anyway. Spending time converting your application into a heap of dung in exchange for an unnecessary, and potentially negligible (or even negative), speed boost is not a winning proposition.
    2. “There’s a difference between ‘Premature Optimisation’ and ‘Doing things right in the first place'”
      So stated a former colleague of mine in one of his less profane moments. If you’re planning to sort a million records you wouldn’t choose to implement a Bubble Sort. Some things are just wrong from the start. Theo Schlossnagle argues that this ability to effectively determine what constitutes premature optimisation and what is merely common sense is what separates the senior developers from their less experienced colleagues.
    3. “You can guess, or you can know”
      If you really understood why your program performs so unacceptably slowly you wouldn’t have written it that way in the first place. So don’t put too much faith in your intuition. If you want to fumble around in the dark in the hope that you’ll eventually guess what you did wrong, go ahead. But if you want to know where you suck at programming ask the computer. A profiler is an essential tool for any optimisation effort. If you’re coding Java JProfiler is an excellent choice. If you want something for nothing the NetBeans Profiler is pretty good too, though not quite as slick. A profiler will quickly identify bottlenecks in your program and the best places to start looking for potential optimisations. Just remember to measure the performance before and after any changes that you make so that you can evaluate their impact.
    4. Hardware solutions to software problems
      Your application uses too much memory. You can either lead a crack team of four developers for 5 months and optimise the code until it fits in the available RAM… or you can buy more RAM for less than £50. Ask yourself, what would Wilhelm do? And then do the opposite. In the world of paid-for software development those performance problems that would go away with better hardware are usually best solved by buying better hardware. Even to the extent of replacing entire servers, it can be more cost-effective than non-trivial code changes.
      As well as buying better hardware you should make sure that you are taking full advantage of what is already available to you. My 81-second trial simulation completed in 51 seconds after I split the work between two threads in order to take advantage of my dual core CPU.
    5. Optimisations at lower levels are often easier and can have a bigger impact
      The lower the level of the optimisation the more opportunity it provides for improved performance since everything built on top of that layer can take advantage of it. For example, switching to a faster JVM potentially makes all of your classes faster without having to change any of them. In my case I switched from Apple’s Java 5 to the SoyLatte version of Java 6 to take advantage of Sun’s on-going performance work and I got a 20% speed boost without modifying my application. Other improvements in this vein would include upgrading your Linux kernel or replacing a library with a faster implementation (such as switching from Castor XML to JiBX rather than addressing the problem at a higher level by trying to reduce the size of the XML in order to squeeze better performance from Castor).
    6. Optimise algorithms not code
      This is where that Computer Science education comes in useful. A basic understanding of complexity theory and big O notation will help you select the best algorithm for the job. A common mistake of inexperienced programmers is to fixate on micro-optimisations. “Maybe if I use direct field access instead of a getter, it will be quicker?” It doesn’t matter. It especially doesn’t matter if your code is slow because you chose an O(n2) algorithm instead of the O(n log n) alternative.
    7. Avoid superstition
      This is related to the previous advice. Don’t do something just because someone told you it might be faster or you read it on the Internet. There are dozens of pages of Java performance tips (micro-optimisations mostly) on the web. Most of these tips are well past their sell-by-date. They are irrelevant with modern JVMs (the JIT compiler generally does a better job than misguided hand-optimised code). Some of them were never sensible in the first place. “Make all methods final for performance”, “iterate over arrays backwards because the comparison with zero is cheaper” they say. Yet these superstitious idioms are still religiously applied by some developers incapable of critical thinking. Critical thinking means taking measurements and evaluating for yourself what the impact is.
    8. Don’t waste your time.
      The profiler tells you that the two parts of your application consume 95% and 5% of CPU resources respectively. You know that the 5% is far too high and that it should be possible to complete that work in less than 1% of the total time. The problem is, even if you achieve this impressive five-fold performance boost in this part of the code, nobody is going to notice since overall application performance has improved by just 4%. Unless that 4% improvement represents the difference between success and failure it’s not worth the effort. Instead you should be focusing on the other 95% of the application since that is the only place where you might be able to achieve a significant improvement, even if it is more work to do so. My rule of thumb is that for anything less than a 20% improvement it’s generally not worth making my code more dung-like.

Hopefully this has been useful. If you remember only one sentence from this article, make sure it’s this one: “You can guess, or you can know”. Measure everything. Optimisation is science not witchcraft.

ReportNG 0.9.6 – HTML and XML Reports for TestNG

Posted in Java by Dan on June 24th, 2008

ReportNG version 0.9.6 is now available for download. ReportNG is a plug-in for TestNG that provides improved HTML reports and improved JUnit-format XML reporting.

This is a bug fix release. Issues addressed include:

  • ISSUE 23 – Inaccurate aggregate timings for tests.
  • ISSUE 25 – NullPointerException when a test failure is due to a Throwable that has a null message.
  • ISSUE 26 – Optionally disable escaping of output logs so that HTML can be inserted into reports.

The fix for ISSUE 26 is a system property that will turn off escaping of Strings embedded in the HTML reports:

org.uncommons.reportng.escape-output=false

This means that, with escaping disabled, FEST will work with ReportNG the way it does with the default TestNG reporter (i.e. hyperlinks will be inserted in the HTML report). The default is still for all log output to be escaped. I don’t recommend that you turn the escaping off unless you really need to because it will mess up your reports if you happen to log any strings that contain characters such as ‘<‘, ‘>’ and ‘&’.

Thank you to the ReportNG users who submitted bug reports and patches for these issues.

Java Archaeology: Revisiting 20th Century Code

Posted in Java by Dan on June 20th, 2008

Java 1.1 was released in 1997. How does code from this era compare to today’s Java code?

Following my decision to resurrect an old project, I was faced with the prospect of converting the codebase from Java 1.1 source code to something a little more civilised. The oldest parts of this software date back nearly 9 years. Even then Java 1.1 was a little old, with 1.2.2 being the latest and greatest. Though I re-wrote much of the code 4 or 5 years ago, when Java 1.4.2 was state-of-the-art, it still targeted Java 1.1 VMs for compatibility reasons (i.e. Microsoft’s VM that used to be bundled with IE).

Trawling through this code was an interesting look back to a time when Java developers could choose to use any type of collection as long as it was Vector, and pretty much the only GUI toolkit available was AWT. It was also interesting to revist the implementation decisions made back when I didn’t know nearly as much as I thought I did (though, to be honest, that’s probably still the case).

Pre-historic Collections

The first thing I noticed was the clunky use of collections and arrays in the code. Not only did this software pre-date generics, being tied to Java 1.1 meant that even the Collections Framework was off-limits. Arrays, Vectors and Hashtables were all there were.

My original code used arrays wherever a fixed length was acceptable and Vectors everywhere else. Preoccupied by the difficulty of keeping track of the type of the Vector’s contents, I had written many of the methods so that they converted their internal Vectors into arrays before returning. In modernising the code, I had similar problems remembering what types went in many of the collections that I had declared several years before. Don’t let anyone tell you that generics were not worth it. They’re invaluable for communication of intent alone.

On a similar note, the richer variety of collections that arrived with Java 2 allow for code to be written that more accurately conveys the thinking of the developer. The same method that returned an array or Vector in the legacy code could be rewritten to return List, Set, SortedSet, or just Collection. Each of these options communicates a different notion of what to expect from the method.

Sorting

The purists wept as one of the pillars of Computer Science education crumbled before them and a thousand copies of The Art of Computer Programming were consigned to the shelf indefinitely. The Collections Framework in Java 2 introduced a general purpose sort method that was good enough for the vast majority of your sorting needs. Rather than worrying about remembering how to implement QuickSort and why it was better than an Insertion Sort or a Bubble Sort, just use the modified Merge Sort that Josh Bloch and his colleagues had helpfully provided. The Comparator interface meant that this one tried-and-tested implementation was applicable to just about any problem.

The first things to go as I brought my code up-to-date were my plagiarised sort method and my custom version of the Comparator interface.

Enumerated Types

The lack of enumerated types in the early versions of Java was a curious oversight. I found my old code littered with integer constants. Effective Java remained unwritten in those days so I was ignorant of the type-safe enum pattern. I compounded my ignorance by writing methods that took three of these integer constants as arguments, each representing a conceptually different type. It was very easy to get the arguments the wrong way round and introduce all manner of bugs.

User Interface

It amazes me that people ever achieved anything worthwhile with just AWT. While I would use my mastery of the GridBagLayout to earn the respect of my peers, the lack of a rich set of GUI controls was very limiting. No tables, trees, tab sets or sliders for a start.

Swing has its critics, but I’m not one of them. There were some performance and look-and-feel issues in the early years, but I think that on the whole they got a lot more right than they got wrong. Now that I am not tied to Java 1.1, converting my project’s UI from AWT to Swing is the next item on the agenda. It will be with great pleasure that I rip out the sluggish and horrific table-simulating code that I built from a mess of GridLayouts and Labels.

[Swing was available as an external library for use with Java 1.1, but I had originally steered clear of this option because of the effect this would have had on the software’s download size]

Tools

When I began this project in 1999, Apache Ant was still a little-known part of the Tomcat project. I think JUnit was probably around then but it was not so well-known (certainly not to me). I wrote the original source code in the venerable PFE, a capable editor but it did not even have syntax-highlighting. I built the code using a Windows 98 batch file. The idea of writing automated unit tests did not even occur to me. These days IntelliJ IDEA, Ant and TestNG are essential to my development efforts.

I have no doubt that people will still be writing Java code in another decade, though how significant it will be, and how far removed from today’s code, remains to be seen.  Maybe the rest of us will have all switched to JavaNG?

« Older Posts