New Adventures in Software

In Praise of JiBX

Posted in Java by Dan on July 16th, 2007

JiBX is a library for XML data-binding. I’ve used it on a few projects over the last year or so. I used to use Castor XML. Castor had its quirks, but it worked. I decided to explore JiBX as an alternative when I saw the eye-opening BindMark performance benchmarks. JiBX’s elegant mapping mechanism made the conversion of the project I was working on pretty straightforward (without any of the awkward corner cases I came across with Castor). One quick performance test was enough to confirm that I wouldn’t ever be going back to Castor. The conversion to JiBX made the XML <-> POJO translations nine times faster.

The simpler mapping and astonishing performance boost were not the only plus-points. JiBX achieves its performance by using post-compilation bytecode manipulation rather than the reflection approach adopted by Castor and other similarly invasive XML data-binding solutions. This has the advantage that it places fewer demands on the way that you write your mapped classes. There is no need for getters, setters and no-arg constructors just to satisfy the tools. Most of the time you can write the class without considering mapping issues and then map it without modifications.

Denis Sosnoski shows what can be achieved by a single developer project. The code is clean, the documentation is good and the end result is a piece of software that I wouldn’t hesistate to recommend.

BigDecimal Gotchas and the need for Overloaded Operators

Posted in Java by Dan on June 25th, 2007

…or why Java sucks for arbitrary-precision arithmetic.

For many applications floating-point binary is bad. But BigDecimal isn’t much fun either…

I came across this discussion yesterday on the evilness of BigDecimal‘s double constructor. It’s an important point to be aware of when using the BigDecimal class. In choosing to use BigDecimal instead of double, you probably wanted an exact value but, if you use that constructor, you’re unlikely to get it. I’d quite like an IDEA inspection to warn me if I inadvertently invoke that constructor with a floating-point literal.

Unfortunately, that issue is not the only thing that you have to keep in mind when using BigDecimal. Unless you’ve read the Javadocs diligently, you maybe surprised that the equals method does not consider 2.0 to be equivalent to 2.00. This is because it takes into account the scale as well as the value. If you want to compare values irrespective of scale, you need to use the compareTo method:

if (firstValue.compareTo(secondValue) == 0)
{
     // Do something...
}

Not particulary readable. An idle glance may wrongly interpret this as a check for zero.

Of course, this means that the implementation of Comparable is inconsistent with equals. This has unpleasant consequences:

It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave “strangely” when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.

On top of all this, the BigDecimal class is not very user-friendly. Consider the equation a = b – c * d. Using doubles this would look something like this:

double a = b - c * d

That’s pretty similar to the mathematical representation. The Java programming language doesn’t provide operators for BigDecimals so we have to use instance methods. Converting the above to use BigDecimals might look something like this:

BigDecimal a = b.subtract(c).multiply(d)

Not only is this verbose, it’s also wrong. The rules of precedence have changed. With chained method calls like this, evaluation is strictly left-to-right. Instead of subtracting the product of c and d from b, we are multiplying the difference between b and c by d. We would have to rewrite it to be equivalent to the double example:

BigDecimal a = b.subtract(c.multiply(d))

This does not present an insurmountable mental challenge, but it does increase the cognitive load slightly by virtue of being different to what we are used to. We could split the computation into separate stages with intermediate variables. This might make the behaviour clearer but it’s not exactly concise.

The final point to make about the BigDecimal class is that it is immutable and, as such, each of the “operator” methods returns a new instance. Compare the add method of a Calendar or a List with the add method of a BigDecimal. One modifies its target, the other returns an entirely new object. This ought not be a problem (immutability has many advantages) but a common error is to forget to use the result of the method. IDEA has an inspection for this. Alternatively, you can use FindBugs. In the absence of this type of tool, these are the kind of issues that will be picked up by your extensive unit test suite.

Conclusions

There is not much we can do about the first two problems. The double constructor won’t be deprecated since it does serve a purpose when performing type conversions from primitive variables. Also, it seems reasonable that BigDecimal should provide a method to check equality with respect to both value and scale. Whether that method should have been the equals method is a judgment call that is not going to be changed now.

The other issues, those of readability, verbosity and bug-prone patterns of usage, can all be improved by overloading the common arithmetic operators (+, -, *, /, %, +=, -=, *=, /=, %=, — and ++).

Over-loaded operators for BigDecimals are apparently one of the language enhancements being considered for Java 7. Whether this would cover the comparison operators is not clear (I cannot find any definitive information on the proposal). The less-than, greater-than, less-than-or-equals and greater-than-or-equals operators would be straightforward. Overloading the equality operator (==) would be more problematic. Existing code may break. Also, would it be consistent with the equals method or the compareTo method?

What’s happening in Java 7?

Posted in Java by Dan on June 16th, 2007

Alex Miller has put together this very useful page that aggregates the relevant information about changes planned or proposed for Java 7. The page includes dozens of links to the various JSRs and to articles discussing the new features.

Biomorphs and other Evolutionary Algorithms in Java – Watchmaker 0.4.0

Posted in Evolutionary Computation, Java by Dan on May 26th, 2007

Version 0.4.0 of the Watchmaker Framework for Evolutionary Computation is now available for download. This release adds support for interactive evolutionary algorithms. An example applet, based on Richard Dawkin’s famous Biomorph program, demonstrates how the framework can be used with an interactive selection strategy. Using the applet you can direct the evolution of recursive pictures that resemble biological entities such as plants and insects.

The new release also adds a blazingly fast random number generator (a Java port of Tony Pasqualoni’s cellular automaton RNG). This RNG out-performs even the Mersenne Twister. By offering this RNG as an option, the framework provides potential for improved evolutionary performance.

Double Dispatch

Posted in Java by Dan on May 19th, 2007

Double dispatch is a neat trick for getting around the problem highlighted by Anders Bengtsson in his blog yesterday. The idea is similar to the visitor pattern and is particularly useful for implementing the command pattern. The Java Performance Tuning site has an article describing its use in Java.

Applying the concept to Anders’ example, we would introduce an intermediate call to the Person object that merely calls back to the invoking object to provide the necessary type information that would otherwise be unknown. This will probably be clearer with some example code.

Anders’ example had a Person base class with two sub-classes, Worker and Capitalist. There is also another class that defines overloaded methods called doStuff that operate on each of the concrete types. The problem is how do you invoke the correct doStuff method if you do not know the concrete type of a Person object at compile time?

The solution is to add an abstract method to the Person class. We’ll call this method dispatch, for want of a better name. Assuming that the class that contains the doStuff methods is called Executor, this method takes a single parameter – an instance of Executor:

public abstract void dispatch(Executor executor);

Both Worker and Capitalist would implement this method identically and simply invoke the doStuff method on the Executor, like so:

public void dispatch(Executor executor)
{
    executor.doStuff(this);
}

By implementing this in each of the concrete sub-classes rather than the abstract base class, we are providing the exact runtime type of the object to the executor and the appropriate overloaded method is invoked. This is arguably cleaner than a long chain of instanceof checks.

For the command pattern this approach has the advantage of decoupling the command objects from the behaviour that they invoke. This makes it possible to provide different “Executor” implementations to the same command objects.

Drowning in a sea of JARs

Posted in Java by Dan on May 15th, 2007

You know the situation, you’ve got a NoClassDefFoundError but you’re not sure which of 20+ JAR files you need to add to fix it. This is a question that has cropped up a few times on the Java newsgroups. There are a few solutions to this problem. A colleague of mine has a 60-line Java utility that he’s pleased with, but then terseness is not Java’s strong point. Here’s the equivalent shell script:

#!/bin/sh
find "$1" -name "*.jar" -exec sh -c 'jar -tf {}|grep -H --label {} '$2'' ;

(NOTE: The command may be wrapped when it is displayed on this page, but everything from “find…” should be on a single line).

Save this as findclass.sh (or whatever), put it on your path and make it executable. Problem solved.

The first parameter is the directory to search recursively and the second parameter is a regular expression (typically just a simple class name) to search for. The script relies on the -t option to the jar command (which lists the contents) and greps each table of contents, labelling any matches with the path of the JAR file in which it was found.

For shell-impaired Windows users, Cygwin (possibly with Poderosa) is the answer.

Watchmaker Framework for Evolutionary Computation – Version 0.3.0

Posted in Evolutionary Computation, Java by Dan on December 24th, 2006

Version 0.3.0 of the Watchmaker Framework for Evolutionary Computation is now available for download. The release includes several refinements to the API, one major bug fix (for Roulette Wheel Selection), and a new example program based on the first exercise in An Introduction to Genetic Algorithms.

This release can be considered the first stable release of the framework. There ought to be no major bugs and it is unlikely that any changes will be made to the API that will break backwards compatibility prior to the 1.0 release.

There are more features to be added (including support for GP and distributed fitness evaluations) before the framework is ready for its 1.0 release.

As always, I’m keen to receive suggestions for improvements and new features.

Beware using enum constants as Map keys in performance-sensitive code.

Posted in Java by Dan on December 17th, 2006

I discovered this by accident when replacing some String keys with an enum (in an attempt to reduce the potential for bugs in some code I had written). The change resulted in a big performance hit in a routine that repeatedly queries the HashMap.

In Java 5.0, the implementation of java.lang.Enum.hashCode() is much slower than it needs to be. Furthermore, hashCode() is final, so it can’t even be replaced by something trivial, such as simply returning the ordinal.

The performance problem should be fixed in Java 6.0 but, if you are stuck on Java 5.0, it’s worth keeping in mind when using enum constants to key a map in performance-sensitive code.

The Travelling Salesman – Darwin-style

Posted in Evolutionary Computation, Java by Dan on October 16th, 2006

The Travelling Salesman problem is a good demonstration of where evolutionary algorithms (EAs) can be useful for finding good solutions to difficult problems. This applet allows you to compare evolutionary and brute force approaches to the Travelling Salesman problem for up to 15 cities. You can tweak the parameters of the evolution to see how it affects results. The applet uses the Watchmaker Framework to implement the EA.

In a quarter of a second we can find a solution to a problem that would take three days to brute force (WARNING: Don’t try to brute force a result for 15 cities unless you have a seriously powerful machine). The EA solution is not guaranteed to be optimal (from observations it seems to be optimal about 90% of the time for this example and is “near”-optimal in the other 10% of cases).

Watchmaker Framework 0.2 Release

Posted in Evolutionary Computation, Java by Dan on October 16th, 2006

Version 0.2 of the Watchmaker Framework for Evolutionary Algorithms is now available for download from the project website (Document & Files section).

This version includes several minor tweaks plus support for concurrent fitness evaluations to take advantage of multi-core and multi-processor machines. It also includes improved example programs (more on that later).

API Documentation

« Older Posts

New Adventures in Software by Dan Dyer