New Adventures in Software

Evolving Sudoku (Watchmaker 0.4.1)

Posted in Evolutionary Computation, Java by Dan on July 21st, 2007

Everybody loves Sudoku (probably), and evolutionary computation is kind of neat, so what could possibly be better than an animated evolutionary Sudoku solver?

The applet’s animation gives a good feel for how the randomly directed search eventually converges on the right solution through the power of cumulative selection. Have a play with the population size setting to trade-off performance with reliability (harder puzzles will typically require a larger population).

I’ve got lots of ideas for improvements to the Watchmaker Framework for Evolutionary Computation, but before I could get started, I first had to finish off the stuff I had been playing with. So that’s what I’ve been up to this evening and the result is version 0.4.1.

Ohloh – Forgettable name, interesting site

Posted in Evolutionary Computation, Software Development by Dan on July 16th, 2007

I discovered the Open Source project directory Ohloh yesterday (thanks to Andres Almiray’s post that I noticed on JavaBlogs). Oddly the name Ohloh is instantly forgettable and I’m having to look up the URL each time I go there.

By querying CVS and Subversion repositories, Ohloh generates information about the state of a project and ties individual contributions to developer profiles. It also allows individuals to list which Open Source projects they are users of. The site uses Google Maps to show how both contributors and users are distributed geographically.

I added my Watchmaker evolutionary computation project (the Ohloh entry is here). Unfortunately java.net‘s strategy of hosting web content in the Subversion repository does skew the statistics somewhat (my contributions appear to be mostly HTML content, but this is mainly generated Javadoc output).

Of the information presented, most pleasing for me is the nice green tick the project gets for being well commented:

Across all Java projects on Ohloh, 35% of all source code lines are comments. For Watchmaker Evolution Framework, this figure is 45%.

This high number of comments puts Watchmaker Evolution Framework among the highest one-third of all Java projects on Ohloh.

A high number of comments might indicate that the code is well-documented and organized, and could be a sign of a helpful and disciplined development team.

That’s a good reference for my CV: “Dan is a helpful and disciplined development team.” 🙂

In Praise of JiBX

Posted in Java by Dan on July 16th, 2007

JiBX is a library for XML data-binding. I’ve used it on a few projects over the last year or so. I used to use Castor XML. Castor had its quirks, but it worked. I decided to explore JiBX as an alternative when I saw the eye-opening BindMark performance benchmarks. JiBX’s elegant mapping mechanism made the conversion of the project I was working on pretty straightforward (without any of the awkward corner cases I came across with Castor). One quick performance test was enough to confirm that I wouldn’t ever be going back to Castor. The conversion to JiBX made the XML <-> POJO translations nine times faster.

The simpler mapping and astonishing performance boost were not the only plus-points. JiBX achieves its performance by using post-compilation bytecode manipulation rather than the reflection approach adopted by Castor and other similarly invasive XML data-binding solutions. This has the advantage that it places fewer demands on the way that you write your mapped classes. There is no need for getters, setters and no-arg constructors just to satisfy the tools. Most of the time you can write the class without considering mapping issues and then map it without modifications.

Denis Sosnoski shows what can be achieved by a single developer project. The code is clean, the documentation is good and the end result is a piece of software that I wouldn’t hesistate to recommend.

BigDecimal Gotchas and the need for Overloaded Operators

Posted in Java by Dan on June 25th, 2007

…or why Java sucks for arbitrary-precision arithmetic.

For many applications floating-point binary is bad. But BigDecimal isn’t much fun either…

I came across this discussion yesterday on the evilness of BigDecimal‘s double constructor. It’s an important point to be aware of when using the BigDecimal class. In choosing to use BigDecimal instead of double, you probably wanted an exact value but, if you use that constructor, you’re unlikely to get it. I’d quite like an IDEA inspection to warn me if I inadvertently invoke that constructor with a floating-point literal.

Unfortunately, that issue is not the only thing that you have to keep in mind when using BigDecimal. Unless you’ve read the Javadocs diligently, you maybe surprised that the equals method does not consider 2.0 to be equivalent to 2.00. This is because it takes into account the scale as well as the value. If you want to compare values irrespective of scale, you need to use the compareTo method:

if (firstValue.compareTo(secondValue) == 0)
{
     // Do something...
}

Not particulary readable. An idle glance may wrongly interpret this as a check for zero.

Of course, this means that the implementation of Comparable is inconsistent with equals. This has unpleasant consequences:

It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave “strangely” when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.

On top of all this, the BigDecimal class is not very user-friendly. Consider the equation a = b – c * d. Using doubles this would look something like this:

double a = b - c * d

That’s pretty similar to the mathematical representation. The Java programming language doesn’t provide operators for BigDecimals so we have to use instance methods. Converting the above to use BigDecimals might look something like this:

BigDecimal a = b.subtract(c).multiply(d)

Not only is this verbose, it’s also wrong. The rules of precedence have changed. With chained method calls like this, evaluation is strictly left-to-right. Instead of subtracting the product of c and d from b, we are multiplying the difference between b and c by d. We would have to rewrite it to be equivalent to the double example:

BigDecimal a = b.subtract(c.multiply(d))

This does not present an insurmountable mental challenge, but it does increase the cognitive load slightly by virtue of being different to what we are used to. We could split the computation into separate stages with intermediate variables. This might make the behaviour clearer but it’s not exactly concise.

The final point to make about the BigDecimal class is that it is immutable and, as such, each of the “operator” methods returns a new instance. Compare the add method of a Calendar or a List with the add method of a BigDecimal. One modifies its target, the other returns an entirely new object. This ought not be a problem (immutability has many advantages) but a common error is to forget to use the result of the method. IDEA has an inspection for this. Alternatively, you can use FindBugs. In the absence of this type of tool, these are the kind of issues that will be picked up by your extensive unit test suite.

Conclusions

There is not much we can do about the first two problems. The double constructor won’t be deprecated since it does serve a purpose when performing type conversions from primitive variables. Also, it seems reasonable that BigDecimal should provide a method to check equality with respect to both value and scale. Whether that method should have been the equals method is a judgment call that is not going to be changed now.

The other issues, those of readability, verbosity and bug-prone patterns of usage, can all be improved by overloading the common arithmetic operators (+, -, *, /, %, +=, -=, *=, /=, %=, — and ++).

Over-loaded operators for BigDecimals are apparently one of the language enhancements being considered for Java 7. Whether this would cover the comparison operators is not clear (I cannot find any definitive information on the proposal). The less-than, greater-than, less-than-or-equals and greater-than-or-equals operators would be straightforward. Overloading the equality operator (==) would be more problematic. Existing code may break. Also, would it be consistent with the equals method or the compareTo method?

What’s happening in Java 7?

Posted in Java by Dan on June 16th, 2007

Alex Miller has put together this very useful page that aggregates the relevant information about changes planned or proposed for Java 7. The page includes dozens of links to the various JSRs and to articles discussing the new features.

Scholarpedia: Wikipedia with better standards?

Posted in Evolutionary Computation, Software Development by Dan on June 16th, 2007

I’ve just stumbled upon Scholarpedia, a MediaWiki-based encyclopedia. The key difference between it and Wikipedia is its focus on peer-review of content. Of course, this immediately means that it has substantially less content than Wikipedia but less is more, right? Contributors are nominated and voted in by the public based on their reputation in their area of expertise, most being notable academics.

At present Scholarpedia seems to have quite a narrow focus (most current articles are about various kinds of adaptive systems in computer science). It will be interesting to see how the project progresses. Perhaps the most promising aspect is the quality of authors who have apparently signed up to write various sections. For example, if you could ask anybody to explain Genetic Algorithms, it would probably be John Holland. And who better to write about Hopfield Networks than John Hopfield himself?

Biomorphs and other Evolutionary Algorithms in Java – Watchmaker 0.4.0

Posted in Evolutionary Computation, Java by Dan on May 26th, 2007

Version 0.4.0 of the Watchmaker Framework for Evolutionary Computation is now available for download. This release adds support for interactive evolutionary algorithms. An example applet, based on Richard Dawkin’s famous Biomorph program, demonstrates how the framework can be used with an interactive selection strategy. Using the applet you can direct the evolution of recursive pictures that resemble biological entities such as plants and insects.

The new release also adds a blazingly fast random number generator (a Java port of Tony Pasqualoni’s cellular automaton RNG). This RNG out-performs even the Mersenne Twister. By offering this RNG as an option, the framework provides potential for improved evolutionary performance.

Haskell – Ready for the mainstream?

Posted in Haskell, Software Development by Dan on May 25th, 2007

Having been exposed to pure functional programming in Miranda at university, Haskell is a programming language that I like a lot. I’ve dabbled with it on and off using GHC but I’ve never written a really substantial program in it. Certainly nothing to rival the Java behemoths that I develop day-to-day. Programs written (well) in Haskell are concise, expressive and have an elegance that is hard to match in more general-purpose languages such as C and Java.

Despite the appeals of strongly-typed, side-effect-free programming, Haskell (and non-strict, pure functional languages in general), have long been regarded as primarily the domain of academics. Haskell’s design owes much to the proprietary Miranda system but, unencumbered by commercial constraints, it has rapidly supplanted its predecessor. Even the University of Kent, previously the principal champion of Miranda, has long since switched to Haskell for teaching purposes. Haskell is now a part of undergraduate computer science courses around the world but, despite this, has little visibility in the world of commercial software development.

However, all that may be about to change. This week saw the announcement of the book “Real World Haskell”, to be published by O’Reilly. If the authors deliver on the proposed outline, it promises to provide a real boost for Haskell as a viable language for mainstream software development. The book will cover topics that are often ignored by existing Haskell guides but that are essential for solving real world problems. I/O, databases, GUI development, concurrency and unit testing are just some of the items to be addressed.

At present there is nothing much to see, but I’ll be monitoring their progress closely over the coming months in eager anticipation. Hopefully, by the time the book is out, GHC will have broken free from its LGPL shackles to remove another barrier to widespread Haskell adoption.

Double Dispatch

Posted in Java by Dan on May 19th, 2007

Double dispatch is a neat trick for getting around the problem highlighted by Anders Bengtsson in his blog yesterday. The idea is similar to the visitor pattern and is particularly useful for implementing the command pattern. The Java Performance Tuning site has an article describing its use in Java.

Applying the concept to Anders’ example, we would introduce an intermediate call to the Person object that merely calls back to the invoking object to provide the necessary type information that would otherwise be unknown. This will probably be clearer with some example code.

Anders’ example had a Person base class with two sub-classes, Worker and Capitalist. There is also another class that defines overloaded methods called doStuff that operate on each of the concrete types. The problem is how do you invoke the correct doStuff method if you do not know the concrete type of a Person object at compile time?

The solution is to add an abstract method to the Person class. We’ll call this method dispatch, for want of a better name. Assuming that the class that contains the doStuff methods is called Executor, this method takes a single parameter – an instance of Executor:

public abstract void dispatch(Executor executor);

Both Worker and Capitalist would implement this method identically and simply invoke the doStuff method on the Executor, like so:

public void dispatch(Executor executor)
{
    executor.doStuff(this);
}

By implementing this in each of the concrete sub-classes rather than the abstract base class, we are providing the exact runtime type of the object to the executor and the appropriate overloaded method is invoked. This is arguably cleaner than a long chain of instanceof checks.

For the command pattern this approach has the advantage of decoupling the command objects from the behaviour that they invoke. This makes it possible to provide different “Executor” implementations to the same command objects.

Drowning in a sea of JARs

Posted in Java by Dan on May 15th, 2007

You know the situation, you’ve got a NoClassDefFoundError but you’re not sure which of 20+ JAR files you need to add to fix it. This is a question that has cropped up a few times on the Java newsgroups. There are a few solutions to this problem. A colleague of mine has a 60-line Java utility that he’s pleased with, but then terseness is not Java’s strong point. Here’s the equivalent shell script:

#!/bin/sh
find "$1" -name "*.jar" -exec sh -c 'jar -tf {}|grep -H --label {} '$2'' ;

(NOTE: The command may be wrapped when it is displayed on this page, but everything from “find…” should be on a single line).

Save this as findclass.sh (or whatever), put it on your path and make it executable. Problem solved.

The first parameter is the directory to search recursively and the second parameter is a regular expression (typically just a simple class name) to search for. The script relies on the -t option to the jar command (which lists the contents) and greps each table of contents, labelling any matches with the path of the JAR file in which it was found.

For shell-impaired Windows users, Cygwin (possibly with Poderosa) is the answer.

« Older Posts

New Adventures in Software by Dan Dyer