IntelliJ IDEA Goes Open Source

Posted in Java by Dan on October 15th, 2009

Big news from JetBrains.  Their IntelliJ IDEA Java IDE will be offered in a free Open Source edition from version 9.0.  The free version will include all the JavaSE features, version control integrations for CVS, Git and Subversion, support for unit testing with JUnit and TestNG, and language support for Groovy and Scala.  The full paid-for version will add the enterprise Java tools, UML diagrams, more version control integrations, and language support for SQL, PHP, JavaScript and more.

IntelliJ IDEA is, in my opinion, faster and more powerful than NetBeans and slicker and more intuitive than Eclipse.

As well as opening up the Java IDE, JetBrains are open-sourcing the IntelliJ platform, which is the foundation of all of their IDEs.  As a bonus, they’ve chosen to use the permissive Apache Licence rather than less commercial-friendly GPL.

At the time of writing, the JetBrains web server appears to have crumbled under the weight of the traffic that this announcement has prompted.

Evolutionary Computation in Java – ECJ, JGAP and Watchmaker Compared

Posted in Evolutionary Computation, Java by Dan on September 17th, 2009

In the days before the Watchmaker Framework the two most popular Java evolutionary computation libraries were probably ECJ (Evolutionary Computation [in/for] Java) and JGAP (Java Genetic Algorithms Package). Since the advent of Watchmaker the two most popular Java evolutionary computation libraries are probably ECJ and JGAP. So that worked out well then, but at least the Watchmaker Framework is now being mentioned alongside these established projects.

A recent series of posts over at the Hidden Clause blog compares the pros and cons of implementing the same genetic programming (GP) examples using each of the three frameworks. The article about the Watchmaker Framework is here. In the final analysis Watchmaker is ranked second, ahead of JGAP but behind ECJ, with mostly positive comments (particularly in respect to Watchmaker being the only one of the three frameworks that takes advantage of Java 5’s generics).

Ultimately it seems that the lack of specialist GP support is the main black mark against Watchmaker in this review. Watchmaker is, in its current incarnation, a more general-purpose evolutionary computation framework rather than a specialist system for genetic programming. There is proof-of-concept GP code included but the classes are not part of the core framework. Eventually I would like to include proper GP support, either as part of the core library or as an official add-on library.

I understand that there will be further articles on GP in Java (using ECJ, JGAP and Watchmaker) at Hidden Clause, so it might be worth subscribing if you are interested in this kind of stuff.

Watchmaker Framework for Evolutionary Computation – Version 0.6.2

Posted in Evolutionary Computation, Java by Dan on September 13th, 2009

This is a bug fix release that addresses a couple of issues with thread management. In version 0.6.1, if you were creating and discarding multiple ConcurrentEvolutionEngines, the threads from the discarded engines would not be cleared up properly.  This could eventually lead to OutOfMemoryErrors if you created a large number of evolution engines.

In version 0.6.2, all ConcurrentEvolutionEngines share a common thread pool so there is no need to create and destroy additional threads.

0.6.2 also reverts the switch to non-daemon threads in version 0.6.1. If you were having problems with the JVM not exiting when your program completed, this should fix it.

JDK7 Tackles Java Verbosity

Posted in Java by Dan on August 29th, 2009

The Java Language changes accepted for inclusion in JDK7 have been announced by Joseph Darcy. We already knew that closures were off the menu.  So too, unfortunately, is language support for arbitrary-precision arithmetic. The final list is pretty non-controversial and includes a number of changes that will reduce the verbosity of Java programs (one of the main criticisms of Java from proponents of other languages). Java will never be as terse as Perl or Haskell, but that’s no bad thing. One of the strengths of Java is its readability. There are however some areas where the language is needlessly verbose and that’s what these changes are addressing.

Simplified Generics

The last major revision of the Java language was Java 5.0, which introduced generics, auto-boxing, enums, varargs and annotations. Despite the compromises of type erasure, generics have been a major improvement to the language. They have also contributed to the verbosity of Java code. The necessity to specify, in full, both the reference type and value type of a field or variable has led to some very long declarations:

Map<String, List<BigDecimal>> numberMap = new TreeMap<String, List<BigDecimal>>();

JDK7’s proposed diamond notation allows the programmer to omit the generic parameters on the righthand side if they are the same as the left:

Map<String, List<BigDecimal>> numberMap = new TreeMap<>();

Collection Literals

The long overdue addition of collection literals will help to reduce the size of Java code and make it more readable. Lists, sets and maps can be created and populated without the need for cumbersome instance initialisers:

List<Integer> powersOf2 = {1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024};
Map<String, Integer> ages = {"John" : 35, "Mary" : 28, "Steve" : 42};

Automatic Resource Management

Josh Bloch’s proposal for automatic resource management gives Java an alternative to C++’s RAII and C#’s using. It eliminates much of the boiler-plate exception handling that surrounds the proper creation and disposal of resources, such as IO streams, in Java code. The proposal introduces a new interface, Disposable, that resources will implement. The syntax of try/catch/finally is extended to allow resources to be specified at the start. These resources are then automatically disposed upon completion. Here’s an example of the new syntax in action (taken from the proposal):

static String readFirstLineFromFile2(String path) throws IOException
{
    try (BufferedReader reader = new BufferedReader(new FileReader(path))
    {
        return reader.readLine();
    }
}

Other Changes

As well as the above changes to tackle verbosity, JDK7 adds binary integer literals and the ability use String literals in switch statements.  JDK7 will also fix the problem of mixing varargs parameters with generic types.

Java 6 for 32-bit Macs…finally?

Posted in Java, Mac by Dan on August 25th, 2009

Apple’s OS X 10.6, code-named Snow Leopard, is released on Friday.  There is some suggestion that this will finally deliver Java 6 for 32-bit Intel Macs (more than two-and-a-half years after it debuted on other platforms). The news reaches me via James at DZone, who cites Axel’s blog, which in turn links to this 2-month-old post as evidence. There’s no primary source identified and, given Apple’s legendary pre-release silence, this is unlikely to be confirmed until some Java developer with a 32-bit Mac actually tries the Snow Leopard upgrade.

At present there are two not-entirely-satisfactory options for Java 6 development on 32-bit Mac hardware. The first is to use SoyLatte, which is fine for non-GUI work but only supports Swing under X11. The other option is to run the JVM under another OS via the magic of Parallels or VirtualBox.

Assuming that this rumour is true (and I remain sceptical), the key question is will this update be made available to Tiger and Leopard users via Software Update, or is an OS upgrade necessary? The Leopard-to-Snow-Leopard upgrade is reasonably priced but Apple’s site implies that if you are upgrading from an earlier version your only option is the more expensive Mac Box Set (which also includes the latest versions of iLife and iWork).

UPDATE (28th August): It seems that the Snow Leopard “upgrade” is actually a full version of the Operating System and can be used to upgrade machines running Tiger. However, to do so might be a breach of the End User Licence Agreement.

UPDATE (29th August): I asked on Stack Overflow whether anybody could confirm the presence of Java 6 on 32-bit Macs.  The question got bounced to the new Super User site, but I did get a couple of positive responses.  So it seems that yes, Java 6 is finally available to owners of 32-bit Macs, but only if you upgrade to Snow Leopard.

Watchmaker Framework for Evolutionary Computation – Version 0.6.1: Terracotta Clustering and more…

Posted in Evolutionary Computation, Java by Dan on August 3rd, 2009

I’ve just uploaded version 0.6.1 of the Watchmaker Framework for Evolutionary Computation.  If you’re not already familiar with the project, it is a library for implementing evolutionary/genetic algorithms in Java.  It’s multi-threaded, cross-platform, fast and has a modern, unobtrusive and flexible API.

API Improvements

One user-requested addition to the API in this release is the getSatisfiedTerminationCondtions method.  This makes it easy to determine which termination condition (elapsed time, generation count, stagnation, etc.) caused the evolution to terminate when you are using multiple termination conditions.

The API documentation has also been improved in a few places to make things clearer. Firstly, the framework does not support negative fitness scores.  In previous releases it may have worked under some circumstances, but it was undefined behaviour.  In this release you will get an IllegalArgumentException if you try it.

Secondly, if you are using an EvolutionObserver to update a Swing GUI, be careful not to overwhelm the AWT thread with updates (this can happen if you are processing dozens or hundreds of generations per second). It appears that the framework is so fast that AWT can’t keep up.  If this happens (it’s more likely with small population sizes), the GUI will become sluggish and unresponsive.  This problem is mitigated by minimising the work that your EvolutionObserver does on the AWT thread or by only updating the GUI every nth generation.

Distributed Fitness Evaluations with Terracotta

There have also been some internal modifications to make the framework more amenable to clustering with Terracotta.  Terracotta can now be used to distribute the workload across multiple machines.  It’s only at the proof-of-concept stage at the moment – there is no support for handling node failures.  It’s also only really worthwhile for evolutionary programs that have expensive fitness functions.  The fitness function has to be expensive enough to justify the cost of transferring the candidate across the network for evaluation on another machine, otherwise clustering just makes things slower.

I will likely provide more detail on how to use Watchmaker with Terracotta in a future article, but for now here’s what to do if you want to try it out.  The field that you need to configure Terracotta to share is the  private workQueue field in the org.uncommons.watchmaker.framework.FitnessEvaluationWorker class. Run your unmodified program using Terracotta then run extra instances of the FitnessEvaluationWorker on other nodes.

Remember, you also have the option of using Hadoop with Watchmaker (via the Apache Mahout project).

See the changelog for full details of changes in this release.

10 Tips for Publishing Open Source Java Libraries

Posted in Java, Software Development by Dan on July 29th, 2009

One of the strengths of the Java ecosystem is the huge number of open source libraries that are available.  There are often several alternatives when you need a library that provides some specific functionality.  Some library authors make it easy to evaluate and use their libraries while others don’t.  Open source developers may not care whether their libraries are widely used but I suspect that many are at least partially motivated by the desire to see their projects succeed.  With that in mind, here’s a checklist of things to consider to give your open source Java library the best chance of widespread adoption.

1. Make the download link prominent.

If other people can’t figure out how to download your project, it’s not going to be very successful. I’m bemused by the number of open source projects that hide their download links some place obscure. Put it in a prominent location on the front page. Use the word “download” and use large, bold text so that it can’t be missed.

2. Be explicit about the licence.

Potential users will want to know whether your licensing is compatible with their project. Don’t make users have to download and unzip your software in order to find out which licence you use. Display this information prominently on the project’s home page (don’t leave it hidden away in some dark corner of SourceForge’s project pages).

3. Prefer Apache, BSD or LGPL rather than GPL.

Obviously you are free to release your library under any terms that you choose. It’s your work and you get to decide who uses it and how. That said, while the GPL may be a fine choice for end user applications, it doesn’t make much sense for libraries. If you pick a copyleft licence, such as the GPL, your library will be doomed to irrelevance.  Even the Free Software Foundation acknowledges this (albeit grudgingly), hence the existence of the LGPL.

The viral nature of the GPL effectively prevents commercial exploitation of your work.  This may be exactly what you want, but it also prevents your library from being used by open source projects that use a more permissive licence.  This is because they would have to abandon the non-copyleft licence and switch to your chosen licence. That isn’t going to happen.

4. Be conservative about adding dependencies.

Every third-party library that your library depends on is a potential source of pain for your users. They may already depend on a different version of the same library, which can lead to JAR Hell (such problems can be mitigated by using a tool such as Jar Jar Links to isolate dependencies). Injudicious dependencies can also greatly increase the size of your project and every project that uses it.  Don’t introduce a dependency unless it adds real value to your library.

5. Document dependencies.

Ideally you should bundle all dependent JARs with your distribution. This makes it much easier for users to get started. Regardless, you should document exactly which versions of which libraries your library requires. NoClassDefFoundError is not the most friendly way to communicate this information.

6. Avoid depending on a logging framework.

Depending on a particular logging framework will cause a world of pain for half of your users. Some people like to use Sun’s JDK logging classes to avoid an external dependency; and some people like to use Log4J because Sun’s JDK logging isn’t very good. SimpleLog is another alternative.

If you pick the “wrong” logging framework you force your users to make a difficult choice.  Either they maintain two separate logging mechanisms in their application, or they replace their preferred framework with the one you insisted that they use, or (more likely) they replace your library with something else.

For most small to medium sized libraries logging is not a necessity. Problems can be reported to the application code via exceptions and can be logged there.  Incidental informational logging can usually be omitted (unless you’ve written something like Hibernate, which really does need trace logging so that you can figure out what is going on).

7. If you really need logging, use an indirect dependency.

OK, so not all libraries can realistically avoid logging.  The solution is to use a logging adapter such as SLF4J.  This allows you to write log messages and your users to have the final say over which logging back-end gets used.

8. Make the Javadocs available online.

Some libraries only include API docs in the download or, worse still, don’t generate it at all.  If you’re going to have API documentation (and it’s not exactly much effort with Javadoc), put it on the website. Potential users can get a feel for an API by browsing its classes and methods.

9. Provide a minimal example.

In an ideal world your library will be accompanied by a beautiful user manual complete with step-by-step examples for all scenarios. In the real world all we want is a code snippet that shows how to get started with the library. Your online Javadocs can be intimidating if we don’t know which classes to start with.

10. Make the JAR files available in a Maven repository.

This one that I haven’t really followed through on properly for all of my projects yet, though I intend to. That’s because I don’t use Maven, but some people like to. These people will be more likely to use your library if you make the JAR file(s) available in a public Maven repository (such as Java.net’s). You don’t have to use Maven yourself to do this as there is a set of Ant tasks that you can use to publish artifacts.

Escape Analysis in Java 6 Update 14 – Some Informal Benchmarks

Posted in Java by Dan on May 31st, 2009

Sun recently released update 14 of the Java 6 JDK and JRE.  As well as the usual collection of bug fixes, this release includes some experimental new features designed to improve the performance of the JVM (see the release notes).  One of these is Escape Analysis.

To see what kind of impact escape analysis might have on my applications, I decided to try it on a couple of my more CPU-intensive Java programs.  Escape analysis is turned off by default since it is still experimental.  It is enabled using the following command-line option:

-XX:+DoEscapeAnalysis

Benchmark 1

The first program I tested is a statistical simulation.  Basically it generates millions of random numbers (using Uncommons Maths naturally) and does a few calculations.

VM Switches: -server
95 seconds

VM Switches: -server -XX:+DoEscapeAnalysis
73 seconds

Performance improvement using Escape Analysis: 23%

Benchmark 2

The second program I tested is an implementation of non-negative matrix factorisation.

VM Switches: -server
22.6 seconds

VM Switches: -server -XX:+DoEscapeAnalysis
20.8 seconds

Performance improvement using Escape Analysis: 8%

Conclusions

These benchmarks are neither representative nor comprehensive.  Nevertheless, for certain types of program the addition of escape analysis appears to be another signficant step forward in JVM performance.

Watchmaker 0.6.0 – Evolutionary Computation for Java

Posted in Evolutionary Computation, Java by Dan on April 26th, 2009

Version 0.6.0 of the Watchmaker Framework for Evolutionary Computation is now available for download. This release incorporates several minor changes that I’ve been making over the last few months.  Consult the changelog for full details, but here are the highlights:

Numerous Improvements to the Evolution Monitor and other Swing Components

The Watchmaker Swing library provides a collection of GUI components that simplify the process of building user interfaces for evolutionary programs. These components have received many improvments for version 0.6.0. As well as controls for manipulating evolution parameters while the program is running, the library also provides an Evolution Monitor component. This provides real-time information about the state of the program, including a view of the fittest candidate so far and a graph showing changes in population fitness over time.

Upgraded to Uncommons Maths 1.2

This means even faster RNGs are available for you to use. It also means that we now use the Uncommons Maths Probability class rather than duplicating it in the framework (this means you may have to change some imports in your code when upgrading from Watchmaker 0.5.x).

Caching Fitness Evaluator

Version 0.6.0 introduces the CachingFitnessEvaluator class. This is a wrapper that provides caching for existing FitnessEvaluator implementations. The results of fitness evaluations are cached so that if the same candidate is evaluated twice, the expense of the fitness calculation can be avoided the second time. The cache uses weak references in order to avoid memory leakage.

Caching of fitness values can be a useful optimisation in situations where the fitness evaluation is expensive and there is a possibility that some candidates will survive from generation to generation unmodified. Programs that use elitism are one example of candidates surviving unmodified. Another scenario is when the configured evolutionary operator does not always modify every candidate in the population for every generation.

Caching of fitness scores is provided as an option rather than as the default Watchmaker Framework behaviour because caching is only valid when fitness evaluations are isolated and repeatable. An isolated fitness evaluation is one where the result depends only upon the candidate being evaluated. This is not the case when candidates are evaluated against the other members of the population.

Mona Lisa Example

After seeing Roger Alsing’s evolution of the Mona Lisa, I was inspired to try to reproduce it using the Watchmaker Framework. I didn’t follow Roger’s methodology but I have come up with something similar. My results aren’t as impressive as his latest efforts but may be interesting anyway. This example was actually included in version 0.5.1 but I didn’t draw attention to it. In 0.6.0 I’ve improved performance and used it to demonstrate the Watchmaker GUI components mentioned above.  You can try it for yourself here.  Maybe you can come up with a combination of parameters that works better than the defaults I have provided?

Useful Watchmaker Links

If you are new to Evolutionary Computation in Java,  these previous articles may be of interest:

The Java Language Features that Nobody Uses

Posted in Java by Dan on April 17th, 2009

I read Anthony Goubard’s “Top 10 Unused Java Features” on JavaLobby earlier today. I agree with some of his selections but I think he missed out a few key features that nobody uses. Restricting myself to just language features (the API is too huge), here are four more widely unused features of Java.

4. The short data type

You use it? I don’t believe you. Everybody* uses int when they want integers, even if they don’t need a 32-bit range.

3. Octal Literals

Who uses Octal these days?** Hexadecimal is a more useful shorthand for binary values. Worse, the leading-zero notation for Octal literals is just confusing:

int a = 60;
int b = 060;
System.out.println(a + b); // Prints 108.

2. Local Classes

Java has four types of nested class, three of which are widely used. As well as static nested classes, named inner classes and anonymous inner classes, you can also define named classes within methods, though it’s rare to see one in the wild.

public class TopLevelClass
{
    public void someMethod()
    {
        class LocalClass
        {
            // Some fields and methods here.
        }
 
        LocalClass forLocalPeople = new LocalClass();
    }
}

1. Strict FP

There is probably a programmer out there somewhere for whom Java’s strictfp is vital, but I haven’t met him or her. If you already know what strictfp is used for then you are probably in the top 5% of Java programmers. If you don’t know what strictfp does, here you go, welcome to the top 5%. It’s basically about making sure that your calculations are equally wrong on all platforms.

* OK, maybe you used to be a C programmer.
** Here’s your rhetorical answer.

Related Articles

« Older Posts