Watchmaker Framework at JavaOne – Using Java and Genetic Algorithms to Beat the Market

Posted in Evolutionary Computation, Java by Dan on July 29th, 2011

Those of you who will be attending JavaOne this year (2nd – 6th October in San Francisco) might be interested in Matthew Ring’s BoF session on genetic algorithms in Java. Matthew tells me that he will be presenting his software that uses my Watchmaker Framework for Evolutionary Computation as part of his session Using Java and Genetic Algorithms to Beat the Market. Here’s the abstract:

Predicting the future is dangerous business. Is Java up to the task? This session discusses and demonstrates a fun and understandable stock-picking system built with Java 6 and various Java-based open source projects. The discussion ranges from machine learning concepts (specifically genetic algorithms) and core Java development with specific code examples to trading system back-testing. Attendees are sure to come away excited, ready to tackle that next hard big-money problem, and confident that Java is a capable partner.

Artificial Intelligence StackExchange – The Final Push

Posted in Evolutionary Computation by Dan on December 7th, 2010

After some months of slow progress, Jon Hawkins’ proposal to create a StackExchange site dedicated to artificial intelligence questions nears success. If you are not familiar with StackExchange, it is the software behind StackOverflow and its family of sister sites.

The artificial intelligence proposal is currently at 91% complete, with 195 people committed. That means that commitments from another 15-20 people should be enough to push it over the finishing line. If you would like to see a Q&A site dedicated to artificial intelligence, please commit to this proposal.

Update: The site is now live at ai.stackexchange.com.

Update 2: After a very brief beta period, the site has had the plug pulled due to lack of activity.

Practical Evolutionary Computation: Island Models

Posted in Evolutionary Computation, Java by Dan on February 23rd, 2010

Some time ago, I promised more details about the support for island models that was added to the Watchmaker Framework for Evolutionary Computation in version 0.7.0.  I’ve finally completed a first draft of some documentation for this feature, attempting to cover both the motivation for this approach to evolutionary algorithms and the practicalities of implementing this idea in your own programs that use Watchmaker. If anything is unclear or wrong please leave suggestions for improvements in the comments.

Island Models

In the natural world, populations of organisms might be separated by geography. Left to evolve in isolation over millions of years, vastly different species will occur in different locations. Consider Australia, an island continent protected by its seas. With little opportunity for outside organisms to interfere, and few opportunities for its land-based organisms to migrate to other land masses, Australian wildlife evolved to be distinctly different from that of other continents and countries. The majority of Australia’s plant and animal species, including 84% of its mammals, are endemic. They occur nowhere else in the world.

Australia is not the only island to exhibit such levels of endemism. It was a visit to the Galápagos Islands in 1835 that started Charles Darwin on the path to formulating his theory of evolution. Darwin noticed the pronounced differences between the species of mocking birds and tortoises present on the different islands of the archipelago and began to speculate on how such variations might have occurred.

In the world of evolutionary computation we can mimic this idea of having multiple isolated populations evolving in parallel. Having additional populations would increase the likelihood of finding a solution that is close to the global optimum. However, it is not just a question of having a larger global population. A system of 10 islands each with a population of 50 individuals is not equivalent to a single island with a population of 500. The reason for this is that the island system partitions the search. If one island prematurely converges on a sub-optimal solution it does not affect the evolution happening on the other islands; they are following their own paths. A single large population does not have this in-built resilience.

Migration

There is of course no real difference between evolving 10 completely separate islands in parallel and running the same single-population evolution 10 times in a row, other than how the computing resources are utilised. In practice the populations are not kept permanently isolated from each other and there are occasional opportunities for individuals to migrate between islands.

In nature external species have been introduced to foreign ecosystems in several ways. In an ice age the waters that previously separated two land masses might freeze providing a route for land animals to migrate to previously unreachable places. Microorganisms and insects have often strayed beyond their usual environment by hitching a ride with larger species.

The effect of introducing a foreign species to a new environment can vary. The new species might be ill-adapted to its new surroundings and quickly perish. Alternatively, a lack of natural predators may cause it to flourish, often to the detriment of indigenous species. One such example is the introduction of rabbits to Australia. Australia was a land without rabbits until the arrival of European settlers. An Englishman named Thomas Austin released 24 rabbits into the wild of Victoria in October 1859 with the intention of hunting them. If rabbits are famous for one thing it is for reproducing prodigiously. The mild winters allowed year-round breeding and the absence of any natural rabbit predators, such as foxes, allowed the Australian rabbit population to explode unchecked. Within 10 years an annual cull of two million rabbits was having no noticeable effect on rabbit numbers and the habitats of some native animals were being destroyed by the floppy-eared pests. Today there are hundreds of millions of rabbits in Australia, despite efforts to reduce the population, and the name of Thomas Austin is widely cursed for his catastrophic lack of foresight.

While such invasions of separate species provide a useful analogy for what can happen when we introduce migration into island model evolutionary algorithms, we are specifically interested in the effects of migration involving genetically different members of the same species. This is because, in our simplified model of evolution, all individuals are compatible and can reproduce. The island model of evolution provides the isolation necessary for diversity to thrive while still providing opportunities for diverse individuals to be combined to produce potentially fitter offspring.

In an island model, the isolation of the separate populations often leads to different traits originating on different islands. Migration brings these diverse individuals together occasionally to see what happens when they are combined. Remember that, even if the immigrants are weak, cross-over can result in offspring that are fitter than either of their parents. In this way, the introduction to the population of new genetic building blocks may result in evolutionary progress even if the immigrants themselves are not viable in the new population.

Islands in the Watchmaker Framework

The Watchmaker Framework for Evolutionary Computation supports islands models via the IslandEvolution class. Each island is a self-contained EvolutionEngine just like those we have been using previously for single-population evolutionary algorithms. The evolution is divided into epochs. Each epoch consists of a fixed number of generations that each island completes in isolation. At the end of an epoch migration occurs. Then, if the termination conditions are not yet satisfied, a new epoch begins.

The IslandEvolution supports pluggable migration strategies via different implementations of the Migration interface. An island version of the string evolution example from this previous article might look something like this:

IslandEvolution engine  = new IslandEvolution(5, // Number of islands.
                                              new RingMigration(),
                                              candidateFactory,
                                              evolutionaryOperator,
                                              fitnessEvaluator,
                                              selectionStrategy,
                                              rng);
 
engine.evolve(100, // Population size per island.
              5, // Elitism for each island.
              50, // Epoch length (no. generations).
              3, // Migrations from each island at each epoch.
              new TargetFitness(0, false));

We can add listeners to an IslandEvolution object, just as we can with individual EvolutionEngines. We use a different interface for this though, IslandEvolutionObserver, which provides two call-backs. The populationUpdate method reports the global state of the combined population of all islands at the end of each epoch. The islandPopulationUpdate method reports the state of individual island populations at the end of each generation.

Advanced Usage

In the example code above we specified how many islands we wanted to use and the IslandEvolution class created one GenerationalEvolutionEngine for each island. Using this approach all of the islands have the same configuration; they use the same candidate factory, evolutionary operator(s) and selection strategy. This is the easiest way to create an island system but it is also possible to construct each island individually for ultimate flexibility.

List> islands = new ArrayList>();
 
// Create individual islands here and add them to the list.
// ...
 
IslandEvolution engine  = new IslandEvolution(islands,
                                              new RingMigration(),
                                              false, // Natural fitness?
                                              rng);

One reason you might choose to construct the islands explicitly is that it makes it possible to configure individual islands differently. You may choose to have different islands use different parameters for evolutionary operators, or even to use different evolutionary operators all together. Alternatively, you could use the same evolutionary operators and parameters but have different selection strategies so that some islands have stronger selection pressure than others. You should generally use the same fitness function for all islands though, otherwise you might get some strange results.

Another possible reason for creating the islands explicitly is so you don’t have to use the standard GenerationalEvolutionEngine for the islands. You can choose to use any implementation of the EvolutionEngine interface, such as the SteadyStateEvolutionEngine class or the EvolutionStrategyEngine class. You can even use a mixture of different island types with the same IslandEvolution object.

This is the fourth entry in a short and increasingly infrequent series of articles on practical Evolutionary Computation, based on the work-in-progress documentation for the Watchmaker Framework for Evolutionary Computation.  The first article provided an introduction to the field of evolutionary computation, the second article showed how to implement a simple evolutionary algorithm using the Watchmaker Framework, and the third article briefly discussed the concept of elitism in evolutionary algorithms.

Further Reading

An Introduction to Genetic Algorithms Introduction to Evolutionary Computing The Blind Watchmaker

Open Source Graphic Design – New Watchmaker Framework Logo

Posted in Evolutionary Computation, The Internet by Dan on January 11th, 2010

A while ago I created a new website for my main Open Source project, the Watchmaker Framework for Evolutionary Computation. While the new website was a definite improvement over the previous effort, it was still lacking something. It wasn’t distinctive. What I really needed was a logo, something that visually identified the project. But how do you represent evolution and/or a watchmaker in the form of a simple, distinct picture?

It was beyond my modest artistic skills so I headed to Reddit and floated the question of how to find a graphic designer who would be willing to contribute to an Open Source project. I wasn’t expecting much, after all I was asking somebody to work for free on a fairly obscure project.

The usual way to get somebody to make you a logo is to find a professional graphic designer and pay them, or to stump up some cash to fund a contest on worth1000.com or 99designs.com. The prices start at $204 at the latter. I got a few suggestions from the Reddit crowd that I should try this monetary compensation idea.

I also got several people offering to produce a logo for me free-of-charge, and even a few who spontaneously decided to create something and submit it as a suggestion (see some of the links in the Reddit thread).  I really wasn’t sure if there would be many graphic designers who were interested in helping out Open Source software projects but it seems there are plenty. What is lacking is somewhere on the web to connect these willing designers with needy projects.

One of the people who got in touch to offer his services was Charles Burdett. He sent me a link to his impressive portfolio. I quickly accepted Charles’ offer before he had a chance to change his mind. This meant turning down a number of other offers that I received. I’m extremely grateful to all of the people who were willing to help me out, and some of the concepts that people suggested would have made very good logos.

Charles came up with the concept that now adorns the Watchmaker project website – the clockwork Ichthyostega. To be honest, I don’t even know how to pronounce that but Wikipedia tells me that Ichthyostega was a creature from the Upper Devonian Period. It was an intermediate form between fish and amphibian, so a significant step in the evolution of life on this planet.

This logo fits in very nicely with the existing site design and, because it’s a simple outline drawing, it should be very versatile for use in different contexts. I’m very happy with the result and I’d like to thank Charles for all of his work on this. If you like this logo, or any of Charles’ other work, he is available for hire.

Watchmaker Framework 0.7.1 – Evolution Strategies

Posted in Evolutionary Computation, Java by Dan on January 9th, 2010

The Watchmaker Framework for Evolutionary Computation has reached version 0.7.1.  This is an incremental release that refines a couple of the changes made in the substantial 0.7.0 update.  It also adds support for Evolution Strategies to complement the existing support for generational and steady-state evolutionary/genetic algorithms.

In addition, the (still incomplete) user manual has a new section on the different selection strategies that are available.

Watchmaker 0.7.0 – Island models, steady-state evolution and more

Posted in Evolutionary Computation, Java by Dan on December 15th, 2009

The Watchmaker Framework for Evolutionary Computation has reached version 0.7.0. If you’re new here, Watchmaker is a Java library for implementing evolutionary/genetic algorithms. This release is the most substantial update for some time.

Backwards-Incompatibilities

Firstly, I’ve refactored the evolution engine so that it is not as tightly tied to the standard generational model of evolution. What this means in practice is that the class formerly known as ConcurrentEvolutionEngine is now called GenerationalEvolutionEngine. For most users this should be the only backwards-incompatibility in this release. The constructor arguments are the same and all the old methods are still there so it should just be a case of updating your code to use the new name. If you were using SequentialEvolutionEngine, and I think this mostly affects the Mahout guys, this class no longer exists. Concurrency is no longer managed via different sub-classes. Instead you should also use GenerationalEvolutionEngine and call the new setSingleThreaded method to force all work to be done synchronously on the request thread.

If you use Watchmaker to evolve BitString objects, this release improves the performance of the provided mutation and cross-over operators. One side-effect of this is that the semantics of the BitStringMutation class have been modified, so you may need to adjust your parameters.

Steady-State Evolution

OK, on to the good stuff. The reason for the evolution engine changes was to make it easier for the framework to support different types of evolutionary algorithm. The benefits of these changes will become more apparent in subsequent releases but there is one example in this release. The framework now has first-class support for steady-state evolution via the new SteadyStateEvolutionEngine. In steady-state evolution the population is evolved one member at a time rather than all members in parallel. This was possible in previous versions of the Watchmaker Framework but only via a nasty hack.

Island Model Evolution

The major new feature in 0.7.0 is support for island model evolution. Instead of evolving a single population, you evolve multiple populations in parallel. The idea is that the different populations will randomly proceed in different directions making it less likely that the evolution will get stuck at a local optimum. What makes this different from simply evolving a single population multiple times is that there is periodic migration of individuals between islands. The motivation behind the migration is that combining independently evolved individuals with different traits will hopefully result in even fitter offspring. There’s a lot more detail that I’m glossing over that I intend to cover in a later post, but if you want to get started right away the IslandEvolution class is the place to start.

Sigma-Scaling Selection

0.7.0 also includes a new selection strategy. SigmaScaling uses the population’s fitness standard deviation to regulate the selection pressure, making premature convergence less likely.

More Information

Demo Applets

New Watchmaker Framework Website / Development Roadmap

Posted in Evolutionary Computation by Dan on December 2nd, 2009

The Watchmaker Framework for Evolutionary Computation has a new website at http://watchmaker.uncommons.org. The Java.net web hosting is slow and restrictive, so I’ve decided to self-host the project pages.

The project will continue to use the other Java.net tools such as Subversion, Issuezilla and the project forum for the foreseeable future, though I am weighing up the benefits and drawbacks of a possible relocation to GitHub.

I have no plans to move my other projects that are currently on Java.net (Uncommons Maths and ReportNG being the main ones).

Part of the new Watchmaker website is the development roadmap, which should answer some of the questions I’ve been asked about where the project is heading. Most of the work for version 0.7.0 is done and in Subversion, there are just a couple of rough edges to be worked out. I hope to do a release before the new year. Version 1.0 seems as far away as it did 3 and a half years ago.

Evolutionary Computation in Java – ECJ, JGAP and Watchmaker Compared

Posted in Evolutionary Computation, Java by Dan on September 17th, 2009

In the days before the Watchmaker Framework the two most popular Java evolutionary computation libraries were probably ECJ (Evolutionary Computation [in/for] Java) and JGAP (Java Genetic Algorithms Package). Since the advent of Watchmaker the two most popular Java evolutionary computation libraries are probably ECJ and JGAP. So that worked out well then, but at least the Watchmaker Framework is now being mentioned alongside these established projects.

A recent series of posts over at the Hidden Clause blog compares the pros and cons of implementing the same genetic programming (GP) examples using each of the three frameworks. The article about the Watchmaker Framework is here. In the final analysis Watchmaker is ranked second, ahead of JGAP but behind ECJ, with mostly positive comments (particularly in respect to Watchmaker being the only one of the three frameworks that takes advantage of Java 5’s generics).

Ultimately it seems that the lack of specialist GP support is the main black mark against Watchmaker in this review. Watchmaker is, in its current incarnation, a more general-purpose evolutionary computation framework rather than a specialist system for genetic programming. There is proof-of-concept GP code included but the classes are not part of the core framework. Eventually I would like to include proper GP support, either as part of the core library or as an official add-on library.

I understand that there will be further articles on GP in Java (using ECJ, JGAP and Watchmaker) at Hidden Clause, so it might be worth subscribing if you are interested in this kind of stuff.

Watchmaker Framework for Evolutionary Computation – Version 0.6.2

Posted in Evolutionary Computation, Java by Dan on September 13th, 2009

This is a bug fix release that addresses a couple of issues with thread management. In version 0.6.1, if you were creating and discarding multiple ConcurrentEvolutionEngines, the threads from the discarded engines would not be cleared up properly.  This could eventually lead to OutOfMemoryErrors if you created a large number of evolution engines.

In version 0.6.2, all ConcurrentEvolutionEngines share a common thread pool so there is no need to create and destroy additional threads.

0.6.2 also reverts the switch to non-daemon threads in version 0.6.1. If you were having problems with the JVM not exiting when your program completed, this should fix it.

Watchmaker Framework for Evolutionary Computation – Version 0.6.1: Terracotta Clustering and more…

Posted in Evolutionary Computation, Java by Dan on August 3rd, 2009

I’ve just uploaded version 0.6.1 of the Watchmaker Framework for Evolutionary Computation.  If you’re not already familiar with the project, it is a library for implementing evolutionary/genetic algorithms in Java.  It’s multi-threaded, cross-platform, fast and has a modern, unobtrusive and flexible API.

API Improvements

One user-requested addition to the API in this release is the getSatisfiedTerminationCondtions method.  This makes it easy to determine which termination condition (elapsed time, generation count, stagnation, etc.) caused the evolution to terminate when you are using multiple termination conditions.

The API documentation has also been improved in a few places to make things clearer. Firstly, the framework does not support negative fitness scores.  In previous releases it may have worked under some circumstances, but it was undefined behaviour.  In this release you will get an IllegalArgumentException if you try it.

Secondly, if you are using an EvolutionObserver to update a Swing GUI, be careful not to overwhelm the AWT thread with updates (this can happen if you are processing dozens or hundreds of generations per second). It appears that the framework is so fast that AWT can’t keep up.  If this happens (it’s more likely with small population sizes), the GUI will become sluggish and unresponsive.  This problem is mitigated by minimising the work that your EvolutionObserver does on the AWT thread or by only updating the GUI every nth generation.

Distributed Fitness Evaluations with Terracotta

There have also been some internal modifications to make the framework more amenable to clustering with Terracotta.  Terracotta can now be used to distribute the workload across multiple machines.  It’s only at the proof-of-concept stage at the moment – there is no support for handling node failures.  It’s also only really worthwhile for evolutionary programs that have expensive fitness functions.  The fitness function has to be expensive enough to justify the cost of transferring the candidate across the network for evaluation on another machine, otherwise clustering just makes things slower.

I will likely provide more detail on how to use Watchmaker with Terracotta in a future article, but for now here’s what to do if you want to try it out.  The field that you need to configure Terracotta to share is the  private workQueue field in the org.uncommons.watchmaker.framework.FitnessEvaluationWorker class. Run your unmodified program using Terracotta then run extra instances of the FitnessEvaluationWorker on other nodes.

Remember, you also have the option of using Hadoop with Watchmaker (via the Apache Mahout project).

See the changelog for full details of changes in this release.

« Older Posts