The Unbeatable Draughts (Checkers) Player

Posted in Evolutionary Computation, Software Development by Dan on July 21st, 2007

While some of us can just about muddle through a Sudoku, others are aiming higher. The BBC has the story of how a team from the University of Alberta has solved Checkers (or “Draughts” as we like to call it in this part of the world). The Chinook program was already well capable of beating the best human players, now it’s not worth even trying since it can’t be beaten. Earlier incarnations of Chinook made use of evolutionary approaches. The latest release is the result of a phenomenal amount of CPU time dedicated to analysing every possible game position and determining the best move for each. A solution for Chess is still some way off.

Ohloh – Forgettable name, interesting site

Posted in Evolutionary Computation, Software Development by Dan on July 16th, 2007

I discovered the Open Source project directory Ohloh yesterday (thanks to Andres Almiray’s post that I noticed on JavaBlogs). Oddly the name Ohloh is instantly forgettable and I’m having to look up the URL each time I go there.

By querying CVS and Subversion repositories, Ohloh generates information about the state of a project and ties individual contributions to developer profiles. It also allows individuals to list which Open Source projects they are users of. The site uses Google Maps to show how both contributors and users are distributed geographically.

I added my Watchmaker evolutionary computation project (the Ohloh entry is here). Unfortunately java.net‘s strategy of hosting web content in the Subversion repository does skew the statistics somewhat (my contributions appear to be mostly HTML content, but this is mainly generated Javadoc output).

Of the information presented, most pleasing for me is the nice green tick the project gets for being well commented:

Across all Java projects on Ohloh, 35% of all source code lines are comments. For Watchmaker Evolution Framework, this figure is 45%.

This high number of comments puts Watchmaker Evolution Framework among the highest one-third of all Java projects on Ohloh.

A high number of comments might indicate that the code is well-documented and organized, and could be a sign of a helpful and disciplined development team.

That’s a good reference for my CV: “Dan is a helpful and disciplined development team.” 🙂

Scholarpedia: Wikipedia with better standards?

Posted in Evolutionary Computation, Software Development by Dan on June 16th, 2007

I’ve just stumbled upon Scholarpedia, a MediaWiki-based encyclopedia. The key difference between it and Wikipedia is its focus on peer-review of content. Of course, this immediately means that it has substantially less content than Wikipedia but less is more, right? Contributors are nominated and voted in by the public based on their reputation in their area of expertise, most being notable academics.

At present Scholarpedia seems to have quite a narrow focus (most current articles are about various kinds of adaptive systems in computer science). It will be interesting to see how the project progresses. Perhaps the most promising aspect is the quality of authors who have apparently signed up to write various sections. For example, if you could ask anybody to explain Genetic Algorithms, it would probably be John Holland. And who better to write about Hopfield Networks than John Hopfield himself?

Haskell – Ready for the mainstream?

Posted in Haskell, Software Development by Dan on May 25th, 2007

Having been exposed to pure functional programming in Miranda at university, Haskell is a programming language that I like a lot. I’ve dabbled with it on and off using GHC but I’ve never written a really substantial program in it. Certainly nothing to rival the Java behemoths that I develop day-to-day. Programs written (well) in Haskell are concise, expressive and have an elegance that is hard to match in more general-purpose languages such as C and Java.

Despite the appeals of strongly-typed, side-effect-free programming, Haskell (and non-strict, pure functional languages in general), have long been regarded as primarily the domain of academics. Haskell’s design owes much to the proprietary Miranda system but, unencumbered by commercial constraints, it has rapidly supplanted its predecessor. Even the University of Kent, previously the principal champion of Miranda, has long since switched to Haskell for teaching purposes. Haskell is now a part of undergraduate computer science courses around the world but, despite this, has little visibility in the world of commercial software development.

However, all that may be about to change. This week saw the announcement of the book “Real World Haskell”, to be published by O’Reilly. If the authors deliver on the proposed outline, it promises to provide a real boost for Haskell as a viable language for mainstream software development. The book will cover topics that are often ignored by existing Haskell guides but that are essential for solving real world problems. I/O, databases, GUI development, concurrency and unit testing are just some of the items to be addressed.

At present there is nothing much to see, but I’ll be monitoring their progress closely over the coming months in eager anticipation. Hopefully, by the time the book is out, GHC will have broken free from its LGPL shackles to remove another barrier to widespread Haskell adoption.

Why pay for a Continuous Integration server?

Posted in Software Development by Dan on December 8th, 2006

With plenty of good, free alternatives, such as Continuum and CruiseControl, you’d need a compelling reason to spend money on per-user licensing in order to use JetBrains’ new TeamCity product.

JetBrains’ legendary reputation for productivity-boosting, easy-to-use tools has taken a bit of dent recently (certainly within our office) thanks to the rather buggy 6.0 releases of their flagship product, IntelliJ IDEA. Regardless, they have managed to attract a legion of devoted followers eager to pay $500 each to avoid having to use Eclipse.

But can they repeat the trick for Continuous Integration? Continuum does the job with minimal fuss and is admirably simple to configure, so why pay more?

TeamCity has several nice-to-have plus points that put it ahead of the free competition, but not to the extent of $399 per user (or even the current offer price of $199 per user). Briefly, these plus points are:

  1. Support for .Net projects, as well as Java, in the same product (nice if you need it).
  2. Server-side code coverage analysis (you could get the same results by running EMMA from the Ant build.xml)
  3. Server-side static code analysis using IDEA inspections (nice but relies on using IDEA for development – Checkstyle and FindBugs could do something similar from Ant).
  4. Flashy AJAX web console (perhaps too flashy)
  5. Pre-tested commits. Sends your changes to the CI server for building before committing to version control. Your changes are only checked-in if the build succeeds and all tests pass.

So if like me you got a free TeamCity licence with your IDEA 6.0 purchase, there is enough in it to make it worthwhile. However, unless your entire team uses IDEA and have free TeamCity licences, you may find it a little restrictive compared to the free alternatives. You can enable guest access to allow anybody to view your builds and download artifacts, but only licensed users can control builds and receive notifications when builds fail.

This review could end here with a lukewarm thumbs-up and a recommendation to use the product if you got it for free but not to spend any money on it. But to do so would be to ignore the crucial architectural difference between TeamCity and its free competitors. The TeamCity website alludes to this key advantage without fully explaining the implications.

Continuum, CruiseControl and the like share a common approach. The web application running inside the servlet container checks out the code from version control and builds it locally (on the web server machine). In TeamCity, the web application is merely the coordinator, building happens elsewhere. TeamCity has a grid of Build Agents to which the work is delegated. You may choose to install the Build Agent on the Team City server, and/or you may decide to have one or more other machines that do the building.

The advantages of this approach were not initially obvious to me, beyond the ability to share large workloads across several nodes. The important point is that the build grid does not have to be homogenous. Build agents can be installed on different platforms with different resources available to them.

To understand the possibilities that this provides, allow me to introduce you to my current project. The server is a J2EE application but has a mission-critical native component written in C and invoked via JNI. The target platform is Solaris 10 on Sparc hardware, but developer desktops are x86 Windows XP machines. When running on Windows, the native component is replaced by a Java stub (the Ant script detects the platform and skips building the native component). In addition, we have two native client applications, one for Win32 and one for Mac OS X.

In TeamCity you can have multiple configurations per project. Each configuration can express preferences about the build agent used to execute it. In the above example we can have a server configuration that will only be built by a Solaris build agent and two client configurations, one for Windows and one for the Mac. Assuming we have at least one build agent available to satisfy each configuration, this is where the magic happens. A change is checked into Subversion. TeamCity detects that the repository has been updated and springs into life. A Solaris machine is instructed to check out the project and run the server build. Simultaneously, a Mac is prompted to build the Mac client and a Windows box gets to work on the Windows client. After x minutes, assuming there are no errors, the built and tested server and clients are available to download from the TeamCity web console. Without any human intervention we’ve built and tested across three different platforms.

But it doesn’t end there. How do we develop the Solaris-specific parts of the server on Windows desktops? There are two options. The first is to make the changes on Windows, check them into Subversion, logon to the Solaris machine, check out and test the updates and then go back to Windows to fix the bugs. This has the potential to break other people’s builds if they do an update before we have finished.

The alternative is to shorten the write-compile-debug cycle by doing the actual development on the Solaris machine using a remote session. The drawback of this approach is that using a heavyweight IDE in an X-session on the remote server may not be feasible, particularly if there are other users vying for resources.

Fortunately, TeamCity’s build grid gives us another option. Using the TeamCity IDEA plugin (plugins for Eclipse, NetBeans and Visual Studio are apparently coming soon), we can develop on Windows and submit a remote build to the grid. Without anything being committed to Subversion, a Solaris agent will do the build and report the results. When we have the final version ready to check-in, we can do a pre-tested commit as described earlier. This will make sure that the build is successful on the target platform before committing to version control.

TeamCity has its rough edges (see Hani’s review of version 1.0), but it has raised the bar for Continuous Integration functionality. If you’re going to pay money for it though, wait for version 1.2 as there are still a few bugs that might cause frustration.

Effective Bug Tracking – Ownership & Responsibility

Posted in Software Development by Dan on April 5th, 2006

The first key premise of effective bug tracking is the concept of ownership. Ownership is crucial in software teams because without ownership there is no responsibility, and without responsibility there is no direction. In order for a bug to get fixed in a timely manner there must, at all times, be somebody, a single individual, who is personally responsible for ensuring that that particular bug gets fixed. This person may be a programmer, but it could equally be the project manager or someone else. All that is important is that there is one person, and no more than one person, to be held accountable.

If this all seems obvious so far, consider that in many bug tracking applications, bugs are entered and then remain unassigned until somebody goes in and chooses who will deal with the problem. Where’s our accountability there? From the bug tracking application’s perspective at least, there is nobody responsible. We can do better than that. Golden rule number one is this:

“From the point at which the bug is entered into the system, and at all times afterwards, there will be somebody, a single individual, who owns that bug.”

In other words, there can never, at any time, be such a terrible thing as an unassigned, unresolved bug.

So how do we achieve this? Who does the bug get assigned to when it is first entered?

We could let the person who reports the bug decide, but more often than not the bug reporter is a software tester. They may well know who is the best person to assign the bug to, but we shouldn’t assume that they do. Futhermore, a software tester really shouldn’t, and probably won’t, have the authority to decide the workload of individual developers. All-in-all, it’s probably best that we remove these destructively-minded pessimists from this decision-making process.

Ideally the system should be able to provide us with a sensible default owner. Fortunately this is pretty straightforward to achieve. If we apply the same ownership rules to projects that we have applied to bugs, every project in the system will have an owner. Therefore every bug reported against a project has a ready-made default owner.

This approach works great for small projects with just one or two team members but doesn’t really scale. With a team of 20 developers the project owner would have to go through the list of bugs and re-assign each one to an appropriate team member. This is not much of an improvement on systems that leave the bugs unassigned initially.

This problem however is just one of granularity. By allowing projects to be sub-divided into components (and of course having an owner for each component), we can make the approach scale quite nicely. Bugs are then reported against components and initially assigned to the owner of that component. When components are added to projects their default owner is the project owner, but this can be over-ridden.

The sub-division of a project into components must necessarily be quite coarse-grained. Bugs are reported by testers and these testers must easily be able to identify the various application components without an intimate knowledge of the implementation. I see no reason to further divide components into sub-components. There is little to be gained from such a hierarchical composition.