Installing GHC 6.10.1 on OS X 10.4

Posted in Haskell, Mac by Dan on November 21st, 2008

Every time I need/decide to upgrade GHC, it seems there’s a different set of hoops I need to jump through to get it working on OS X 10.4 (Tiger). I don’t have OS X 10.5 (Leopard) and I don’t intend to buy it, so unfortunately I don’t get to use the nice-and-simple installer. I’ve decided to write down the exact steps that I’m taking this time so that I have a reference if I need to do it again (or if somebody else needs to do the same).

I’m pretty certain that this isn’t the way I did it last time. I seem to recall manually building the whole thing from a source tarball and having to resolve the dependencies myself. Then again, that’s probably why I have to upgrade now – my 6.8.2 install appears to be broken.

MacPorts and Xcode

The GHC site recommends that Tiger users use MacPorts, so that’s what I’m doing. I would have used fink, because I already have that set-up, but they don’t have a recent GHC build available for Tiger (6.6.2 is relatively ancient).

First I tried to install MacPorts without upgrading Xcode. It hung. So then I did what I had been told (and had ignored) and downloaded the latest version of Xcode from the Apple Developer Connection. For Tiger the latest version is 2.5. 3.0 and above are for Leopard only. At 903mb the download is not exactly slimline. After running the Xcode installer the MacPorts installer worked properly, which was nice.

Installing GHC from MacPorts

After that, it’s supposed to be easy:

$ sudo port install ghc
Password:
sudo: port: command not found

MacPorts installs to /opt/local and I didn’t have /opt/local/bin on the path (it seems that the “postflight script” mentioned here didn’t run or didn’t work). No problem:

$ sudo /opt/local/bin/port install ghc

This is meant to download, build and install the latest GHC and all its dependenices (GMP, yet another version of Perl, etc.).  After some time had elapsed my first attempt failed with this helpful message:

Error: Status 1 encountered during processing.

The GCC output seemed to suggest that it couldn’t find the GMP library that MacPorts had just installed. Google revealed this to be a bug in the Portfile. Somebody else had run into the same problem earlier the same day and the maintainer was on the case. After leaving it for a day, the bug is now fixed and I tried again. This time the installation proceeded without problems, although it took a fecking long time to complete.

Paths and Symlinks

Once the install was done, I removed all traces of the previous 6.8.2 install (which was under /usr/local) and made sure that /opt/local/bin was on my path (in ~/.bash_login).

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.1

Excellent. The thing that prompted me to upgrade was that Haddock wasn’t working for me since I upgraded it to the latest version.  So that was the next thing to check:

$ ./setup.hs haddock
setup.hs: haddock version >=0.6 is required but it could not be found.

It seems that for this particular build of GHC the Haddock executable is called haddock-ghc, rather than haddock as in 6.8.2. Cabal is still looking for haddock though, so I added a symlink and everything was fine again:

$ sudo ln -s /opt/local/bin/haddock-ghc /opt/local/bin/haddock

I think I now have a working GHC 6.10.1 installation.

Smaller Java

Posted in Java by Dan on November 9th, 2008

In my previous post I talked about how to reduce the size of your Java binaries without sacrificing functionality. Using Proguard to strip out unused and redundant code, I was able to squeeze 1.4 megabytes of already-compressed JAR files (an applet plus its libraries) into 276kb. The motivation was to reduce download times and data transfer costs for network-launched software (applets and Web Start applications). An 80% size reduction is pretty impressive but can we do better? Yes we can.

GZip

A JAR file (or Java ARchive) is simply a zip file with a different extension. In other words the contents are already compressed. Given that we’ve already removed redundant information and zipped the files, you might think that we couldn’t compress the code much further.

Files in a zip archive are compressed individually. In practise, better compression is often achieved by compressing an archive as a whole so that similarities across separate files can be exploited. For this we can use gzip. Compressing the 276kb JAR file results in a 251kb GZipped JAR file. This is a reduction of about 9%. Good but not spectacular.

GZip is more effective when its input is uncompressed. If we expand the 276kb JAR file, repack it as an uncompressed JAR (use jar -0) and then gzip that, the resultant jar.gz file is a mere 193kb. Now that’s more like it, a 30% reduction on our already spartan 276kb binary.

At this point it is worth noting that you can’t just embed a jar.gz file in an HTML page. It won’t work. Instead what you need to do is set-up content-negotiation on the web server so that when a browser requests the vanilla applet JAR file it receives the GZipped version. This is actually pretty straightforward and we’ll cover that shortly. But first, I don’t think that 193kb is anywhere near small enough. Can we do better? Yes we can.

Pack200

We’ve reached the limits of what we can achieve with general purpose compression techniques. We’ve also reached a lower limit for applets, since we are reliant on the browser for compression. For Web Start applications though it’s a different story.

The Sun JDK includes a little-known tool called pack200. It is a compression utility designed specifically for compressing JAR files. Because Pack200 understands the class file format used by the archive contents, it is able to make optimisations that are unavailable to general purpose tools. Pack200 restructures the archive and the class files it contains and then GZips the result.

At this point I really didn’t think there was much scope for further reductions in size. I was wrong. That 276kb JAR that we started with, the one that started out as 1.4mb of compressed Java bytecode, the one that was squashed to just 193kb by GZip, was reduced to a tiny 81 kilobytes after Pack200 had finished with it.

Over the course of two blog posts, I’ve reduced my data transfer requirements by 94%. Despite the difference in size, the two programs are functionally equivalent. Of course, I shouldn’t pretend that compression is completely free. The client machine will have to unpack the archive. This will increase start-up processing, but since smaller files are downloaded quicker, it’s still likely to be faster overall.

Content-Negotiation

To use either Pack200 or GZip to compress network-launched Java applications requires content-negotiation on the web server. The client tells the web server what encodings it supports and the web server responds with the most appropriate option. In the case of applets, the client is the web browser. None of the browsers that I have tested support Pack200, but they will all accept GZip.  For Web Start applications, the javaws launcher does accept Pack200 so that will be used where available.

Sun’s Pack200 page describes a servlet that can perform the necessary content-negotiation. However, if you’re not already running a servlet container, it’s probably easier to use the features of your web server.  Chris Nokleberg has written some straightforward instructions on how to achieve this with Apache.

The Incredible Shrinking Software

Posted in Java by Dan on November 7th, 2008

It may not seem important to those of us who develop server-side Java software, but size matters.  If you distribute your software on CD or DVD, you aren’t going to worry about 10 megabytes here and there.  The only Java developers who tend to put as much thought into optimising for size as they do into optimising for performance are those that work with JavaME and its constrained environments.  However, network-launched software, such as applets and Web Start applications, can suffer greatly from bloated binaries too.

In an era when users are used to interacting with AJAX web applictions and snappy Flash-powered content, a start-up time that rivals that of a cassette game on a C64 is going to set your application apart for all the wrong reasons.  Of course, maybe your software is so good that it doesn’t matter about the load time and people will use it anyway.  Then you have another problem.  If you aspire to have a huge number of users for your huge application, data transfer costs are going to bite.

1995 called, it wants its RIA technology back…

So where am I going with this?  If you’ve been following along at home, you’ll know that I’ve been playing around with applets again recently.  Version 1.4.3 of this applet (the last build of its initial incarnation, circa 2002) weighed in at 20.9 kilobytes.  Version 2.0.3 (the last build from the 2005 rewrite) was a relatively bloated 39.7 kilobytes, but still smaller than many of the image files that people embed in their web pages.  But version 3.0 was/is going to be much more ambitious.  More features means more code.  And since I’d finally be giving up on AWT and moving to Swing, I really had no excuse not to replace the horrific custom graphs I’d hacked together for version 2.

Every Swing developer knows that if you need graphs, there’s really only one place you need to look: JFreeChart.  JFreeChart does everything you could ever possibly need to do with charts and graphs.  It does it well and even looks pretty good.  You can tweak just about everything in order to get exactly what you are looking for.

Your library is so fat it’s got its own postcode

There was just one problem with JFreeChart: its size.  1.6 megabytes is not huge in most contexts, but something about having a 50kb applet towing a 1.6mb dependency offended me.  Perhaps it was because it made my contribution to the whole a lot less significant, but it seemed to me a lot like a bicycle pulling a caravan.

I could have just accepted it as a fact of life.  Good, comprehensive libraries are unlikely to be small.  Broadband users would probably be able to accept the slightly longer start-up, but any dial-up users would give up long before they’d get to see anything.

So I looked at the alternatives.  Most were smaller, some were ugly and some seemed a bit too basic.  A couple looked like feasible replacements but, other than the size, I was happy with JFreeChart.  Would I be able to get the results I wanted with these more limited libraries?  I’d also have to spend some time figuring out how to use them.  As I saw it, there was only one solution.  JFreeChart would just have to become smaller.

It was pretty obvious that there was ample scope to achieve a significant reduction in JFreeChart’s size.  As already mentioned, JFreeChart is very comprehensive.  It supports several different types of charts, customisable renderers and all sorts of other optional stuff.  My applet was using only a small fraction of this functionality (line graphs and pie charts).  The rest of it could go.

The labour-intensive approach would have been to check out JFreeChart’s source code, start deleting code and hack around until something smaller emerged (and hopefully compiled).  Too much effort for me.  Which is where the “Incredible Shrinking Software” of the title comes in…

Enter Proguard

You may already be familiar with Proguard.  It’s arguably the premier open source obfuscator.  If you’ve ever wanted to make your Java software difficult to reverse-engineer, then you’ve probably already used it.  But obfuscation is only one of two complementary functions that Proguard performs.  The other is shrinking.

Obfuscation and shrinking are inextricably linked since both involve removing unnecessary information from an application’s compiled binaries.  Java class files contain information that is not necessary to run the code, JAR files contain classes that you don’t use, the classes that you do use contain methods that you will never call, and all those descriptive identifiers you used are way too verbose.

The low-hanging fruit of class file shrinkage is the debug information inserted by the compiler.  This includes the line number table that is used to insert something more useful than “unknown source” into exception stack-traces.  You can simply instruct the compiler to omit this data (-g:none) and the class files will be smaller.  The downside of this is that the information won’t be there if you need it.

Proguard goes much further than this though.  You give it one or more entry points (in this case a class that extends Applet, but it could be a class with a main method, or something else). From this, Proguard finds code from the input JAR(s) that is definitely not used and removes it. In addtion, where it won’t break anything, members are renamed with shorter names, such as ‘a’ and ‘b’, resulting in further reductions in code size (again at the expense of easy debugging – stack traces will have neither meaningful names nor source code line numbers).

How low can you go?

So, how small did I make JFreechart? Well, I cheated slightly by reverting from version 1.0.11 to version 1.0, which had all the functionality I needed but was only 1.3mb in size. With my applet code weighing in at just over 100kb unobfuscated, the combined applet + JFreeChart size was 276kb after shrinking, a total size reduction (for applet and library combined) of about 80%.

Pretty good, hey? Still bigger than I’d like though. Proguard has taken me as far as it can, but 276kb is not the limit of my ambition. There are still further reductions that can be made without sacrificing functionality. To be continued…

Swing Applications on OS X

Posted in Java, Mac by Dan on November 6th, 2008

This post is mostly for my future reference as I keep forgetting this information and have to search for it each time. These links demonstrate the little tweaks that you can make to your Swing applications to improve the user experience under OS X.

Preview Version of FSA 3.0 / Full Historical FA Premier League Statistics

Posted in Java by Dan on November 6th, 2008

A while ago I wrote about my decision to resurrect and open source one of my oldest projects, the Football Statistics Applet. Being an AWT-based Java 1.1 applet, written by my less experienced self, the code was pretty clunky.

After a period of inactivity, I’ve spent a lot of time in the last week on the new Swing UI and other enhancements. Many of the improvements I have in mind for version 3.0 are still to be done, but the updated software is at least useful, if you are interested in football statistics.

Version 3.0 head-to-head view (OS X)

I have set up a demo page for a pre-release build of 3.0 that displays statistics for every English Premier League season since 1992.  It also generates an all-time table that combines the separate seasons into one.  If you follow English football, you may find it interesting. Any feedback should be directed to the issue tracker.

Are applets dead? Maybe, though Sun doesn’t think so. Even so, the refactored FSA codebase offers opportunities for other non-applet projects in the future.