Understanding PHP – A Journey into the darkness…

Posted in PHP by Dan on July 31st, 2009

I knew PHP was a bit crufty before I got seriously involved with it. I’ve been trying to avoid writing a rant about how horrible it is as the web has enough of those already and, after all, it doesn’t really matter, does it? Still, to maintain my sanity I’ve been maintaining a list of everything that’s bad about PHP, mostly for my own amusement. However, the most recent entry on my list cannot be allowed to pass without comment.

"01a4" != "001a4"

We start with something simple and non-controversial. If you have two strings that contain a different number of characters, they can’t be considered equal. The leading zeros are important because these are strings not numbers.

"01e4" == "001e4"

However, PHP doesn’t like strings. It’s looking for any excuse it can find to treat your values as numbers. And here we have it. Change the hexadecimal characters in those strings slightly and suddenly PHP decides that these aren’t strings any more, they are numbers in scientific notation (PHP doesn’t care that you used quotes) and they are equivalent because leading zeros are ignored for numbers. To reinforce this point you will find that PHP also evaluates "01e4" == "10000" as true because these are numbers with equivalent values.  This is documented behaviour, it’s just not very sensible.

Enter ===

At this point the PHP apologists chime in with the suggestion to use the === operator. This is an equality operator that compares not only the values of the arguments but their types as well.  Both sides must have the same type as well as identical values. This doesn’t seem like it should make any difference as the literals on both side of the comparison already have identical types, regardless of whether that type is string or integer. Of course that’s not the case and when you use the extra equals sign the values remain as strings rather than being interpreted as integers. "01e4" === "001e4" evaluates to false (correct, but not entirely convincing).

"0x001a4" == 0x01a4

So it seems that the rule in PHP is that if the contents of a string can be parsed as a numeric literal then, for comparisons, they are, as we see with the above hexadecimals (note the difference in notation from the first example, specifically the use of the 0x prefix). Leading zeros are ignored when numbers are involved.

"0012" != 0012

Unfortunately that’s not the full story as the final example shows. Like many other languages, PHP interprets numbers beginning with a zero as octal values, but not when that number is within a string. This is completely inconsistent with the way it processes hexadecimal values and scientific notation within strings.

10 Tips for Publishing Open Source Java Libraries

Posted in Java, Software Development by Dan on July 29th, 2009

One of the strengths of the Java ecosystem is the huge number of open source libraries that are available.  There are often several alternatives when you need a library that provides some specific functionality.  Some library authors make it easy to evaluate and use their libraries while others don’t.  Open source developers may not care whether their libraries are widely used but I suspect that many are at least partially motivated by the desire to see their projects succeed.  With that in mind, here’s a checklist of things to consider to give your open source Java library the best chance of widespread adoption.

1. Make the download link prominent.

If other people can’t figure out how to download your project, it’s not going to be very successful. I’m bemused by the number of open source projects that hide their download links some place obscure. Put it in a prominent location on the front page. Use the word “download” and use large, bold text so that it can’t be missed.

2. Be explicit about the licence.

Potential users will want to know whether your licensing is compatible with their project. Don’t make users have to download and unzip your software in order to find out which licence you use. Display this information prominently on the project’s home page (don’t leave it hidden away in some dark corner of SourceForge’s project pages).

3. Prefer Apache, BSD or LGPL rather than GPL.

Obviously you are free to release your library under any terms that you choose. It’s your work and you get to decide who uses it and how. That said, while the GPL may be a fine choice for end user applications, it doesn’t make much sense for libraries. If you pick a copyleft licence, such as the GPL, your library will be doomed to irrelevance.  Even the Free Software Foundation acknowledges this (albeit grudgingly), hence the existence of the LGPL.

The viral nature of the GPL effectively prevents commercial exploitation of your work.  This may be exactly what you want, but it also prevents your library from being used by open source projects that use a more permissive licence.  This is because they would have to abandon the non-copyleft licence and switch to your chosen licence. That isn’t going to happen.

4. Be conservative about adding dependencies.

Every third-party library that your library depends on is a potential source of pain for your users. They may already depend on a different version of the same library, which can lead to JAR Hell (such problems can be mitigated by using a tool such as Jar Jar Links to isolate dependencies). Injudicious dependencies can also greatly increase the size of your project and every project that uses it.  Don’t introduce a dependency unless it adds real value to your library.

5. Document dependencies.

Ideally you should bundle all dependent JARs with your distribution. This makes it much easier for users to get started. Regardless, you should document exactly which versions of which libraries your library requires. NoClassDefFoundError is not the most friendly way to communicate this information.

6. Avoid depending on a logging framework.

Depending on a particular logging framework will cause a world of pain for half of your users. Some people like to use Sun’s JDK logging classes to avoid an external dependency; and some people like to use Log4J because Sun’s JDK logging isn’t very good. SimpleLog is another alternative.

If you pick the “wrong” logging framework you force your users to make a difficult choice.  Either they maintain two separate logging mechanisms in their application, or they replace their preferred framework with the one you insisted that they use, or (more likely) they replace your library with something else.

For most small to medium sized libraries logging is not a necessity. Problems can be reported to the application code via exceptions and can be logged there.  Incidental informational logging can usually be omitted (unless you’ve written something like Hibernate, which really does need trace logging so that you can figure out what is going on).

7. If you really need logging, use an indirect dependency.

OK, so not all libraries can realistically avoid logging.  The solution is to use a logging adapter such as SLF4J.  This allows you to write log messages and your users to have the final say over which logging back-end gets used.

8. Make the Javadocs available online.

Some libraries only include API docs in the download or, worse still, don’t generate it at all.  If you’re going to have API documentation (and it’s not exactly much effort with Javadoc), put it on the website. Potential users can get a feel for an API by browsing its classes and methods.

9. Provide a minimal example.

In an ideal world your library will be accompanied by a beautiful user manual complete with step-by-step examples for all scenarios. In the real world all we want is a code snippet that shows how to get started with the library. Your online Javadocs can be intimidating if we don’t know which classes to start with.

10. Make the JAR files available in a Maven repository.

This one that I haven’t really followed through on properly for all of my projects yet, though I intend to. That’s because I don’t use Maven, but some people like to. These people will be more likely to use your library if you make the JAR file(s) available in a public Maven repository (such as Java.net’s). You don’t have to use Maven yourself to do this as there is a set of Ant tasks that you can use to publish artifacts.