New Adventures in Software » Software Development

Broken pipe errors after upgrading to Django 3.2

Posted in Python, Software Development by Dan on September 20th, 2021

I recently upgraded a project from Django 2.2.24 to the latest 3.2.x release. This was mostly straightforward except for one problem that took a while to figure out.

I have a view that generates SVG and I have another view that generates HTML that embeds this SVG using an <Object> tag. For some reason I was getting “broken pipe” errors on the server when accessing the HTML view and the SVG would not load. Accessing the SVG directly was not a problem. This was the same whether using the development server or Apache with mod_wsgi.

I tried both Django 3.1 and 3.0 as well with similar results but different errors. With these versions the error was “Connection reset by peer”. For some reason the browser was closing the connection before reading the SVG.

I couldn’t find an explanation and it wasn’t urgent so I gave up for a while. Today I took another look and found the reason could be found in the Django 3.0 release notes:

X_FRAME_OPTIONS now defaults to 'DENY'. In older versions, the X_FRAME_OPTIONS setting defaults to 'SAMEORIGIN'. If your site uses frames of itself, you will need to explicitly set X_FRAME_OPTIONS = 'SAMEORIGIN' for them to continue working.

This is a change to prevent click-jacking attacks. Evidently browsers treat <object> elements the same way as an <iframe> in this respect. You can resolve the issue globally as described in the release notes above, or you can decorate the view in question to allow it to be embedded:

from django.views.decorators.clickjacking import xframe_options_sameorigin

@xframe_options_sameorigin
def svg_view(request):
    # Do something

Carmack on Static Analysis and Functional Purity

Posted in Haskell, Software Development by Dan on October 16th, 2012

In the absence of anything worthwhile of my own to write here at the moment, I thought I’d instead highlight a couple of interesting blog posts that I’ve discovered recently by id Software’s John Carmack.

I read Carmack’s thoughts on static analysis a few weeks ago and his reports on the effectiveness of tools for analysis of C++ code chimed with my own experiences with Java tools such as FindBugs and IntelliJ IDEA.

The first step is fully admitting that the code you write is riddled with errors. That is a bitter pill to swallow for a lot of people, but without it, most suggestions for change will be viewed with irritation or outright hostility. You have to want criticism of your code.

Automation is necessary. It is common to take a sort of smug satisfaction in reports of colossal failures of automatic systems, but for every failure of automation, the failures of humans are legion. Exhortations to “write better code” plans for more code reviews, pair programming, and so on just don’t cut it, especially in an environment with dozens of programmers under a lot of time pressure. The value in catching even the small subset of errors that are tractable to static analysis every single time is huge.

In the other article, which I only read today, Carmack espouses the virtues of pure (i.e. side-effect-free) functions. His is a pragmatic approach concerned with how to exploit purity in mainstream languages, where it is entirely optional, as opposed to advocating jumping ship to Haskell. Even when 100% purity is impractical there are still benefits in minimising impurity.

A large fraction of the flaws in software development are due to programmers not fully understanding all the possible states their code may execute in. In a multithreaded environment, the lack of understanding and the resulting problems are greatly amplified, almost to the point of panic if you are paying attention. Programming in a functional style makes the state presented to your code explicit, which makes it much easier to reason about, and, in a completely pure system, makes thread race conditions impossible.

My experiences with Haskell have informed how I approach coding in other languages. In certain cases I’ve ended up taking the functional approach to such extremes that I’ve been left questioning my original choice of implementation language.

No matter what language you work in, programming in a functional style provides benefits. You should do it whenever it is convenient, and you should think hard about the decision when it isn’t convenient.

Free StackOverflow Careers Invites

Posted in Software Development, The Internet by Dan on June 30th, 2011

About 18 months ago when I was looking for contract work, I signed up for Stack Overflow Careers. This was back when you had to pay to use the service (it’s now free but registration is by invitation only – more on that in a minute). The asking price had quickly dropped from an ambitious $99 a year to a merely illegal $9 by the time I signed up.

The Careers site has evolved considerably since it launched and now allows you to create quite a sophisticated online CV that aggregates your Stack Exchange activity, open source projects (GitHub, Google Code, etc.) and more. The idea of the service is that employers can search for candidates who appear to be good matches for their vacancies. All the enquiries I had were from companies developing financial software in London, which would have been ideal except for the work and the location.

Anyway, it seems that Stack Overflow is desperate for more people to sign up for Careers. When it went free-but-invite-only I was awarded 5 invites to distribute to suitable individuals. Last week I was given 20 more invites. Today I have been given two further batches of 20 invites. I currently have 63 invites available. If you want to try out the service for yourself, all you have to do is follow this link to claim one of my invites.

ReportNG 1.1.1 – The Less Embarrassingly Bad Version

Posted in Software Development by Dan on May 21st, 2010

It seems that ReportNG 1.1 fared badly when it came into contact with the real world. It was a buggy piece of crap. If you upgraded and suffered IllegalStateExceptions or NullPointerExceptions, I’m sorry for wasting your time.

The new chronology page was the root of all the problems. The main one (symptom: IllegalStateException) was triggered when you used TestNG’s @AfterXXX annotations. My tests included only @BeforeXXX annotations so I didn’t detect the issue. I have improved the tests and fixed the cause.

Having fixed the stability issues I am left with a chronology page that has a couple of problems with the accuracy of the information it displays. These are due to invalid assumptions on my part about what the TestNG API would return (assumptions I should have tested more thoroughly). There may well be other ways to get TestNG to provide the information required to produce a worthwhile chronology but for now I have disabled it in version 1.1.1. Compared to 1.0 this version offers i18n and a fix for problems with Gradle. We’ll just forget that 1.1 ever happened.

ReportNG has moved to GitHub. The project website is at http://reportng.uncommons.org. The SVN repository and issue tracker at Java.net are no longer in use.

Download ReportNG version 1.1.1

Moving Projects from Java.net to GitHub

Posted in Java, Software Development by Dan on May 17th, 2010

How to move your project from Subversion on Java.net to Git on GitHub without losing the change history.

Cloning the Subversion Repository

The normal way to duplicate a Subversion repository with full history is to use the svnadmin dump and load commands. Unfortunately most SVN hosting services, including Java.net, do not provide access to svnadmin commands or direct access to the file system.

Fortunately there is another way to clone a repository, complete with all its history, that requires only read access to the repository: svnsync.

The first step is to create a local SVN repository into which you will mirror the remote Java.net repository.

svnadmin create myproject

Before cloning the contents it is important that you add a pre-revprop-change hook to your new local repository. This is a script that must complete successfully (exit code 0) before any changes to revision properties are accepted. The easiest way to do this to add an empty script and make it executable.

echo '#!/bin/sh' > myproject/hooks/pre-revprop-change
chmod +x myproject/hooks/pre-revprop/change

Bearing in mind that we want to preserve tags and branches too, not just the trunk, we can then mirror the top-level of the remote repository.

svnsync init file:///pathto/myproject https://myproject.dev.java.net/svn/myproject
svnsync sync file:///pathto/myproject

This may take a while if the repository is large and/or your connection is slow.

Stripping Java.net Web Content from the Repository

Java.net uses the project SVN repository to manage the project website, with the files stored under trunk/www. When migrating from Java.net you probably don’t want to continue with this approach. If that’s the case then you’ll want to remove all traces of the www directory from the repository. The usual way of deleting a file – removing it from the working copy and then committing – will not purge its history so instead we use the svndumpfilter command.

First dump the mirrored repository:

svnadmin dump myproject > dump

We can then remove all traces of the www directory. Any commits that only touched files under www are now empty and are dropped completely. All remaining revisions are renumbered to avoid gaps in the sequence.

svndumpfilter --drop-empty-revs --renumber-revs exclude trunk/www < dump > filtered

The final step is to restore the filtered dump in place of our local repository.

rm -rf myproject
svnadmin create myproject
svnadmin load myproject < filtered

Migrating the Repository from SVN to Git

Now that the local SVN repository contains only what we wish to keep, we are ready to migrate it to Git. To achieve this I follow Paul Dowman’s instructions.

The first step is to import the SVN repository into a new Git repository.

git svn clone file:///pathto/myproject --no-metadata -A authors.txt -t tags -b branches -T trunk myproject-git

The authors.txt file maps SVN users to Git users. Your Java.net repository may have commits attributed to the users root and httpd. You should probably just map these to your own user name. There should be one entry for each committer:

root = Your Name <you@example.com>
httpd = Your Name <you@example.com>
you = Your Name <you@example.com>
other = Someone Else <other@example.com>

Refer to Paul’s full instructions if you have branches and tags to maintain.

Pushing to GitHub

Create a new repository on GitHub and then add this as a remote for your local Git repository.

git remote add origin git@github.com:username/myproject.git

And then push:

git push origin master --tags

Job done.

ReportNG 1.1 – i18n, Gradle fix, chronological ordering

Posted in Java, Software Development by Dan on May 15th, 2010

Over the last 6 weeks or so I seem to have taken an unintentional extended break from programming (and from posting here). It’s time to get back into the swing of things and top of my TODO list was putting the finishing touches to ReportNG 1.1 (ReportNG being an alternative HTML reporting plug-in for TestNG).

This new release fixes a problem people have been having using ReportNG with Gradle. It also adds internationalisation support. So now, as well as the default English text, there is also support for French and Portuguese. The Portuguese translation was contributed by Felipe Knorr Kuhn. The French text is just something I added as a proof of concept. It is likely to be offensively bad to a native French speaker and I welcome any corrections. I’d also appreciate any translations for other languages (just open an issue in the issue tracker and attach your translated version of this file).

The other major change is the addition of the “Chronology” page. This is something that exists in the default TestNG reports that I originally decided to leave out of ReportNG. I didn’t really have a use for it but several people have asked for something similar so I have added it.

Zeitgeist 1.0 – An Intelligent RSS News Aggregator

Posted in Java, Software Development by Dan on November 26th, 2009

I recently signed-up for GitHub. Compared to Java.net or Sourceforge, it provides a much lower barrier of entry for code hosting. There’s no need to wait an indeterminate period of time for somebody to approve your project, you just upload it. And because it’s a DVCS, it’s easy for other people to fork your projects and submit patches. Open Source project hosting has become so straightforward, thanks to sites like GitHub and the Bazaar-based Launchpad, that it encourages developers to open up code that they might otherwise have kept to themselves. After all, why bother with local repositories and back-ups when you can get somebody else to do it for you and get free web-hosting and issue-tracking too?

I have a number of trivial and incomplete projects hosted in local Subversion repositories. I am slowly adding to GitHub those that have any worthwhile substance to them. I’m making no promises about the quality of this code, and I don’t intend to spend much time supporting it, but I’m putting it out there in case somebody might have a use for it.

First up is Zeitgeist. This is a small Java library/application for identifying common topics among a set of news articles downloaded from RSS feeds. It’s sort of like what Google News does. There is a basic HTML publisher included that generates a web page for displaying the current top news stories, including relevant pictures.

You give the program a list of RSS feeds that cover a certain topic (maybe world news, or music news, or a particular sport) and it uses non-negative matrix factorisation to detect similarities in the article contents and to group the articles by topic. The original idea comes from Programming Collective Intelligence.

The default HTML output looks a bit like the image below, but you could customise it with CSS or by hacking the default templates to modify what information is included (for example, you could add an excerpt instead of just displaying headlines).

The algorithm is not infallible and how well it works depends a lot on the feeds that you select. It’s also non-deterministic, so if you run it multiple times with the same input you will get variations in the output. Perhaps Zeitgeist is not that useful in it’s current form but it could be used for adding on-topic news headlines to a website or as the basis for something more advanced.

Attention to Detail

Posted in Software Development by Dan on September 23rd, 2009

A thought for the day, courtesy of Landon Dyer (no relation) a.k.a DadHacker.

“Good programs do not contain spelling errors or have grammatical mistakes. I think this is probably a result of fractal attention to detail; in great programs things are correct at all levels, down to the periods at the ends of sentences in comments.”

10 Tips for Publishing Open Source Java Libraries

Posted in Java, Software Development by Dan on July 29th, 2009

One of the strengths of the Java ecosystem is the huge number of open source libraries that are available. There are often several alternatives when you need a library that provides some specific functionality. Some library authors make it easy to evaluate and use their libraries while others don’t. Open source developers may not care whether their libraries are widely used but I suspect that many are at least partially motivated by the desire to see their projects succeed. With that in mind, here’s a checklist of things to consider to give your open source Java library the best chance of widespread adoption.

1. Make the download link prominent.

If other people can’t figure out how to download your project, it’s not going to be very successful. I’m bemused by the number of open source projects that hide their download links some place obscure. Put it in a prominent location on the front page. Use the word “download” and use large, bold text so that it can’t be missed.

2. Be explicit about the licence.

Potential users will want to know whether your licensing is compatible with their project. Don’t make users have to download and unzip your software in order to find out which licence you use. Display this information prominently on the project’s home page (don’t leave it hidden away in some dark corner of SourceForge’s project pages).

3. Prefer Apache, BSD or LGPL rather than GPL.

Obviously you are free to release your library under any terms that you choose. It’s your work and you get to decide who uses it and how. That said, while the GPL may be a fine choice for end user applications, it doesn’t make much sense for libraries. If you pick a copyleft licence, such as the GPL, your library will be doomed to irrelevance. Even the Free Software Foundation acknowledges this (albeit grudgingly), hence the existence of the LGPL.

The viral nature of the GPL effectively prevents commercial exploitation of your work. This may be exactly what you want, but it also prevents your library from being used by open source projects that use a more permissive licence. This is because they would have to abandon the non-copyleft licence and switch to your chosen licence. That isn’t going to happen.

4. Be conservative about adding dependencies.

Every third-party library that your library depends on is a potential source of pain for your users. They may already depend on a different version of the same library, which can lead to JAR Hell (such problems can be mitigated by using a tool such as Jar Jar Links to isolate dependencies). Injudicious dependencies can also greatly increase the size of your project and every project that uses it. Don’t introduce a dependency unless it adds real value to your library.

5. Document dependencies.

Ideally you should bundle all dependent JARs with your distribution. This makes it much easier for users to get started. Regardless, you should document exactly which versions of which libraries your library requires. NoClassDefFoundError is not the most friendly way to communicate this information.

6. Avoid depending on a logging framework.

Depending on a particular logging framework will cause a world of pain for half of your users. Some people like to use Sun’s JDK logging classes to avoid an external dependency; and some people like to use Log4J because Sun’s JDK logging isn’t very good. SimpleLog is another alternative.

If you pick the “wrong” logging framework you force your users to make a difficult choice. Either they maintain two separate logging mechanisms in their application, or they replace their preferred framework with the one you insisted that they use, or (more likely) they replace your library with something else.

For most small to medium sized libraries logging is not a necessity. Problems can be reported to the application code via exceptions and can be logged there. Incidental informational logging can usually be omitted (unless you’ve written something like Hibernate, which really does need trace logging so that you can figure out what is going on).

7. If you really need logging, use an indirect dependency.

OK, so not all libraries can realistically avoid logging. The solution is to use a logging adapter such as SLF4J. This allows you to write log messages and your users to have the final say over which logging back-end gets used.

8. Make the Javadocs available online.

Some libraries only include API docs in the download or, worse still, don’t generate it at all. If you’re going to have API documentation (and it’s not exactly much effort with Javadoc), put it on the website. Potential users can get a feel for an API by browsing its classes and methods.

9. Provide a minimal example.

In an ideal world your library will be accompanied by a beautiful user manual complete with step-by-step examples for all scenarios. In the real world all we want is a code snippet that shows how to get started with the library. Your online Javadocs can be intimidating if we don’t know which classes to start with.

10. Make the JAR files available in a Maven repository.

This one that I haven’t really followed through on properly for all of my projects yet, though I intend to. That’s because I don’t use Maven, but some people like to. These people will be more likely to use your library if you make the JAR file(s) available in a public Maven repository (such as Java.net’s). You don’t have to use Maven yourself to do this as there is a set of Ant tasks that you can use to publish artifacts.

Practical Evolutionary Computation: Elitism

Posted in Evolutionary Computation, Java, Software Development by Dan on February 12th, 2009

In my previous article about evolutionary computation, I glossed over the concept of elitism. The Watchmaker Framework‘s evolve methods require you to specify an elite count. I told you to set this parameter to zero and forget about it. This brief article ties up that loose end by explaining how to use elitism to improve the performance of your evolutionary algorithm.

In an evolutionary algorithm (EA), sometimes good candidates can be lost when cross-over or mutation results in offspring that are weaker than the parents. Often the EA will re-discover these lost improvements in a subsequent generation but there is no guarantee of this. To combat this we can use a feature known as elitism. Elitism involves copying a small proportion of the fittest candidates, unchanged, into the next generation. This can sometimes have a dramatic impact on performance by ensuring that the EA does not waste time re-discovering previously discarded partial solutions. Candidate solutions that are preserved unchanged through elitism remain eligible for selection as parents when breeding the remainder of the next generation.

NOTE: One potential downside of elitism is that it may make it more likely that the evolution converges on a sub-optimal local maximum.

The Watchmaker Framework supports elitism via the second parameter to the evolve method of an EvolutionEngine. This elite count is the number of candidates in a generation that should be copied unchanged from the previous generation, rather than created via evolution. Collectively these candidates are the elite. So for a population size of 100, setting the elite count to 5 will result in the fittest 5% of each generation being copied, without modification, into the next generation.

If you run the Hello World example from the previous article both with and without elitism, you will see that it completes in fewer generations with elitism enabled (22 generations vs. 40 when I ran it – though your mileage may vary due to the random nature of the evolution).

Source code for the Hello World example (and several other, more interesting evolutionary programs) is included in the download.

This is the third in a short and irregular series of articles on practical Evolutionary Computation, based on the work-in-progress documentation for the Watchmaker Framework for Evolutionary Computation. The first article provided an introduction to the field of evolutionary computation and the second article showed how to implement a simple evolutionary algorithm using the Watchmaker Framework.

New Adventures in Software by Dan Dyer