Stopping Being Part of the Problem

Posted in The Internet by Dan on July 27th, 2015

No trackers on this site.Some time ago I wrote about my issues with Google and its anti-privacy agenda. It was my intention to avoid using Google services where possible because I don’t like Larry Page’s vision and Google’s (and Facebook’s and others’) tracking of individuals’ activity across the web. I have very little to hide but that’s not the point.

Two of the biggest enablers of Google’s surveillance machinery are Adsense and Google Analytics. These services give Mountain View a foothold on millions of properties across the web, from which they can observe users activities and track them from site to site. Today I finally got around to removing these from my own sites. I never paid much attention to the analytics anyway and there are more privacy-friendly alternatives if required. Adsense never generated enough income to compensate for the cheapening of the message.

Other unrelated changes I’ve made to this blog include changing the canonical domain to instead of (the old URLs will redirect). In addition, some time ago I disabled comments. There were occasionally a few worthwhile responses but over 99.6% of all comments submitted were spam. The bottom half of the web is generally a cesspit. If you feel the need to respond to something I’ve written, you can contact me on Twitter or e-mail dan at this domain.

Google Skynet

Posted in The Internet by Dan on May 16th, 2013

Google policy is to get right up to the creepy line and not cross it.

These are the words of former Google CEO Eric Schmidt, speaking to The Atlantic in October 2010.

We don’t need you to type at all. We know where you are. We know where you’ve been. We can more or less know what you’re thinking about.

This was two and a half years ago. Today Google is so far over the creepy line it can’t even see the line any more. The problem is not so much individual Google products but the way in which the vast data from all these disparate services is combined to build a very detailed profile of you. Not just what you do on Google sites but also any other site you visit that includes Google’s +1, Adsense or Analytics JavaScript (including this one). Google can read every e-mail you send and receive, it knows everything you search for online and knows pretty much every website you ever visit. It knows where you are, where you’ve been and probably with whom. Google knows more about you than your mother does and with Google+ it’s all neatly connected to your real identity.

I’ve so far avoided signing up for Google+ but it’s increasingly difficult to ignore, particularly as an Android developer, as it’s becoming more tightly integrated into everything that Google does. It’s the keystone of Google’s anti-privacy agenda.

The privacy implications are not the only issue. Most users perceive Google’s search results to represent some sort of objective truth, with links ranked only by their relevance to the search query, but in fact each user is served personalised results. If you and I both search for information on some contentious political topic, Google won’t necessarily give us the same response; it will show us each what it thinks we want to see based on its profiling of us.

In his Google I/O Q&A yesterday, Google CEO Larry Page dismissed concerns over the implications of this profiling:

[Audience member]: Most of my opinion, I can trace back to a Google search. As search becomes more and more personalized, and predictive, I worry that it informs my world view and rules out the possibility of some other serendipitous discovery. Any comment on that?

[Page]: People have a lot of concern about that – I’m totally not worried about that at all.

Personally, I don’t like Google’s all-encompassing vision. It’s not the only company that employs such methods but I can easily ignore the likes of Facebook and Bing as I don’t use them. Google is everywhere.

Spurred more by the closure of Google Reader than anything else, a couple of months ago I began to consider alternatives to relying on the benevolence of one omniscient company for the services I use every day.

The first thing to go was GMail. E-mail is important enough to be worth paying for so I signed-up with FastMail (now owned by Opera), which offers an ad-free service from $4.95 per year. It has a refined web interface, IMAP access, and a simple way to import your existing messages from GMail.

I eventually replaced Google Reader with The Old Reader and for search I’m now using DuckDuckGo, which promises not to track its users or filter search results. DuckDuckGo doesn’t quite match Google in terms of the freshness or depth of its results but it’s fast and has some useful features. On DuckDuckGo’s recommendation I also installed Ghostery for Opera to block Google and others’ attempts to track me on third-party sites. I’ve removed the Google +1 buttons from this blog and will be looking for more privacy-friendly alternatives to Google Analytics and Adsense (both of which would be useless anyway if everybody is driven to use the likes of Ghostery).

I’m not giving up on Google entirely, I just prefer to keep it at arm’s length. I’ll still be developing for Android (for which there were many very welcome developer announcements at I/O yesterday) and no doubt I’ll continue to use some of its other services.

We Can Rebuild Twitter – We Have the Technology

Posted in The Internet by Dan on June 30th, 2012

Why do we need Twitter? I don’t mean in the sense of my previous question from 2008 (i.e. what’s the point?), I mean why is it necessary to have a single company controlling “microblogging”? It’s a question that hadn’t occurred to me until I saw Matt Gemmell’s tweet this morning linking to Brent Simmons’ suggestion for a decentralised Twitter-like service:

The discussion is prompted by Twitter tightening its control of the service by restricting what third-party clients and developers can do.

It took me some time to accept Twitter as even vaguely worthwhile. One unexpected way that I’ve found myself using Twitter is as an RSS reader replacement (these days most sites post links to their articles on Twitter), particularly since Google crippled Google Reader’s sharing in order to try to drive adoption of Google+. The fact that Twitter is effectively just doing what RSS feeds and aggregators have been doing for years (microblogging is just blogging with a higher volume of shorter messages) raises the question of why it can’t use a similar decentralised architecture?

Like RSS feeds, Twitter-like feeds could be published anywhere on the open web as long as they conform to some standard format (e.g. RSS with a 140-character limit). People who don’t have a web server to host their feed could use one of a number of competing services just as they might choose Tumblr, Blogger or WordPress to host their blog. In theory clients could fetch the feeds directly from the source although in practice, as with RSS feeds, there would be some efficiencies and other benefits from using intermediate third-party aggregator services. Searching could be handled by Google or another search engine while notifications of mentions could be handled with blog-like pingbacks.

The benefits to a decentralised system based on open standards are that it would allow different services and clients to compete on implementation and would be resilient to censorship. Service providers could experiment with business models such as different advertising strategies or paid access for a premium service. Client developers could innovate in how they present information without the risk of being blocked from accessing the network. Network admins could set-up their own sub-networks based on the same standards, such as a Yammer-like network restricted to a company’s intranet.

Free StackOverflow Careers Invites

Posted in Software Development, The Internet by Dan on June 30th, 2011

About 18 months ago when I was looking for contract work, I signed up for Stack Overflow Careers. This was back when you had to pay to use the service (it’s now free but registration is by invitation only – more on that in a minute). The asking price had quickly dropped from an ambitious $99 a year to a merely illegal $9 by the time I signed up.

The Careers site has evolved considerably since it launched and now allows you to create quite a sophisticated online CV that aggregates your Stack Exchange activity, open source projects (GitHub, Google Code, etc.) and more. The idea of the service is that employers can search for candidates who appear to be good matches for their vacancies. All the enquiries I had were from companies developing financial software in London, which would have been ideal except for the work and the location.

Anyway, it seems that Stack Overflow is desperate for more people to sign up for Careers. When it went free-but-invite-only I was awarded 5 invites to distribute to suitable individuals. Last week I was given 20 more invites. Today I have been given two further batches of 20 invites. I currently have 63 invites available. If you want to try out the service for yourself, all you have to do is follow this link to claim one of my invites.

Zeitgeist in the Wild – Auto-updated Football Headlines

Posted in The Internet by Dan on May 25th, 2010

Having started with the other day, I am continuing with the novel idea of actually putting code I have written into action. If you are a regular reader here you may remember that some months ago I mentioned a library that I had written called Zeitgeist. Zeitgeist analyses a set of RSS feeds and groups articles by topic.

As a companion to the aforementioned, I am now using Zeitgeist to track football news from various sites across the Internet. The result is This is simply the Zeitgeist publisher application being invoked every 30 minutes by cron.  The site should probably be considered a beta since occasionally the article groupings are unsatisfactory, but for the most part it’s a useful way of keeping up to date with the latest stories without having to monitor dozens of sites yourself.


Posted in Haskell, The Internet by Dan on May 23rd, 2010

A long time ago I wrote a brief round-up of the options for generating HTML output from Haskell.  The reason I was looking into this at that time was because, as an exercise to learn more about programming in Haskell, I was attempting to replicate the functionality of my Football Statistics Applet (FSA) but with pure HTML output rather than a heavyweight interactive Java applet.

The result of this effort was a Haskell program I call Anorak, the vast majority of which I wrote quite a while ago (it’s not going to win any prizes for beautiful Haskell code). It processes FSA data files and, using HStringTemplate, generates static HTML pages containing league tables, form tables, sequences and more.

Having left Anorak dormant for months, yesterday I tidied up a few rough edges and created This online resource provides current and archive statistics for the main football leagues in England, including an all-time Premier League table that incorporates the result of every match played since 1992. I also intend to include all of the main Scottish divisions soon but the Scottish Premier League has a bizarre split structure that, though supported by FSA, is not yet supported by Anorak. Other European leagues (Serie A, La Liga, Bundesliga) will follow some time in the future.

Update: Anorak has been updated to deal with the SPL-style split format and the site has now been expanded to include the SPL and all divisions of the Scottish Football League.

Open Source Graphic Design – New Watchmaker Framework Logo

Posted in Evolutionary Computation, The Internet by Dan on January 11th, 2010

A while ago I created a new website for my main Open Source project, the Watchmaker Framework for Evolutionary Computation. While the new website was a definite improvement over the previous effort, it was still lacking something. It wasn’t distinctive. What I really needed was a logo, something that visually identified the project. But how do you represent evolution and/or a watchmaker in the form of a simple, distinct picture?

It was beyond my modest artistic skills so I headed to Reddit and floated the question of how to find a graphic designer who would be willing to contribute to an Open Source project. I wasn’t expecting much, after all I was asking somebody to work for free on a fairly obscure project.

The usual way to get somebody to make you a logo is to find a professional graphic designer and pay them, or to stump up some cash to fund a contest on or The prices start at $204 at the latter. I got a few suggestions from the Reddit crowd that I should try this monetary compensation idea.

I also got several people offering to produce a logo for me free-of-charge, and even a few who spontaneously decided to create something and submit it as a suggestion (see some of the links in the Reddit thread).  I really wasn’t sure if there would be many graphic designers who were interested in helping out Open Source software projects but it seems there are plenty. What is lacking is somewhere on the web to connect these willing designers with needy projects.

One of the people who got in touch to offer his services was Charles Burdett. He sent me a link to his impressive portfolio. I quickly accepted Charles’ offer before he had a chance to change his mind. This meant turning down a number of other offers that I received. I’m extremely grateful to all of the people who were willing to help me out, and some of the concepts that people suggested would have made very good logos.

Charles came up with the concept that now adorns the Watchmaker project website – the clockwork Ichthyostega. To be honest, I don’t even know how to pronounce that but Wikipedia tells me that Ichthyostega was a creature from the Upper Devonian Period. It was an intermediate form between fish and amphibian, so a significant step in the evolution of life on this planet.

This logo fits in very nicely with the existing site design and, because it’s a simple outline drawing, it should be very versatile for use in different contexts. I’m very happy with the result and I’d like to thank Charles for all of his work on this. If you like this logo, or any of Charles’ other work, he is available for hire.

New Adventures in Software – Top 10 Most Popular Articles of 2009

Posted in The Internet by Dan on December 31st, 2009

The end of the year is here and, as is traditional among bloggers and mainstream media alike, I’ve lazily compiled a top 10 list to mark the occasion without having to exert myself. So here it is, according to Google Analytics, the top 10 articles of 2009 from this sporadically updated blog.

When I checked the stats, four of the top five most read articles this year were actually from 2008, with Why are you still not using Hudson? claiming first place. This list includes only those articles that were first posted in 2009.

  1. 5 Ways to Become a Famous Programmer
  2. Practical Evolutionary Computation: An Introduction
  3. Random Number Generators: There Should be Only One
  4. Practical Evolutionary Computation: Implementation
  5. Debugging Java Web Start Applications
  6. Using PHPUnit with Hudson
  7. Understanding PHP – A journey into the darkness…
  8. Programming the Semantic Web and Beautiful Data
  9. Uncommons Maths 1.2
  10. The Java Language Features that Nobody Uses

Happy New Year.

Programmers’ CVs – 20 years behind the times?

Posted in Software Development, The Internet by Dan on October 24th, 2009

Take a programmer’s CV/résumé from the late 1980s and one from today and, aside from the content, what has changed?

Not much. Both will typically be approximately two pages of static, word-processed, black text on white A4 paper (or US Letter in North America). Maybe the text doesn’t always arrive on actual paper these days thanks to the miracles of electronic document transfer, but the format of the typical CV has not really changed since the demise of the typewriter.

NOTE:  Just to be clear, I am considering CVs and résumés to be equivalent. Wikipedia makes a distinction but I’m assuming that’s mainly an American thing. In the UK there is typically no distinction and the Latin term “Curriculum Vitae”, abbreviated to “C.V.”, is almost universally preferred to the French “résumé” (I’ve no idea why there isn’t an English word for this type of document).

Have we really found the optimal way of communicating our skills and experiences, or has the humble CV been neglected by the Internet revolution? The IT recruitment industry seems wedded to the Microsoft .DOC format. This is partly because of the ubiquity of Microsoft Office in the corporate world and partly because agents prefer to receive CVs in a format that they can easily edit (which is why I insist on sending my CV as a PDF).

Dynamic CVs

Where’s the innovation? Shouldn’t a CV be something more dynamic? And why are we still e-mailing attachments every time we want somebody to see our CVs?  Attached files very quickly become outdated. I often have recruiters that I’ve spoken to in the past phoning me to ask for an updated CV. If my CV was a URL, people would always be able to see the latest version (assuming I let them have access). We do have LinkedIn profiles, which are fine in the context of your LinkedIn network but don’t really work as a general purpose CV.

Fortunately, there are some people trying to drag the programmer’s CV into the 21st century. Ben Northrop’s provides a free online home for your programming CV. The documents are nicely presented and the timeline view is a neat way to display your own personal history. VisualCV goes even further and embraces multimedia content, though the site is not IT-specific. Maybe it’s not a good idea to have a video as part of your CV but it’s nice to have the choice.

If you are planning to break a few conventions with your CV, either on the web or in a static file, it would be useful to measure the impact of any changes that you make, so you might be interested in, which I saw announced today on Reddit. You use it to make documents trackable in pretty much the same way spammers embed 1-pixel images in e-mails in order to see which messages get read.

StackOverflow Careers

Another attempt to bring programmer CVs into the Web 2.0 age is the recently announced StackOverflow Careers.  The main Stackoverflow site has been a phenomenal success. Co-founder Joel Spolsky has had previous success with the Joel on Software jobs board and we are reminded that he wrote a book on recruiting programmers, so this kind of job-related monetisation was the obvious next step.

Voting peculiarities and reputation anomalies aside, StackOverflow is a meritocracy of sorts and it is this that Joel and business partner Jeff Atwood are attempting to exploit. A CV posted on StackOverflow Careers will be accompanied by a reputation score and a history of contributions to the programming community. The careers feature is currently in beta and is not particularly sophisticated at present but I expect it to expand over time.

I like the idea of expanding the scope of a CV to include other online evidence of programming competence. In this case it’s StackOverflow reputation but it could be Ohloh data or information from Google Code/SourceForge. However, at $99 to list your CV for a year, the current pricing is ridiculous. Most recruitment sites charge employers but let candidates use the service for free.  Joel and Jeff are taking a fee from both sides. The justification for charging candidates is that it will ensure that the only CVs listed on the service will be from people who are actively looking for work, increasing the value of the service to potential employers. It should also mean that your CV page is kept free from advertising.

I suspect that Joel and Jeff are aware that $99 is too much but it’s easier to start out too expensive and reduce the price than it is to do the opposite. The $99 figure also serves to make the introductory offer of $29 for 3 years look more reasonable in comparison. The problem with the introductory offer is that it only runs until 9th November. The site is still in beta and the functionality for employers to sign-up has not been launched yet. So, if you sign-up now to get the reduced rate, you’ll be paying $29 to list your CV on a site that is not used by any employers. It’s an unproven platform with a boot-strapping problem. Most programmers won’t want to pay money for a speculative premium service and employers are likely to be reluctant to sign-up to search a database with so few potential hires.

Netflix Prize Snatched Away at Last Moment?

Posted in Software Development, The Internet by Dan on July 26th, 2009

30 days ago, the BellKor’s Pragmatic Chaos team submitted the first qualifying solution for the $1 million Netflix prize.  The prize is awarded to the best performing solution 30 days after first submission that achieves the 10% improvement threshold.

BellKor achieved 10.05% on 26th June and have since moved on to 10.08%.  Several teams that were close to the qualifying mark responded by forming coallitions in a frantic race to find a hybrid solution that would surpass BellKor’s mark before the end of the 30 day period.

The Ensemble is one of these super teams.  They achieved the 10% mark two days ago and then today, on the very last day of the competition, they appear to have dramatically snatched the prize with a submission that is just 0.01% better than BellKor’s.

UPDATE: BellKor subsequently submitted an entry that matched the Ensemble’s 10.09% only for the Ensemble to trump that 20 minutes later with a score of 10.10%, 4 minutes before the submissions closed.

UPDATE 2: Simon Owens has posted an interview with one of the members of the winning Ensemble team.

UPDATE 3: The Ensemble themselves have posted an account of the nail-biting final minutes of the competition.

« Older Posts