Paul Hammond's Journal

Some things of interest to Paul Hammond, arranged in a reverse chronological order with a feed

Image analysis on the cheap

The recently rebuilt Favcol presented an surprisingly interesting challenge: how to analyze the images.

Image processing at scale is effectively a solved problem. The algorithms are well optimized, and it's trivial to scale horizontally by adding more hardware to your image processing cluster. Sites like Flickr and Picasa have optimized the process enough to resize images on the fly if needed while serving thousands of requests a second.

Scaling image processing down is a different story. I think everyone I've ever talked to about processing images on a small site has a horror story. The story of Favcol is fairly typical.

The first version of Favcol was a Rails application, and used RMagick to manipulate images in memory. It was a disaster. Memory leaks caused processes to grow until the box crashed hard. Reaping processes helped a little, but the server I was running it on was supposed to be doing other things at the same time, and couldn't really wait 60 seconds to recover.

The next version shelled out to the gm GraphicsMagick command to manipulate files, then read the results back from disk. In theory this should have been slower and more expensive, in practise it was significantly more efficient. If there's one piece of advice I can give to anyone thinking about doing any kind of handling of large images, it's to do the hard work in a seperate process unless you really know what you're doing. And if you think you know what you're doing, do the hard work in a seperate process anyway because you're probably wrong.

Even so reading a few hundred huge files every five minutes was still killing my server. One day Favcol crashed the machine again. The cron job got disabled. The intent was to fix it quickly but kids and work and life got in the way and that never happened.

Eventually I started looking at alternatives. Upgrading my virtual server was more expensive than I'm willing to pay to host something like Favcol. I could make the bills cheaper by bringing up an EC2 instance to batch process images for half an hour each day, but part of the fun of favcol is seeing your photo appear within a few minutes. I looked for online services for image processing, and found many different ways to resize or post-process images and no services to give me an average color. I even briefly considered doing the work on visitor's computers with <canvas>.

Google App Engine kept bubbling up as a potential solution - it's free if you stay below a quota and has a built in image manipulation API. The only problem was that App Engine offers no easy way to get at the raw pixel data for an image that has been processed, which is the only data I needed.

Eventually I realized there is a workaround.

The trick is that PNG files are an easy to read, even from high level scripting languages like Python. So you can use the App Engine Image Manipulation Service to convert an image into a smallish PNG, then read the raw data using a pure python library like pypng:

# go grab the image
result = urllib2.urlopen(url)

# resize to a 20px thumbnail
img = images.Image(result.read())
img.resize(, )
thumbnail = img.execute_transforms(
              output_encoding=images.PNG)

# read the thumbnail
r = png.Reader(bytes = thumbnail)
png_w,png_h,pixels,info = r.asDirect()

It's a hack, but it works well enough to process a few thousand images throughout the day without costing me any money.

The full code I use is up on github. It only does basic RGB mean average at the moment, but it should be easy to add other metrics like dominant colour.

I hope it's useful.

Posted 2010-08-23

JS1k

9 years ago I entered the 5k competition with a DHTML cellula autonoma based on Langton's Ant.

I was pretty proud of being able to get everything into 5k until the JS1k contest launched this month and got me wondering how much smaller I could go.

The answer is 640 bytes.

Some of the savings come from not supporting Internet Explorer and removing unneeded functionality like color schemes. Others come from using canvas instead of absolutely positioned div elements. There's also a lot of optimizations including some neat modulo arithmetic (like x+=G*(1-v)%2 and x=x<0?w-1:x%w) and a new algorithm using less data (which means less code).

You see this in context over at the js1k site, or take a look at a local copy. You can also take a look at the original uncompressed javascript source.

While you're at it take a look at the other demos, especially my favourite entry so far, Barry van Oudtshoorn's Swarm.

Posted 2010-08-18

Always ship trunk

I spoke at the O'Reilly Velocity conference this afternoon about using version control to manage web services.

Brian Moon summarized my talk in 140 characters:

“For those that have not gotten the message yet, there is no vcs that does what @ph wants. Your app has to do it. #velocityconf”

I obviously went into a little more detail, including why web applications are different to installed apps, problems that none of the revision control systems solve, and how you can solve some of them yourself within your app.

The slides are online as a PDF file.

Posted 2010-06-24

Favcol works again

For those that don't remember it from the first two times around, Favcol is an ongoing attempt to discover the web's favorite color by averaging out the colors in pictures. The first version (by Matt Webb) used MMS, then later on I rebuilt it to use Flickr.

I had to shut it down a few years back because processing the images was overwhelming my small virtual server. The new version offloads all the heavy lifting to someone else's servers. It's also a little bit prettier (but only a little).

Go play with it.

Posted 2010-06-03

Joining Typekit

Friday will be my last day at Flickr. On Monday I join the team at Small Batch working on Typekit.

We're at an inflection point with type on the web. Typography is important, but we've forgotten about it because our options have been so limited for so long. Growing browser support for @font-face rules makes it technically possible to embed fonts, but that's only part of the story. There are a bunch of other problems that need to be worked on, not least the legal ones, and Typekit is in an ideal position to solve them. I can't wait to be involved.

But I'm incredibly sad to be leaving Flickr. The team is (and always has been) full of the most talented people I've ever worked with, and the product itself remains special in an intangible way that nobody has really replicated since. I have learned so much while working there, in some cases from making some huge mistakes, but mostly by watching people much better than me making their impossible jobs look effortless. We wrote a lot of code, launched a lot of features, fixed a lot of bugs, and had a lot of fun. The goal was to kick ass, and I think we succeeded.

Working at Flickr and Yahoo also gave me the opportunity to move halfway round the world to San Francisco. I am happy here, it feels like home, and cannot imagine living anywhere else. I'm hugely grateful to Stewart, Caterina, Bradley, Chad and Cal for making that possible.

In fact, I'm hugely grateful to everyone on the Flickr team. You've set high expectations for the next few years, and I'm going to miss working with you.

Posted 2010-03-22

NextMuni API

I spent a chunk of today catching up with the state of public data for the San Francisco Muni.

Minimuni has been broken for a few months now. At some point NextBus changed their website slightly and it broke the screenscraper. Mike told me he had some updated code that worked, but he never sent it to me and I never got round to asking for it.

Today I noticed (via lhl) that NextMuni has an API, so I spent an hour or two exploring it. It's really good.

It has all of the data I was scraping from the website and then some, including current locations of trains, the lat/long of every stop on the network, detailed description of routes, and most importantly expected arrival times. Most of the time spent making Minimuni was getting the scraper working, so I can see how easy access to data will encourage more transit based apps.

So Minimuni works again (with the updated source on github) and if you're interested in getting data about SF Muni you should go look at the SF Muni API.

Posted 2009-11-08

Pingr

Google released XMPP support for App Engine a few days ago and I spent a couple of hours playing with it last night. The result is Pingr, another app that solves a problem for me and me only.

My wonderful wife Amy has only one annoying habit; when she's at home she often leaves her phone in her bag next to the front door and doesn't notice when I'm trying to call or text her. We don't have a home phone, so when this happens I have to log onto IM to send her a message, and at some point she'll notice it on her laptop and get back to me. Pingr reduces this to one step, a "ping" message is sent whenever I press the button on my iPhone home screen:

Of course the app is completely pointless, it's just a good excuse to play with the technology.

As with the rest of App Engine, the XMPP support is (mostly) well documented, and just works. Treating incoming messages the same as incoming HTTP requests is a nice abstraction, having both in the same framework makes writing apps that bridge the web and XMPP easy. It reminds me of the hacking I did on POE 5 years ago, but much much easier.

There were a couple of rough patches:

There's no documentation on how to test XMPP handlers locally. I found the test page in the SDK console at /_ah/admin/xmpp/ by accident, until then the only way I could think of to test was to deploy it to Google's servers and test in production.
The handling of JIDs is less polished than I'd expect. There's some documentation on "JIDs", "Resources" and "Bare JIDs", but the sample code ignores the details entirely and there's no convenience functions to handle them. It's nothing that a split('/') can't handle, but it's something that will trip up people not already familiar with how XMPP works.
The "Oops. Something went wrong." message gets really annoying really quickly. If I have debugging on it'd be nice to get some kind of error in context without having to go to the admin console and check the logs.

Nitpicking aside, it took me longer to pick colors for the icon than it did to learn the framework from scratch and write the code. That alone makes XMPP in App Engine very interesting.

Posted 2009-09-04

Transparent grids

Inspired by Delta909's Pixel Matrix wallpaper, I've been playing with layering randomly generated alpha transparent PNG grids on top of photos:

This is three images layered on top of each other:

a photo
a grid of white squares with random opacity
a grid of black squares with random opacity

The grids are different sizes to decrease obvious tile repetition. They were generated using a small ruby script.

Posted 2009-07-05

Dev and Ops Cooperation at Velocity 2009

I spoke at Velocity 2009 on Tuesday with my friend John Allspaw about how we work together at Flickr to make a lot of changes to the site without breaking it too often.

This is the third talk I've given with others around how to work together, but the first time I've tackled the ops/dev connection. It seems this relationship is easier to reflect on; it felt more natural giving direct advice in our talk instead of vague handwaving about designers and developers being nice to each other.

The slides are available as a PDF, or you can look at them on slideshare. A video of the session is up on blip.tv, and there's even a code.flickr.com post about the whole thing.

Posted 2009-06-28

The old new webkit2png

I've had a new version of webkit2png ready to release for over 4 years now. Until today I'd just never got around to pushing it out the door.

The new version includes a bunch of random bug fixes and some new options. The most interesting change is a --delay flag that makes it possible to take screenshots of websites that do things after the page loads, which is particularly useful for sites that use flash.

While I was at it, I also spruced up the instructions, relicensed the code under the MIT license, and uploaded the complete revision history to GitHub.

Go get it.

Posted 2009-03-28