MenTaLguY's page

Other Moonbases

Looking for the German band Alphaville? Their site is called Moonbase too.

Thursday March 22, 2012

08:32 PM: A Monkeypatching Decorator for Python

I had to do a bit of Python monkeypatching today — purely in the name of expediency, you understand — and like most things that are discouraged in Python, monkeypatching turns out to be a rather laborious thing. So, I wrote a simple decorator that automates the tedious bits and works more or less along the lines of Phil Hagelberg’s Robert Hooke library in Clojure.

The operating principle is pretty straightforward: you define a function and use the decorator to specify the function or method to patch it over. When called, your patch function will receive the original function as its first positional argument, followed by any actual arguments that were intended for the real function.

from functools import wraps

def patches(target, name, external_decorator=None):

  def decorator(patch_function):
    original_function = getattr(target, name)

    @wraps(patch_function)
    def wrapper(*args, **kw):
      return patch_function(original_function, *args, **kw)

    if external_decorator is not None:
      wrapper = external_decorator(wrapper)

    setattr(target, name, wrapper)
    return wrapper

  return decorator

Typical usage looks like this:

@patches(SomeClass, 'aMethod')
def aMethodFixed(aMethod, self, foo, bar):
  return aMethod(self, foo, bar + 1) * 2

(It doesn’t really matter what you call the decorated patch function, but anonymous functions aren’t the Python Way.)

Now this function will get called whenever someone invokes .aMethod(...) on an instance of SomeClass (subject to some caveats with some versions of Python that won’t flush the method cache in certain situations). You can do anything you want around calling the original SomeClass.aMethod (which is inserted as an extra initial argument), including not calling it at all.

Instead of patching entire classes, you can also patch methods on individual instances. This works exactly the same way, except that you pass an instance rather than a class to the decorator, and (as with bound methods generally), you won’t get an explicit self argument when your patch function is called. Normally, lacking an explicit self argument is not a big inconvenience, since you’ll usually have a variable referring to the patched instance in the enclosing lexical environment of your patch function.

The main thing that doesn’t work automagically are methods which use descriptors in some non-default way, such as class methods and static methods. In those cases, the staticmethod or classmethod decorator needs to be applied around the wrapper, but before it’s assigned to the target. In other words, neither of these will work:

@classmethod
@patches(SomeClass, 'someStaticMethod')
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)

@patches(SomeClass, 'someStaticMethod')
@classmethod
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)

You’d need to do this instead:

@patches(SomeClass, 'someStaticMethod', classmethod)
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)

(This use case is basically the only reason that the external_decorator argument exists.)

Anyway, there you have it. I’d tell you to use it in good health, but, well—just don’t shoot your eye out. A good heuristic to determine whether it’s appropriate to use this decorator in your code is whether your first thought on reading this article is “Wow, this would be a great way to simplify the code running on my production system.”

(Hint: If that’s your first thought, then please don’t.)

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Wednesday December 21, 2011

10:34 PM: www. and Redirects

I mentioned on Twitter that I think most websites ought to provide service with both www. and www.-less hostnames, and also that one of these ought to be canonical, and the other redirect to it.

I’d like to take a little more space to unpack my reasoning, and what I personally do about it.

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Thursday November 03, 2011

02:35 PM: Asynchronous is not "Fire and Forget": Joiners not Quitters

In all the recent enthusiasm for threaded and asynchronous programming, I’ve noticed people missing something really important: if you fire off an asynchronous task, you will eventually need to wait for it to complete — and you’ll probably need to be able to cancel it as well.

At the basic level, this is why Ruby’s Thread has instance methods such as Thread#join and Thread#kill (even if #kill, specifically, is such a blunt instrument that you should never use it in production code). For the sake of language- and library- neutrality, I’ll call these two sorts of operations join and cancel.

These are very important. Let’s begin by looking at join.

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Saturday June 18, 2011

11:55 AM: Sprockets Versus CommonJS: Require for Client-Side JavaScript

Once you’ve split your client-side JavaScript application into nice, self-contained individual files, you have two new problems:

Your application has to make a bajillion individual HTTP requests to pick up each one of these individual files; the need for so many HTTP transactions kills your load time.
Dependencies among the files require them to be loaded in depth-first order; this may be prohibitively difficult to arrange by hand.

Sprockets is a tool for concatenating multiple JavaScript files into a single one — in the right order. It’s included with Rails 3.1, and in that environment is the recommended way to approach this.

To make sure that a JavaScript file you depend on is included before the file you’re in, just add a special comment of the form //= require "foo" to the top, and the text of the referenced file (in this case, foo.js) will get included at that point if it hasn’t already been included earlier in the concatenation.

To me, though, it’s a little sad that we’ve come full-circle back to the same approach used by the C preprocessor to attain modularity. (Sprockets require even has similar semantics assigned to the use of quotes and angle brackets for included filenames — //= require <foo> searches “system” paths, whereas //= require "foo" looks in your project.) Everybody’s still pooping in the global namespace, and there’s no inter-module isolation except for what you happen to be disciplined enough to impose on yourself.

It’s doubly sad, because in the server-side JavaScript world, we’ve got a perfectly servicible module system, specified as part of CommonJS. Admittedly, the CommonJS module API isn’t normally appropriate for client-side use, because the require function it specifies is synchronous, whereas loading individual scripts from a browser is an asynchronous activity. However, if you’re going to be concatenating all your JavaScripts together into a single giant file anyway… why not?

Implementing the CommonJS Module API itself isn’t that difficult; a simple implementation of the whole thing seems to require about 70 lines of code:

// module-prelude.js
var require;

(function () {
  var hasOwnProperty = Object.hasOwnProperty;
  var currentModuleId = "";
  var modules = {};

  require = function (moduleId) {
    moduleId = resolveModuleId(moduleId,
                               currentModuleId);
    if (!hasOwnProperty.call(modules, moduleId)) {
      var message = "No such module " + moduleId;
      throw new ReferenceError(message);
    }
    var module = modules[moduleId];

    var exports = module.exports;
    if (!exports) {
      exports = module.exports = {};
      var body = module.body;
      delete module.body;

      var savedModuleId = currentModuleId;
      try {
        currentModuleId = moduleId;
        body(exports, {id: moduleId});
      } finally {
        currentModuleId = savedModuleId;
      }
    }

    return exports;
  }

  require.defineModule = function (moduleId, body) {
    modules[moduleId] = {body: body};
  }

  require.loadAllModules = function () {
    for (var moduleId in modules) {
      if (hasOwnProperty.call(modules, moduleId)) {
        require(moduleId);
      }
    }
  }

  // resolve relative module ids
  function resolveModuleId(moduleId, baseId) {
    moduleId = moduleId.split("/");
    var absModuleId;
    if (moduleId[0] === "." || moduleId[0] === "..") {
      absModuleId = baseId.split("/").slice(0, -1);
    } else {
      absModuleId = [];
    }

    for (var i = 0; i < moduleId.length; i++) {
      var component = moduleId[i];
      if (component === ".") {
        // ignore
      } else if (component === "..") {
        absModuleId.pop();
      } else {
        absModuleId.push(component);
      }
    }

    return absModuleId.join("/");
  }
})();

Untested. require.defineModule and require.loadAllModules are non-standard extensions, and unlike in some other server-side contexts, all the modules share natives and the same global object.

Anyway, let’s say you’ve got two CommonJS modules, one of which uses an API exported by the other:

// foo.js
var bar = require("bar");
bar.displayMessage("Hello!");

// bar.js
function displayMessage(message) {
  alert("Message: " + message);
}
exports.displayMessage = displayMessage;

A Sprockets-like tool could then concatenate them all together like so:

// application.js
(function () {
  // content of module-prelude.js

  require.defineModule("foo", function (exports, module) {
    // content of foo.js
  });

  require.defineModule("bar", function (exports, module) {
    // content of bar.js
  });

  require.loadAllModules();
})();

Ideally, I’d like to see Sprockets support this itself, though the issue of backwards-compatibility needs to be considered. It’d be wonderful to be able to use CommonJS modules to structure client-side JavaScript.

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Thursday March 03, 2011

01:06 PM: Queueing Theory: Why the Other Line Moves Faster

In this video, Bill Hammack breezes through an introduction to basic queueing theory, the discipline established by Danish mathematician Agner Krarup Erlang. (Yes, the Erlang programming language is named after him.)

In this era of parallel and distributed systems, I think queueing theory is actually a very important area for programmers to be familiar with, but Bill’s explanation should be accessible even to a non-technical audience.

Why the other line is likely to move faster

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Sunday February 27, 2011

11:07 PM: Saving Money with Amazon S3 and Bittorrent

I’m not the biggest fan of Amazon lately, but if you happen to be using S3 for hosting big downloads, or if you want to permanently publish a file using bittorrent without having to maintain your own seed for the rest of time, S3 has a little-used feature that could save you a lot of trouble — and potentially money.

Seeding Torrents from S3

It turns out that S3 will publish and seed a torrent for any publicly-available file stored in S3. This is pretty easy to set up:

Upload a file to S3 and make it public
Visit the file’s URL with ?torrent appended
After a delay, you’ll get a .torrent for that file; save it to your computer
Amazon will seed that torrent for as long as the file remains public

For example, if your uploaded file were available at bucketname.s3.amazonaws.com/my.mp3, the URL to get a .torrent for it would be bucketname.s3.amazonaws.com/my.mp3?torrent.

S3 doesn’t generate the torrent until the first time it’s requested, so you may have to wait a while for the .torrent to be generated if the original file is large.

In my experience, when demand outstrips supply, Amazon will actually temporarily spin up additional seeds in order to keep download speeds up (each individual Amazon seed seems to max out around 72kbps). In terms of billing, you’re charged (at the normal S3 rates) for all data downloaded via the Amazon seeds, but peer-to-peer transfers and downloads from other seeds would obviously be free for you.

Technical details of working with Bittorrent and the S3 REST API can be found in Amazon’s developer documentation.

Saving Money

There are basically two scenarios (that I can think of) in which seeding from S3 has the potential to save money:

You’re serving popular downloads from S3 and start using Bittorrent (which reduces the amount of data served from S3 for a given number of downloads)
You’re seeding torrents from an EC2 instance (or other hosted server), where bandwidth costs are typically higher than from S3

In the first case, potential savings are going to be largely proportional to how busy the torrent is. If you only ever have one person downloading at a time, costs will be pretty much the same as if people were downloading via HTTP directly.

In the second case, any difference is going to depend on the exact pricing structure you’re dealing with — for example, the first gigabyte downloaded from EC2 in a billing cycle is free, so if your EC2 seeds never serve significantly more than that, seeding from S3 is actually the more expensive option.

In both cases, savings aren’t guranteed; it’s important to keep an eye on costs and run the numbers. If you aren’t measuring, you’re losing.

To sum up:

Advantages

Setting up torrents for files you already have in S3 is extremely simple
You don’t have to maintain a seed or a tracker yourself
Versus direct downloads from S3, you’re only billed for bytes downloaded from Amazon’s seeds

Limitations

S3 won’t generate torrents for:
- multiple files at once; multi-file torrents aren’t supported
- files larger than 5GB
You’re stuck using Amazon’s tracker if you want Amazon’s seeds to work for you. (On the other hand, it’s not that difficult to edit a .torrent to add extra trackers.)
In some lower-usage situations it’s possible that — compared to a seed running on EC2 — S3 bandwidth costs would actually be more expensive.

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Friday February 11, 2011

11:12 AM: Neil Gaiman on Copyright, Piracy, and the Web

The Open Rights Group interviews Neil Gaiman about his experiences with online piracy:

(Gaiman on Copyright Piracy and the Web)

Edit: Also, here’s a journal entry that Gaiman posted during his American Gods experiment, responding to the concerns of an independent bookseller:

I don’t see this as either they get it for free or they come and buy it from you. I see it as Where do you get the people who come in and buy the books that keep you in business from?

The books you sell have “pass-along” rates. They get bought by one person. Then they get passed along to other people. The other people find an author they like, or they don’t.

When they do, some of them may come in to your book store and buy some paperback backlist titles, or buy the book they read and liked so that they can read it again. You want this to happen.

(Read the rest…)

Comment

All fields are optional. Feedback is private by default; if it's okay for me to post your comment publicly, be sure to say so.

Name:
Email:
Homepage:

...is this spam?	Yes No

Moonbase

Other Moonbases

Sponsored Links

Thursday March 22, 2012

08:32 PM: A Monkeypatching Decorator for Python

Comment

Wednesday December 21, 2011

10:34 PM: www. and Redirects

Comment

Thursday November 03, 2011

02:35 PM: Asynchronous is not "Fire and Forget": Joiners not Quitters

Comment

Saturday June 18, 2011

11:55 AM: Sprockets Versus CommonJS: Require for Client-Side JavaScript

Comment

Thursday March 03, 2011

01:06 PM: Queueing Theory: Why the Other Line Moves Faster

Comment

Sunday February 27, 2011

11:07 PM: Saving Money with Amazon S3 and Bittorrent

Seeding Torrents from S3

Saving Money

Advantages

Limitations

Comment

Friday February 11, 2011

11:12 AM: Neil Gaiman on Copyright, Piracy, and the Web

Comment

Thursday February 10, 2011

03:49 AM: Kitten (and Chipmunk) in Slow Motion