Blog

spacer

spacer

Yehuda Katz is a member of the Ruby on Rails core team, and lead developer of the Merb project. He is a member of the jQuery Core Team, and a core contributor to DataMapper. He contributes to many open source projects, like Rubinius and Johnson, and works on some he created himself, like Thor.

spacer @chriseppstein ppl are constantly complaining to ppl who do free work. The same complaints are levelled at jQuery and Rails. @brendaneich

JavaScript Needs Blocks

January 10th, 2012

While reading Hacker News posts about JavaScript, I often come across the misconception that Ruby’s blocks are essentially equivalent to JavaScript’s “first class functions”. Because the ability to pass functions around, especially when you can create them anonymously, is extremely powerful, the fact that both JavaScript and Ruby have a mechanism to do so makes it natural to assume equivalence.

In fact, when people talk about why Ruby’s blocks are different from Python‘s functions, they usually talk about anonymity, something that Ruby and JavaScript share, but Python does not have. At first glance, a Ruby block is an “anonymous function” (or colloquially, a “closure”) just as a JavaScript function is one.

This impression, which I admittedly shared in my early days as a Ruby/JavaScript developer, misses an important subtlety that turns out to have large implications. This subtlety is often referred to as “Tennent’s Correspondence Principle”. In short, Tennent’s Correspondence Principle says:

“For a given expression expr, lambda expr should be equivalent.”

This is also known as the principle of abstraction, because it means that it is easy to refactor common code into methods that take a block. For instance, consider the common case of file resource management. Imagine that the block form of File.open didn’t exist in Ruby, and you saw a lot of the following in your code:

begin
  f = File.open(filename, "r")
  # do something with f
ensure
  f.close
end

In general, when you see some code that has the same beginning and end, but a different middle, it is natural to refactor it into a method that takes a block. You would write a method like this:

def read_file(filename)
  f = File.open(filename, "r")
  yield f
ensure
  f.close
end

And you’d refactor instances of the pattern in your code with:

read_file(filename) do |f|
  # do something with f
end

In order for this strategy to work, it’s important that the code inside the block look the same after refactoring as before. We can restate the correspondence principle in this case as:

# do something with f

should be equivalent to:

do
  # do something with
end

At first glance, it looks like this is true in Ruby and JavaScript. For instance, let’s say that what you’re doing with the file is printing its mtime. You can easily refactor the equivalent in JavaScript:

try {
  // imaginary JS file API
  var f = File.open(filename, "r");
  sys.print(f.mtime);
} finally {
  f.close();
}

Into this:

read_file(function(f) {
  sys.print(f.mtime);
});

In fact, cases like this, which are in fact quite elegant, give people the mistaken impression that Ruby and JavaScript have a roughly equivalent ability to refactor common functionality into anonymous functions.

However, consider a slightly more complicated example, first in Ruby. We’ll write a simple class that calculates a File’s mtime and retrieves its body:

class FileInfo
  def initialize(filename)
    @name = filename
  end
 
  # calculate the File's +mtime+
  def mtime
    f = File.open(@name, "r")
    mtime = mtime_for(f)
    return "too old" if mtime < (Time.now - 1000)
    puts "recent!"
    mtime
  ensure
    f.close
  end
 
  # retrieve that file's +body+
  def body
    f = File.open(@name, "r")
    f.read
  ensure
    f.close
  end
 
  # a helper method to retrieve the mtime of a file
  def mtime_for(f)
    File.mtime(f)
  end
end

We can easily refactor this code using blocks:

class FileInfo
  def initialize(filename)
    @name = filename
  end
 
  # refactor the common file management code into a method
  # that takes a block
  def mtime
    with_file do |f|
      mtime = mtime_for(f)
      return "too old" if mtime < (Time.now - 1000)
      puts "recent!"
      mtime
    end
  end
 
  def body
    with_file { |f| f.read }
  end
 
  def mtime_for(f)
    File.mtime(f)
  end
 
private
  # this method opens a file, calls a block with it, and
  # ensures that the file is closed once the block has
  # finished executing.
  def with_file
    f = File.open(@name, "r")
    yield f
  ensure
    f.close
  end
end

Again, the important thing to note here is that we could move the code into a block without changing it. Unfortunately, this same case does not work in JavaScript. Let’s first write the equivalent FileInfo class in JavaScript.

// constructor for the FileInfo class
FileInfo = function(filename) {
  this.name = filename;
};
 
FileInfo.prototype = {
  // retrieve the file's mtime
  mtime: function() {
    try {
      var f = File.open(this.name, "r");
      var mtime = this.mtimeFor(f);
      if (mtime < new Date() - 1000) {
        return "too old";
      }
      sys.print(mtime);
    } finally {
      f.close();
    }
  },
 
  // retrieve the file's body
  body: function() {
    try {
      var f = File.open(this.name, "r");
      return f.read();
    } finally {
      f.close();
    }
  },
 
  // a helper method to retrieve the mtime of a file
  mtimeFor: function(f) {
    return File.mtime(f);
  }
};

If we try to convert the repeated code into a method that takes a function, the mtime method will look something like:

function() {
  // refactor the common file management code into a method
  // that takes a block
  this.withFile(function(f) {
    var mtime = this.mtimeFor(f);
    if (mtime < new Date() - 1000) {
      return "too old";
    }
    sys.print(mtime);
  });
}

There are two very common problems here. First, this has changed contexts. We can fix this by allowing a binding as a second parameter, but it means that we need to make sure that every time we refactor to a lambda we make sure to accept a binding parameter and pass it in. The var self = this pattern emerged in JavaScript primarily because of the lack of correspondence.

This is annoying, but not deadly. More problematic is the fact that return has changed meaning. Instead of returning from the outer function, it returns from the inner one.

This is the right time for JavaScript lovers (and I write this as a sometimes JavaScript lover myself) to argue that return behaves exactly as intended, and this behavior is simpler and more elegant than the Ruby behavior. That may be true, but it doesn’t alter the fact that this behavior breaks the correspondence principle, with very real consequences.

Instead of effortlessly refactoring code with the same start and end into a function taking a function, JavaScript library authors need to consider the fact that consumers of their APIs will often need to perform some gymnastics when dealing with nested functions. In my experience as an author and consumer of JavaScript libraries, this leads to many cases where it’s just too much bother to provide a nice block-based API.

In order to have a language with return (and possibly super and other similar keywords) that satisfies the correspondence principle, the language must, like Ruby and Smalltalk before it, have a function lambda and a block lambda. Keywords like return always return from the function lambda, even inside of block lambdas nested inside. At first glance, this appears a bit inelegant, and language partisans often accuse Ruby of unnecessarily having two types of “callables”, in my experience as an author of large libraries in both Ruby and JavaScript, it results in more elegant abstractions in the end.

Iterators and Callbacks

It’s worth noting that block lambdas only make sense for functions that take functions and invoke them immediately. In this context, keywords like return, super and Ruby’s yield make sense. These cases include iterators, mutex synchronization and resource management (like the block form of File.open).

In contrast, when functions are used as callbacks, those keywords no longer make sense. What does it mean to return from a function that has already returned? In these cases, typically involving callbacks, function lambdas make a lot of sense. In my view, this explains why JavaScript feels so elegant for evented code that involves a lot of callbacks, but somewhat clunky for the iterator case, and Ruby feels so elegant for the iterator case and somewhat more clunky for the evented case. In Ruby’s case, (again in my opinion), this clunkiness is more from the massively pervasive use of blocks for synchronous code than a real deficiency in its structures.

Because of these concerns, the ECMA working group responsible for ECMAScript, TC39, is considering adding block lambdas to the language. This would mean that the above example could be refactored to:

FileInfo = function(name) {
  this.name = name;
};
 
FileInfo.prototype = {
  mtime: function() {
    // use the proposed block syntax, `{ |args| }`.
    this.withFile { |f|
      // in block lambdas, +this+ is unchanged
      var mtime = this.mtimeFor(f);
      if (mtime < new Date() - 1000) {
        // block lambdas return from their nearest function
        return "too old";
      }
      sys.print(mtime);
    }
  },
 
  body: function() {
    this.withFile { |f| f.read(); }
  },
 
  mtimeFor: function(f) {
    return File.mtime(f);
  },
 
  withFile: function(block) {
    try {
      var f = File.open(this.name, "r");
      block(f);
    } finally {
      f.close();
    }
  }
};

Note that a parallel proposal, which replaces function-scoped var with block-scoped let, will almost certainly be accepted by TC39, which would slightly, but not substantively, change this example. Also note block lambdas automatically return their last statement.

Our experience with Smalltalk and Ruby show that people do not need to understand the SCARY correspondence principle for a language that satisfies it to yield the desired results. I love the fact that the concept of “iterator” is not built into the language, but is instead a consequence of natural block semantics. This gives Ruby a rich, broadly useful set of built-in iterators, and language users commonly build custom ones. As a JavaScript practitioner, I often run into situations where using a for loop is significantly more straight-forward than using forEach, always because of the lack of correspondence between the code inside a built-in for loop and the code inside the function passed to forEach.

For the reasons described above, I strongly approve of the block lambda proposal and hope it is adopted.

Posted in Other | 16 Comments »

Amber.js (formerly SproutCore 2.0) is now Ember.js

December 12th, 2011

After we announced Amber.js last week, a number of people brought Amber Smalltalk, a Smalltalk implementation written in JavaScript, to our attention. After some communication with the folks behind Amber Smalltalk, we started a discussion on Hacker News about what we should do.

Most people told us to stick with Amber.js, but a sizable minority told us to come up with a different name. After thinking about it, we didn’t feel good about the conflict and decided to choose a new name.

Henceforth, the project formerly known as SproutCore 2.0 will be known as Ember.js. Our new website is up at www.emberjs.com

(and yes, we know this is pretty ridiculous)

Posted in Other | 22 Comments »

Announcing Amber.js

December 8th, 2011

A little over a year ago, I got my first serious glimpse at SproutCore, the JavaScript framework Apple used to build MobileMe (now iCloud). At the time, I had worked extensively with jQuery and Rails on client-side projects, and I had never found the arguments for the “solutions for big apps” very compelling. At the time, most of the arguments (at least within the jQuery community) focused on bringing more object orientation to JavaScript, but I never felt that they offered the layers of abstraction you really want to manage complexity.

When I first started to play with SproutCore, I realized that the bindings and computed properties were what gave it its real power. Bindings and computed properties provide a clean mechanism for building the layers of abstractions that improve the structure of large applications.

But even before I got involved in SproutCore, I had an epiphany one day when playing with Mustache.js. Because Mustache.js was a declarative way of describing a translation from a piece of JSON to HTML, it seemed to me that there was enough information in the template to also update the template when the underlying data changed. Unfortunately, Mustache.js itself lacked the power to implement this idea, and I was still lacking a robust enough observer library.

Not wanting to build an observer library in isolation (and believing that jQuery’s data support would work in a pinch), I started working on the first problem: building a template engine powerful enough to build automatically updating templates. The kernel of the idea for Handlebars (helpers and block helpers as the core primitives) came out of a discussion with Carl Lerche back when we were still at Engine Yard, and I got to work.

When I met SproutCore, I realized that it provided a more powerful observer library than anything I was considering at the time for the data-binding aspect of Handlebars, and that SproutCore’s biggest weakness was the lack of a good templating solution in its view layer. I also rapidly became convinced that bindings and computed properties were a significantly better abstraction, and allowed for hiding much more complexity, than manually binding observers.

After some months of retooling SproutCore with Tom Dale to take advantage of an auto-updating templating solution that fit cleanly into SproutCore’s binding model, we reached a crossroads. SproutCore itself was built from the ground up to provide a desktop-like experience on desktop browsers, and our ultimate plan had started to diverge from the widget-centric focus of many existing users of SproutCore. After a lot of soul-searching, we decided to start from scratch with SproutCore 2.0, taking with us the best, core ideas of SproutCore, but leaving the large, somewhat sprawling codebase behind.

Since early this year, we have worked with several large companies, including ZenDesk, BazaarVoice and LivingSocial, to iterate on the core ideas that we started from to build a powerful framework for building ambitious applications.

Throughout this time, though, we became increasingly convinced that calling what we were building “SproutCore 2.0″ was causing a lot of confusion, because SproutCore 1.x was primarily a native-style widget library, while SproutCore 2.0 was a framework for building web-based applications using HTML and CSS for the presentation layer. This lack of overlap causes serious confusion in the IRC room, mailing list, blog, when searching on Google, etc.

To clear things up, we have decided to name the SproutCore-inspired framework we have been building (so far called “SproutCore 2.0″) “Amber.js”. Amber brings a proven MVC architecture to web applications, as well as features that eliminate common boilerplate. If you played with SproutCore and liked the concepts but felt like it was too heavy, give Amber a try. And if you’re a Backbone fan, I think you’ll love how little code you need to write with Amber.

In the next few days, we’ll be launching a new website with examples, documentation, and download links. Stay tuned for further updates soon.

UPDATE: The code for Amber.js is still, as of December 8, hosted at the SproutCore organization. It will be moved and re-namespaced within a few days.

Posted in Other | 63 Comments »

How to Marshal Procs Using Rubinius

November 19th, 2011

The primary reason I enjoy working with Rubinius is that it exposes, to Ruby, much of the internal machinery that controls the runtime semantics of the language. Further, it exposes that machinery primarily in order to enable user-facing semantics that are typically implemented in the host language (C for MRI, C and C++ for MacRuby, Java for JRuby) to be implemented in Ruby itself.

There is, of course, quite a bit of low-level functionality in Rubinius implemented in C++, but a surprising number of things are implemented in pure Ruby.

One example is the Binding object. To create a new binding in Rubinius, you call Binding.setup:

def self.setup(variables, code, static_scope, recv=nil)
  bind = allocate()
 
  bind.self = recv || variables.self
  bind.variables = variables
  bind.code = code
  bind.static_scope = static_scope
  return bind
end

This method takes a number of more primitive constructs, which I will explain as this article progresses, but we can describe the constructs that make up the high-level Ruby Binding in pure Ruby.

In fact, Rubinius implements Kernel#binding itself in terms of Binding.setup.

def binding
  return Binding.setup(
    Rubinius::VariableScope.of_sender,
    Rubinius::CompiledMethod.of_sender,
    Rubinius::StaticScope.of_sender,
    self)
end

Yes, you’re reading that right. Rubinius exposes the ability to extract the constructs that make up a binding, one at a time, from a caller’s scope. And this is not just a hack (like Binding.of_caller for a short time in MRI). It’s core to how Rubinius manages eval, which of course makes heavy use of bindings.

Marshalling Procs

For a while, I have wanted the ability to Marshal.dump a proc in Ruby. MRI has historically disallowed it, but there’s nothing conceptually impossible about it. A proc itself is a blob of executable code, a local variable scope (which is just a bunch of pointers to other objects), and a constant lookup scope. Rubinius exposes each of these constructs to Ruby, so Marshaling a proc simply means figuring out how to Marshal each of these constructs.

Let’s take a quick detour to learn about the constructs in question.

Rubinius::StaticScope

Rubinius represents Ruby’s constant lookup scope as a Rubinius::StaticScope object. Perhaps the easiest way to understand it would be to look at Ruby’s built-in Module.nesting function.

module Foo
  p Module.nesting
 
  module Bar
    p Module.nesting
  end
end
 
module Foo::Bar
  p Module.nesting
end
 
# Output:
# [Foo]
# [Foo::Bar, Foo]
# [Foo::Bar]

Every execution context in Rubinius has a Rubinius::StaticScope, which may optionally have a parent scope. In general, the top static scope (the static scope with no parent) in any execution context is Object.

Because Rubinius allows us to get the static scope of a calling method, we can implement Module.nesting in Rubinius:

def nesting
  scope = Rubinius::StaticScope.of_sender
  nesting = []
  while scope and scope.module != Object
    nesting << scope.module
    scope = scope.parent
  end
  nesting
end

A static scope also has an addition property called current_module, which is used during class_eval to define which module the runtime should add new methods to.

Adding Marshal.dump support to a static scope is therefore quite easy:

class Rubinius::StaticScope
  def marshal_dump
    [@module, @current_module, @parent]
  end
 
  def marshal_load(array)
    @module, @current_module, @parent = array
  end
end

These three instance variables are defined as Rubinius slots, which means that they are fully accessible to Ruby as instance variables, but don’t show up in the instance_variables list. As a result, we need to explicitly dump the instance variables that we care about and reload them later.

Rubinius::CompiledMethod

A compiled method holds the information necessary to execute a blob of Ruby code. Some important parts of a compiled method are its instruction sequence (a list of the compiled instructions for the code), a list of any literals it has access to, names of local variables, its method signature, and a number of other important characteristics.

It’s actually quite a complex structure, but Rubinius has already knows how to convert an in-memory CompiledMethod into a String, as it dumps compiled Ruby files into compiled files as part of its normal operation. There is one small caveat: this String form that Rubinius uses for its compiled method does not include its static scope, so we will need to include the static scope separately in the marshaled form. Since we already told Rubinius how to marshal a static scope, this is easy.

class Rubinius::CompiledMethod
  def _dump(depth)
    Marshal.dump([@scope, Rubinius::CompiledFile::Marshal.new.marshal(self)])
  end
 
  def self._load(string)
    scope, dump = Marshal.load(string)
    cm = Rubinius::CompiledFile::Marshal.new.unmarshal(dump)
    cm.scope = scope
    cm
  end
end

Rubinius::VariableScope

A variable scope represents the state of the current execution context. It contains all of the local variables in the current scope, the execution context currently in scope, the current self, and several other characteristics.

I wrote about the variable scope before. It’s one of my favorite Rubinius constructs, because it provides a ton of useful runtime information to Ruby that is usually locked away inside the native implementation.

Dumping and loading the VariableScope is also easy:

class VariableScope
  def _dump(depth)
    Marshal.dump([@method, @module, @parent, @self, nil, locals])
  end
 
  def self._load(string)
    VariableScope.synthesize *Marshal.



gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.