Tender Lovemaking

My Jerky Setup

Posted by Aaron Patterson – January 14, 2012

I love making beef jerky. I started making jerky by using Alton Brown’s recipe, but I found his jerky making apparatus to be lacking in a few key areas, so I put together my own jerky making setup. I use a modified food dehydrator for making my jerky. I prefer using a food dehydrator because it’s easy to clean, efficient at circulating air, and easily adjustable to accommodate larger or smaller batches of meat. Jerky is supposed to be air dried with no heat. Drying with no heat required modification of my dehydrator. I would like to share my modification process with you!

Getting started

I use a Nesco American Harvest dehydrator. I found mine on sale for about $50, but it looks like the going rate on Amazon is about $65.

I like this dryer because the fan is on top, it’s easy to clean, and you can buy and use extra trays. I think it goes up to 10 or 15 trays, but I only have 8.

Step 1: Remove the top nob

In order to disconnect the heating element, we must remove the plastic bar that goes across the top of the fan. The first step is to remove the knob:

Just pull the knob straight up, and it should come off:

Step 2: Disconnect the bar

Flip the lid over. There should be four screws outside the ring. Remove just those four screws:

Flip the lid back over, and you should be able to remove the center plastic bar with the Nesco logo on it:

Step 3: Disconnect the heater

These two leads go to the potentiometer that controls the heater:

Disconnect one. The circuit will open, and the heater disabled:

I taped mine down with electrical tape:

That’s it!

Just follow the steps backwards to reassemble your dryer. The electrically inclined may want to enhance their dryer with a switch. I (on the other hand) am lazy. I bought a second dryer so that I could make jerky and do normal food dehydrating at the same time.

You can see more pictures of the process here. I plan to write a follow up post where I actually make jerky. Yum!

Category: life
5 Comments – Feed

Profiling Rails startup with DTrace

Posted by Aaron Patterson – December 5, 2011

TL;DR: I’m adding DTrace probes to Ruby, and I used those to make a 6% performance improvement to rake environment.

Lately I’ve been working on adding DTrace probes to Ruby 2.0. I am interested in using them to profile and improve rails startup time. Today I want to look at building a profiler with DTrace, finding slow method calls, and making some incremental improvements.

The examples I’m going to show will only work on systems where DTrace is available, and with either stock Apple MRI 1.8.7, or my special fork of Ruby. Likely you don’t have my fork, so if you’d like to play along, you’ll have to use Apple’s Ruby 1.8.7. I’ve published my branch here, but I will be adding and moving probes, so please don’t depend on it. Eventually it will be merged to ruby trunk (hopefully).

All of these examples will be executing in the context of a Rails application built from Rails master.

What to profile?

It’s easy to say that application boot time is not important. When we start our applications in production, they are only booted once, so the cost is only paid once. But as an application developer, the “boot once” argument doesn’t hold water. Every time we run a rake task, we boot the app. Every time we run tests, we boot the app. Every time we run migrations, we boot the app. Every time we restart our webserver, we boot the app. In the context of application development, class reloading helps to some extent, but not when someone runs rake test.

There are good efforts surrounding the improvement of Rails test speeds, but I would like to suggest that these are just working around a larger problem: booting Rails is slow. If we can make Rails boot times fast, I think it would eliminate the need for testing outside the rails context.

So where do we start? Typically, we begin at the beginning. Today we’ll start profiling the rake environment command. This task simply requires all of the files for your application, readying everything for execution against a Rails context. Many things can impact the startup time of our application: file parse time, method call time, GC, etc. Today we’re going to focus on slow methods, and using DTrace to find those methods.

Viewing method calls

The DTrace probes in Ruby allow us to detect when method calls are made. We can tell when Ruby enters a method by watching the function-entry probe. This probe is given:

recipient class name
method name
file name and line number of the call site

Here is an example DTrace script that prints out all of this information each time a function-entry probe is activated:

ruby*:::function-entry
{
  printf("-> %s::%s (%s:%d)\n", copyinstr(arg0), copyinstr(arg1), copyinstr(arg2), arg3);
}

Save this to a file, and run the file along with rake environment like so:

$ sudo dtrace -q -s calls.d -c'rake environment'

You’ll see all of the method call entries, and what file and line made the call. Here is a sample of the output:

-> Module::local_constant_names (/Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb:85)
-> Module::local_constants (/Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb:78)
-> Module::constants (/Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb:79)
-> Array::map (/Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb:86)
-> Symbol::to_s (/Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb:86)

Counting method calls

We were able to see all of the method calls, but this doesn’t give us a good understanding of where the program is spending time. Instead of just viewing method calls, let’s use the aggregation functionality in DTrace to count the number of times a method is called. To do this, we use @ variables in DTrace. @ variables are known as “aggregation variables” and let us, well, aggregate things.

Aggregate variables are associative arrays that allow arbitrary keys. We can modify our DTrace program to aggregate the count of the recipient class name and method name pairs like so:

ruby*:::function-entry
{
  @[copyinstr(arg0), copyinstr(arg1)] = count();
}

When the program finishes executing, DTrace will automatically output the aggregate variable. Running the program yields this output (I’ve truncated the output to the last 10 lines or so):

$ sudo dtrace -q -s calls.d -c'rake environment'

  Class                 core#define_method           6332
  Symbol                ==                           6612
  Module                method_added                 9329
  Kernel                hash                         9437
  Array                 hash                        11966
  Array                 each                        13101
  Fixnum                ^                           15540
  Module                ===                         20267
  String                hash                        37878
  Symbol                to_s                       214691
$

Great! We can see that when rake environment is run, we’re calling Symbol#to_s over 200,000 times. Certainly this will put a burden on the Garbage Collector. If we change our program to aggregate by class, method, and call site, we can find out who is making so many calls. Let’s modify the program to add file and line number information:

ruby*:::function-entry
{
  @[copyinstr(arg0), copyinstr(arg1), copyinstr(arg2), arg3] = count();
}

Rerun the program, and we’ll see output like this (again I’ve truncated the output):

$ sudo dtrace -q -s calls.d -c'rake environment'

  Gem::Dependency        requirement           /Users/aaron/.local/lib/ruby/1.9.1/rubygems/dependency.rb               98             4641
  Array                  hash                  /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1308             4697
  Bundler::DepProxy      method_missing        /Users/aaron/.local/lib/ruby/gems/1.9.1/gems/bundler-1.0.21/lib/bundler/rubygems_ext.rb              194             4809
  BasicObject            ==                    /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1334             5610
  Kernel                 ===                   /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1334             5610
  Module                 ===                   /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1334             5940
  Array                  hash                  /Users/aaron/.local/lib/ruby/1.9.1/rubygems/requirement.rb              135             6001
  Fixnum                 ^                     /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1308            11078
  String                 hash                  /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1308            31209
  Symbol                 to_s                  /Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb               86           208900
$

This shows us that introspection.rb line 86 is calling Symbol#to_s over 208k times. If we take a look at that method, we’ll see that on 1.9 it will call Symbol#to_s many times:

def local_constants #:nodoc:
  constants(false)
end

# Returns the names of the constants defined locally rather than the
# constants themselves. See <tt>local_constants</tt>;.
def local_constant_names
  local_constants.map { |c| c.to_s }
end

Knowing the number of times a method was called and who called it is pretty good. But we can’t tell if that’s why our program is slow. It could be that the method is very fast, so calling it many times doesn’t matter. Certainly we need to keep an eye on this method, but let’s try to find out where this program is spending the most time.

Sampling method calls

It would be best if we could time how long ruby spends in any particular method. But in order to do that, we would have to keep track of the Ruby stack. DTrace doesn’t provide us with a good way to access the Ruby stack. We could write our own program to manually keep track, but there are problems. For example, if something raises an exception, how far up the stack did it go? Which method caught the exception? What about catch / throw? It’s Friday and I don’t feel like figuring out all the edge cases for dealing with that.

Rather than keeping track of the stack, we can use a timer probe in DTrace. We will set up two probes. One probe will record function entries in a global variable, the other probe will run at fixed intervals and record the last function that was called. This approach assumes that the more time we spend in any particular function, the more likely we’ll see that function recorded. Every time a method is entered, this DTrace script will record the method call. Then we have a timer that runs at 5000hz taking a sample of the last method entered.

ruby*:::function-entry
{
  k = copyinstr(arg0);
  m = copyinstr(arg1);
}

tick-5000hz
/k != 0/
{
  @[k, m] = count();
}

Running this script against rake environment yields these results:

$ sudo dtrace -q -s calls.d -c'rake environment'

  Module            method_added               329
  Fixnum            ^                          330
  Mutex             unlock                     355
  String            hash                       477
  Array             -                          671
  Module            constants                  779
  Kernel            require                   2262
  IO                set_encoding              3713
  Symbol            to_s                      4095
  Kernel            `                         4122
$

The number on the right is the number of times the timer executed while ruby was executing the method on the left. Symbol#to_s is definitely in our top 10, but it seems we’re actually spending more time in Kernel#`. Let’s find all places where Kernel#` is called but using DTrace. We’ll use predicates to ensure our probe only fires when that method is called. Predicates are declared below the probe name and above the curly brace within slashes. Our probe body will simply aggregate based on the file where the call happened:

ruby*:::function-entry
/copyinstr(arg0) == "Kernel" && copyinstr(arg1) == "`"/
{
  @[copyinstr(arg2)] = count();
}

Execute this script:

$ sudo dtrace -q -s calls.d -c'rake environment'

  /Users/aaron/.local/lib/ruby/gems/1.9.1/gems/bundler-1.0.21/lib/bundler.rb                1
  /Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/coffee-rails-2120408a4063/coffee-rails.gemspec                3
  /Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/sass-rails-3c24e4fd8dd3/sass-rails.gemspec                3
  /Users/aaron/.local/lib/ruby/gems/1.9.1/gems/execjs-1.2.9/lib/execjs/external_runtime.rb                5
$

On the right side, we have the number of calls made to Kernel#`, and on the left we have the file where that call was made. We can see that running rake environment will create 12 subshells. Let’s see how much time we’re spending by creating new subshells:

ruby*:::function-entry
/copyinstr(arg0) == "Kernel" && copyinstr(arg1) == "`"/
{
  self->time = timestamp;
}

ruby*:::function-return
/copyinstr(arg0) == "Kernel" && copyinstr(arg1) == "`"/
{
  this->took = (timestamp - self->time) / 1000000;

  @[copyinstr(arg2)] = sum(this->took);
  @total = sum(this->took);
}

END
{
  printf("Time per file:\n");
  printa(@);
  printf("\nTOTAL: ");
  printa(@total);
}

Here we keep track of the total time spent in Kernel#` in the @total variable, and the aggregate per file in the @ variable. The report lists the amount of time in milliseconds that we’re spending. If we run this along with the time command, we can get a rough idea of what percentage we’re spending in Kernel#`:

$ time sudo dtrace -q -s calls.d -c'rake environment'
Time per file:

  /Users/aaron/.local/lib/ruby/gems/1.9.1/gems/bundler-1.0.21/lib/bundler.rb  15
  /Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/coffee-rails-2120408a4063/coffee-rails.gemspec  246
  /Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/sass-rails-3c24e4fd8dd3/sass-rails.gemspec  267
  /Users/aaron/.local/lib/ruby/gems/1.9.1/gems/execjs-1.2.9/lib/execjs/external_runtime.rb  398

TOTAL:
              926

real  0m7.942s
user  0m4.052s
sys   0m2.909s
$

With DTrace enabled, the total run time took about 8000ms and of that time 926ms was spent inside Kernel#`. So let’s say about 11% of our time running rake environment is spent in subshells.

Removing the Subshells

First we need to figure out where these subshells are being created. Since we’re studying DTrace, let’s write a script to aggregate call sites of Kernel#`:

ruby*:::function-entry
/copyinstr(arg0) == "Kernel" && copyinstr(arg1) == "`"/
{
  @[copyinstr(arg2), arg3] = count();
}

Running this, we can find the locations of our subshells:

/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/coffee-rails-2120408a4063/coffee-rails.gemspec       19        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/coffee-rails-2120408a4063/coffee-rails.gemspec       20        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/coffee-rails-2120408a4063/coffee-rails.gemspec       21        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/sass-rails-3c24e4fd8dd3/sass-rails.gemspec       22        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/sass-rails-3c24e4fd8dd3/sass-rails.gemspec       23        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/bundler/gems/sass-rails-3c24e4fd8dd3/sass-rails.gemspec       24        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/gems/bundler-1.0.21/lib/bundler.rb      209        1
/Users/aaron/.local/lib/ruby/gems/1.9.1/gems/execjs-1.2.9/lib/execjs/external_runtime.rb      150        5

We’ll tackle fixing the gemspecs first because they are most boring to fix.

Fixing the gemspecs

Each gemspec shells out and runs git in order to populate various parts of the gemspec. Here are the relevant lines from one of the gemspecs:

s.files         = `git ls-files`.split("\n")
s.test_files    = `git ls-files -- {test,spec,features}/*`.split("\n")
s.executables   = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }

To fix this, we’re going to generate the gemspec in advance, then check in the generated gemspec. The downside of this solution is that whenever anybody adds or removes a file, the gemspec must be regenerated. The upside to this solution is that you see exactly what files are packaged in your gemspec. A more optimal solution would be to change bundler such that when a git repository is checked out, a gemspec for that repo is automatically built. If you’re looking for an opportunity to contribute to open source, this is it!

The gemspecs are now generated in advance. Next we should look at fixing subshells Bundler.

Fixing Bundler

In bundler.rb around line 209, we find that Bundler is trying to locate the sudo executable:

def requires_sudo?
  return @requires_sudo if defined?(@checked_for_sudo) && @checked_for_sudo

  path = bundle_path
  path = path.parent until path.exist?
  sudo_present = !(`which sudo` rescue '').empty?

  @checked_for_sudo = true
  @requires_sudo = settings.allow_sudo? && !File.writable?(path) && sudo_present
end

Let’s refactor this to use a which method that searches the PATH environment variable:

def which(cmd)
  if File.executable? cmd
    cmd
  else
    path = ENV['PATH'].split(File::PATH_SEPARATOR).find { |path|
      File.executable? File.join(path, cmd)
    }
    path && File.expand_path(cmd, path)
  end
end

def requires_sudo?
  # ...
  sudo_present = which "sudo"
  # ...
end

Our which implementation isn’t 100% perfect, (it can’t handle input like “./sudo” correctly) but it is good enough to find the sudo executable on a person’s system. Benchmarks reveal that the implementation that doesn’t shell out is over 10x faster than the version that does:

require 'rubygems'
require 'benchmark/ips'

def sand_which cmd
  `which #{cmd}`
end

def blair_which cmd
  if File.executable? cmd
    cmd
  else
    path = ENV['PATH'].split(File::PATH_SEPARATOR).find { |path|
      File.executable? File.join(path, cmd)
    }
    path && File.expand_path(cmd, path)
  end
end

Benchmark.ips do |x|
  x.report("sand") { sand_which 'sudo' }
  x.report("blair") { blair_which 'sudo' }
end

$ ruby lol.rb
                sand      251.0 (±4.8%) i/s -       1265 in   5.051489s (cycle=23)
               blair     3222.0 (±8.2%) i/s -      16218 in   5.076616s (cycle=306)
$

Now that we have bundler fixed, let’s take a look at ExecJS.

Fixing ExecJS

Our last place that shells out is inside ExecJS. The DTrace probes specifically list execjs/external_runtime.rb line 150. If we look there, we’ll find another implementation of which:

def which(command)
  Array(command).each do |name|
    name, args = name.split(/\s+/, 2)
    result = if ExecJS.windows?
      `#{shell_escape("#{ExecJS.root}/support/which.bat", name)}`
    else
      `#{shell_escape('command', '-v', name)} 2>/dev/null`
    end

    if path = result.strip.split("\n").first
      return args ? "#{path} #{args}" : path
    end
  end
  nil
end

When fixing performance problems, it’s helpful to understand the task the code is trying to accomplish, as well is how the method is called. Here is the stack trace for this method inside ExecJS:

lib/execjs/external_runtime.rb:134:in `locate_binary'
lib/execjs/external_runtime.rb:97:in `initialize'
lib/execjs/runtimes.rb:37:in `new'
lib/execjs/runtimes.rb:37:in `'
lib/execjs/runtimes.rb:9:in `'
lib/execjs/runtimes.rb:8:in `'
lib/execjs.rb:2:in `require'
lib/execjs.rb:2:in `'

There are a couple important things we can learn from this stack trace: 1) this method is called immediately when the file is required, and 2) it’s called from inside a constructor (on line 97). Let’s look at locate_binary, the initializer, and how the objects are constructed in runtimes.rb:

external_runtime.rb:

def initialize(options)
  @name        = options[:name]
  @command     = options[:command]
  @runner_path = options[:runner_path]
  @test_args   = options[:test_args]
  @test_match  = options[:test_match]
  @encoding    = options[:encoding]
  @binary      = locate_binary
end

protected
def locate_binary
  if binary = which(@command)
    if @test_args
      output = `#{shell_escape(binary, @test_args)} 2>&1`
      binary if output.match(@test_match)
    else
      binary
    end
  end
end

runtimes.rb:

module ExecJS
  module Runtimes
    # ...

    Node = ExternalRuntime.new(
      :name        => "Node.js (V8)",
      :command     => ["nodejs", "node"],
      :runner_path => ExecJS.root + "/support/node_runner.js",
      :encoding    => 'UTF-8'
    )

    # ...
  end
end

The initializer calls locate_binary which calls which which shells out. This is a great example of a hidden dependency in a constructor, our hidden dependency being the ability to shell out, and the availability of the command command. This isn’t to say that the command command will sometimes not be available, the problem is that people constructing this object will not know that it depends on these things. It’s “hidden” from the caller. The first thing we should do is stop the constructor from immediately shelling out. We’ll do this by initializing @binary to nil, adding a binary method, and changing all places that access the instance variable to use the method instead. Our diff looks like this:

diff --git a/lib/execjs/external_runtime.rb b/lib/execjs/external_runtime.rb
index a8180d1..65f46f9 100644
--- a/lib/execjs/external_runtime.rb
+++ b/lib/execjs/external_runtime.rb
@@ -94,7 +94,7 @@ module ExecJS
       @test_args   = options[:test_args]
       @test_match  = options[:test_match]
       @encoding    = options[:encoding]
-      @binary      = locate_binary
+      @binary      = nil
     end

     def exec(source)
@@ -113,16 +113,21 @@ module ExecJS

     def available?
       require "multi_json"
-      @binary ? true : false
+      binary ? true : false
     end

+    private
+      def binary
+        @binary ||= locate_binary
+      end
+
     protected
       def runner_source
         @runner_source ||= IO.read(@runner_path)
       end

       def exec_runtime(filename)
-        output = sh("#{shell_escape(*(@binary.split(' ') << filename))} 2>&1")
+        output = sh("#{shell_escape(*(binary.split(' ') << filename))} 2>&1")
         if $?.success?
           output
         else

Now the object will only shell out if the value is absolutely necessary. This technique is called “extracting a getter”, and you can read more about it in Michael Feathers’s book Working Effectively with Legacy Code (IMO this book is a mandatory read for any software engineer). Now we’re lazily shelling out, but if we look at the impact, we’ve gone from ExecJS creating 5 subshells down to 3 subshells:

$ sudo dtrace -q -s calls.d -c'rake environment'
/Users/aaron/.local/lib/ruby/gems/1.9.1/gems/execjs-1.2.9/lib/execjs/external_runtime.rb      155      3
$

The target is 0 subshells, so we should continue. This code is trying to locate executables on the file system, similar to what bundler was doing. We can change the which method in ExecJS to be a Ruby implementation similar to the implementation we put in bundler:

diff --git a/lib/execjs/external_runtime.rb b/lib/execjs/external_runtime.rb
index 65f46f9..88a08ce 100644
--- a/lib/execjs/external_runtime.rb
+++ b/lib/execjs/external_runtime.rb
@@ -121,6 +121,22 @@ module ExecJS
         @binary ||= locate_binary
       end

+      def which_windows(name)
+        result = `#{shell_escape("#{ExecJS.root}/support/which.bat", name)}`
+        result.strip.split("\n").first
+      end
+
+      def which_unix(name)
+        if File.executable? cmd
+          cmd
+        else
+          path = ENV['PATH'].split(File::PATH_SEPARATOR).find { |path|
+            File.executable? File.join(path, cmd)
+          }
+          path && File.expand_path(cmd, path)
+        end
+      end
+
     protected
       def runner_source
         @runner_source ||= IO.read(@runner_path)
@@ -147,19 +163,14 @@ module ExecJS
       end

       def which(command)
-        Array(command).each do |name|
+        Array(command).find do |name|
           name, args = name.split(/\s+/, 2)
-          result = if ExecJS.windows?
-            `#{shell_escape("#{ExecJS.root}/support/which.bat", name)}`
-          else
-            `#{shell_escape('command', '-v', name)} 2>/dev/null`
-          end
+          path = ExecJS.windows? ? which_windows(name) : which_unix(name)

-          if path = result.strip.split("\n").first
-            return args ? "#{path} #{args}" : path
-          end
+          next unless path
+
+          args ? "#{path} #{args}" : path
         end
-        nil
       end

       if "".respond_to?(:force_encoding)

It isn’t clear what the utility of the Windows batch file is, so it’s been extracted to it’s own method in order to maintain the current functionality. After these patches (which I have submitted here), the number of subshells is now down to zero.

Total Impact

Before:

$ time rake environment

real    0m2.377s
user    0m2.028s
sys 0m0.320s

After:

$ time rake environment 

real    0m2.224s
user    0m1.946s
sys 0m0.260s

After eliminating subshells, we’ve saved 153 milliseconds for approximately 6% savings. Not as much as the 11% we were hoping for, but still greater than 0. If we re-run our sampling profiler, we’ll find no more references to Kernel#`:

$ sudo dtrace -q -s calls.d -c'rake environment'

  Module           ===                306
  Mutex            unlock             324
  Fixnum           ^                  353
  Module           method_added       368
  Array            -                  509
  String           hash               603
  Module           constants          859
  Kernel           require           2292
  IO               set_encoding      3371
  Symbol           to_s              4561

Conclusion

Removing subshells from the Rails startup process wasn’t a silver bullet. From my experience in doing performance work, there is rarely a silver bullet that will fix performance problems. That said, even 2.2 seconds is still too slow for Rails startup time. If we collate the sampling profiler by file and method call, I think we can get a more clear picture about our runtime cost:

$ sudo dtrace -q -s calls.d -c'rake environment'

  Array            hash                  /Users/aaron/.local/lib/ruby/1.9.1/rubygems/requirement.rb              255
  Mutex            unlock                                                              365
  Fixnum           ^                     /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb              493
  Array            -                     /Users/aaron/git/rails/activesupport/lib/active_support/dependencies.rb              507
  IO               set_encoding          /Users/aaron/.local/lib/ruby/1.9.1/rubygems/custom_require.rb              636
  Module           constants             /Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb              841
  IO               set_encoding          /Users/aaron/git/rails/activesupport/lib/active_support/dependencies.rb             1098
  String           hash                  /Users/aaron/.local/lib/ruby/1.9.1/rubygems/specification.rb             1188
  Kernel           require               /Users/aaron/git/rails/activesupport/lib/active_support/dependencies.rb             1253
  Symbol           to_s                  /Users/aaron/git/rails/activesupport/lib/active_support/core_ext/module/introspection.rb             8981

Of the top ten slow methods, five of them are inside Rails. These particular methods are all associated with the way that Rails handles missing constants and code reloading. The total amount of time spent dealing with code reloading is likely far greater than the subshelling issue we fixed. I knew this before writing this article, but tackling the reloading code in Rails is much longer than a single blog post. The goal of this blog post was to show that:

Startup performance can be improved
Even small changes can make an impact
DTrace is pretty awesome

I’d like to write an article with a more comprehensive analysis of where Rails spends it’s time on boot (GC, file parsing, method calls), but I don’t have all the information I need yet. My next steps are to:

Try and understand Rails code reloading better
Add more GC related probes to Ruby
Add source compile probes to ruby

Once I’ve made more progress, I’ll write another article.

<3<3<3<3

Category: life
14 Comments – Feed

PSA: The number of gems installed on your system can impact rails boot time

Posted by Aaron Patterson – November 30, 2011

Just a PSA: the number of gems you have installed on your system will impact your rails boot time. Why is this?

When you require a file, rubygems must find the file. If the file isn’t on the load path, it must find it in your installed gems. To do this, it loads all gemspecs and searches for the file. This means all gemspecs on your system are evaluated, so the more gems you have, the longer it takes. You can read more in the rubygems comments.

The second problem is when bundler comes in to the picture. When rubygems loads the gemspec files, it turns them in to Gem::Specification objects and caches them. Sometimes bundler must modify your GEM_HOME and GEM_PATH environment variables. These environment variables tell rubygems where to locate gems on your file system. Unfortunately, if these variables are modified after the specification cache is built, that means the cache must be cleared. Until recently, bundler always cleared the rubygems cache. Since rubygems can load gemspecs before bundler gets activated (or may even while bundler is being activated!), this means it’s not unusual for all gemspecs on your system to be evaluated twice. Earlier today, I submitted a patch that will only clear the cache if the environment variables are modified. Hopefully this will mean fewer gemspec evals in the next version of bundler.

To help reduce boot time, just remember:

Use gemsets (if you’re on RVM)
Clean unused gems often

To clean up your gems, just use the command gem cleanup.

Hope that helps!

Category: life
10 Comments – Feed

Connection Management in ActiveRecord

Posted by Aaron Patterson – October 20, 2011

OMG! Happy Thursday! I am trying to be totally enthusiastic, but the truth is that I have a cold, so there will be fewer uppercase letters and exclamation points than usual.

Anyway, I want to talk about database connection management in ActiveRecord. I am not too pleased with its current state of affairs. I would like to describe how ActiveRecord connection management works today, how I think it should work, and steps towards fixing the current system.

TL;DR: database connection API in ActiveRecord should be more similar to File API

Thinking in terms of files

It’s convenient to think of our database connection as a file. Dealing with files is very common. When we work with files, the basic sequence goes something like this:

Open the file
Do some work on the file handle
Close the file

We’re very used to doing these steps when dealing with files. Typically our code will look something like this:

File.open('somefile.txt', 'wb') do |fh| # Open the file
  fh.write "hello world"                # Do some work with the file
end                                     # Close file when block returns

We don’t want to share open files among threads because dealing with synchronization around reading and writing to the file is too difficult (and time consuming). So maybe we’ll store the handle in a thread local or something until we’re ready to close it.

Our basic requirements for dealing with a database connection are essentially the same as when dealing with files. We need to open our database connection, do some work with the connection (send and receive queries), and close the connection. We have these similarities, yet the API for dealing with database connections in ActiveRecord is vastly different. Let’s look at how each of these steps are performed in ActiveRecord today.

Opening a connection

Opening a connection to the database is very easy. First we configure ActiveRecord with the database specification, then we call connection to actually get back a database handle:

ActiveRecord::Base.establish_connection(
  :adapter  => "sqlite",
  :database => "path/to/dbfile")

connection_handle = ActiveRecord::Base.connection

The main difference between this API and the File API is that we’ve separated the connection specification from actually opening the connection. In the case of opening a file, we call open along with a “specification” which includes the file name and how we want to open it. In this case, we’ve separated the two; essentially storing the specification in a global place, then opening the connection later.

This leads to two questions:

Where is the specification stored?
When I call connection, what specification is used?

The answer to the first question can be found by reading the establish_connection method. Specifically if we look at line 63 we’ll find a clue. Since this method is a class method, the call to name returns the class name of the recipient. This name (along with our actual spec) is passed in to the connection handler object. If we jump through a few more layers of indirection, we’ll find that what we have is essentially a one to one mapping of class name to connection specification.

Armed with this information, we can tackle the second question. If we look at the implementation of connection, it calls retrieve_connection on itself, which calls retrieve_connection on the connection handler with itself. A few more method calls later, and we see that each ActiveRecord subclass walks up the inheritance tree looking for a connection:

def retrieve_connection_pool(klass)
  pool = @connection_pools[klass.name]
  return pool if pool
  return nil if ActiveRecord::Base == klass
  retrieve_connection_pool klass.superclass
end

If we read this code carefully, we’ll notice that not only are connection specifications mapped to classes so are database connections!

Why is this bad?

This behavior smells bad to me. The reason is because we’re tightly coupling classes along with database connections when really this relationship doesn’t need to exist.

How can it be improved?

If this tight coupling is removed, the complexity of ActiveRecord can be reduced and at the same time increasing the features available! The way we can reduce this coupling is by passing the connection specification to the method that actually opens the connection. Specifications can be stored on each class as a convenience, but nothing more.

What if opening a connection looked more like this?

spec = ActiveRecord::Base.specificiation
ActiveRecord::ConnectionPool.open(spec) do |conn|
  ...
end

We could maintain the current behavior by storing specifications on each class, but eliminate the coupling between connection and class. We would be able to delete all of the code that looks up connections by class hierarchy, and open the doors to having features like this:

spec = database_a
ActiveRecord::Connect