About
My name is Peter Williams. I am a father and software developer. (I do a few other things, of course, but family and software are what I am most passionate about.) This blog is primarily a place where I can rant without scaring too many people, but occasionally I record things of interest to my family here.
more about ยปProjects
- HalClient
- HalInterpretation
- RSpec HAL
- Rspec (minor) mode
- Shoulda (minor) mode
- Resourceful
- Saml SP
- Parallel Each
- resque-fairly
- resque-multi-step
- random-word
-
Recent Comments
- catherine on family2010
This picture is more than 5 years old. Time for an update. …
- Cassie on Mount Bierstadt
Great hike! …
- Dino on What do i want to do when i grow up?
I might have a problem domain that might be of interest... Send m …
- Peter Williams on Embedding
Yair, i don't think i understand exactly the problem you are conc …
- Peter Williams on Backlogs considered harmful
Murray, i don't consider it a straw man argument and i don't thin …
- catherine on family2010
People I know
- Brian Erickson
- Catherine Williams
- Charlie Savage
- Donald Marino
- Morgan Whitney
- Paul Sadauskas
- Robert Smeets
-
Peter Williams
Parallel Each
Parallel each allows you to iterated through any Enumerable
handling the items in parallel.
results = []
[1,2,3,4].p_each do |i|
results << i
end
results # => [3,2,4,1]
The parallel processing happens in threads The given block is executed for each item in the array in a separate thread. Obviously, this could get dodgy for large Enumerable
s. To handle that #p_each
will never have more than a set number of threads running simultaneously. The default limit is 20 but you can change that by passing a number to #p_each
.
results = []
[1,2,3,4].p_each(2) do |i|
sleep 1
results << i
end
results # => [2,1,3,4]
Parallel iterates through the items in order getting ahead of what can be processed immediately. This mean that you can use #p_each
on very large lazy loaded lists (such as the results of ActiveRecord#find
) without instantiating every item in the list at once.
Installation
Install the gem with this command
sudo gem install parallel-each
Performance
The performance characteristics of parallel iteration varies greatly depending on the platform and work being performed.
I/O bound work is where #p_each
will really shine. In this situation the overall processing time will, generally, be improved regardless of the version/flavor of Ruby that is being used or the number of CPUs.
If the work is CPU bound the impacts of parallelization will depend on version/flavor of Ruby and the number of CPUs. With more than one CPU running Ruby 1.9 or JRuby should result in an improvement of the overall processing time. With one CPU or when running Ruby 1.8 there will be no improvement (and possibly a slight degradation) of the overall processing time.
Links
- API docs
- parallel-each.rubyforge.com
- Canonical repo
- git://github.com/pezra/parallel-each.git
- Github page
- github.com/pezra/parallel-each