Large Ruby File Downloads Done Right!

I have recently been writing a utility in ruby to move some very large files across from Rackspace's Cloud to Amazon S3. Basically the utility firstly downloads a file from Rackspace and then uploads it to S3. Well this all seems very straight forwards but there are considerations that have to be made when you download large files using any scripted utility.

Some of the file's that are being downloaded are 1GB or more in size and I am using an Amazon Micro EC2 server that only has 613 mg of RAM. Due to the available RAM usually being smaller than the file size of the download the last thing I want to do is to put the http response stream into memory before writing it out to disk. This would cause all kinds of fail. Basically the server will run out of memory and kill the download process before it is complete. What is needed is to download and stream the file directly to disk leaving the RAM well alone.

Below is code I have extracted from my utility and simplified.

            Net::HTTP.start("someurl_without_the_protocol.com") do |http|
              begin
                file = open("/path/to/file.mov", 'wb')
                http.request_get('/' + URI.encode("file.mov")) do |response|
                  response.read_body do |segment|
                    file.write(segment)
                  end
                end
              ensure
                file.close
              end
            end
            
Tweet Me | Link To Facebook
raggi
Wednesday, 28th September, 2011
spacer
why don't you use the block form of Kernel#open?
Stephen Touset
Wednesday, 28th September, 2011
spacer
Or Tempfile#open, for that matter.

Month List

  • 2011-December (3)
  • 2011-November (2)
  • 2011-October (1)
  • 2011-September (1)
  • 2011-August (1)
  • 2011-June (2)
  • 2011-May (4)
  • 2011-April (1)
  • 2011-March (3)
  • 2011-February (2)
  • 2011-January (2)
  • 2010-December (1)
  • 2010-November (1)
  • 2010-September (1)
  • 2010-June (3)
  • 2010-May (1)
  • 2010-April (3)
  • 2010-March (2)
  • 2010-February (2)
  • 2010-January (2)
  • 2009-December (2)
  • 2009-November (2)
  • 2009-October (2)
  • 2009-September (2)
  • 2009-July (2)
  • 2009-June (2)
  • 2009-May (2)
  • 2009-April (2)

Tag Cloud

  • activerecord
  • annotate
  • appsettings
  • ashx
  • asp.net
  • asp.net mvc
  • asp.netmvc
  • blogging
  • book review
  • bookreview
  • c#
  • caching
  • calender
  • cancer
  • data annotations
  • datamapper
  • dependency injection
  • development
  • errors
  • fluent nhibernate
  • funny
  • hardware
  • http
  • httpruntime
  • iis7
  • image resize
  • irb
  • jacob
  • javascript
  • jquery
  • jruby
  • jsshell
  • kue
  • life
  • linq 2 nhibernate
  • linq 2 sql
  • master pages
  • mini_record
  • model
  • mvc
  • mysql
  • nhibernate
  • nunit
  • patterns
  • performance testing
  • powershell
  • productivity
  • pry
  • rapuncel
  • ravendb
  • ruby
  • ruby on rails
  • rubyonrails
  • rvm
  • security
  • selectlist
  • session
  • shout mouth
  • specs
  • sqlite
  • stored procedures
  • structuremap
  • theming
  • tools
  • tribe.cache
  • ultimate dev machine
  • unit testing
  • wcat
  • web guy
  • web service
  • webforms
  • xmlrpc
  • xunit
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.