Saturday, August 06, 2011

S3 file bucket downloader in Ruby

Today I wanted to download files from a website that I happened to find out that stored all files in S3. By accessing the website root, I realized that it was just the response of a S3 ListBucket API call. For instance:

<ListBucketResult xmlns="">

In order to download all files more quickly, I wrote the following Ruby program that downloads all files from this website, and I hope it can be useful for others:
require 'net/http'
require 'rexml/document'

baseurl = ''

# get the XML data as a string
xml_data = Net::HTTP.get_response(URI.parse("http://" + baseurl)).body

# extract event information
doc =
titles = []
links = []
Net::HTTP.start(baseurl) do |http|
doc.elements.each('ListBucketResult/Contents/Key') do |ele|
puts "Downloading " + ele.text
resp = http.get("/" + ele.text)
open("images/" + ele.text.gsub("/", "_") + ".jpg", "wb") { |file|
puts "Done"
Post a Comment