Saturday, August 06, 2011

Flickr interestingness downloader in Ruby

0 comments
And this time this is the Ruby code using Flickraw gem to download large size versions of Flickr interesting photos.

require 'flickraw'

FlickRaw.api_key="api_key"
FlickRaw.shared_secret="shared_secret"

photos = flickr.interestingness.getList( :per_page => 500 )

frob = flickr.auth.getFrob
auth_url = FlickRaw.auth_url :frob => frob, :perms => 'read'

photos.each do |pic|
photo_info = flickr.photos.getInfo(:photo_id => pic.id)
photo_url = FlickRaw.url_b(photo_info)

puts "Downloading #{photo_url}"

open("flickr/" + pic.id + ".jpg", "wb") { |file|
file.write(Net::HTTP.get_response(URI.parse(photo_url)).body)
}
end

S3 file bucket downloader in Ruby

0 comments
Today I wanted to download files from a website that I happened to find out that stored all files in S3. By accessing the website root, I realized that it was just the response of a S3 ListBucket API call. For instance:

<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>foo.com</Name>
<Prefix/>
<Marker/>
<MaxKeys>1000</MaxKeys>
<IsTruncated>true</IsTruncated>
<Contents>
<Key>file/1</Key>
<LastModified>2011-06-09T06:29:02.000Z</LastModified>
<ETag>"5cb3930839817ff4a5c1ddf08e3fea1e"</ETag>
<Size>1440231</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>file/2</Key>
<LastModified>2011-06-09T06:29:18.000Z</LastModified>
<ETag>"96fdc94d14b6d9817f80ac1e9e2049b4"</ETag>
<Size>1310</Size>
<StorageClass>STANDARD</StorageClass>
</Contents>
</ListBucketResult>

In order to download all files more quickly, I wrote the following Ruby program that downloads all files from this website, and I hope it can be useful for others:
require 'net/http'
require 'rexml/document'

baseurl = 'foo.com'

# get the XML data as a string
xml_data = Net::HTTP.get_response(URI.parse("http://" + baseurl)).body

# extract event information
doc = REXML::Document.new(xml_data)
titles = []
links = []
Net::HTTP.start(baseurl) do |http|
doc.elements.each('ListBucketResult/Contents/Key') do |ele|
puts "Downloading " + ele.text
resp = http.get("/" + ele.text)
open("images/" + ele.text.gsub("/", "_") + ".jpg", "wb") { |file|
file.write(resp.body)
}
end
end
puts "Done"