intelligent internet agents


Page Caching
August 11, 2009, 7:33 am
Filed under: crawl, engine, software | Tags: , ,

Note: This does not work. Only the pointer to the object gets marshaled, not the object itself.


require "rubygems"
require "celerity"

browser = Celerity::Browser.new :browser => :firefox, :log_level => :off
browser.clear_cookies

browser.goto("http://www.google.com/")

browser.resynchronized do
  browser.link(:text, "News").click
end

File.open("#{Dir.pwd}/monster.dmp", 'w') do |f|
  f.write(Marshal.dump(browser.page))
end

File.open("#{Dir.pwd}/monster.dmp", 'r') do |f|
  browser.page = Marshal.load(f)
end

puts browser.url

see also http://gist.github.com/165436

Advertisements

Leave a Comment so far
Leave a comment



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s



%d bloggers like this: