Tuesday, November 9, 2010

Using Spawn to circumvent Ruby OCI driver connection issues

  In our system we use Active MQ with a set of pollers to run background jobs such as importing large data sets and publishing information over the wire. For the most part this works great. You generate a message and publish it to the queue and forget about it. Normally everything rocks, except for the case where the runners (pollers) have not had any work for the past X hours, where X >= four hours. In this lovely case, Oracle and Ruby don't like each other any more. Ruby asks oracle for connection information and somewhere in the stack, there is a fifteen minute wait before both sides agree that the current connection to the database is no longer active.

  This sucks. Hard. We have tried many things in order to get Ruby to release the connection without checking with Oracle.
 
dbconfig = ActiveRecord::Base.remove_connection
ActiveRecord::Base.establish_connection(dbconfig)

and before that:

ActiveRecord::Base.establish_connection

and before that:

ActiveRecord::Base.verify_active_connections!

It was insane, everything we tried kept getting hung up in some magical part of the stack that didn't like the fact we were allowing for long running threads with no activity.

I finally found a solution that worked out for us. Amusingly it was while I was working on a section of the code for my own gratification.

Spawn. That's right, Spawn. Spawn is a simple, clean plug-in that helps to take the pain away from forking and threading in Ruby. Besides having a healthy number of fixes for threading issues in Ruby, it provides a strait forward way of creating child processes and waiting for them to complete. Of course there are a handful of options that you can specify but the base case syntax is simple:

spawn do 
   call_to_long_running_process_or_job
end


 That is it. It just works. If you want to wait for the process to complete before moving on:


fork_process = spawn do 
 call_to_long_running_process_or_job
end

wait(fork_process)

Makes life easy. It also provides a workaround for the Oracle time-out issue.

Old code:

def on_message(message)
    logger.info("#{Time.now.to_s} Received request: #{message}")
    ActiveRecord::Base.establish_connection
    #Sometimes our connection goes away when a poller has been waiting a long time for a job, this is the 15 minute hang line of code
    logger.info("#{Time.now.to_s} Finished reconnecting to the database.")
   do_something(message)
end


New code:
 def on_message(message)
    fork_process = spawn do
        do_something(message)
     end
     logger.info("Forked for processing Parent PID (#{Process.pid}) is wating for PID -- #{fork_process.handle}")
    wait(fork_process)
    logger.info("Completed message for Parent PID (#{Process.pid})")
end

No wait time for Oracle to release the connection or provide a new connection. The process gets the message out of the queue, forks itself, and runs it immediately. Makes the users happy, and provided me with enough ammo for a secondary post about threading with limits and lambdas.

Hope this helps someone out, and if not, there are a few blog links on the Spawn ReadMe that also helped me out.





Scott Persinger's blog post on how to use fork in rails for background processing. http://geekblog.vodpod.com/?p=26
Jonathon Rochkind's blog post on threading in rails.
http://bibwild.wordpress.com/2007/08/28/threading-in-rails/

No comments:

Post a Comment