Faisal's Interactions


Beanstalk – Part 1

Posted in Uncategorized by f10i on October 29, 2009
Tags: , ,

As I mentioned in my previous post, I will write some articles about Beanstalk queuing server to help share what I find out about it. In this post I will cover installing and running Beanstalk.

One point worth mentioning is that Beanstalk is an in-memory based queuing system. What this means that everything is stored in the memory. So if the power goes out, the machine hangs, or beanstalk terminates for any reason, any and all jobs in the queue will be lost. Of course this behavior can be changed with startup options (explained below).

Beanstalk requires having libevent installed in order for it to work. So lets start by installing libevent. In a terminal on your linux or mac machines run the following commands:


wget http://monkey.org/~provos/libevent-1.4.12-stable.tar.gz
tar -xzf libevent-1.4.12-stable.tar.gz
cd libevent-1.4.12-stable
./configure
make
make install

Now that you have libevent installed, we need to install beanstalk:

wget http://xph.us/dist/beanstalkd/beanstalkd-1.4.2.tar.gz
tar -xzf beanstalkd-1.4.2.tar.gz
cd beanstalkd-1.4.2
./configure
make
make install

That wasn’t too hard :)

To run beanstalk, all you have to do now is run the command “beanstalkd”. However, there are some interesting options you can pass to the command:

  • -d to detach the process, or to run it as a daemon. The process will be removed from the foreground and you will get your terminal back, but beanstalk will be still working in the background.
  • -b as mentioned above, beanstalk is a memory-based queue. To let your jobs persist power outages or server crashes the -b option would store any job beanstalk receives into a binary log file. If beanstalk terminates for any reason, you can start it again with the same -b option, and your jobs will be restored.
  • -s BYTES Limit binary log file size to BYTES maximum. Default 10485760
  • -l specify the address to listen to. Default 0.0.0.0
  • -p specify the port to listen to. Default 11300
  • -u which user to run as
  • -z BYTES limit job size to BYTES maximum. Default 65535

I am running the server with the following command: beanstalkd -d -l 127.0.0.1 -p 11300 -b PATH_TO_BIN_DIR

Now we have a running beanstalkd server. In my next article, I’ll talk about how to use a client to connect to the running beanstalkd instance, and interact with it.

Web Queues

Posted in Uncategorized by f10i on October 29, 2009
Tags: , , , , ,

When writing a web application, developers usually think in the “user request” -> “server processing” -> “response” cycle. And they are right to think so, since this is how HTTP works. However sometimes you need to accomplish tasks outside the scope of this cycle, to run scheduled jobs, long running jobs or callback jobs.

There are many ways to go about this, but the best way IMO, is to have a separate worker listening to a queue. What you do is that you create a queue, and whenever you want to run a job outside the scope of the web cycle, you push a message to the queue. Separate workers would exist and listen to that queue, and whenever there is a new job, they will fetch that job and run it.

So far, I have tried using three different systems for queuing jobs: ActiveMQ, Amazon SQS, and most recently Beanstalk. I have to say that my personal preference is Beanstalk. ActiveMQ was a complete pain to install and configure, and I actually gave up on it before getting it to run.

Amazon SQS has no setup or configuration since it is a service offered by Amazon. However, not having the queue locally means slower response times, and you don’t get that feeling of having full control over your jobs. More importantly, for some reason, some jobs would get completely lost and I can’t figure where they went. Having to manually reinsert jobs into queues every so often became tedious quickly, and after a year of this we started looking for an alternative.

Beanstalk is an extremely fast and lightweight queuing server, has almost no setup overhead (you have to compile from source but that’s easily done on mac or linux), and can take on some serious load. The only problem I found with beanstalk is that it desperately lacks good documentation online, as I find myself resorting to the source code to figure out what functionality it offers.

As such, I will write articles about Beanstalk whenever I find a new feature that I didn’t know about to help spread the knowledge about this amazing queuing server. Stay tuned.


Follow

Get every new post delivered to your Inbox.