Restarting mongrel_cluster with capistrano

Update April 2, 2009: This post is REALLY old. Do yourself a favor and install Passenger. I have and itâ€™s a life saver for my Rails sites. No more crashing mongrels, or mongrels that donâ€™t want to restart. Just sites that know they need to spin up if theyâ€™re down.

Update April 17, 2007: Bradley, the author of mongrel_cluster, was already aware of this issue and is getting ready to release a new version with some fixes for this issue. Best practice for now is to update your config files to place the pid files in /var/run/mongrel_cluster. He mentioned it way back on February 23rd. I shoulda read closer.

If you’ve been experiencing problems with restarting your mongrel cluster through Capistrano then I have two solutions that have worked for me and I’m pretty sure will for you as well.

THE PROBLEM

For a while now I’ve been having trouble restarting my clusters using Capistrano. It wouldn’t find the pid files so I’d have to manually SSH in and forceable kill my mongrel processes and restart ’em. I’ve seen this come up before on the Mongrel mailing list and looking through the archive I hadn’t been able to find a suitable answer or fix.

All my machines are updated to the latest mongrel and mongrel_cluster gems. 1.0.1 and 1.0.1.1 respectively. Running a “cap restart” runs the correct command to restart the cluster (edited for brevity) …

sudo mongrel_rails cluster::restart -C [valid path to config] --clean

This works when you are sitting in your rails app root, however Capistrano runs it’s commands from the ssh user’s home directory [1]. I ran that same command from there and I got this dreaded error output.

already stopped port 8000 already stopped port 8001 starting port 8000 !!! PID file log/mongrel.8000.pid already exists. Mongrel could be running already. Check your log/mongrel.8000.log for errors. !!! Exiting with error. You must stop mongrel and clear the .pid before I'll attempt a start. starting port 8001 !!! PID file log/mongrel.8001.pid already exists. Mongrel could be running already. Check your log/mongrel.8001.log for errors. !!! Exiting with error. You must stop mongrel and clear the .pid before I'll attempt a start.

What was brought up on the mailing list was that it appears that mongrel is not changing the working directory to the one specified in the cluster config file. A patch was submitted but by my reckoning, has not been applied and released.

However, I believe the patch mentioned above may not be necessary because, according to my research, the problem isn’t with mongrel at all. It’s with mongrel_cluster, no offense Bradley. 🙂 I’ve found two issues. One I believe causes the other.

The basic problem is that the “start” and “stop” commands, when they are scanning for existing pid files, are not being run from the working directory, as specified by the :cwd setting in the mongrel_cluster config file. mongrel_cluster does not use the working directory setting until it is past that point and finally calling the mongrel_rails command. Thus, it isn’t going to find the pid files if you are also susceptible to problem #2.
A relative directory :pid_file setting in the mongrel_cluster config. If you’re like me, your :pid_file setting is “log/mongrel.pid”. Using a relative directory like that is supposed to be based on the value of the :cwd setting. But mongrel_cluster is not applying the :cwd setting when parsing the :pid_file setting for it’s internal pid file variables.

SOLUTIONS… FINALLY!! 🙂

The solution to the first problem is to patch mongrel_cluster/init.rb. Sprinkle in some directory change commands, like the “status” command uses. I’ve uploaded my patch to Pastie.
Don’t use relative directories for the pid_file setting. Once I changed to an absolute directory setting of ~~“/www/app/shared/log/mongrel.pid” for example,~~ “/var/run/mongrel_cluster/app.pid” then mongrel_cluster correctly found my pid files. Solution #1 is NOT needed in this instance.

Both solutions require the user to perform an action but I believe that the first solution requires less steps for the end user. Instead of updating ALL of your mongrel_cluster config files, for every single app you’re running, just update to a patched mongrel_cluster.

I suppose there’s a THIRD solution and that’s to patch the “read_options” function in init.rb. Lines 28 and 29 maybe. Prepend @options[“cwd”] if @options[“pid_file”] or @options[“log_file”] are relative paths.

Am I off base with all this? Let me know. And thanx for reading all the way to the end. 🙂

[1] unless the recipe specifically calls a “cd” command. but even then, the cd command starts in the ssh user’s home directory.