Ask HN: keeping services up and running?

nixme · on May 21, 2010

Dustin Sallings says, "Don’t start programs, run programs."

See: http://dustin.github.com/2010/02/28/running-processes.html

aphyr · on May 21, 2010

I use init. Most of my services manage their own daemonization and forked workers. It's easier to design a correct system to detect failure, issue alerts, and restart as necessary than building unkillable workers. Coupled with some basic init scripts, it works pretty well.

I like monit, but it has a weird habit of running super-slowly, failing to restart services, or otherwise flaking out.

_delirium · on May 22, 2010

This post by patio11 is a more in-depth "how to keep everything up" post rather than a rundown of specific tools (though it mentions tools also), but I thought it was quite informative: http://www.kalzumeus.com/2010/04/20/building-highly-reliable...

njl · on May 21, 2010

I've used both daemontools and monit in anger. For a single server that I'm not particularly concerned about, monit has been more than sufficient and convenient. The web interface to see what's going on isn't bad either. On the other hand, it can send me annoying blizzards of emails when things break.

For the next project, I'm going back to daemontools, for a multitude of reasons. Between the /service directory and the svc command, automating stuff is ridiculously easy. I can get the active health check stuff by writing a ten-line script that does a much better check on a server than the generic checks monit provides. I can get emails when bad stuff happens with logcheck.

Daemontools is just so goddamn unixy.

kineticac · on May 22, 2010

I used to use Monit a lot, but recently started going simpler. We run our entire backend / platform on Heroku, and all our processes are actually DelayedJob workers that run constantly. Using a begin/rescue/ensure we can make sure the "process" here keeps going. It's a little harder to work in a cloud environment, but doing this has saved us from needing any type of sys admin work at all.

It's great using tools like Heroku, who then teams up with awesome other services and then provides great tools to really make life easy as a developer.

skorgu · on May 21, 2010

Bear in mind that at least supervisord and daemontools expect to be the parent of a running, foregrounded process. Monit expects processes to run in the background and generate a pid file, I'm not very familiar with the others.

I really enjoy supervisord personally, it feels similar to daemontools in execution but has a somewhat friendlier interface all around.

jubbam · on May 22, 2010

I second this, supervisord is a pleasure to use, pretty straight forward and quick to get up to speed with how to use it.

msisk6 · on May 21, 2010

Monit seems to be the currently preferred solution in the Rails world. I use it on many different sites with no problems.

I've also used daemontools in the past and never had an issue with it, either.

I've heard good things about god, but I've not used it myself. (Now there's a sentence that could be taken out of context.)

damienfir · on May 21, 2010

I use supervisord to run my python webservers. Works pretty well.

vimalg2 · on May 22, 2010

I pretty much use Monit for everything. (Though I wasn't aware of supervisord.)

I'm always looking forward to infrastructure/scaling/systems-engineering posts like these.

aditya · on May 21, 2010

god is nice too: http://god.rubyforge.org/

there · on May 22, 2010

i've had terrible experiences with god. after a few days of running, it would routinely be the one process eating up the most cpu and lots of memory, yet its only reason for running was to watch out for other processes eating cpu and memory.

i didn't need millisecond accuracy in detecting rogue processes, so i just switched to a small, custom script that runs from cron every minute.

kineticac · on May 22, 2010

God has posed many problems for me in the past. I saw all the same problems "there" saw. It would slow down our entire slice on slicehost to a standstill after a few days of running. It was using more resources than everything else combined. I'm not sure how it is now, but we switched to Monit after God and it's been awesome.

I wrote a quick snippet on how we setup the basics of Monit and also talked about God as well: http://artchang.com/?sort=&search=monit

bensummers · on May 21, 2010

SMF on Solaris / OpenSolaris.

jacquesm · on May 21, 2010

monit works wonders for me.

I got it through a post much like this one.