Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: keeping services up and running?
36 points by nprincigalli on May 21, 2010 | hide | past | favorite | 15 comments
Services are born into a life of sweat and suffering, but some can't stand the beating or the garbage being thrown at them, and they die on you. Hence the question:

What are you using to keep them always up and running?

I've used djb's daemontools some 6 years ago, and it was okay back then, but I wonder what is being used by those setting up their gear now. I've also compiled this quick list of players in this space after some googling and asking around:

  * monit http://mmonit.com/monit/
  * supervisord http://supervisord.org/
  * daemonize http://bmc.github.com/daemonize/
  * runit http://smarden.sunsite.dk/runit/
  * perp http://b0llix.net/perp/
  * launchd http://launchd.macosforge.org/
  * DJB's daemontools http://cr.yp.to/daemontools.html
Pointers to alternatives, too, are greatly appreciated!

Thank you!



Dustin Sallings says, "Don’t start programs, run programs."

See: http://dustin.github.com/2010/02/28/running-processes.html


I use init. Most of my services manage their own daemonization and forked workers. It's easier to design a correct system to detect failure, issue alerts, and restart as necessary than building unkillable workers. Coupled with some basic init scripts, it works pretty well.

I like monit, but it has a weird habit of running super-slowly, failing to restart services, or otherwise flaking out.


This post by patio11 is a more in-depth "how to keep everything up" post rather than a rundown of specific tools (though it mentions tools also), but I thought it was quite informative: http://www.kalzumeus.com/2010/04/20/building-highly-reliable...


I've used both daemontools and monit in anger. For a single server that I'm not particularly concerned about, monit has been more than sufficient and convenient. The web interface to see what's going on isn't bad either. On the other hand, it can send me annoying blizzards of emails when things break.

For the next project, I'm going back to daemontools, for a multitude of reasons. Between the /service directory and the svc command, automating stuff is ridiculously easy. I can get the active health check stuff by writing a ten-line script that does a much better check on a server than the generic checks monit provides. I can get emails when bad stuff happens with logcheck.

Daemontools is just so goddamn unixy.


I used to use Monit a lot, but recently started going simpler. We run our entire backend / platform on Heroku, and all our processes are actually DelayedJob workers that run constantly. Using a begin/rescue/ensure we can make sure the "process" here keeps going. It's a little harder to work in a cloud environment, but doing this has saved us from needing any type of sys admin work at all.

It's great using tools like Heroku, who then teams up with awesome other services and then provides great tools to really make life easy as a developer.


Bear in mind that at least supervisord and daemontools expect to be the parent of a running, foregrounded process. Monit expects processes to run in the background and generate a pid file, I'm not very familiar with the others.

I really enjoy supervisord personally, it feels similar to daemontools in execution but has a somewhat friendlier interface all around.


I second this, supervisord is a pleasure to use, pretty straight forward and quick to get up to speed with how to use it.


Monit seems to be the currently preferred solution in the Rails world. I use it on many different sites with no problems.

I've also used daemontools in the past and never had an issue with it, either.

I've heard good things about god, but I've not used it myself. (Now there's a sentence that could be taken out of context.)


I use supervisord to run my python webservers. Works pretty well.


I pretty much use Monit for everything. (Though I wasn't aware of supervisord.)

I'm always looking forward to infrastructure/scaling/systems-engineering posts like these.


god is nice too: http://god.rubyforge.org/


i've had terrible experiences with god. after a few days of running, it would routinely be the one process eating up the most cpu and lots of memory, yet its only reason for running was to watch out for other processes eating cpu and memory.

i didn't need millisecond accuracy in detecting rogue processes, so i just switched to a small, custom script that runs from cron every minute.


God has posed many problems for me in the past. I saw all the same problems "there" saw. It would slow down our entire slice on slicehost to a standstill after a few days of running. It was using more resources than everything else combined. I'm not sure how it is now, but we switched to Monit after God and it's been awesome.

I wrote a quick snippet on how we setup the basics of Monit and also talked about God as well: http://artchang.com/?sort=&search=monit


SMF on Solaris / OpenSolaris.


monit works wonders for me.

I got it through a post much like this one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: