Go to Hackademy website

Monitoring server-side processes with God

Nicolas Cavigneaux

Posté par dans la catégorie outils

Have you ever needed to put a critical website online? Ensure that it’s up & running 24/7? If you did, you know that it can be a real pain to check that everything is ok, that all services are running, that no process is eating too much resources (CPU / memory).

Here at Synbioz we have to ensure services reliability for our customers. There are many ways to do that, you can write your own shell scripts, play with some crontabs, send email on failure, … But it is kind of difficult to write effective scripts, ensure it’s working well and more over, most of the time your homemade scripts will not be portable and will only work with a specific application. So what is the best way to handle this?

God is upon you!

Here comes God which is a monitoring framework you can rely on to keep your processes and tasks running well. God is written in Ruby and aims to be a simple, powerful and flexible way to write monitoring tasks.

Before going deeper into God, you must know that it will only work on Unix-like systems. Sorry Windows users but hey I know you never ever want to deploy a production app on a Windows server…

God config files are written in Ruby, so you can do basically everything Ruby allows you to do, and it’s a lot of stuff.

God features are:

  • Config files are written in Ruby
  • Easily write your own custom conditions in Ruby
  • Supports both poll and event based conditions
  • Different poll conditions can have different intervals
  • Integrated notification system (eg: XMPP notifier)
  • Easily control non-deamonizing scripts

God basics

Install

You must first start by installing God on your system, I mean the production server:

$ sudo gem install god

or add it to your Gemfile like so:

gem "god"

You can now create a God configuration file for the deamon you want to monitor:

$ touch config/unicorn.god

Naming config file with .god extension is a convention but this file is in fact a plain Ruby file.

Handle deamons start and stop

RAILS_ROOT = File.dirname(File.dirname(__FILE__))

God.watch do |w|
  pid_file = File.join(RAILS_ROOT, "tmp/pids/unicorn.pid")

  w.name = "unicorn"
  w.dir = RAILS_ROOT
  w.interval = 60.seconds
  w.start = "unicorn -c #{RAILS_ROOT}/config/unicorn.rb -D"
  w.stop = "kill -s QUIT $(cat #{pid_file})"
  w.restart = "kill -s HUP $(cat #{pid_file})"
  w.start_grace = 20.seconds
  w.restart_grace = 20.seconds
  w.pid_file = pid_file

  w.uid = 'nico'
  w.gid = 'team'

  w.env = { 'RAILS_ENV' = "production" }

  w.behavior(:clean_pid_file)
end

God monitoring

We’re now going to enhance our config file to add real process monitoring. Monitoring will allow us to check CPU and memory usages by process:

RAILS_ROOT = File.dirname(File.dirname(__FILE__))

God.watch do |w|
  pid_file = File.join(RAILS_ROOT, "tmp/pids/unicorn.pid")

  w.name = "unicorn"
  w.interval = 60.seconds
  w.start = "unicorn -c #{RAILS_ROOT}/config/unicorn.rb -D"
  w.stop = "kill -s QUIT $(cat #{pid_file})"
  w.restart = "kill -s HUP $(cat #{pid_file})"
  w.start_grace = 20.seconds
  w.restart_grace = 20.seconds
  w.pid_file = pid_file

  w.behavior(:clean_pid_file)

  # When to start?
  w.start_if do |start|
    start.condition(:process_running) do |c|
      # We want to check if deamon is running every ten seconds
      # and start it if itsn't running
      c.interval = 10.seconds
      c.running = false
    end
  end

  # When to restart a running deamon?
  w.restart_if do |restart|
    restart.condition(:memory_usage) do |c|
      # Pick five memory usage at different times
      # if three of them are above memory limit (100Mb)
      # then we restart the deamon
      c.above = 100.megabytes
      c.times = [3, 5]
    end

    restart.condition(:cpu_usage) do |c|
      # Restart deamon if cpu usage goes
      # above 90% at least five times
      c.above = 90.percent
      c.times = 5
    end
  end

  w.lifecycle do |on|
    # Handle edge cases where deamon
    # can't start for some reason
    on.condition(:flapping) do |c|
      c.to_state = [:start, :restart] # If God tries to start or restart
      c.times = 5                     # five times
      c.within = 5.minute             # within five minutes
      c.transition = :unmonitored     # we want to stop monitoring
      c.retry_in = 10.minutes         # for 10 minutes and monitor again
      c.retry_times = 5               # we'll loop over this five times
      c.retry_within = 2.hours        # and give up if flapping occured five times in two hours
    end
  end
end

You can repeat the God watch block as much as you need to handle other deamons your application makes us of.

Usage

Now that your config file is ready, you can check current God status

$ god status

which will tell you that unicorn is down. So we’re going to start it:

$ god -c config/unicorn.god

Same but not deamonized:

$ god -c config/unicorn.god -D

God status should now tell you that unicorn is up and running.

$ god log unicorn

will show you what God did with the deamon and will also show you monitoring results such as last memory and CPU usages in real-time.

If you like to play you can now try to kill unicorn process from another shell and look at what happen in God logs:

$ kill $(cat tmp/pid/unicorn.pid)

You should see that God detected that unicorn isn’t running anymore, deleted pid file if it existed and started unicorn deamon again.

Now if you need to stop all God monitorings:

$ god terminate

or a given one:

$ god stop unicorn

Server init process

Great we’re happy with our monitoring system, but how do I start this thing when server starts or reboots? You have to write an init script! But relax, I have one for you:

Init script

#!/bin/bash
#
# God
#
# chkconfig: - 85 15
# description: start, stop, restart, status for God
#

RETVAL=0

case "$1" in
    start)
      god -P /var/run/god.pid -l /var/log/god.log
      god load /etc/god.conf
      RETVAL=$?
      ;;
    stop)
      kill `cat /var/run/god.pid`
      RETVAL=$?
      ;;
    restart)
      kill `cat /var/run/god.pid`
      god -P /var/run/god.pid -l /var/log/god.log
      god load /etc/god.conf
      RETVAL=$?
      ;;
    status)
      /usr/bin/god status
      RETVAL=$?
      ;;
    *)
      echo "Usage: god {start|stop|restart|status}"
      exit 1
  ;;
esac

exit $RETVAL

Global God config file

As you can see, the above script makes use of a file named /etc/god.conf. This file has only one simple purpose, load a bunch of God config files at once:

God.load "/etc/god/*.god"

This trick allows you to create a symlink of your app God config files into /etc/god/ directory to ensure it will be loaded on server boot. This is very similar to the technique used for Mongrel.

Now you can do:

$ /etc/init.d/god start

$ /etc/init.d/god status

$ /etc/init.d/god stop

Notification on failures

Let’s say you want to be notified everytime a process exits, you can add this to your God configuration file:

w.transition(:up, :start) do |on|
  on.condition(:process_exits) do |c|
    c.notify = 'devteam'
  end
end

Now god knows that everytime our process exits when starting it should send a notification to “devteam”. You can use notify in any condition block.

But what is “devteam” and how the hell are notification sent?!

Sending emails

First solution is to send email.

We’ll first start by defining some default for email in our God config file:

God::Contacts::Email.defaults do |d|
  d.from_email = 'god@synbioz.com'
  d.from_name = 'God'
  d.delivery_method = :sendmail
end

Then we need to define a contact:

God.contact(:email) do |c|
  c.name = 'Dev Team'
  c.group = 'devteam'
  c.to_email = 'team@synbioz.com'
end

You can define as much contacts as you need but be sure “name” attribute is unique! Now our dev team will receive email notification when there’s such a problem.

Jabber notifications

You don’t like emails and want XMPP notifications? No problem:

God::Contacts::Jabber.defaults do |d|
  d.host = "jabber.synbioz.com"
  d.from_jid = "foo@synbioz.com"
  d.password = "bar"
end

God.contact(:jabber) do |c|
  d.to_jid = "baz@synbioz.com"
end

Other notification systems

You can also use Campfire, Prowl, Scout, Twitter and WebHook to send notifications. They are part of God core.

You can easily extends notifications if you need to use your own system, maybe an internal tracking system.

God is good for you!

I hope this quick introduction to God will be helpful for those of you who want to monitor their applications. Don’t think God is only for Rails apps or even Ruby apps. You can use God for anything you want to monitor, Rails projects or not!

Synbioz Team.

Commentaires (14) Flux RSS des commentaires

  • 05/01/2012 à 10:33

    yannski

    Thanks for the article ! If I understand well, God shoulmd run as privileged user. How do you incorporate this constraint with the fact that usually deployment are done with an unprivileged user ? How do you restart service in that case ?

  • 05/01/2012 à 11:57

    Nicolas Cavigneaux

    @yannski Do you mean when you do something like cap deploy? God isn't meant to be use on deploy but rather as a service started at server boot time.

    As you can see in the first example, each "watch" block can take a UID and group to be apply to the daemon when it start so you can easily interact (stop / restart) it as an unprivileged user in your deploy scripts.

  • 05/01/2012 à 13:33

    Nicolas Cavigneaux

    @yannski If you are wondering how to make capistrano and God communicates, here is a link for you: https://github.com/jnewland/san_juan

  • 05/01/2012 à 14:14

    yannski

    interesting ! thanks for the pointers

  • 05/01/2012 à 15:58

    yannski

    J'ai reçu les mails. Mais je ne vois pas les réponses affichées ici !

  • 05/01/2012 à 16:02

    fuse

    Désolé Yann c'était juste des tests de dev ;-)

  • 12/02/2013 à 13:58

    Sahidur Rahman

    Thanks for nice article.
    I added God gem for monitoring my applications delayed_job processes, but when i want to start god service, like god start delayed_job or sudo god start delayed_job then return an error message - The server is not available (or you do not have permissions to access it).
    So what can i do now ?
    Please give a suggestion.
    I use in my project - Rails 2.3.16, ruby - 1.8.7

  • 12/02/2013 à 14:03

    Nicolas Cavigneauxb

    @Sahidur : Did you start god ? I think you forgot : "/etc/init.d/god start" or "god load /etc/god.conf"

  • 14/02/2013 à 06:25

    Sahidur Rahman

    Thanks @Nicolas.
    My application god service is working now.
    I was install god service as a normal user, thats why occurred permission related errors.
    God should be run as a root user.

  • 28/02/2014 à 10:44

    Derek

    Thank you for this! I'm having trouble understanding something, I know that god must be run as root, but what if you are using god to run a ruby program developed by a non-root user who uses ruby from RVM with the corresponding gems in RVM?
    I found this:
    https://rvm.io/integration/init-d
    but it still isn't clear to me how it should allow a ruby program to execute at startup with a user-specific RVM gemset. Thanks again!

  • 03/03/2014 à 09:40

    Nicolas Cavigneaux

    @Derek: Can't really help you out with this one since I'm not using RVM nor RVM gemsets. Maybe this link will help: https://rvm.io/deployment/god

  • 17/09/2015 à 03:15

    steakknife

    Simpler, UNIX-philosophy rvm replacement (sans gemsets):



    ruby-install :: build and install Rubies from sources without a hellazillion "richness" tweaks.

    chruby :: switch Rubies cleanly

    bunder :: package --all && --local --deployment will vendor and ignore globally-installed gems

  • 05/12/2015 à 06:29

    Max

    God will never kill unicorn workers which abuse the memory and cpu limits as it monitors master process only.
    So your approach does not work.

  • 05/11/2016 à 08:55

    Priyank Agrawal

    How do we send a custom message in email triggered by god when watching a process for process_exit. It currently send a default message but i would like to send my logs in email. Reference link/tutorial would be appreciated.

Ajouter un commentaire