ruby-collectd ============= `ruby-collectd` lets you send statistics to `collectd` from Ruby. This can be useful if you want to **instrument data** from within daemons or servers written in Ruby. `ruby-collectd` works by talking the `collectd` network protocol, and sending stats periodicially to a network-aware instance of `collectd`. Setup ----- You need to have `collectd` load the network plugin, and listen on UDP port `25826` (so it's acting as a server): :: # /etc/collectd.conf LoadPlugin network Listen "ff18::efc0:4a42" To install the gem, make sure the Gemcutter RubyGems server is known to your local RubyGems, then install it: :: gem sources -a http://gemcutter.org gem install collectd Add ``sudo`` in front of the ``gem install`` command if you want it to be installed system wide. Usage ----- :: require 'rubygems' require 'collectd' First of all, specify a server to send data to: :: Collectd.add_server(interval=10, addr='ff18::efc0:4a42', port=25826) Each server you add will receive all the data you push later. An interval of 10 is quite reasonable. Because of UDP and `collectd`'s network buffering, you can set the interval to less than 10, but you won't see much benefit. `ruby-collectd` gives you a free data collector out of the box, and it's a nice gentle introduction to instrumenting your app. To collect memory and CPU statistics of your Ruby process, do: :: Stats = Collectd.my_process(:woo_data) Stats.with_full_proc_stats In the first line, we set up a new plugin. ``my_process`` is the plugin name (magically handled by method_missing), and ``:woo_data`` is the plugin instance. A **plugin name** is generally an application's name, and a **plugin instance** is a unique identifier of an instance of an application (i.e. you have multiple daemons or scripts running at the same time). In the second line, ``with_full_proc_stats`` is a method provided by `ruby-collectd` that collects stats about the current running process. It makes use of polled gauges, which we talk about later. Behind the scenes, ``with_full_proc_stats`` is using a **simple interface** you can use to instrument your own data. Back in the first line we set up a plugin which we wanted to record some data on. ``with_full_proc_stats`` sets up **types**, which are a kind of data you are measuring (in this case CPU and memory usage). You can do this yourself like this: :: Stats = Collectd.my_daemon(:backend) # Set counter absolutely Stats.my_counter(:my_sleep).counter = 0 Stats.my_gauge(:my_gauge).gauge = 23 loop do # Increment counter relatively Stats.my_counter(:my_sleep).count! 5 # Set gauge absolutely Stats.my_gauge(:my_stack).gauge = rand(40) sleep 5 end (Don't worry if this doesn't make sense - gauges and counters are explained below) You can also **poll** for your data, if you feel comfortable with that: :: Stats.counter(:seconds_elapsed).polled_counter do Time.now.to_i end Glossary / collectd's data model -------------------------------- `collectd` groups data by **six categories:** * *hostname* is grabbed from ``hostname -f`` * *plugin* is the application's name * *plugin-instance* is passed from the programs' side with the programs instance identifier, useful if you're running the same script twice (PIDs are quite too random) * *type* is the kind of data you are measuring and must be defined in types.db_ for collectd to understand * *type-instance* provides further distinction and have no relation to other type-instances. Multiple type-instances are only rendered into one graph by collection3 if defined with module GenericStacked. * *values* are one or more field names and types belonging together. The exact amount of fields and their corresponding names (useful to collection3) are specified in collectd's types.db_. A value can be either of **two types:** * *COUNTER* is for increasing counters where you want to plot the delta. Network interface traffic counters are a good example. * *GAUGE* is values that go up and down to be plotted as-is, like a temperature graph. .. _types.db: http://collectd.org/documentation/manpages/types.db.5.shtml