# Ironfan + Chef Style Guide **NOTE**: NEEDS UPDATE. will be current by time of launch but the below may differ from what you see in practice. ## Cookbooks Ordinary cookbooks describe a single system, consisting of one or more components. For example, the `redis` cookbook has a `server` component (with a daemon and moving parts), and a `client` component (which is static). You should crisply separate cookbook-wide concerns from component concerns. The server's attributes live in `node[:redis][:server]`, it is installed by the `redis::server` cookbook, and so forth. You should also separate system configuration from multi-system integration. Cookbooks should provide hooks that are neighborly but not exhibitionist, and otherwise mind their own business. The `hadoop_cluster` cookbook describes hadoop, the `pig` cookbook pig, and the `zookeeper` cookbook zookeeper. The job of tying those components together (copying zookeeper jars into the pig home dir, or the port+addr of hadoop daemons) should be isolated. ### Recipes * Naming: - `foo/recipes/default.rb` -- information shared by anyone using foo, including support packages, directories - `foo/recipes/client.rb` -- configure me as a foo client - `foo/recipes/server.rb` -- configure me as a foo server - `foo/recipes/ec2_conf` -- cloud-specific settings * Always include a `default.rb` recipe, even if it is blank. * *DO NOT* install daemons via the default cookbook, even if that's currently the only thing it does. Remember, a node that is a client -- or refers to any current or future component of the system -- will include the default recipe. * Do not repeat the cookbook name in a recipe title: `hbase:master`, not `hbase:hbase_master`; `zookeeper:server`, not `zookeeper:zookeeper_server`. * Use only `[a-z0-9_]` for cookbook and component names. Do not use capital letters or dashes. Keep names to fewer than 15 characters. ### Cookbook Dependencies * Dependencies should be announced in metadata.rb, of course. * *DO* remember to explicitly `include_recipe` for system resources -- `runit`, `java`, `provides_service`, `thrift` and `apt`. * *DO NOT* use `include_recipe` unless putting it in the role would be utterly un-interesting. You *want* the run to break unless it's explicitly included the role. - *yes*: `java`, `ruby`, `provides_service`, etc. - *no*: `zookeeper:client`, `nfs:server`, or anything that will start a daemon Remember: ordinary cookbooks describe systems, roles and integration cookbooks coordinate them. * `include_recipe` statements should only appear in recipes that are entry points. Recipes that are not meant to be called directly should assume their dependencies have been met. * If a recipe is meant to be the primary entrypoint, it *should* include default, and it should do so explicitly: `include_recipe 'foo::default'` (not just 'foo'). ### Templates * *DO NOT* use node[:foo] in your templates except in rare circumstances. Instead, say `variables :foo => node[:foo]`; this lets folks use that cookbook from elsewhere. ### Attributes * Scope concerns by *cookbook* or *cookbook and component*. `node[:hadoop]` holds cookbook-wide concerns, `node[:hadoop][:namenode]` holds component-specific concerns. * Attributes shared by all components sit at cookbook level, and are always named for the cookbook: `node[:hadoop][:log_dir]` (since it is shared by all its components). * Component-specific attributes sit at component level (`node[:cookbook_name][:component_name]`): eg `node[:hadoop][:namenode][:service_state]`. Do not use a prefix (NO: `node[:hadoop][:namenode_handler_count]`) #### Attribute Files * The main attribute file should be named `attributes/default.rb`. * If there are a sizeable number of tunable attributes (hadoop, cassandra), place them in `attributes/tuneables.rb`. * ?? Place integration attribute *hooks* in `attributes/integration.rb` ?? * Be generic when you're *simple and alone*, descriptive when you're not. - If a component has only one log file, call it 'log_file': `node[:foo][:server][:log_file]` and in general do not use a prefix. - If a component has more than one log_file, *always* use a prefix: `node[:foo][:server][:dashboard_log_file]` and `node[:foo][:server][:gc_log_file]`. * If you don't have exactly the semantics and datatype of the convention, don't use the convention. That is, don't use `:port` and give it a comma-separated string, or `:addr` and give it an email address. * (*this advice will change as we figure out integration rules*: use `foo_client` when you are a client of a service: so [:rails][:mysql_client][:host] to specify the hostname of your mysql server.) ## Attribute Names ### Universal Aspects ### File and Dir Aspects A *file* is the full directory and basename for a file. A *dir* is a directory whose contents correspond to a single concern. A *root* is a prefix not intended to be used directly -- it will be decorated with suffixes to form dirs and files. A *basename* is only the leaf part of a file reference. Don't use the terms 'path' or 'filename'. Ignore the temptation to make a one-true-home-for-my-system, or to fight the package maintainer's choices. #### Application * **home_dir**: Logical location for the cookbook's system code. - default: typically, leave it up to the package maintainer. Otherwise, `:prefix_root/share/:cookbook` should be a symlink to the `install_dir` (see below). - instead of: `xx_home` / `dir` alone / `install_dir` * **prefix_root**: A container with directories bin, lib, share, src, to use according to convention - default: `/usr/local`. * **install_dir**: The cookbook's system code, in case the home dir is a pointer to potential alternates. - default: `:prefix_root/share/:cookbook-:version` ( you don't need the directory after the cookbook runs, use `:prefix_root/share/:cookbook-:version` instead, eg `/usr/local/src/tokyo_tyrant-xx.xx`) - Make `home_dir` a symlink to this directory (eg home_dir `/usr/local/share/elasticsearch` links to install_dir `/usr/local/share/elasticsearch-0.17.8`). * **src_dir**: holds the compressed tarball, its expanded contents, and the compiled files when installing from source. Use this when you will run `make install` or equivalent and use the files elsewhere. - default: `:prefix_root/src/:system_name-:version`, eg `/usr/local/src/pig-0.9.tar.gz` - do not: expand the tarball to `:prefix_root/src/(whatever)` if it will actually be used from there; instead, use the `install_dir` convention described above. (As a guideline, I should be able to blow away `/usr/local/src` and everything still works). * **deploy_dir**: deployed code that follows the capistrano convention. See more about deploy variables below. - the `:deploy_dir/shared` directory holds common files - releases are checked out to `:deploy_dir/releases/{sha}` - the operational release is a symlink to the right release: `:deploy_dir/current -> :deploy_dir/releases/xxx`. - do not: use this when you mean `home_dir`. * **scratch_roots**, **persistent_roots**: an array of directories spread across volumes, with expectations on persistence - `scratch_root`s have no guarantee of persistence -- for example, stop/start'ing a machine on EC2 destroys the contents of its local (ephemeral) drives. `persistent_root`s have the *best available* promise of persistance: if permanent (eg EBS) volumes are available, they will exclusively populate the `persistent_root`s; but if not, the ephemeral drives are used instead. - these attributes are provided by the `mountable_volume` meta-cookbook and its appropriate integration recipe. Ordinary cookbooks should always trust the integration cookbook's choices (or visit the integration cookbook to correct them). - each element in `persistent_roots` is by contract on a separate volume, and similarly each of the `scratch_roots` is on a separate volume. A volume *may* be in both scratch and persistent (for example, there may be only one volume!). - the singular forms **scratch_root** and **persistent_root** are provided for your convenience and always correspond to `scratch_roots.first` and `persistent_roots.first`. This means lots the first named volume is picked on the heaviest -- if you don't like that, choose explicitly (but not randomly, or you won't be idempotent). * **log_file**, **log_dir**, **xx_log_file**, **xx_log_dir**: - default: - if the log files will always be trivial in size, put them in `/var/log/:cookbook.log` or `/var/log/:cookbook/(whatever)`. - if it's a runit-managed service, leave them in `/etc/sv/:cookbook-:component/log/main/current`, and make a symlink from `/var/log/:cookbook-component` to `/etc/sv/:cookbook-:component/log/main/`. - If the log files are non-trivial in size, set log dir `/:scratch_root/:cookbook/log/`, and symlink `/var/log/:cookbook/` to it. - If the log files should be persisted, place them in `/:persistent_root/:cookbook/log`, and symlink `/var/log/:cookbook/` to it. - in all cases, the directory is named `.../log`, not `.../logs`. Never put things in `/tmp`. - Use the physical location for the `log_dir` attribute, not the /var/log symlink. * **tmp_dir**: - default: `/:scratch_root/:cookbook/tmp/` - Do not put a symlink or directory in `/tmp` -- something else blows it away, the app recreates it as a physical directory, `/tmp` overflows, pagers go off, sadness spreads throughout the land. * **conf_dir**: - default: `/etc/:cookbook` * **bin_dir**: - default: `/:home_dir/bin` * **pid_file**, **pid_dir**: - default: pid_file: `/var/run/:cookbook.pid` or `/var/run/:cookbook/:component.pid`; pid_dir: `/var/run/:cookbook/` - instead of: `job_dir`, `job_file`, `pidfile`, `run_dir`. * **cache_dir**: - default: `/var/cache/:cookbook`. * **data_dir**: - default: `:persistent_root/:cookbook/:component/data` - instead of: `datadir, `dbfile`, `dbdir` * **journal_dir**: high-speed local storage for commitlogs and so forth. Can be deleted, though you may rather it wasn't. - default: `:scratch_root/:cookbook/:component/scratch` - instead of: `commitlog_dir` ### Daemon Aspects * **daemon_name**: daemon's actual service name, if it differs from the component. For example, the `hadoop-namenode` component's daemon is `hadoop-0.20-namenode` as installed by apt. * **daemon_states**: an array of the verbs acceptable to the Chef `service` resource: `:enable`, `:start`, etc. * **num_xx_processes**, **num_xx_threads** the number of separate top-level processes (distinct PIDs) or internal threads to run - instead of `num_workers`, `num_servers`, `worker_processes`, `foo_threads`. * **log_level** - application-specific; often takes values info, debug, warn - instead of `verbose`, `verbosity`, `loglevel` * **user**, **group**, **uid**, **gid** -- `user` is the user name. The `user` and `group` should be strings, even the `uid` and `gid` should be integers. - instead of username, group_name, using uid for user name or vice versa. - if there are multiple users, use a prefix: `launcher_user` and `observer_user`. ### Install / Deploy Aspects * **release_url**: URL for the release. - instead of: install_url, package_url, being careless about partial vs whole URLs * **release_file**: Where to put the release. - default: `:prefix_root/src/system_name-version.ext`, eg `/usr/local/src/elasticsearch-0.17.8.tar.bz2`. - do not use `/tmp` -- let me decide when to blow it away (and make it easy to be idempotent). - do not use a non-versioned URL or file name. * **release_file_sha** or **release_file_md5** fingerprint - instead of: `whatever_checksum`, `whatever_fingerprint` * **version**: if it's a simply-versioned resource that uses the `major.minor.patch-cruft` convention. Do not use unless this is true, and do not use the source control revision ID. * **plugins**: array of system-specific plugins use `deploy_{}` for anything that would be true whatever SCM you're using; use `git_{}` (and so forth) where specific to that repo. * **deploy_env** production / staging / etc * **deploy_strategy** * **deploy_user** user to run as * **deploy_dir**: Only use `deploy_dir` if you are following the capistrano convention: see above. * **git_repo**: url for the repo, eg `git@github.com:infochimps-labs/ironfan.git` or `http://github.com/infochimps-labs/ironfan.git` - instead of: `deploy_repo`, `git_url` * **git_revision**: SHA or branch - instead of: `deploy_revision` * **apt/{repo_name}** Options for adding a cookbook's apt repo. - Note that this is filed under *apt*, not the cookbook. - Use the best name for the repo, which is not necessarily the cookbook's name: eg `apt/cloudera/{...}`, which is shared by hadoop, flume, pig, and so on. - `apt/{repo_name}/url` -- eg `http://archive.cloudera.com/debian` - `apt/{repo_name}/key` -- GPG key - `apt/{repo_name}/force_distro` -- forces the distro (eg, you are on natty but the apt repo only has maverick) ### Ports * **xx_port**: - *do not* use 'port' on its own. - examples: `thrift_port`, `webui_port`, `zookeeper_port`, `carbon_port` and `whisper_port`. - xx_port: `default[:foo][:server][:port] = 5000` - xx_ports, if an array: `default[:foo][:server][:ports] = [5000, 5001, 5002]` * **addr**, **xx_addr** - if all ports bind to the same interface, use `addr`. Otherwise, do *not* use `addr`, and use a unique `foo_addr` for each `foo_port`. - instead of: `hostname`, `binding`, `address` * Want some way to announce my port is http or https. * Need to distinguish client ports from service ports. You should be using cluster service discovery anyway though. ### Application Integration * **jmx_port** ### Tunables * **XX_heap_max**, **xx_heap_min**, **java_heap_eden** * **java_home** * AVOID **java_opts** if possible: assemble it in your recipe from intelligible attribute names. ### Nitpicks * Always put file modes in quote marks: `mode "0664"` not `mode 0664`. ## Announcing Aspects If your app does any of the following, * **services** -- Any interesting long-running process. * **ports** -- Any reserved open application port - *http*: HTTP application port - *https*: HTTPS application port - *internal*: port is on private IP, should *not* be visible through public IP - *external*: port *is* available through public IP * metric_ports: - **jmx_ports** -- JMX diagnostic port (announced by many Java apps) * **dashboards** -- Web interface to look inside a system; typically internal-facing only, and probably not performance-monitored by default. * **logs** -- um, logs. You can also announce the logs' flavor: `:apache`, `log4j`, etc. * **scheduleds** -- regularly-occurring events that leave a trace * **exports** -- jars or libs that other programs may wish to incorporate * **consumes** -- placed there by any call to `discover`. ### Dummy aspects Integration cookbooks that announce as * Elastic Load Balancers ## Clusters * Describe physical configuration: - machine size, number of instances per facet, etc - external assets (elastic IP, ebs volumes) * Describe high-level assembly of systems via roles: `hadoop_namenode`, `nfs_client`, `flume_agent`, etc. * Describe important modifications, such as `ironfan::system_internals`, mounts ebs volumes, etc * Describe override attributes: - `heap size`, rvm versions, etc. * roles and recipes - remove `cluster_role` and `facet_role` if empty - are not in `run_list`, but populated by the `role` and `recipe` directives * remove big_package unless it's a dev machine (sandbox, etc) ## Roles Roles define the high-level assembly of recipes into systems * override attributes go into the cluster. currently, those files are typically empty and are badly cluttering the roles/ directory. the cluster and facet override attributes should be together, not scattered in different files. roles shouldn't assemble systems. The contents of the infochimps_chef/roles/plato_truth.rb file belong in a facet. * Deprecated: - Cluster and facet roles (`roles/gibbon_cluster.rb`, `roles/gibbon_namenode.rb`, etc) go away - roles should be service-oriented: `hadoop_master` considered harmful, you should explicitly enumerate the services