==================
Grid'5000 Tutorial
==================

This tutorial will introduce Kameleon, a tool to build software appliances.
With Kameleon it is possible to generate appliances that can be deployed on different virtualization hypervisors or on baremetal.
It targets an important activity in Grid'5000 which is the customization of the experimental environments.

---------------
Kameleon basics
---------------

First of all, let's see all the syntax flavors that *Kameleon* has to offer.
From this point, we assume that *Kameleon* have been installed and it's already working
in your system, otherwise go to :ref:`installation` section.
Kameleon can be seen as a shell sequencer which will boost your shell scripts.
It is based on the execution of shell scripts but it provides some syntax sugar that makes
the work with shell scripts less painful.
Let's start with the basics

Kameleon Hello world
~~~~~~~~~~~~~~~~~~~~

Everything we want to build have to be specified by a recipe. Kameleon reads this recipe
and executes the appropriate actions. Let's create a hello world recipe using Kameleon.
Open a text editor and write the following::

     setup:
     - first_step:
       - hello_microstep:
         - exec_local: echo "Hello world"
     # The end

Save the previous file as a YAML file. For instance, hello_world.yaml.

.. note::
    Be sure of respecting the YAML syntax and indentation `yaml`_.

.. _yaml: http://www.yaml.org/


Then, you run it like this::

     kameleon build hello_world.yaml

You will have some output that looks like this::

      [kameleon]: Starting recipe consistency check
      [kameleon]: Resolving variables
      [kameleon]: Calculating microstep identifiers
      [kameleon]: Creating kameleon working directory : /home/cristian/Repositories/exptools/setup_complex_exp/tests/new_version/build/hello_world
      [kameleon]: Building local context [local]
      [kameleon]: Building external context [out]
      [kameleon]: Building internal context [in]
      [kameleon]: Starting build recipe 'hello_world.yaml'
      [kameleon]: Step 1 : setup/first_step/hello_microstep
      [kameleon]:  ---> Running step
      [kameleon]: Starting process: "bash"
      [local_ctx]: The local_context has been initialized
      [local_ctx]: Hello world
      [kameleon]:
      [kameleon]: Build recipe 'hello_world.yaml' is completed !
      [kameleon]: Build total duration : 0 secs
      [kameleon]: Build directory : /home/cristian/Repositories/exptools/setup_complex_exp/tests/new_version/build/hello_world
      [kameleon]: Build recipe file : /home/cristian/Repositories/exptools/setup_complex_exp/tests/new_version/build/hello_world/kameleon_build_recipe.yaml
      [kameleon]: Log file : /home/cristian/Repositories/exptools/setup_complex_exp/tests/new_version/kameleon.log

With this simple example, we have already introduced most of the Kameleon concepts and syntax.
First, how recipes are structured using a hierarchy composed of: sections, steps, microsteps.

* Sections: correspond to the minimal actions that have to be performed in order to have a software
  stack that can be run almost anywhere. This brings to Kameleon a high degree of customizability, reuse of
  code and users have total control over when and where the
  sections have to take place. This minimal actions are: bootstrap, setup and export.

* Steps: It refers to a specific action to be done inside a section
  (e.g., software installation, network configuration, configure kernel).
  Steps can be declared in independent files that improves the degree of reusability.

* Microsteps: procedures composed of shell commands. The goal of dividing steps into microsteps is the
  possibility of activating certain actions within a step and performing a better checkpoint.

Kameleon hierarchy encourages the reuse (shareability) of code and modularity of procedures.
The minimal building block are the commands *exec_* which wraps shell commands adding
a simple error handling and interactivenes in case of a problem.
These commands are executed in a given context. Which could be: local, in, out.
They can be used as follows::

     setup:
       - first_step:
         - hello_microstep:
           - exec_local: echo "Hello world"
	   - exec_in: echo "Hello world"
	   - exec_out: echo "Hello world"
     # The end


* Local context: It represents the Kameleon execution environment. Normally is the user’s machine.

* OUT context: It is where the appliance will be bootstraped. Some procedures have to be carried out in
  order to create the place where the software appliance is built (In context).
  One example is: the same user’s machine using chroot.
  Thus, in this context is where the setup of the chroot takes place.
  Other examples are: setting up a virtual machine, accessing an infrastructure in order to get a reservation and be able to deploy, setting
  a Docker container, etc.

* IN context: It refers to inside the newly
  created appliance. It can be mapped to a chroot,
  virtual machine, physical machine, Linux container, etc.

In the last example all the contexts are executed on the user's machine.
Which is the default behavior that can be customized (it will be shown later on this tutorial).
Most of the time, users take advantage of the *In context* in order to customize a given a appliance.

We can add variables as well::

     setup:
       - first_step:
         - message: "Hello world"
         - hello_microstep:
           - exec_local: echo "Variable value $$message"


Let's apply the syntax to a real example in the next section.

----------------------------------------
Building a simple Debian based appliance
----------------------------------------

Kameleon already provides tested recipes for building different software appliances based
on different Linux flavors. We can take a look at the provided templates by typing::

     $ kameleon templates

Which will output::

    The following templates are available in /home/cristian/Repositories/kameleon_v2/templates:
    NAME                 | DESCRIPTION
    ---------------------|-------------------------------------------------------------
    archlinux            | Build an Archlinux base system system.
    archlinux-desktop    | Archlinux GNOME Desktop edition.
    debian-testing       | Debian Testing base system
    debian7              | Debian 7 (Wheezy) base system
    debian7-desktop      | Debian 7 (Wheezy) GNOME Desktop edition.
    debian7-oar-dev      | Debian 7 dev appliance with OAR-2.5 (node/server/frontend).
    fedora-rawhide       | Fedora Rawhide base system
    fedora20             | Fedora 20 base system
    fedora20-desktop     | Fedora 20 GNOME Desktop edition
    old-debian7          | [deprecated] Build a debian wheezy appliance using chroot...
    ubuntu-12.04         | Ubuntu 12.04 LTS (Precise Pangolin) base system.
    ubuntu-12.04-desktop | Ubuntu 12.04 LTS (Precise Pangolin) Desktop edition.
    ubuntu-14.04         | Ubuntu 14.04 LTS (Trusty Tahr) base system.
    ubuntu-14.04-desktop | Ubuntu 14.04 LTS (Trusty Tahr) Desktop edition.
    vagrant-debian7      | A standard Debian 7 vagrant base box


Let's import the template debian7::

    $ kameleon import debian7

This will generate the following files in the current directory::

    ├── debian7.yaml
    ├── kameleon.log
    └── steps
        ├── aliases
        |   └── defaults.yaml
	├── bootstrap
	│   ├── debian
	│   │   └── debootstrap.yaml
	│   ├── initialize_disk_qemu.yaml
	│   ├── install_bootloader.yaml
	│   ├── prepare_qemu.yaml
	│   └── start_qemu.yaml
	├── checkpoints
	│   └── qemu.yaml
	├── export
	│   └── save_appliance.yaml
	└── setup
	    ├── create_group.yaml
	    ├── create_user.yaml
	    └── debian
	        ├── configure_apt.yaml
		├── configure_kernel.yaml
		├── configure_keyboard.yaml
		├── configure_network.yaml
		├── configure_system.yaml
		├── install_software.yaml
		└── upgrade_system.yaml

     8 directories, 19 files

Here we can observe that a directory has been generated.
This directory contains all the steps needed to build the final software appliance.
These steps are organized by sections. There is a directory checkpoints that is going
to be explained later on.

Here we can notice that all the process of building is based on steps files written with Kameleon syntax.
Separating the steps in different files gives a high degree of reusability.

The recipe looks like this:

.. literalinclude:: debian7.yaml
   :lines: 69-125
   :language: yaml

The previous recipe build a debian wheezy using qemu.
It looks verbose but normally you as user you wont see it.
You will use it as a template in a way that will be explained later.
The recipe specify all the steps, configurations values that are going to be used
to build the appliance. Kameleon recipes gives many details to you, few things are hidden.
Which is good for reproducibility purposes and when reporting bugs.

If we have all the dependencies required as qemu, qemu-tools and debootstrap we can start to build the appliance
doing the following::

     $ kamelon build debian7.yaml

The process will start and in about few minutes
a directory called builds will be generated in the current directory,
you will have a qemu virtual disk with a base debian wheezy installed in it.
That you can try out by executing::

     $ sudo qemu-system-x86_64 -enable-kvm builds/debian7/debian7.qcow2


--------------------------------
Customizing a software appliance
--------------------------------

Now, lets customize a given template in order to create a software appliance that have OpenMPI, Taktuk and tools necessary to compile source code.
Kameleon allows us to extend a given template. We will use this for adding the necessary software. Type the following::

     $ kameleon new debian_customized debian7

This will create the file debian_customized.yaml which contents are::

     ---
     extend: debian7

     global:
     # You can see the base template `debian7.yaml` to know the
     # variables that you can override

     bootstrap:
       - @base

     setup:
       - @base

     export:
       - @base

If we try to build this recipe, it will generate the exact same image as before.
But the idea here is to change it in order to install the desired software.
Therefore, we will modify the setup section like this::

     extend: debian7

     global:
     # You can see the base template `debian7.yaml` to know the
     # variables that you can override

     bootstrap:
       - @base

     setup:
       - @base
       - install_software:
         - packages: >
            g++ make taktuk openssh-server openmpi-bin openmpi-common openmpi-dev

     export:
       - @base


For building execute::

     $ kameleon build debian_customized.yaml

Then, you can follow the same steps as before to try it out and verify that the software was installed.
Now, let's make things a little more complicated. We will now compile and install TAU in our system.
So, for that let's create a step file that will look like this:


.. literalinclude:: tau_install.yaml
   :language: yaml

You have to put it under the directory *steps/setup/* and you can call it tau_install.
In order to use it in your recipe, modify it as follows::

     extend: debian7

     global:
     # You can see the base template `debian7.yaml` to know the
     # variables that you can override

     bootstrap:
       - @base

     setup:
       - @base
       - install_software:
         - packages: >
            g++ make taktuk openssh-server openmpi-bin openmpi-common openmpi-dev
       - tau_install
     export:
       - @base


And rebuild the image again, you will see that it wont start from the beginning.
It will take advantage of the checkpoint system and it will start from the last
successfull executed step.

When building there is the following error::


     [kameleon]: Step 46 : setup/tau_install/tau_install
     [kameleon]:  ---> Running step
        [in_ctx]: Unset ParaProf's cubeclasspath...
	[in_ctx]: Unset Perfdmf cubeclasspath...
	[in_ctx]: Error: Cannot access MPI include directory /usr/local/openmpi-install/include
     [kameleon]: Error occured when executing the following command :
     [kameleon]:
     [kameleon]: > exec_in: ./configure -prefix=/usr/local/tau-install -pdt=/usr/local/pdt-install/ -mpiinc=/usr/local/openmpi-install/include -mpilib=/usr/local/openmpi-install/lib
     [kameleon]: Press [r] to retry
     [kameleon]: Press [c] to continue with execution
     [kameleon]: Press [a] to abort execution
     [kameleon]: Press [l] to switch to local_context shell
     [kameleon]: Press [o] to switch to out_context shell
     [kameleon]: Press [i] to switch to in_context shell
     [kameleon]: answer ? [c/a/r/l/o/i]:

We can observe that the problem is related with the configure script that cannot access the MPI path.
It can be debugged by using the interactive shell provided by Kameleon.
The interactive shell allows us to log into a given context.
For this case we see that the error happened in the in context, so let's type i in order to enter to this context::

  [kameleon]: User choice : [i] launch in_context
     [in_ctx]: Starting interactive shell
  [kameleon]: Starting process: "LC_ALL=POSIX ssh -F /tmp/kameleon/debian_customized/ssh_config debian_customized -t /bin/bash"
  (in_context) root@cristiancomputer: / #

The commands executed by Kameleon remain in the bash history.
Therefore, It can be rexecuted manually.
For this case, we only need to change the path for the OpenMPI libraries.
As we have installed it using the packages they are avaiable under the directories:
*/usr/include/openmpi/*, */usr/lib/openmpi/* respectively.
If we try with the following parameters::

    # ./configure -prefix=/usr/local/tau-install -pdt=/usr/local/pdt-install/ -mpiinc=/usr/include/openmpi/ -mpilib=/usr/lib/openmpi/

It will finish without any problem. We have found the bug, therefore we can just logout by typing *exit* and
then *abort* for stopping the execution and update the step file with the previous line.
If you carry out the building again you will see that now everything goes smoothly.
Again Kameleon will use the checkpoint system to avoid starting from scratch.

---------------------------------
Creating a Grid'5000 environment
---------------------------------

Now, let's use the extend and export functionalities for creating a Grid'5000 environment.
With this step we will see how code can be re-used with Kameleon.
Therefore, we can extend the recipe created before::

     ---
     extend: debian_customized

     global:
         # You can see the base template `debian7.yaml` to know the
         # variables that you can override

     bootstrap:
       - @base

     setup:
       - @base

     export:
       - save_appliance:
         - input: $$image_disk
         - output: $$kameleon_cwd/$$kameleon_recipe_name
         - save_as_tgz

       - g5k_custom:
         - kadeploy_file:
           - write_local:
             - $$kameleon_cwd/$$kameleon_recipe_name.yaml
             - |
               #
               # Kameleon generated based on kadeploy description file
               #
               ---
               name: $$kameleon_recipe_name

               version: 1

               os: linux

               image:
                 file: $$kameleon_recipe_name.tar.gz
                 kind: tar
                 compression: gzip

               postinstalls:
                 - archive: server:///grid5000/postinstalls/debian-x64-base-2.5-post.tgz
                   compression: gzip
                   script: traitement.ash /rambin

               boot:
                 kernel: /vmlinuz
                 initrd: /initrd.img

               filesystem: $$filesystem_type

This recipe will generate in the build directory a tar.gz image and a configuration file for Kadeploy.
For example::

     $ ls builds
     total 8831536
     -rw-r--r-- 1 root root 18767806464 juin  15 23:04 base_debian_g5k.qcow2
     -rw-r--r-- 1 root root   206403737 juin  15 23:04 debian_g5k.tar.gz
     -rw-r--r-- 1 root root         379 juin  15 23:04 debian_g5k.yaml
     -rw-r--r-- 1 root root         426 juin  15 23:03 fstab.orig
     -rw------- 1 root root         672 juin  15 23:01 insecure_ssh_key

Therefore if we log in a Grid'5000 site for instance (Grenoble) we can submit a deploy job and
deploy the image using kadeploy::


  user@fgrenoble:~$ oarsub -I t deploy
  [ADMISSION RULE] Set default walltime to 3600.
  [ADMISSION RULE] Modify resource description with type constraints
  Generate a job key...
  OAR_JOB_ID=1663465
  Interactive mode : waiting...
  Starting...

  Connect to OAR job 1663465 via the node fgrenoble.grenoble.grid5000.fr

  user@fgrenoble:~$ kadeploy -a debian_g5k.yaml -f $OAR_NODEFILE


With luck the image will be deployed on baremetal after some few minutes.


------------------------------
Playing with Kameleon contexts
------------------------------

The environment that has just been deployed is a basic debian.
It doesn't have the modules required for infiniband and
other configuration that site administrators do for a specific hardware
or politics of the site.
In this case would be good to be able to use the environments already
provided by Grid'5000. This can be done by using Kameleon contexts.
The idea is to re-utilize the same recipe we have written before.

Kameleon already provides a recipe for interacting with Grid'5000 where
the configuration of the contexts is as follows:

* Local context: it is the user's machine.

* Context out: it is the site frontend.
  It is used for submitting a job and deploying
  a given Grid'5000 environment.

* Context in: will be inside the deployed node.


First, we import the G5k recipe::

  $ kameleon import debian7-g5k

And we can just make a copy of our previous recipe (debian customized) and
we call it for instance debian_customized_g5k.yaml.
This recipe will look like this:

.. literalinclude:: debian_customized_g5k.yaml
   :language: yaml

But there will be a problem with the installation of TAU. Because
we download the tarball directly from its web site which is an
operation not allowed in Grid'5000. Just certain sites are accessible
using a web proxy.
To solve this we have to modify the step *tau_install* like this:

.. literalinclude:: tau_install_g5k.yaml
   :language: yaml

Here, we change the context for performing the operation of download.
For now on, it will be the local context that is going to download the
tarballs. Then we have to put them into the *in contex* for
this operation we use a pipe. Pipes are a means for communicating
contexts. We use a pipe between our local context and the in contex.

With those changes we will be able to build a G5k environment with
our already tested configuration. The recipe saves
the environment on the Kameleon workdir on the frontend.
Thus the environment is accessible to be deployed the number of times needed.

-------------
Atlas example
-------------

Here, a more complicated example, where we install the benchmark HPL which
is used to benchmark and rank supercomputers for the TOP500 list:

.. literalinclude:: atlas_debian_g5k.yaml
   :language: yaml

We have to add to the *steps/setup* directory the following files *install_atlas.yaml* and *install_hpl.yaml* for installing atlas and hpl respectively,
Atlas:

.. literalinclude:: install_atlas.yaml
   :language: yaml

HPL:

.. literalinclude:: install_hpl.yaml
   :language: yaml


.. note::
   The building of this appliance could take around half an hour.