:markdown

  The Entity subsystem
  =================

  ### Introduction: Entities and Entity Lists

  This is the operation mode that regular users most like use. Its
  organized quite different than other applications you might know.  The
  objective is to provide a simple abstraction to organize and interconnect
  functionalities and background information. Everything is built around
  the idea of Entities. An Entity is anything that can be the subject of
  investigation: 

    * genes
    * proteins
    * transcripts
    * drugs
    * SNPs
    * samples
    * studies
    * pathways
    * chromosomal ranges
    * ...

  Each type of entity has associated a number of reports. There is an
  *Entity Report*, which displays information about an entity. If the
  entity is a gene, then the report will contain the description of the
  gene, its isoforms, functional information, and many other things. There
  is also an *Entity List Report*, which covers not a single entity, but a
  list of entities. For instance, the report for a list of genomic
  mutations includes the types of mutations (transversions, transitions,
  etc), the genes affected, damage predictions, etc. 

  Each report links to other reports. For instance, a gene report will
  include links to reports for its isoforms, the pathways it is associated
  with, etc.  This offers a way to navigate the information and pursue
  interesting leads.

  In addition to reports, there are also *Actions* associated to Entities
  and Entity Lists. These Actions are accessible from each report page.
  The Actions allow the user to issue analysis jobs centered on the current
  Entity or Entity List. For instance, when viewing the report for a gene
  lists, which may contain for example all genes mutated recurrently in a
  cohort, the Actions available include performing enrichment analysis,
  examining the mutation frequencies of the genes in COSMIC or examining
  the mutational status of these genes on other genotyping studies that you
  may have access to.

  Both Entities and Entity Lists can be marked as *Favourites*, this will
  include a link in the menu on the top of the page. Favourite Lists have
  some special features, they can be *flagged* any link that points to an
  entity in the list will be highlighted. This provides a fast an easy way
  to track a list of interest throughout your explorations.  Additionally
  some Actions may take Entity Lists as inputs, and will allow the user to
  choose among her Favourite Lists. 

  There is a third type of report, the *Entity Map Report*. Its meant as
  additional way to connect functionalities and is less important right
  now. They will cover this on the section on *Tables* and ignore them until
  then.

  All reports are unambiguously identified from its URL, including Actions,
  which allows to bookmark or share anything with your collaborators.

  #### A simple example

  A common place to start using the application is from a *Study Report*.
  The Study entity represents a collection of datasets, for instance a
  cohort of exome sequenced samples for the ICGC CLL study. The report
  includes a link to all the genes that have mutations altering their
  protein isoforms in at least two samples; clicking on that link takes as
  to the *Gene List Report* for the list with name "Recurrently mutated
  genes in CLL".  From the Gene List Report we can access the Action
  "Enrichment", where we are presented with the option of performing an
  hypergeometric-based enrichment analysis for functional annotations that
  include Kegg, The Gene Ontology, Pfam domains, Reactome, and several
  other functional information databases.

  #### A very important note

  The system must be able to unambiguously and precisely identify the
  entities. For instance the string "TP53" may seem to clearly identify a
  gene, yet this is not entirely true. First we need to know the organism
  it refers to, in this case Human. Consider now that we ask for the
  genomic coordinates of the gene, this question cannot be answered until
  we known the version of the build we need to use. By default the system
  will try the most recent version of the genome in Ensembl, but other
  builds can be specified. The complete way to specify the organism will be
  "Hsa/may2009" for the hg18 build or "Hsa/jan2013" for the latest hg19
  build. This system downloads all data consistently from Ensembl using the
  builds. Using an explicit version of the build (Hsa/jan2013) instead of
  just the organism (Hsa) will prevent problems with inconsistencies that
  could result from downloading different files at different times, where
  there may have being some updates. The organism codes follow the convention of Kegg, one
  letter from the *genus*, two from the *species*; Examples: Homo sapiens
  -- Hsa; Mus musculus -- Mmu. The dates that specify the builds represent
  the different archives of Ensembl, which is the corner stone of all our
  genomic data.

  Additionally, there are several ways to refer to the same gene; for
  instance "TP53" can also be expressed as "ENSG00000141510". So, when
  asking the system for a gene, the information on the format of the
  identifier must also be provided. Fortunately, the system will take care
  of transparently translating identifiers between different formats to
  match the formats in different resources. It will also propagate the
  information about the organism across different reports.

  Last but by all means not least: This system is *CASE-SENSITIVE* almost
  all over the place. TP53 is a Human gene, Tp53 is a Mouse gene, no wiggle
  room allowed! It might take a little getting used to, but
  case-insensitive behaviour has been avoided almost all over the system
  for performance issues. Again, the user should not have to worry about
  this most of the time, as the system will take care of most details, but
  its very important to know.

  ### User Interface: Basics

  Before we continue let us see how the user interface is organized. The
  top bar contains the application title, which links to he main page,
  the *Reload* button, the *Start* button, the *Favourite* menus, and 
  the *Search* box. 

  When a report is first requested the result is saved for further use,
  even if there was an error producing it.  This is true for entities,
  lists, actions, and almost everything else. Clicking the reload button
  on the browser will just render the saved result. To force regenerating
  the report you need to use the Reload button on the top of the page.
  An exception to this are Actions. These can be opened in separate windows,
  in which case it works as usual, but are more often opened from the 
  *Action Section* of the report, which will be covered below. The Action 
  Section has its own Reload button.

  To Star button toggles the favourite status of the current Entity or
  Entity List. It only works for Entities, Entity Lists, and Entity.  It
  has no effect on Actions, for the time being. The Favourite Menus are
  updated when the Star button is clicked. If a Favourite is made on a
  different browser tab or window, the current Favourite Menus can be
  updated by reloading the page using the browser button (the page should
  already be saved), or by clicking on the Star button, which also updates
  the current menus. Input forms and actions that have Entity Lists as
  inputs will also be updated this way.

  Some parts of the page will be loaded in the background. These include
  portions of the report that are more costly to compute, or particular
  processes that are issues by the user interacting with the page. To make
  the user aware of these, the little number on the far right of the top
  bar displays the number of processes communicating with the server on
  background.

  On small devices (tablets and phones), the top-bar will be collapsed and
  some elements hidden. To display these elements, click on the "Menu"
  button on the top right.


  ### User Interface: Report template

  Reports may have any type of content, however, they are usually based on
  a common template. It has a title on the top row, a side bar on the left,
  a description on the right, at the top, and the Actions Section below
  it.  Note that, depending on the particular report, the description may
  be empty or there might not be any actions associated to it.

  In general, the sidebar is used to display technical information about
  the Entity at hand. For a Gene Report, for instance, the sidebar is used
  to display the format used to specify the gene (such as Ensembl Gene ID),
  and the organism it refers to. In the case of gene reports, the basic
  identification information is followed with additional information about
  isoforms, functional annotations, PubMed articles, etc. On small devices
  (tablets and phones), the sidebar will hide on the right of the screen,
  and will be displayed on clicking the blue button on the title section.


  ### User Interface: Actions

  Entities and Entity Lists may have associated Actions, depending on the
  type of Entity. When Actions are available for a Report, they are displayed
  in the Actions Section. This section is composed on an horizontal bar
  with a button for each each action. When a button is clicked the action
  is displayed below the bar. If the action takes some time, a 'Loading...'
  message will appear. The bar includes a button to reload an action that
  has already been computed. If the action accepts parameters, the user
  will be required to set them. To set the parameters click on the button
  with a gear; this will display the parameter section. Of course, if the
  parameters of an action are changed, a new report will be generated; 
  Actions are saved separately for each combination of parameters. 

  For actions with large reports, it may be better to open them in a separate
  window. You can do this using the mouse-right-click, just like any regular
  link.