h1. Getting Started Building Your Own Hydra Application h2. Before You Begin *!!! This is a WORK IN PROGRESS !!!* This tutorial is a work in progress. We are actively seeking feedback. If you run into _anything_ that is incorrect or confusing, please email the "hydra-tech":http://groups.google.com/group/hydra-tech mailing list and let us know. h2. What you will learn from this document Set up a Hydra Head (a Rails app that uses the hydra-head plugin) and run it. Define a new Content Type called JournalArticle. # Define the ActiveFedora Model for JournalArticles # Index JournalArticles into solr # Customize how JournalArticles appear in search results # Define JournalArticles “show” view # Define JournalArticles “edit” view h2. Set up a new Hydra Head See the "README":http://hudson.projecthydra.org/job/hydra-head-rails3-plugin/Documentation/file.README.html for instructions on creating a new Hydra Head We will be using both RSpec and Cucumber in this Tutorial., so make sure to follow the instructions in the "README":http://hudson.projecthydra.org/job/hydra-head-rails3-plugin/Documentation/file.README.html under "RSpec and Cucumber for Testing". h3. Starting the App out of the Box h4. (1) Migrate the development databases. From the rails application root directory, run
rake db:migrateDon't worry if you've already run the migrations. It's safe to re-run rake db:migrate. h4. (2) Start Jetty, pre-configured with Fedora and Solr _Stop any copies of jetty (or anything else using port 8983) before running this command._ (Note that java 1.6 must be invoked by the "java" command or Fedora won't work.)
rake hydra:jetty:config rake hydra:jetty:startThis will start up a pre-configured jetty instance running Fedora and Solr on port 8983. Ensure Solr is running: "http://localhost:8983/solr/development/admin":http://localhost:8983/solr/development/admin should show the Solr admin page. You should be able to click on the Statistics link (or go to "http://localhost:8983/solr/development/admin/stats.jsp":http://localhost:8983/solr/development/admin/stats.jsp) and there should be more than 0 documents in the index if you loaded the features properly. Or look for documents in http://localhost:8983/solr/development/select?q={!defType=dismax}*:* Ensure Fedora is running: "http://127.0.0.1:8983/fedora/describe":http://127.0.0.1:8983/fedora/describe should show the Fedora Repository Information View. The following query should return objects: "http://localhost:8983/fedora/objects?pid=true&title=true&terms=&query=&maxResults=20":http://localhost:8983/fedora/objects?pid=true&title=true&terms=&query=&maxResults=20 h4. (3) Start the Rails Application
rails serverTo ensure the rails app is running, go to "http://localhost:3000/":http://localhost:3000/ On some systems, the http://localhost:3000 address will be broken because WEBrick binds to 0.0.0.0 rather than 127.0.0.1. To get around this, you can tell the server which address to bind to using the -b flag. ie. "rails server -b 127.0.0.1" If Rails still does not seem to be working properly, consult the output on the command line to see if there are any errors. h3. Ensure your Hydra App is set up properly: Import Sample Content and Run Tests! h4. (1) Import some Sample Content At some point, we hope to create actual Sample objects for hydra-head. In the meantime, we're re-using the Fixture objects that we use to test the hydra-head gem. These fixture objects are not designed to look nice. Instead, they are designed to exercise various aspects of the hydra-head code. If you're interested in creating and contributing some sample objects for us to distribute with the hdyra-head code and use in this tutorial. Please email the "hydra-tech":http://groups.google.com/group/hydra-tech mailing list and let us know. The hydra-head fixture objects are useful for getting a sense of how hydra works and they will allow you to run the hydra-head gem's test suite in order to make sure that everything was set up properly. You will want to delete these fixture objects later and create your own fixtures to run your application's tests against. First, use the hydra:hyhead_fixtures generator to copy the fixture xml files into test_support/fixtures
rails g hydra:hyhead_fixturesNow use the hydra:fixtures:refresh rake task to import the fixtures into Fedora and Solr
rake hydra:fixtures:refreshYou must run this a second time to load the fixtures into the copies of Fedora & Solr that your tests use.
rake hydra:fixtures:refresh RAILS_ENV=test*IMPORTANT* If you look in config/fedora.yml and config/solr.yml, you will see that these files point to different urls for development, test, and production. This allows you to keep test data, development data, and production data separate by setting the _Rails environment_ accordingly. Rails defaults to using the _test_ environment when you run tests and using the _development_ environment when you run the app using "rails server". We want these fixture objects to be available in both environments because we want to run tests against them and we want to poke through them in our browser. That's why we imported the fixtures once without RAILS_ENV set (defaults to 'development') and once with RAILS_ENV set to test. h4. (2) Run the Hydra-Head gem's RSpec Tests
rake hyhead:specThis will print out a lot of information while the tests run. If all the tests passed, you will see something like this (the important part is "0 failures"):
... Finished in 22.53 seconds 240 examples, 0 failures, 19 pendingh4. (3) Run the Hydra-Head gem's Cucumber Tests Before running the cucumber tests, you have to remove the default index.html file that Rails provides
rm public/index.htmlNow run the cucumber tests
rake hyhead:cucumberThis will print out a lot of information while the tests run. If all the tests passed, you will see something like this (the important part is that the number of scenarios equals the number of tests passed):
37 scenarios (37 passed) 215 steps (215 passed) 0m45.591sh3. Exploring what comes out of the box h4. (1) Create a User Account and Log into the Hydra Head Blacklight & the hydra-head plugin provide a basic user authentication that is meant to be replaced by code that connects to your institutional authentication system like LDAP or Shibboleth. While setting up this sample hydra-head, you need to create user accounts in that system. To do this, run the application, visit http://localhost:3000 and click on "login" then click on "create an account" below the login form. In many of the hydra-head fixtures and in some of the examples below, we gave "edit" permissions to a user called "archivist1@example.com", so create an account with that login id. Choose whatever email & password you want. Now you can log in as that user when you want to use features that require you to be logged into the hydra head. h4. (2) Explore the Application * Create sample account & log in: _use archivist1@example.com as the email address. This user has access to read and edit most of the fixtures you loaded when you ran 'rake hydra:fixtures:refresh'_ * Use facets to browse the content: example: Drill down to all of the assets associated with "TOPIC 1" * Run a search. Example: search for "article" * Look at one of the sample objects * Edit one of the sample objects: some of the objects will display an "edit" link. Click on that to edit the document. Edit some of the Descriptive Metadata and then click "Save Description", then Edit the Permissions and click "Save Permissions" * Create a new MODS article by clicking "Add a Basic MODS Asset" * Upload a file … * Change the permissions on the new article, making it visible to the public (before setting permissions, prove that you can’t see it when logged out, then change permissions, log out again & prove that it’s visible) h2. Making local changes to your Hydra Application In order to make it easy to get any new functionality added to the hydra stack (see "README":http://hudson.projecthydra.org/job/hydra-head-rails3-plugin/Documentation/file.README.html), while retaining your Hydra application's localizations, your local hydra application code should be set up to override the upstream Hydra stack code. Luckily, rails engines has made this easy - the Hydra code is organized so your localizations are kept separate from the core hydra application code. Moreover, to ensure your localizations won't be broken by upgrading the Hydra core code, we STRONGLY recommend that you write tests for every local change you make. This will allow you to ensure that upgrading the core code doesn't break any local changes you have made. So, two key points: # always write tests for your local modifications # always change code at the app level, and never change anything in vendor/plugins. "INITIAL_APP_MODS":http://hudson.projecthydra.org/job/hydra-head-rails3-plugin/Documentation/file.INITIAL_APP_MODS.html explains how to make two basic changes to your hydra rails application and how to write the appropriate tests for them. h2. The new Content Type: JournalArticle In this tutorial, we are creating a new JournalArticle content type. This will allow us to create Journal Articles in a Fedora Repository, collect custom metadata for them, index them in solr, and display that custom metadata in the user interface. In order to describe our Fedora objects, we can use whatever metadata schemas suit our needs. Some metadata schemas currently being used in Hydra Heads include MODS, Dublin Core, EAD, PBcore, EAC-CPF and VRE. This list continues to grow as people set up Hydra Heads to deal with their own specialized content. The JournalArticle content type will use MODS (handily available in hydra-head) to track descriptive metadata about Articles. Some of the metadata is common to many types of content: * title * author (first name, last name, role) * abstract Other metadata fields are more specifically relevant to journal articles, but they still fit into the MODS schema: * journal title * publication date * journal volume * journal issue * start page * end page In addition to the MODS metadata, JournalArticle objects will use Hydra Rights Metadata to track information about licenses, rights, and which people/groups should be able to discover, view, and/or edit each Journal Article. h2. Define the ActiveFedora Model for JournalArticles The first thing to do when adding a new content type is to create the ActiveFedora Model. This model is a Ruby class that uses ActiveFedora to tell the application the structure of your content and its metadata. The model we create can be used in any application with ActiveFedora and OM (Opinionated Metadata), not just in a Hydra Head. For example, ActiveFedora models can be used in batch scripts, command line utilities, and robots that perform automated actions on your fedora objects based on information and behaviors stored in their ActiveFedora models. h3. Where to find the full tutorial on defining ActiveFedora Models This tutorial will only cover the very basics of defining an ActiveFedora Model and defining OM terminologies for its XML datastreams. For a complete introduction to those topics, see the "ActiveFedora Console Tour":http://hudson.projecthydra.org/job/active_fedora/Documentation/file.CONSOLE_GETTING_STARTED.html and the tutorials included in the "OM documentation":http://hudson.projecthydra.org/job/om/Documentation/. h3. Tests to Define Expected Behaviors (MUST HAVE EXAMPLES HERE PERTINENT TO EXPECTED JOURNAL ARTICLE BEHAVIORS) h4. Run the tests The test code should execute but the tests should fail because we haven't written the code yet. h3. Defining the Datastreams A Fedora object is made up of any number of _datastreams_. Datastreams can have content of any type and each datastream is identified by a _datastream id_ or _dsid_. The ActiveFedora model tells us which datastreams to expect or create in an object and tells us what kind of content is expected inside each datastream. For our JournalArticle model, we're particularly interested datastreams with XML content because a JournalArticle object is basically a container for XML metadata. The actual content of the Article (PDF,text,whatever.) will be stored in a separate Fedora object, a _primitive_, with the RDF isPartOf relationship connecting the JournalArticle (primarily metadata) to its content (any number of primitives with files in them). For more information about datastreams, primitives, and where the actual content of an object lives, see the Reference links at the end of this tutorial. h4. descMetadata -- the Descriptive Metadata Datastream Our MODS xml will go into a datastream with the datastream id of _descMetadata_. Technically, we could give it any name we want but the Hydra community has come up with some conventions to make things simpler. One of these conventions is to always put descriptive metadata in a datastream called descMetadata. As we said above, we want to create MODS metadata that keeps track of title, author (first name/last name/role), publication date, abstract, journal title, journal volume, journal issue, start page and end page. In order to do this we will use ActiveFedora to define a special type of Ruby object that uses OM to read and modify XML. Example of the MODS XML we will be creating:
The constraints of a metadata schema sometimes force us to put information into structures that don't map directly to the vocabulary that we use when talking about that information. The "title" has ended up in a spot that you might call the "mods titleInfo title" and the "start page" has ended up in a spot that you might call theARTICLE TITLE FAMILY NAME GIVEN NAMES Creator ABSTRACT TITLE OF HOST JOURNAL 2 2 195 230 FEB. 2007
"mods host-relatedItem part pages-extent start"or
"//mods/relatedItem[@type=host]/part/extent[@unit='pages']/start"... quite the mouthful, isn't it? This is where OM comes in. OM lets us define a Terminology that will allow us to access values in the xml based on our natural vocabulary (title, start page, etc.) while also allowing us to access those same values using the more cumbersome XML-speak when necessary. h5. WRITE TESTS HERE! (for the expected behaviors of a descMetadata datastream model?) specs to test model? features? h5. Coding the descMetadata Datastream model Here's how we define the datastream class for the descMetadata. Notice that we use set_terminology which defines its OM Terminology. Create a new file in app/models called journal_article_mods_datastream.rb and put this into it (NOTE: you could also save this file as lib/journal_article_mods_datastream.rb and get the same results):
# a Fedora Datastream object containing Mods XML for the descMetadata # datastream in the Journal Article hydra content type, defined using # ActiveFedora and OM. class JournalArticleModsDatastream < ActiveFedora::NokogiriDatastream # OM (Opinionated Metadata) terminology mapping for the mods xml set_terminology do |t| t.root(:path=>"mods", :xmlns=>"http://www.loc.gov/mods/v3", :schema=>"http://www.loc.gov/standards/mods/v3/mods-3-2.xsd") t.title_info(:path=>"titleInfo") { t.main_title(:index_as=>[:facetable],:path=>"title", :label=>"title") } t.author { t.first_name(:path=>"namePart", :attributes=>{:type=>"given"}) t.last_name(:path=>"namePart", :attributes=>{:type=>"family"}) t.role { t.text(:path=>"roleTerm",:attributes=>{:type=>"text"}) } } t.abstract t.journal(:path=>'relatedItem', :attributes=>{:type=>"host"}) { t.title_info(:ref=>[:title_info]) t.issue(:path=>"part") { t.volume(:path=>"detail", :attributes=>{:type=>"volume"}, :default_content_path=>"number") t.level(:path=>"detail", :attributes=>{:type=>"number"}, :default_content_path=>"number") t.pages(:path=>"extent", :attributes=>{:unit=>"pages"}) { t.start t.end } t.date } } # these proxy declarations allow you to use more familiar term/field names that hide the details of the XML structure t.title(:proxy=>[:title_info, :main_title]) t.start_page(:proxy=>[:journal, :issue, :pages, :start]) t.end_page(:proxy=>[:journal, :issue, :pages, :end]) t.publication_date(:proxy=>[:journal, :issue, :date]) end # set_terminology end # classThis will allow you to access the metadata values like this when dealing with a JournalArticleModsDatastream object:
ds.term_values(:author, :first_name) ds.term_values(:start_page) ds.term_values(:journal, :issue, :pages, :start) # same result as the previous line, but more detailed reference to the xml node(CAN WE DO TESTING NOW? OR DO WE NEED TO DEAL WITH THE ENTIRE FEDORA OBJECT BEFORE TESTING?) h4. rightsMetadata -- The Hydra Rights Metadata Datastream The hydra-head plugin provides a class definition for the rightsMetadata datastream, so you won't have to define the OM Terminology yourself. The definition is in the hydra-head plugin code in "lib/hydra/rights_metadata.rb":https://github.com/projecthydra/hydra-head/blob/master/lib/hydra/rights_metadata.rb. Here's an example of what rightsMetadata XML looks like:
h4. RELS-EXT & DC Datastreams (Fedora defaults) There are two special datastreams that Fedora creates for you -- the DC datastream and the RELS-EXT datastream. We don't really use the DC datastream. It contains simple Dublin Core metadata that mainly exists for Fedora's internal use. The RELS-EXT datastream contains RDF representing the relationships between the objects in a Fedora repository. ActiveFedora and Hydra both use these RDF relationships in a number of ways. For more information about how to work with RDF relationships in ActiveFedora, see the ActiveFedora documentation links at the end of this tutorial. h3. Defining the Model and adding the Datastreams to it Create a file in app/models called journal_article.rb and put these lines into it:(c)2009 The Hydra Project Blah Blah More blah
This work is in the Public Domain.hydra-policy:4502 public public researcher1 archivist
# a Fedora object for the Journal Article hydra content type class JournalArticle < ActiveFedora::Base include Hydra::ModelMethods has_metadata :name => "descMetadata", :type=> JournalArticleModsDatastream has_metadata :name => "rightsMetadata", :type => Hydra::RightsMetadata end(CAN WE DO TESTING NOW?) h3. Test Your Work THE TESTS WE WROTE ABOVE SHOULD PASS NOW ... (specs? if features ... jetty must be running?) h3. Loading your new Model in the Rails App If you have put the new files you've created into app/models and/or the lib directory, then they will be automatically loaded by the application when you run it using 'rails server' or 'rails console'. h2. Play with JournalArticle objects in the Rails Console First, rather than using the UI of the Rails app, let's interact directly with the code for a bit. If they are not already running, start up Fedora and Solr on port 8983:
rake hydra:jetty:startNow bring up the Rails console (command line interface for interacting with your Rails app)
rails consoleh3. (1) Create a new JournalArticle Object In the console, create a JournalArticle object by entering these lines:
article = JournalArticle.new article.save # In case you're curious, explore a bit... article.pid article.datastreams.keys article.datastreams["descMetadata"].class article.datastreams["rightsMetadata"].class article.relationshipsh3. (2) Set Basic Access Controls on the JournalArticle Object The access controls in the hydra-head plugin work on the assumption that you will put your access controls information into a datastream called rightsMetadata and then index the values from that datastream into Solr. The plugin provides all the tools that you need in order to do this. Most of the work is handled for you by this line in your model:
has_metadata :name => "rightsMetadata", :type => Hydra::RightsMetadataThe plugin defaults to *denying access* when the permissions are unclear. This means that in cases where there is no rightsMetadata datastream or the information in the datastream is unclear or incomplete, the Hydra head will act like the object does not exist. _The object will not show up in search results and you will not be able to view it._ To give yourself permission to view and edit an object, you need to set those permissions. For example, if we want the object to show up in search results for everyone but we only want a user called archivist1@example.com to edit it, we can do the following:
article = JournalArticle.load_instance("changeme:30") article.permissions({"group"=>"public"}, "read") article.permissions({"person"=>"archivist1@example.com"}, "edit") article.savehydra-head provides you with tools to expose a permissions editor to your end-users. This allows them to control who can discover/read/download/edit the objects they create. h3. (3) Save the Changes into Fedora Any time you change an object, you must tell the application when to save those changes to Fedora by calling .save
article.saveh3. How to Explicitly (re)Index the JournalArticle in Solr By default, ActiveFedora automatically updates Solr whenever you save an object to Fedora. If you want to re-index it, all you have to do is load an instance of the object and call its update_index method. For example, if you wanted to update an object that has a pid "changeme:30", you could do it like this:
article = JournalArticle.load_instance("changeme:30") article.update_indexInternally, the update_index method uses Solrizer to get the object into Solr. Solrizer is specifically designed to help you keep your solr indexes up to date. "Solrizer Documentation":http://hudson.projecthydra.org/job/solrizer/Documentation/ "Solrizer on GitHub":https://github.com/projecthydra/solrizer Solrizer is able to figure out how to index XML based on the OM Terminologies you've defined. It defaults to indexing nearly everything in generic ways, but there are many ways to customize this behavior. See the "Solrizer Documentation":http://hudson.projecthydra.org/job/solrizer/Documentation/ for more information about how it works. h2. Defining Custom Views for JournalArticles Now that you've created a couple objects and indexed them, you can view them in the application. Remember, _objects will not show up in search results unless the permissions are set properly in rightsMetadata_. h3. Add a link to create an Article First, we need to add a link to the Hydra Head that lets you create Journal Articles. To do this, you need to override the _add_asset_links view partial. Here's the cucumber test for what you want:
Feature: Button to Add Articles In order to create Journal Articles As a person with submit permissions I want to see a button for adding articles Scenario: button to add articles on home page Given I am logged in as "archivist1@example.com" When I am on the home page Then I should not see "Add a MODS Asset" And I should see "Add an Article"hydra-head puts a list of "add asset" links into the user_util_links section of the page. This list is defined in app/views/_add_asset_links.html.erb. By default, this list includes links for adding Images, MODS Assets and Generic Content. We want it to have just one link -- create an Article. To override the list, create a file at app/views/_add_asset_links.html.erb and put this into it:
Now start your application and visit http://localhost:3000 h3. Default Views JournalArticles will show up with the default views until you create custom views for that content type search results, facets detail & edit views h3. Customize how JournalArticles appear in search results h3. Change the Search Result Headings h3. Define JournalArticles “show” view h3. Define JournalArticles “edit” view displaying MODS edit rednering permissions_edit views rendering file uploader h3. The File List -- show & edit h2. Miscellaneous * Link to info on overriding styling * Overriding Helpers h2. Reference h3. Documentation & More Tutorials h4. API Docs ActiveFedora API Docs OM API Docs h4. Tutorials "ActiveFedora Console Tour":http://hudson.projecthydra.org/job/active_fedora/Documentation/file.CONSOLE_GETTING_STARTED.html "OM-based NokogiriDatastreams":http://hudson.projecthydra.org/job/active_fedora/Documentation/file.NOKOGIRI_DATASTREAMS.html "OM Documentation":http://hudson.projecthydra.org/job/om/Documentation/ h3. Hydra Modeling Conventions h4. Reference info on the Hydra Wiki "Hydra objects, content models(cModels) and disseminators":https://wiki.duraspace.org/display/hydra/Hydra+objects%2C+content+models+%28cModels%29+and+disseminators "Don’t Call it a Content Model":https://wiki.duraspace.org/display/hydra/Don%27t+call+it+a+%27content+model%27%21 h4. genericContent cModel * descMetadata (required): Descriptive Metadata like MODS or DC. MODS is recommended * rightsMetadata (recommended): Rights, License, and Permissions information. Hydra rightsMetadata XML is recommended. h4. Understanding Parts (Where will my uploaded files go?) h5. Primitives contain actual files bq. A primitive is a fundamental atom object that bears an actual file payload. They are single, (near-) universal content types which may either stand-alone or be incorporated into higher order content types (in a book or ETD, e.g.). h5. Higher Level Objects bq. Higher Level Objects represent higher-level, molecular objects that may have primitives and/or other Higher-level objects as children. h5. Relationships "isPartOf":info:fedora/fedora-system:def/relations-external#isPartOf -- Hydra reserves this predicate for use in representing part-whole relationships between objects. This occurs most often when representing which Primitives (ie. an uploaded PDF) are _part_ of a Higher Level object (ie. a MODS Article) h3. How to Add a New Facet
- <%= link_to_create_asset 'Add an Article', 'journal_article' %>