README.textile in solrizer-1.0.4 vs README.textile in solrizer-1.1.0

- old
+ new

@@ -1,8 +1,167 @@ h1. solrizer A lightweight, configurable tool for indexing metadata into solr. Can be triggered from within your application, from the command line, or as a JMS listener. +Solrizer provides the baseline and structures for the process of solrizing. In order to actually read objects from a +datasource and write solr documents into a solr instance, you need to use an implementation specific gem, such as +"solrizer-fedora":https://github.com/projecthydra/solrizer-fedora, which provides the +mechanics for reading from a fedora repository and writing to a solr instance. + + +h2. Installation + +The gem is hosted on rubygems.org. The best way to manage the gems for your project is to use bundler. Create a Gemfile in the root of your application and include the following: + +<pre> +source "http://rubygems.org" + +gem 'solrizer' +</pre> + +Then: + +<pre>bundle install</pre> + +h2. Usage + +h3. Fire up the console: + +The code snippets in the following sections can be cut/paste into your console, giving you the opportunity to play with Solrizer +and demonstrate the functionality underlying the implementation-specific gems, such as solrizer-fedora. + + + +Start up a console and load solrizer: + +<pre> +irb +require "rubygems" +require "solrizer" +</pre> + + +h3. Field Mapper + +The FieldMapper maps term names and values to Solr fields, based on the term’s data type and any index_as options. Solrizer comes with default mappings (which are defined in the config/solr_mappings.yml): + +<pre> +default_mapper = Solrizer::FieldMapper::Default.new + +# some of the default mappings in solrizer +default_mapper.solr_name("foo",:string) # returns foo_t +default_mapper.solr_name("foo",:date) # returns foo_dt +default_mapper.solr_name("foo",:integer) # returns foo_i +default_mapper.solr_name("foo",:string,:facetable) # returns foo_facet +default_mapper.solr_name("foo",:text,:facetable) # returns foo_facet +default_mapper.solr_name("foo",:integer,:facetable) # returns foo_facet +</pre> + +FieldMapper provides some defaults: +<pre> +default_mapper.solr_names_and_values("foo","bar",:string,[:facetable]) # returns searchable and facetable by default => {"foo_facet"=>["bar"], "foo_t"=>["bar"]} +default_mapper.solr_names_and_values("foo","bar",:string,[:not_searchable, :facetable]) # returns just facetable => {"foo_facet"=>["bar"]} +</pre> +Which can be tweaked: +<pre> +default_mapper.default_index_types << :facetable +default_mapper.solr_names_and_values("foo","bar",:string,[]) # returns searchable and facetable by default => {"foo_facet"=>["bar"], "foo_t"=>["bar"]} +</pre> + +Custom Mappings can also be provided (with custom converters): + +<pre> +class CustomMapper < Solrizer::FieldMapper + index_as :searchable, :suffix => "_search" do |type| + type.reversed :suffix => "_reverse" do |value| + value.reverse + end + end +end +</pre> + +<pre> +custom_mapper = CustomMapper.new + +custom_mapper.solr_names_and_values("foo","bar",:string,[:searchable]) # returns {"foo_search"=>["bar"]} +custom_mapper.solr_names_and_values("foo","bar",:reversed,[:searchable]) # returns {"foo_reverse"=>["rab"]} +</pre> + +For more detailed information on custom mappings, see the documetnation for the FieldMapper class. + +h3. Extractor and Extractor Mixins + +Solrizer::Extractor provides utilities for extracting solr fields from objects or inserting solr fields into documents: + +<pre> +extractor = Solrizer::Extractor.new + +extractor.format_node_value(["foo ","\n bar"]) # returns "foo bar" + +solr_doc = Hash.new +extractor.insert_solr_field_value(solr_doc, "foo","bar") # solr_doc is now {"foo" => ["bar"]} +extractor.insert_solr_field_value(solr_doc,"foo","baz") # solr_doc is now {"foo" => ["bar","baz"]} +extractor.insert_solr_field_value(solr_doc, "boo","hoo") # solr_doc is now {"foo" => ["bar","baz"], "boo" => ["hoo"]} +</pre> + +h4. Solrizer provides some default mixins: + +* Solrizer::HTML::Extractor -=> provides html_to_solr method +* Solrizer::XML::Extractor -=> provides xml_to_solr method + +<pre> +xml = "<fields><foo>bar</foo><bar>baz</bar></fields>" + +extractor.xml_to_solr(xml) # returns {:foo_t=>"bar", :bar_t=>"baz"} +</pre> + +h4. Solrizer::XML::TerminologyBasedSolrizer + +Another powerful mixin for use with classes that include the OM::XML::Document module is Solrizer::XML::TerminologyBasedSolrizer. +The methods provided by this module map provides a robust way of mapping terms and solr fields via om terminologies. A notable example +can be found in ActiveFedora::NokogiriDatatstream. + + +h2. JMS Listener for Hydra Rails Applications + +h3. The executables: solrizer and solrizerd + +The solrizer gem provides two executables: + + * solrizer is a stomp consumer which listens for fedora.apim.updates and solrizes (or de-solrizes) objects accordingly. + * solrizerd is a wrapper script that spawns a daemonized version of solrizer and handles start|stop|restart|status requests. + +h3. Usage + +The usage for solrizerd is as follows: + +<pre> + solrizerd command --hydra_home PATH [options] +</pre> + +The commands are as follows: + * start start an instance of the application + * stop stop all instances of the application + * restart stop all instances and restart them afterwards + * status show status (PID) of application instances + +Required parameters: + +--hydra_home: this is the path to your hydra rails applications' root directory. Solrizerd needs this in order to load all your models and corresponding terminoligies. + +The options: + * -p, --port Stomp port 61613 + * -o, --host Host to connect to localhost + * -u, --user User name for stomp listener + * -w, --password Password for stomp listener + * -d, --destination Topic to listen to (default: /topic/fedora.apim.update) + * -h, --help Display this screen + +Note: + +Since the solrizer script must fire up your hydra rails application, it must have all the gems installed that your hydra instance needs. + + h2. Note on Patches/Pull Requests * Fork the project. * Make your feature addition or bug fix. * Add tests for it. This is important so I don't break it in a