README.rdoc in xapian_db-1.0 vs README.rdoc in xapian_db-1.1
- old
+ new
@@ -1,7 +1,11 @@
= XapianDb
+== Important Information
+
+If you upgrade from an earlier version of xapian_db to 1.1, you MUST rebuild your entire index (XapianDb.rebuild_xapian_index)!
+
== What's in the box?
XapianDb is a ruby gem that combines features of nosql databases and fulltext indexing into one piece. The result: Rich documents and very fast queries. It is based on {Xapian}[http://xapian.org/], an efficient and powerful indexing library.
XapianDb is inspired by {xapian-fu}[https://github.com/johnl/xapian-fu] and {xapit}[https://github.com/ryanb/xapit].
@@ -114,10 +118,18 @@
blueprint.attributes :name, :first_name, :profession
blueprint.index :notes, :remarks, :cv
blueprint.ignore_if {active == false}
end
+You can add a type information to an attribute. As of now the special types :string, :date and :number are supported (and required for range queries):
+
+ XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
+ blueprint.attribute :age, :as => :number
+ blueprint.attribute :date_of_birth, :as => :date
+ blueprint.attribute :name, :as => :string
+ end
+
You can override the global adapter configuration in a specific blueprint. Let's say you use ActiveRecord, but you have
one more class that is not stored in the database, but you want it to be indexed:
XapianDb::DocumentBlueprint.setup(SpecialClass) do |blueprint|
blueprint.adapter :generic
@@ -143,10 +155,14 @@
To rebuild the index for all blueprints, use
XapianDb.rebuild_xapian_index
+You can update the index for a single object, too (e.g. to reevaluate an ignore_if block without modifying and saving the object):
+
+ XapianDb.reindex object
+
=== Query the index
A simple query looks like this:
results = XapianDb.search "Foo"
@@ -178,12 +194,31 @@
On class queries you can specifiy order options:
results = Person.search "name:Foo", :order => :first_name
results = Person.search "Fo*", :order => [:name, :first_name], :sort_decending => true
-Please note that the order option is not available for global searches (XapianDb.search...)
+If you define an attribute with a supported type, you can do range searches:
+ XapianDb::DocumentBlueprint.setup(Person) do |blueprint|
+ blueprint.attribute :age, :as => :number
+ blueprint.attribute :date_of_birth, :as => :date
+ blueprint.attribute :name, :as => :string
+ end
+
+ result = XapianDb.search("date_of_birth:2011-01-01..2011-12-31")
+ result = XapianDb.search("age:30..40")
+ result = XapianDb.search("name:Adam..Chris")
+
+Open Ranges are supported, too:
+
+ result = XapianDb.search("age:..40")
+ result = XapianDb.search("age:30..")
+
+You can combine range query expressions with other expressions:
+
+ result = XapianDb.search("age:30..40 AND city:Aarau")
+
=== Process the results
<code>XapianDb.search</code> returns a resultset object. You can access the number of hits directly:
results.hits # Very fast, does not load the resulting documents; always returns the actual hit count
@@ -214,27 +249,35 @@
=== Facets
If you want to implement a simple drilldown for your searches, you can use a global facets query:
search_expression = "Foo"
- facets = XapianDb.facets(search_expression)
+ facets = XapianDb.facets(:name, search_expression)
+ facets.each do |name, count|
+ puts "#{name}: #{count} hits"
+ end
+
+If you want the facets based on the indexed class, use the special attribute :indexed_class:
+
+ search_expression = "Foo"
+ facets = XapianDb.facets(:indexed_class, search_expression)
facets.each do |klass, count|
puts "#{klass.name}: #{count} hits"
# This is how you would get all documents for the facet
# doc = klass.search search_expression
end
-A global facet search always groups the results by the class of the indexed objects. There is a class level facet query syntax available, too:
+A class level facet query is possible, too:
search_expression = "Foo"
facets = Person.facets(:name, search_expression)
facets.each do |name, count|
puts "#{name}: #{count} hits"
end
-At the class level, any attribute can be used for a facet query. Use facet queries on attributes that store atomic values like strings, numbers or dates.
+Any attribute declared in a blueprint can be used for a facet query. Use facet queries on attributes that store atomic values like strings, numbers or dates.
If you use it on attributes that contain collections (like an array of strings), you might get unexpected results.
=== Find similar documents
If you have a rearch result, you can search for similar documents by selecting one or more documents from your result and passing them to the find_similar_to method:
@@ -266,9 +309,15 @@
# change person
person.save
end
end
Person.rebuild_xapian_index
+
+== Add your own serializers for special objects
+
+XapianDb serializes objects to xapian documents using YAML by default. This way, type information is preserved und you get back what you put into a xapian document, not just a string.
+
+However, dates need special handling to support date range queries. To support date range queries and allow the addition of other custom data types in the future, XapianDb uses a simple, extensible mechanism to serialize / deserialize your objects. An example on how to extend this mechanism is provided in examples/custom_serialization.rb.
== Production setup
Since Xapian allows only one database instance to write to the index, the default setup of XapianDb will not work
with multiple app instances trying to write to the same database (you will get lock errors).