= Neo4j.rb Neo4j.rb is a graph database for JRuby. It provides: * Mapping of ruby objects to nodes in networks rather than in tables. * Dynamic, and schema-free - no need to declare nodes/properties/relationships in advance. * Storage of ruby object to a file system. * Fast traversal of relationships between nodes in a huge node space. * Transaction with rollbacks support. * Indexing and querying of ruby objects. * Migration and BatchInserter support * Can be used instead of ActiveRecord in Ruby on Rails or Merb * Can be accessible as REST resources. It uses two powerful and mature Java libraries: * Neo4J (http://www.neo4j.org/) - for persistence and traversal of the graph * Lucene (http://lucene.apache.org/java/docs/index.html) for querying and indexing. === Status * There are over 500 RSpecs. * Has been tested with rails applications, used Neo4j.rb instead of ActiveRecord === Project information * GitHub - http://github.com/andreasronge/neo4j/tree/master * Issue Tracking - http://neo4j.lighthouseapp.com * Twitter - http://twitter.com/ronge * IRC - #neo4j @ irc.freenode.net * API Documentation - http://neo4j.rubyforge.org/ (of the released version) * Source repo - git://github.com/andreasronge/neo4j.git * Mailing list - http://groups.google.com/group/neo4jrb (neo4jrb@googlegroups.com) === Presentation Materials and other URLs * Ruby Manor 2008 - Jonathan Conway: http://jaikoo.com/assets/presentations/neo4j.pdf * Nordic Ruby 2010 (upcoming May 21-23) http://nordicruby.org/speakers#user_29 * Neo4j wiki - http://wiki.neo4j.org/content/Main_Page (check the guidelines and domain modeling gallery pages) === Contributing Have you found a bug, need help or have a patch ? Just clone neo4j.rb and send me a pull request or email me. Do you need help - send me an email (andreas.ronge at gmail dot com). Please also check/add issues at lighthouse, http://neo4j.lighthouseapp.com === License * Neo4j.rb - MIT, see the LICENSE file http://github.com/andreasronge/neo4j/tree/master/LICENSE. * Lucene - Apache, see http://lucene.apache.org/java/docs/features.html * Neo4j - Dual free software/commercial license, see http://neo4j.org/ === Content This page contains the following information: * Installation guide * Three Minute Tutorial * Ten Minute Tutorial * Neo4j API Documentation * Extensions: REST (see Neo4j::RestMixin) and find_path (Neo4j::GraphAlgo::AllSimplePaths) * Performance issues * Ruby on Rails with Neo4j.rb * Lucene API Documentation There are also some complete examples in the example folder * admin - an incomplete admin web gui for the Neo4j.rb/REST interface * railway - an example of a railway network application * imdb - an example of a Neo4j database consisting of movies, role and actors nodes/relationships (over 18000 nodes). * rest - an example how to expose Neo4j nodes as REST resources * Ruby on Rails - see http://github.com/andreasronge/neo4j-rails-example/tree/master or http://github.com/sashaagafonoff/peoplemap * Rails3/ActiveModel integration see: http://github.com/nicksieger/neo4j-rails For most of the examples below there are RSpecs available, check the test/neo4j/readme_spec.rb file. == Installation To install it: jruby -S gem install neo4j To install from the latest source: git clone git://github.com/andreasronge/neo4j.git cd neo4j gem install bundler # only needed for development bundle install # to install all test dependencies needed for development gem build rspec-apigen.gemspec gem install neo4j-x.y.z.gem This has been verified to work on JRuby 1.5.2 ==== Running all RSpecs To check that neo4j.rb is working: cd neo4j # the folder containing the Rakefile rake # you may have to type jruby -S rake depending how you installed JRuby = Three Minute Tutorial Neo node space consists of three basic elements: nodes, relationships that connect nodes and properties attached to both nodes and relationships. All relationships have a type, for example if the node space represents a social network, a relationship type could be KNOWS. If a relationship of the type KNOWS connects two nodes, that probably represents two people that know each other. A lot of the semantics, the meaning, of a node space is encoded in the relationship types of the application. === Creating Nodes Example of creating a Neo4j::Node require "rubygems" require 'neo4j' Neo4j::Transaction.run do node = Neo4j::Node.new end === Transactions Almost all Neo4j operation must be wrapped in a transaction as shown above. In all the following examples we assume that the operations are inside an Neo4j transaction. There are two ways of creating transaction - in a block or the Transaction.new method Using a block: Neo4j::Transaction.run do # neo4j operations goes here end Using the Neo4j::Transaction#new and Neo4j::Transaction#finish methods: Neo4j::Transaction.new # neo4j operations goes here Neo4j::Transaction.finish === Properties Example of setting properties Neo4j::Node.new :name=>'foo', :age=>123, :hungry => false, 4 => 3.14 # which is same as the following: node = Neo4j::Node.new node[:name] = 'foo' node[:age] = 123 node[:hungry] = false node[4] = 3.14 node[:age] # => 123 === Creating Relationships Example of creating an outgoing Neo4j::Relationship from node1 to node2 of type friends node1 = Neo4j::Node.new node2 = Neo4j::Node.new Neo4j::Relationship.new(:friends, node1, node2) # which is same as node1.rels.outgoing(:friends) << node2 === Accessing Relationships Example of getting relationships node1.rels.empty? # => false # The rels method returns an enumeration of relationship objects. # The nodes method on the relationships returns the nodes instead. node1.rels.nodes.include?(node2) # => true node1.rels.first # => the first relationship this node1 has which is between node1 and node2 of any type node1.rels.nodes.first # => node2 first node of any relationship type node2.rels.incoming(:friends).nodes.first # => node1 first node of relationship type 'friends' node2.rels.incoming(:friends).first # => a relationship object between node1 and node2 === Properties on Relationships Example of setting properties on relationships rel = node1.rels.outgoing(:friends).first rel[:since] = 1982 node1.rels.first[:since] # => 1982 (there is only one relationship defined on node1 in this example) = Ten Minute Tutorial === Creating a Model The following example specifies how to map a Neo4j node to a Ruby Person instance. require "rubygems" require "neo4j" class Person include Neo4j::NodeMixin # define Neo4j properties property :name, :salary, :age, :country # define an one way relationship to any other node has_n :friends # adds a Lucene index on the following properties index :name, :salary, :age, :country end Neo properties and relationships are declared using the 'property' and 'has_n'/'has_one' NodeMixin class method. Adding new types of properties and relationships can also be done without declaring those properties/relationships by using the operator '[]' on Neo4j::NodeMixin and the '<<' on the Neo4j::Relationships::RelationshipTraverser. By using the NodeMixin and by declaring properties and indices, all instances of the Person class can now be stored in the Neo4j node space and be retrieved/queried by traversing the node space or performing Lucene queries. A Lucene index will be updated when the name or salary property changes. === Creating a node Creating a Person node instance person = Person.new === Setting properties Setting a property: person.name = 'kalle' person.salary = 10000 You can also set this (or any property) when you create the node: person = Person.new :name => 'kalle', :salary => 10000, :foo => 'bar' === Properties and the [] operator Notice that it is not required to specify which attributes should be available on a node. Any attributes can be set using the [] operator. Declared properties set an expectation, not an requirement. It can be used for documenting your model objects and catching typos. Example: person['an_undefined_property'] = 'hello' So, why declare properties in the class at all? By declaring a property in the class, you get the sexy dot notation. But also, if you declare a Lucene index on the declared property and update the value, then the Lucene index will automatically be updated. The property declaration is required before declaring an index on the property. === Relationships Like properties, relationships do not have to be defined using has_n or has_one for a class. A relationship can be added at any time on any node. Example: person.rels.outgoing(:best_friends) << other_node person.rels.outgoing(:best_friends).first.end_node # => other_node (if there is only one relationship of type 'best_friends' on person) # the line above can also be written as below - take the first outgoing relationship: person.rel(:best_friends).end_node === Finding Nodes and Queries There are three ways of finding/querying nodes in Neo4j: 1. by traversing the graph 2. by using Lucene queries 3. using the unique neo4j id (Neo4j::NodeMixin#neo_id). When doing a traversal one starts from a node and traverses one or more relationships (one or more levels deep). This start node can be either the reference node which is always found (Neo4j#ref_node) or by finding a start node from a Lucene query. === Lucene Queries There are different ways to write Lucene queries. Using a hash: Person.find (:name => 'kalle', :salary => 20000..30000) # find people with name kalle and age between 20 and 30 or using the Lucene query language: Person.find("name:kalle AND salary:[10000 TO 30000]") The Lucene query language supports wildcard, grouping, boolean, fuzzy queries, etc... For more information see: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html === Sorting, example Person.find(:age => 25).sort_by(:salary) Person.find(:age => 25).sort_by(Lucene::Desc[:salary], Lucene::Asc[:country]) Person.find(:age => 25).sort_by(Lucene::Desc[:salary, :country]) === Search Results The query is not performed until the search result is requested. Example of using the search result. res = Person.find(:name => 'kalle') res.size # => 10 res.each {|x| puts x.name} res[0].name = 'sune' === Creating a Relationships Since we declared a relationship in the example above with has_n :friends (see Neo4j::RelClassMethods#has_n) we can use the generated methods Person#friends and Person#friends_rels The friends_rels method is used to access relationships and the Person#friends method for accessing nodes. Adding a relationship between two nodes: person2 = Person.new person.friends << person2 The person.friends returns an object that has a number of useful methods (it also includes the Enumerable mixin). Example person.friends.empty? # => false person.friends.first # => person2 person.friends.include?(person2) # => true === Deleting a Relationship To delete the relationship between person and person2: person.friends_rels.first.del If a node is deleted then all its relationship will also be deleted Deleting a node is performed by using the delete method: person.del === Node Traversals The has_one and has_many methods create a convenient method for traversals and managing relationships to other nodes. Example: Person.has_n :friends # generates the friends instance method # all instances of Person now has a friends method so that we can do the following person.friends.each {|n| ... } Traversing using a filter person.friends{ salary == 10000 }.each {|n| ...} Traversing with a specific depth (depth 1 is default) person.friends{ salary == 10000}.depth(3).each { ... } There is also a more powerful method for traversing several relationships at the same time - Neo4j::NodeMixin#traverse, Neo4j::JavaNodeMixin#outgoing and Neo4j::JavaNodeMixin:incoming see below. === Example on Relationships In the first example the friends relationship can have relationships to any other node of any class. In the next example we specify that the 'acted_in' relationship should use the Ruby classes Actor, Role and Movie. This is done by using the has_n class method: class Role include Neo4j::RelationshipMixin # notice that neo4j relationships can also have properties property :name end class Actor include Neo4j::NodeMixin # The following line defines the acted_in relationship # using the following classes: # Actor[Node] --(Role[Relationship])--> Movie[Node] # has_n(:acted_in).to(Movie).relationship(Role) end class Movie include Neo4j::NodeMixin property :title property :year # defines a method for traversing incoming acted_in relationships from Actor has_n(:actors).from(Actor, :acted_in) end Creating a new Actor-Role-Movie relationship can be done like this: keanu_reeves = Actor.new matrix = Movie.new keanu_reeves.acted_in << matrix or you can also specify this relationship on the incoming node (since we provided that information in the has_n methods). keanu_reeves = Actor.new matrix = Movie.new matrix.actors << keanu_reeves More information about neo4j can be found after the Lucene section below. = Neo4j API Documentation === Start and Stop of the Neo4j Unlike the Java Neo4j implementation it is not necessarily to start Neo4j. It will automatically be started when needed. It also uses a hook to automatically shutdown Neo4j. Shutdown of Neo4j can also be done using the stop method, example: Neo4j.stop ==== Neo4j Configuration Before using Neo4j the location where the database is stored on disk should be configured. The Neo4j configuration is kept in the Neo4j::Config class: Neo4j::Config[:storage_path] = '/home/neo/neodb' ==== Accessing the Java Neo4j API You can access the org.neo4j.kernel.EmbeddedGraphDatabase class by Neo4j.instance You can create an org.neo4j.graphdb.Node object by using the Neo4j::Node.new method (!) node = Neo4j::Node.new # => an instance of org.neo4j.graphdb.Node To load a specific node by its ID (see javadoc org.neo4j.graphdb.Node.getId()), do node_id = node.neo_id ref_node = Neo4j.load_node(node_id) The neo_id method works both for the Neo4j::NodeMixin and for org.neo4j.graphdb.Node java objects. You can create a relationship of type org.neo4j.graphdb.Relationship by a = Neo4j::Node.new b = Neo4j::Node.new r = a.add_rel(:friends, b) r.java_class # => class org.neo4j.kernel.impl.core.RelationshipProxy === Lucene Integration Neo4j.rb uses the Lucene module. That means that the Neo4j::NodeMixin has methods for both traversal and Lucene queries/indexing. ==== Lucene Configuration By default Lucene indexes are kept in memory. Keeping index in memory will increase the performance of Lucene operations (such as updating the index). Example to configure Lucene to store indexes on disk instead Lucene::Config[:store_on_file] = true Lucene::Config[:storage_path] = '/home/neo/lucene-db' ==== Lucene Index in Memory If index is stored in memory then one needs to reindex all nodes when the application starts up again. MyNode.update_index # will traverse all MyNode instances and (re)create the Lucene index in memory. === Neo4j::NodeMixin Neo4j::NodeMixin is a mixin that lets instances to be stored as a node in the Neo node space on disk. A node can have properties and relationships to other nodes. Example of how declare a class that has this behaviour: class MyNode include Neo4j::NodeMixin end === Neo4j::Node If you do not need to map a node to a ruby instance you can simply use the Neo4j::Node object. Example: node = Neo4j::Node.new node[:name] = 'foo' The Neo4j::Node.new method actually returns a Java object that implements the org.neo4j.graphdb.Node interface. That Java interface is extended with methods so it behaviors almost like using the your own Ruby Neo4j::NodeMixin class. === Create a Node node = MyNode.new === Delete a Node The Neo4j::NodeMixin mixin defines a delete method that will delete the node and all its relationships. Example: node = MyNode.new node.del The node in the example above will be removed from the Neo database on the filesystem and the Lucene index ==== Neo4j::Node#del method Since one can both use the org.neo4j.graphdb.Node directly or using the Neo4j::NodeMixin there might be a clash in method names. For example the method Neo4j::NodeMixin#del deletes the node and all its relationships. The org.neo4j.graphdb.Node#delete (which is created by Neo4j::Node.new) will raise an exception if not all relationships are already deleted. === Node and Relationship Identity Each node has an unique identity (neo_id) which can be used for loading the node: id = Neo4j::Node.new.neo_id Neo4j.load_node(id) # will return the node that was created above And for relationships: rel = Neo4j::Node.new.add_rel(:some_relationship_type, Neo4j::Node.new) id = rel.neo_id # Load the node Neo4j.load_rel(id) # will return the relationship (rel) that was created above === Node Properties In order to use properties they have to be declared first class MyNode include Neo4j::NodeMixin property :foo, :bar end These properties (foo and bar) will be stored in the Neo database. You can set those properties: # create a node with two properties in one transaction node = MyNode.new { |n| n.foo = 123 n.bar = 3.14 } # access those properties puts node.foo You can also set a property like this: f = SomeNode.new f.foo = 123 Neo4j.rb supports properties to by of type String, Fixnum, Float and true/false === Property Types and Marshalling If you want to set a property of a different type then String, Fixnum, Float or true/false you have to specify its type. Example, to set a property to any type class MyNode include Neo4j::NodeMixin property :foo, :type => Object end node = MyNode.new node.foo = [1,"3", 3.14] Neo4j.load_node(node.neo_id).foo.class # => Array === Property of type Date and DateTime Example of using Date queries: class MyNode include Neo4j::NodeMixin property :born, :type => Date index :born, :type => Date end Neo4j::Transaction.run do node = MyNode.new node.born = Date.new 2008, 05, 06 end Neo4j::Transaction.run do MyNode.find("born:[20080427 TO 20100203]")[0].born # => Date end Example of using DateTime queries: class MyNode include Neo4j::NodeMixin property :since, :type => DateTime index :since, :type => DateTime end Neo4j::Transaction.run do node = MyNode.new node.since = DateTime.civil 2008, 04, 27, 15, 25, 59 end Neo4j::Transaction.run do MyNode.find("since:[200804271504 TO 201002031534]")[0].since # => DateTime end Only UTC timezone is allowed. Notice that the query must be performed in a new transaction. === Finding all nodes To find all nodes use the Neo4j#all_nodes method. Example Neo4j.all_nodes{|node| puts node} === Declared Relationships Neo relationships are asymmetrical. That means that if A has a relationship to B then it may not be true that B has a relationship to A. Relationships can be declared by using 'has_n', 'has_one' or 'has_list' Neo4j::RelClassMethods methods (included in the Neo4j::NodeMixin). This methods generates accessor methods for relationships. By using those accessor methods we do no longer need to know which direction to navigate in the relationship. There are accessor methods for both relationships and nodes. The 'has_n', 'has_one' or 'has_list' Neo4j::RelClassMethods methods returns a Neo4j::Relationships::DeclRelationshipDsl. === has_n The Neo4j::NodeMixin#has_n class method (see Neo4j::RelClassMethods#has_n) creates a new instance method that can be used for both traversing and adding new objects to a specific relationship type. The has_n method returns a DSL object Neo4j::Relationships::DeclRelationshipDsl For example, let say that Person can have a relationship to any other node class with the type 'friends': class Person include Neo4j::NodeMixin has_n :knows # will generate a knows method for outgoing relationships end The generated knows method will allow you to add new relationships, example: me = Person.new neo = Person.new me.knows << neo # me knows neo but neo does not know me You can add any object to the 'knows' relationship as long as it includes the Neo4j::NodeMixin or is an org.neo4j.core.api.Node object, example: person = Person.new another_node = Neo4j::Node.new person.knows << another_node person.knows.include?(another_node) # => true ==== has_n to an outgoing class If you want to express that the relationship should point to a specific class use the 'to' method on the has_n method. (It's still possible to add any nodes to the relationship - no validation is performed) class Person include Neo4j::NodeMixin has_n(:knows).to(Person) end The difference between specifying a 'to' class and not doing that is that Neo4j.rb will create relationships of type 'Person#knows'. Example: a = Person.new b = Neo4j::Node.new a.knows << b # is the same as a.add_rel('Person#knows', b) The given class 'Person' will act like a name space for the relationship 'knows'. ==== has_n from an incoming class It's also possible to generate methods for incoming relationships by using the 'from' method on the has_n method. Example: class Person include Neo4j::NodeMixin has_n :knows # will generate a knows method for outgoing relationships has_n(:known_by).from(Person, :knows) # will generate a known_by method for incoming knows relationship end In the example above we can find outgoing nodes using the 'knows' method and incoming relationships using the 'known_by' method. Example: person = Person.new other_person = Person.new person.knows << other_person other_person.known_by.include?(person) # => true You can also add a relationships on either the incoming or outgoing node. The from method can also take an additional class parameter if it has incoming nodes from a different node class (see the Actor-Role-Movie example at the top of this document). Example of adding a 'knows' relationship from the other node: me = Person.new neo = Person.new neo.known_by << me # me knows neo but neo does not know me me.knows.include?(neo) # => true neo.knows.include?(me) # => false The known_by method creates a 'knows' relationship between the me and neo nodes. This is the same as doing: me.knows << neo # me knows neo but neo does not know me ==== has_n from an incoming class with 'namespace' In the example above we only provided the parameter :knows for the from method. That means that incoming relationship of type 'knows' will be accessible with the known_by method. The following many-to-many example demonstrates how to specify a from class. class Product include Neo4j::NodeMixin has_n(:orders).to(Order) end class Order include Neo4j::NodeMixin has_n(:products).from(Product, :orders) end Then you can add an order on the Product object or add an Product on the Order object. p = Product.new o = Order.new o.products << p Which is the same as p = Product.new o = Order.new p.orders << o Notice that we must provide the class name of the from method since we use the 'namespace' Order for the outgoing orders relationship. ==== Accessing Declared Relationships Neo4j.rb generates methods for accessing declared relationship. Example, let say that class Product declares a relationship 'order' to class Order. class Product include Neo4j::NodeMixin has_n(:orders).to(Order) end To access the relationships between Product and Order without specifying the 'Order#' namespace one can use the '_rels' method. Example: product = Product.new order = Order.new product.orders << order prod_order_relationship = product.orders_rels.first prod_order_relationship.start_node # => product prod_order_relationship.end_node # => order For a has_one relationship the '_rel' method will be generated instead. This work for both incoming and outgoing nodes. === Relationship has_one Example: A person can have at most one Address class Address; end class Person include Neo4j::NodeMixin has_one(:address).to(Address) end class Address include Neo4j::NodeMixin property :city, :road has_n(:people).from(Person, :address) end In the example above we have Neo4j.rb will generate the following methods * in Person, the method ''address='' and ''address'' * in Address, the traversal method ''people'' for traversing incoming relationships from the Person node. Example of usage: p = Person.new p.address = Address.new p.address.city = 'malmoe' p.address.people.include?(p) # => true Or from the incoming ''address'' relationship a = Address.new {|n| n.city = 'malmoe'} a.people << Person.new a.people.first.address # => a For more documentation see the Neo4j::RelClassMethods#has_one. === Relationship has_list The has_n relationship will not maintain the order of when items are inserted to the relationship. If order should be preserved then use the has_list class method instead. Example class Company include Neo4j::NodeMixin has_list :employees end company = Company.new company.employees << employee1 << employee2 # prints first employee2 and then employee1 company.employees.each {|employee| puts employee.name} If the optional parameter :size is given then the list will contain a size counter. Example class Company has_list :employees, :counter => true end company = Company.new company.employees << employee1 << employee2 company.employees.size # => 2 For more documentation see the Neo4j::RelClassMethods#has_list. ==== Deleted List Items The list will be updated if an item is deleted in a list. Example: company = Company.new company.employees << employee1 << employee2 << employee3 company.employees.size # => 3 employee2.del company.employees.to_a # => [employee1, employee3] company.employees.size # => 2 ==== Memberships in lists Each node in a list knows which lists it belongs to, and the next and previous item in the list Example: employee1.list(:employees).prev # => employee2 employee2.list(:employees).next # => employee1 employee1.list(:employees).size # => 3 # the size counter is available if the :counter parameter is given as shown above (The list method takes an optional extra parameter - the list node. Needed if one node is member of more then one list with the same name). === Cascade delete The has_one, has_n and has_list all support cascade delete. There are two types of cascade delete - incoming and outgoing. For an outgoing cascade delete the members (of the has_one/has_n/has_list) will all be deleted when the 'root' node is deleted. For incoming cascade the 'root' node will be deleted when all its members are deleted. Example, outgoing class Person include Neo4j::NodeMixin has_list :phone_nbr, :cascade_delete => :outgoing end p = Person.new phone1 = Neo4j::Node.new phone1[:number] = '+46123456789' p.phone_nbr << phone1 p.phone_nbr << phone2 p.del # then phone1 and phone2 node will also be deleted. Example, incoming class Phone include Neo4j::NodeMixin has_list :people, :cascade_delete => :incoming # a list of people having this phone number end phone1 = Phone.new p1 = Person.new p2 = person.new phone1.people << p1 phone1.people << p2 p1.del p2.del # then phone1 will be deleted === Finding all nodes To find all nodes of a specific type use the all method. Example require 'neo4j/extensions/reindexer' class Car include Neo4j::NodeMixin property :wheels end class Volvo < Car end v = Volvo.new c = Car.new Car.all # will return all relationships from the reference node to car objects Volvo.all # will return the same as Car.all To return nodes (just like the relationships method) Car.all.nodes # => [c,v] Volvo.all.nodes # => [v] The reindexer extension that is used in the example above will for each created node create a relationship from the index node (Neo4j#ref_node.relationships.outgoing(:index_node)) to that new node. The all method use these relationships in order to return nodes of a certain class. The update_index method also uses this all method in order to update index for all nodes of a specific class. === Traversing Relationships Each type of relationship has a method that returns an Enumerable object that enables you to traverse that type of relationship. For example the Person example above declares one relationship of type friends. You can traverse all Person's friends (depth 1 is default) f.friends.each { |n| puts n } It is also possible to traverse a relationship of an arbitrary depth. Example finding all friends and friends friends. f.friends.depth(2).each { ...} Traversing to the end of the graph f.friends.depth(:all).each { ...} ==== Filtering Nodes If you want to find one node in a relationship you can use a filter. Example, let say we want to find a friend with name 'andreas' n1 = Person.new n2 = Person.new :name => 'andreas' n3 = Person.new n1.friends << n2 << n3 n1.friends{ name == 'andreas' }.to_a # => [n2] The block { name == 'andreas' } will be evaluated on each node in the relationship. If the evaluation returns true the node will be included in the filter search result. === Traversing Nodes The Neo4j::NodeMixin#incoming and Neo4j::NodeMixin#outgoing method are a more powerful methods compared to the generated has_n and has_one methods. Unlike the generated methods it can traverse several relationship types at the same time. The types of relationships being traversed must therefore always be specified in the incoming, outgoing or both method. The three methods can take one or more relationship types parameters if more than one type of relationship should be traversed. ==== Traversing Nodes of Arbitrary Depth The depth method allows you to specify how deep the traversal should be. If not specified, only one level is traversed. Example: me.incoming(:friends).depth(4).each {} # => people with a friend relationship to me ==== Traversing Nodes With Several Relationship Types It is possible to traverse several relationship types at the same type. The incoming, both and outgoing methods takes a list of arguments. Example, given the following holiday trip domain: # A location contains a hierarchy of other locations # Example region (asia) contains countries which contains cities etc... class Location include Neo4j::NodeMixin has_n :contains has_n :trips property :name index :name # A Trip can be specific for one global area, such as "see all of sweden" or # local such as a 'city tour of malmoe' class Trip include Neo4j::NodeMixin property :name end # create all nodes # ... # setup the relationship between all nodes @europe.contains << @sweden << @denmark @sweden.contains << @malmoe << @stockholm @sweden.trips << @sweden_trip @malmoe.trips << @malmoe_trip @malmoe.trips << @city_tour @stockholm.trips << @city_tour # the same city tour is available both in malmoe and stockholm Then we can traverse both the contains and the trips relationship types. Example: @sweden.outgoing(:contains, :trips).to_a # => [@malmoe, @stockholm, @sweden_trip] It is also possible to traverse both incoming and outgoing relationships, example: @sweden.outgoing(:contains, :trips).incoming(:contains).to_a # => [@malmoe, @stockholm, @sweden_trip, @europe] ==== Traversing Nodes With a Filter It's possible to filter which nodes should be returned from the traverser by using the filter function. This filter function will be evaluated differently depending the number of arguments it takes, see below. ==== Filtering: Using Evaluation in the Context of the Current Node If the provided filter function does not take any parameter it will be evaluated in the context of the current node being traversed. That means that one can writer filter functions like this: @sweden.outgoing(:contains, :trips).filter { name == 'sweden' } ==== Filtering: Using the TraversalPostion If the filter method takes one parameter then it will be given an object of type TraversalPosition which contains information about current node, how many nodes has been returned, depth etc. The information contained in the TraversalPostion can be used in order to decide if the node should be included in the traversal search result. If the provided block returns true then the node will be included in the search result. The filter function will not be evaluated in the context of the current node when this parameter is provided. The TraversalPosition is a thin wrapper around the java interface TraversalPosition, see http://api.neo4j.org/current/org/neo4j/api/core/TraversalPosition.html For example if we only want to return the Trip objects in the example above: # notice how the tp (TraversalPosition) parameter is used in order to only # return nodes included in a 'trips' relationship. traverser = @sweden.outgoing(:contains, :trips).filter do |tp| tp.last_relationship_traversed.relationship_type == :trips end traverser.to_a # => [@sweden_trip] === Relationships A relationship between two nodes can have properties just like a node. Example: p1 = Person.new p2 = Person.new relationship = p1.friends.new(p2) # set a property 'since' on this relationship between p1 and p2 relationship.since = 1992 If a Relationship class has not been specified for a relationship then any properties can be set on the relationship. It has a default relationship class: Neo4j::Relationships::Relationship If you instead want to use your own class for a relationship use the Neo4j::NodeMixin#has_n.relationship method, example: class Role # This class can be used as the relationship between two nodes # since it includes the following mixin include Neo4j::RelationMixin property :name end class Actor include Neo4j::NodeMixin # use the Role class above in the relationship between Actor and Movie has_n(:acted_in).to(Movie).relationship(Role) end === Finding Relationships The Neo4j::NodeMixin#relationships method can be used to find incoming or outgoing relationship objects. Example of listing all types of outgoing (default) relationship objects (of depth one) from the me node. me.relationships.each {|rel| ... } If we instead want to list the nodes that those relationships points to then the nodes method can be used. me.rels.nodes.each {|rel| ... } Listing all incoming relationship objects of any relationship type: me.rels.incoming.each { ... } Listing both incoming and outgoing relationship object of a specific type: me.rels.both(:friends) { } Finding one outgoing relationship of a specific type and node (you) me.rels.outgoing(:friends)[you] # => [# [# n1 n1.rels[0].end_node # => n2 A RelationshipMixin contains the relationship type which connects the two nodes n1.rels[0].relationship_type # => :friends Relationships can also have properties just like a node (NodeMixin). === Finding outgoing and incoming relationships If we are only interested in all incoming nodes, we can do n2.rels.incoming # => [# [# [n1] === Finding outgoing/incoming nodes of a specific relationship type Let say we want to find who has my phone number and who consider me as a friend # who has my phone numbers me.rels.incoming(:phone_numbers).nodes # => people with my phone numbers # who consider me as a friend me.rels.incoming(:friends).nodes # => people with a friend relationship to me Remember that relationships are not symmetrical. Notice that, there is also another way of finding nodes, see the Neo4j::NodeMixin#traverse method below. === Transactions All operations that work with the node space (even read operations) must be wrapped in a transaction. For example all get, set and find operations will start a new transaction if none is already not running (for that thread). If you want to perform a set of operation in a single transaction, use the Neo4j::Transaction.run method: Example Neo4j::Transaction.run { node1.foo = "value" node2.bar = "hi" } There is also a auto commit feature available which is enabled by requiring 'neo4j/auto_tx' instead of 'neo4j', see the three minutes tutorial above. You can also run it without a block, like this: transaction = Neo4j::Transaction.new transaction.start # do something transaction.finish ==== Rollback Neo4j support rollbacks on transaction. Example: Example: include 'neo4j' node = MyNode.new Neo4j::Transaction.run { |t| node.foo = "hej" # something failed so we signal for a failure t.failure # will cause a rollback, node.foo will not be updated } === Indexing Properties and relationships which should be indexed by Lucene can be specified by the index class method. For example to index the properties foo and bar class SomeNode include Neo4j::NodeMixin property :foo, :bar index :foo, :bar end Every time a node of type SomeNode (or a subclass) is created, deleted or updated the Lucene index will be updated. === Reindexing Sometimes it's necessarily to change the index of a class after a lot of node instances already have been created. To delete an index use the class method 'remove_index' To update an index use the class method 'update_index' which will update all already created nodes in the Neo database. Example: require 'neo4j' require 'neo4j/extensions/reindexer' # needed for the update_index method class Person include Neo4j property :name, :age, :phone index :name, :age end p1 = Person.new :name => 'andreas', :phone => 123 Person.find (:name => 'andreas') # => [p1] Person.find (:phone => 123) # => [] # change index and reindex all person nodes already created in the Neo database. Person.remove_index :name Person.index :phone # add an index on phone Person.update_index Person.find (:name => 'andreas') # => [] Person.find (:phone => 123) # => [p1] In order to use the update_index method you must include the reindexer neo4j.rb extension. This extension will keep a relationship to each created node so that it later can recreate the index by traversing those relationships. === Updating Lucene Index The Lucene index will be updated after the transaction commits. It is not possible to query for something that has been created inside the same transaction as where the query is performed. === Querying (using Lucene) You can declare properties to be indexed by Lucene by the index method: Example class Person include Neo4j::NodeMixin property :name, :age index :name, :age end node = Person.new node.name = 'foo' node.age = 42 Person.find(:name => 'foo', :age => 42) # => [node] The query parameter (like property on a Neo4j::NodeMixin) can be of type String, Fixnum, Float, boolean or Range. The query above can also be written in a Lucene query DSL: Person.find{(name =='foo') & (age => 42)} # => [node] Or Lucene query language: Person.find("name:foo AND age:42") For more information see: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html or the Lucene module above. === Indexing and Property Types In order to use range query on numbers the property types must be converted. This is done by using the :type optional parameter: class Person include Neo4j::NodeMixin property :name, :age index :age, :type => Fixnum end By using :type => Fixnum the age will be padded with '0's (Lucene only support string comparison). Example, if the :type => Fixnum was not specified then p = Person.new {|n| n.age = 100 } Person.find(:age => 0..8) # => [p] === Indexing and Querying Relationships The Neo4j::NodeMixin#index method can be used to index relationships to other classes. Example, let say we have to classes, Customer and Orders: class Customer include Neo4j::NodeMixin property :name # specifies outgoing relationships to Order has_n(:orders).to(Order) # create an index on customer-->order#total_cost index "orders.total_cost" end class Order include Neo4j::NodeMixin property :total_cost # specifies one incoming relationship from Customer has_one(:customer).from(Customer, :orders) # create an index on the order<--customer#name relationship index "customer.name" end Notice that we can index both incoming and outgoing relationships. Let's create a customer and one order for that customer Neo4j::Transaction.run do cust = Customer.new order = Order.new cust.name = "kalle" order.total_cost = "1000" cust.orders << order end Now we can find both Orders with a total cost between 500 and 2000 and Customers with name 'kalle' using Lucene Example: customers = Customer.find('orders.total_cost' => 500..2000, 'name' => 'kalle') Or also possible from the other way: orders = Order.find('total_cost' => 500..2000, 'customer.name' => 'kalle') === Full text search Neo4j supports full text search by setting the tokenized property to true on an index. (see JavaDoc for org.apache.lucene.document.Field.Index.ANALYZED). class Comment include Neo4j::NodeMixin property :comment index comment, :tokenized => true end === Keyword searches If we want to search for exact matches, for example language codes like 'se', 'it' we must make sure that the Lucene does not filters away stop words like 'it' class LangCodes include Neo4j::NodeMixin property :code index :code, :analyzer => :keyword end By using the keyword analyzer (instead of the default StandardAnalyzer) we make sure that Lucene indexes everything. For more info, see the Lucene chapter below. === Unmarshalling The Neo module will automatically unmarshal nodes to the correct ruby class. It does this by reading the classname property and loading that ruby class with that node. If this classname property does not exist it will use the default Neo4j::Node for nodes and Neo4j::Relationships::Relationship for relationship. class Person include Neo4j::Node def hello end end f1 = Person.new {} # load the class again f2 = Neo4j.load_node(foo.neo_id) # f2 will now be new instance of Person, but will be == f1 f1 == f2 # => true === Reference node There is one node that can always be found - the reference node, Neo4j::ReferenceNode. Example: Neo4j.ref_node This node can have a relationship to the index node (Neo4j::IndexNode), which has relationships to all created nodes. You can add relationships from this node to your nodes. == Performance Issues It is recommended to wrap several Neo4j operations including read operations in a singe transaction if possible for better performance. Updating a Lucene index can be slow. A solution to this is to keep the index in memory instead of on disk. Using raw java nodes (Neo4j::Node) and relationship (Neo4j::Relationship) will also increase performance. Here is an example how to traverse only using Java objects (instead of Ruby wrappers): iter = folder.outgoing(:child_folders).raw(true).depth(:all).iterator iter.hasNext() The example above gives you access to the raw Java iterator class. Another way to improve performance is to rewrite the performance critical part of your application in Java and access it from neo4j.rb in JRuby. Traversing in pure Java is of orders of magnitude faster then doing it in JRuby. == Migrations By using migrations you can keep the code and the database in sync. There are two types of migrations : none lazy and lazy. In a none lazy migration the database is upgraded/downgraded all at once, while in lazy migrations the node/relationship is only upgraded/downgraded when the node or relationship is loaded. === None Lazy Migration Here is an example of a use case for this feature. Let say that we already have a database with nodes that have one property 'name'. Now we want to split that property into two properties: 'surname' and 'given_name'. We want to upgrade the database when it starts so we don't use the lazy migration feature. The neo database starts at version 0 by default. Neo4j.migrate 1, "split name" do up do # find all people and change Person.all.each {|p| surname = self[:name].split[0] given_name = self[:name].split[1] delete_property(:name) end down do Person.all.each {|p| name = "#{self[:surname]} {self[:given_name]}" delete_property(:surname) delete_property(:given_name) end end end If the code above has been loaded before the neo database starts it will automatically upgrade to version 1 (running all the migrations to the higest migration available). You can force the neo to go to a specific version by using Neo4j#migrate! method. For more information see the example/imdb application or the RSpecs. === Lazy Migration The example above can also be run as lazy migration. i.e. perform the upgrade/downgrade when the node is loaded instead of all at once. The following example demonstrates this feature: class Person include Neo4j::NodeMixin include Neo4j::MigrationMixin # you need to include this in order to use lazy migrations ... end Person.migration 1, :split_name do up do surname = self[:name].split[0] given_name = self[:name].split[1] delete_property(:name) end down do name = "self[:given_name] #{self[:surname]}" delete_property(:surname) delete_property(:given_name) end end == Batch Insert Sometimes you need a fast way to insert a lot of data into the database without any transactional support. Neo4j.rb wrapps the Java BatchInserter API. Neo4j::BatchInserter.new do |b| a = Neo4j::Node.new :name => 'a' b = Neo4j::Node.new :name => 'b' c = Foo.new :key1 => 'val1', :key2 => 'val2' Neo4j::Relationship.new(:friend, a, b, :since => '2001-01-01') end Creating nodes and relationships inside the code block uses the batch inserter API. Only a limited set of the API for nodes and relationships are available inside the code block (e.g. traversing is not possible). If you need lucene indexing you have to wrap your code inside a transaction, since only when the transaction is finished the lucene database will be updated (the neo4j transaction is disabled). Example: Neo4j::BatchInserter.new do Neo4j::Transaction.new foo = Foo98.new foo.name = 'hej' Neo4j::Transaction.finish # update the lucene index, neo4j transaction is disabled here. end To get even better insertion speed one can use the raw java Batch Inserter API: http://wiki.neo4j.org/content/Batch_Insert. Example: Neo4j::BatchInserter.new do |b| b.createNode({'name' => 'me'}) end Notice that the BatchInserter can be used together with Migrations. == Extensions: Replication There is an experimental extension that makes it possible to replicate a Neo4j database to another machine. For example how to use it see the test/replication/test_master.rb and test_slave.rb It has only been tested to work with a very simple node space. == Extension: REST There is a REST extension to Neo4j.rb. It requires the following gems * Sinatra >= 0.9.4 * Rack >= 1.0 * json-jruby >= 1.1.6 For RSpec testing it also needs: * rack-test For more information see the examples/rest/example.rb or the examples/admin or Neo4j::RestMixin. == Extension: find_path Extension which finds the shortest path (in terms of number of links) between two nodes. Use something like this: require 'neo4j/extensions/find_path' node1.traverse.both(:knows).depth(:all).path_to(node2) # => [node1, node42, node1234, node256, node2] This extension is still rather experimental. The algorithm is based on the one used in the Neo4j Java IMDB example. For more information see Neo4j::Relationships::NodeTraverser#path_to or the RSpec find_path_spec.rb. == Extension: graph_algo This extension uses the Java Neo4j Graph Algo package - http://components.neo4j.org/graph-algo/ Currently only the AllSimplePaths algorithm supported. If you want the other algorithms you either access the Java methods directly or write a new wrapper (like my AllSimplePath wrapper). == Ruby on Rails with Neo4j.rb Neo4j.rb does work nicely with R&R. There are two ways to use neo4j.rb with rails - embedded or accessing it via REST. === Embedded Rails A complete example of embedding Neo4j with rails can be found http://github.com/andreasronge/neo4j-rails-example/tree/master (please fork and improve it). ==== Config rails Config rails to use Neo4j.rb instead of ActiveRecord, edit movies/config/environment.rb environment.rb: config.frameworks -= [ :active_record ] #, :active_resource, :action_mailer ] config.gem "neo4j", :version => "0.3.1" # or the latest one If you need to reindex all nodes or use the Neo4j::NodeMixin#all method you must require the reindexer neo4j.rb extension. Add a require in the environment.rb file: require 'neo4j/extensions/reindexer' ==== Models Create a new file for each Neo4j node or relationship class Example for an Actor class create the file: app/models/actor.rb # filename app/models/actor.rb class Actor include Neo4j::NodeMixin property :name, :phone, :salary has_n(:acted_in).to(Movie).relationship(Role) index :name end ==== Create RESTful routes Edit the config/routes.rb file Example: ActionController::Routing::Routes.draw do |map| map.resources :actors do |actor| actor.resources :acted_in actor.resource :movies, :controller => 'acted_in' end ==== Create Controllers Since all Neo4j operations must be wrapped in a transaction, add an around filter for all operations Example: acted_in_controller.rb: class ActedInController < ApplicationController around_filter :neo_tx def index @actor = Neo4j.load_node(params[:actor_id]) @movies = @actor.acted_in.nodes end def create @actor = Neo4j.load_node(params[:actor_id]) @movie = Movie.new @movie.update(params[:movie]) @actor.acted_in << @movie flash[:notice] = 'Movie was successfully created.' redirect_to(@actor) end def update @actor = Neo4j.load_node(params[:actor_id]) @movie = Movie.new @movie.update(params[:movie]) @actor.acted_in.new @movie @movie.update(params[:movie]) flash[:notice] = 'Movie was successfully updated.' redirect_to(@movie) end def show @movie = Neo4j.load_node(params[:id]) end def new @actor = Neo4j.load_node(params[:actor_id]) @movie = Movie.value_object.new end def edit @movie = Neo4j.load_node(params[:id]) end private def neo_tx Neo4j::Transaction.new yield Neo4j::Transaction.finish end end ==== Add views Add the following views in app/views/actors index.html.erb:

Listing actors

<% for actor in @actors %> <% end %>
Name
<%=h actor.name %> <%= link_to 'Edit', edit_actor_path(actor) %> <%= link_to 'Show', actor %> <%= link_to 'Destroy', actor, :confirm => 'Are you sure?', :method => :delete %>

<%= link_to 'New actor', new_actor_path %> new.html.erb:

New Actor

<% form_for(@actor) do |f| %>

<%= f.label :name %>
<%= f.text_field :name %>

<%= f.label :phone %>
<%= f.text_field :phone %>

<%= f.label :salary%>
<%= f.text_field :salary %>

<%= f.submit "Update" %>

<% end %> <%= link_to 'Back', actors_path %> == The Lucene Module You can use this module without using the Neo4j module. Lucene provides: * Flexible Queries - Phrases, Wildcards, Compound boolean expressions etc... * Field-specific Queries eg. title, artist, album * Sorting * Ranked Searching === Lucene Document In Lucene everything is a Document. A document can represent anything textual: A Word Document, a DVD (the textual metadata only), or a Neo4j.rb node. A document is like a record or row in a relationship database. The following example shows how a document can be created by using the ''<<'' operator on the Lucene::Index class and found using the Lucene::Index#find method. Example of how to write a document and find it: require 'lucene' include Lucene # the var/myindex parameter is either a path where to store the index or # just a key if index is kept in memory (see below) index = Index.new('var/myindex') # add one document (a document is like a record or row in a relationship database) index << {:id=>'1', :name=>'foo'} # write to the index file index.commit # find a document with name foo # hits is a ruby Enumeration of documents hits = index.find{name == 'foo'} # show the id of the first document (document 0) found # (the document contains all stored fields - see below) hits[0][:id] # => '1' Notice that you have to call the commit method in order to update the index (both disk and in memory indexes). Performing several update and delete operations before a commit will give much better performance than committing after each operation. === Keep indexing on disk By default Neo4j::Lucene keeps indexes in memory. That means that when the application restarts the index will be gone and you have to reindex everything again. To store indexes on file: Lucene::Config[:store_on_file] = true Lucene::Config[:storage_path] => '/home/neo/lucene-db' When creating a new index the location of the index will be the Lucene::Config[:storage_path] + index path Example: Lucene::Config[:store_on_file] = true Lucene::Config[:storage_path] => '/home/neo/lucene-db' index = Index.new('/foo/lucene') The example above will store the index at /home/neo/lucene-db/foo/lucene === Indexing several values with the same key Let say a person can have several phone numbers. How do we index that? index << {:id=>'1', :name=>'adam', :phone => ['987-654', '1234-5678']} === Id field All Documents must have one id field. If an id is not specified, the default will be: :id of type String. A different id can be specified using the field_infos id_field property on the index: index = Index.new('some/path/to/the/index') index.field_infos.id_field = :my_id To change the type of the my_id from String to a different type see below. === Conversion of types Lucene.rb can handle type conversion for you. (The Java Lucene library stores all the fields as Strings) For example if you want the id field to be a Fixnum require 'lucene' include Lucene index = Index.new('var/myindex') # store the index at dir: var/myindex index.field_infos[:id][:type] = Fixnum index << {:id=>1, :name=>'foo'} # notice 1 is not a string now index.commit # find that document, hits is a ruby Enumeration of documents hits = index.find(:name => 'foo') # show the id of the first document (document 0) found # (the document contains all stored fields - see below) doc[0][:id] # => 1 If the field_info type parameter is not set then it has a default value of String. === Storage of fields By default only the id field will be stored. That means that in the example above the :name field will not be included in the document. Example doc = index.find('name' => 'foo') doc[:id] # => 1 doc[:name] # => nil Use the field info :store=true if you want a field to be stored in the index (otherwise it will only be searchable). Example require 'lucene' include Lucene index = Index.new('var/myindex') # store the index at dir: var/myindex index.field_infos[:id][:type] = Fixnum index.field_infos[:name][:store] = true # store this field index << {:id=>1, :name=>'foo'} # notice 1 is not a string now index.commit # find that document, hits is a ruby Enumeration of documents hits = index.find('name' => 'foo') # let say hits only contains one document so we can use doc[0] for that one # that document contains all stored fields (see below) doc[0][:id] # => 1 doc[0][:name] # => 'foo' === Setting field infos As shown above you can set field infos like this index.field_infos[:id][:type] = Fixnum Or you can set several properties like this: index.field_infos[:id] = {:type => Fixnum, :store => true} ==== Tokenized Field infos can be used to specify if the should be tokenized. If this value is not set then the entire content of the field will be considered as a single term. Example index.field_infos[:text][:tokenized] = true If not specified, the default is 'false' ==== Analyzer Field infos can also be used to set which analyzer should be used. If none is specified, the default analyzer - org.apache.lucene.analysis.standard.StandardAnalyzer (:standard) will be used. index.field_infos[:code][:tokenized] = false index.field_infos[:code][:analyzer] = :standard The following analyzer is supported * :standard (default) - org.apache.lucene.analysis.standard.StandardAnalyzer * :keyword - org.apache.lucene.analysis.KeywordAnalyzer * :simple - org.apache.lucene.analysis.SimpleAnalyzer * :whitespace - org.apache.lucene.analysis.WhitespaceAnalyzer * :stop - org.apache.lucene.analysis.StopAnalyzer For more info, check the Lucene documentation, http://lucene.apache.org/java/docs/ === Simple Queries Lucene.rb support search in several fields: Example: # finds all document having both name 'foo' and age 42 hits = index.find('name' => 'foo', :age=>42) Range queries: # finds all document having both name 'foo' and age between 3 and 30 hits = index.find('name' => 'foo', :age=>3..30) === Lucene Queries If the query is string then the string is a Lucene query. hits = index.find('name:foo') For more information see: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html === Advanced Queries (DSL) The queries above can also be written in a lucene.rb DSL: hits = index.find { (name == 'andreas') & (foo == 'bar')} Expression with OR (|) is supported, example # find all documents with name 'andreas' or age between 30 and 40 hits = index.find { (name == 'andreas') | (age == 30..40)} === Sorting Sorting is specified by the 'sort_by' parameter Example: hits = index.find(:name => 'foo', :sort_by=>:category) To sort by several fields: hits = index.find(:name => 'foo', :sort_by=>[:category, :country]) Example sort order: hits = index.find(:name => 'foo', :sort_by=>[Desc[:category, :country], Asc[:city]]) === Thread-safety The Lucene::Index is thread safe. It guarantees that an index is not updated from two threads at the same time. === Lucene Transactions Use the Lucene::Transaction in order to do atomic commits. By using a transaction you do not need to call the Index.commit method. Example: Transaction.run do |t| index = Index.new('var/index/foo') index << { id=>42, :name=>'andreas'} t.failure # rollback end result = index.find('name' => 'andreas') result.size.should == 0 You can find uncommitted documents with the uncommitted index property. Example: index = Index.new('var/index/foo') index.uncommited #=> [document1, document2] Notice that even if it looks like a new Index instance object was created the index.uncommitted may return a non-empty array. This is because Index.new is a singleton - a new instance object is not created.