= English Ruby Linguistics Module - Synopsis This is an overview of the functionality currently in the English functions of the Ruby Linguistics module as of version 0.02: == Pluralization require 'linguistics' Linguistics::use( :en ) # extends Array, String, and Numeric "box".en.plural # => "boxes" "mouse".en.plural # => "mice" "ruby".en.plural # => "rubies" == Indefinite Articles "book".en.a # => "a book" "article".en.a # => "an article" == Present Participles "runs".en.present_participle # => "running" "eats".en.present_participle # => "eating" "spies".en.present_participle # => "spying" == Ordinal Numbers 5.en.ordinal # => "5th" 2004.en.ordinal # => "2004th" == Numbers to Words 5.en.numwords # => "five" 2004.en.numwords # => "two thousand and four" 2385762345876.en.numwords # => "two trillion, three hundred and eighty-five billion, seven hundred and sixty-two million, three hundred and forty-five thousand, eight hundred and seventy-six" == Quantification "cow".en.quantify( 5 ) # => "several cows" "cow".en.quantify( 1005 ) # => "thousands of cows" "cow".en.quantify( 20_432_123_000_000 ) # => "tens of trillions of cows" == Conjunctions animals = %w{dog cow ox chicken goose goat cow dog rooster llama pig goat dog cat cat dog cow goat goose goose ox alpaca} puts "The farm has: " + animals.en.conjunction # => The farm has: four dogs, three cows, three geese, three goats, two oxen, two cats, a chicken, a rooster, a llama, a pig, and an alpaca Note that 'goose' and 'ox' are both correctly pluralized, and the correct indefinite article 'an' has been used for 'alpaca'. You can also use the generalization function of the #quantify method to give general descriptions of object lists instead of literal counts: allobjs = [] ObjectSpace::each_object {|obj| allobjs << obj.class.name} puts "The current Ruby objectspace contains: " + allobjs.en.conjunction( :generalize => true ) which will print something like: The current Ruby objectspace contains: thousands of Strings, thousands of Arrays, hundreds of Hashes, hundreds of Classes, many Regexps, a number of Ranges, a number of Modules, several Floats, several Procs, several MatchDatas, several Objects, several IOS, several Files, a Binding, a NoMemoryError, a SystemStackError, a fatal, a ThreadGroup, and a Thread == Infinitives New in version 0.02: "leaving".en.infinitive # => "leave" "left".en.infinitive # => "leave" "leaving".en.infinitive.suffix # => "ing" == WordNet® Integration Also new in version 0.02, if you have the Ruby-WordNet module installed, you can look up WordNet synsets using the Linguistics interface: # Test to be sure the WordNet module loaded okay. Linguistics::EN.has_wordnet? # => true # Fetch the default synset for the word "balance" "balance".synset # => # # Fetch the synset for the first verb sense of "balance" "balance".en.synset( :verb ) # => # # Fetch the second noun sense "balance".en.synset( 2, :noun ) # => # # Fetch the second noun sense's hypernyms (more-general words, like a superclass) "balance".en.synset( 2, :noun ).hypernyms # => [#] # A simpler way of doing the same thing: "balance".en.hypernyms( 2, :noun ) # => [#] # Fetch the first hypernym's hypernyms "balance".en.synset( 2, :noun ).hypernyms.first.hypernyms # => [#] # Find the synset to which both the second noun sense of "balance" and the # default sense of "shovel" belong. ("balance".en.synset( 2, :noun ) | "shovel".en.synset) # => # # Fetch just the words for the other kinds of "instruments" "instrument".en.hyponyms.collect {|synset| synset.words}.flatten # => ["analyzer", "analyser", "cautery", "cauterant", "drafting instrument", "extractor", "instrument of execution", "instrument of punishment", "measuring instrument", "measuring system", "measuring device", "medical instrument", "navigational instrument", "optical instrument", "plotter", "scientific instrument", "sonograph", "surveying instrument", "surveyor's instrument", "tracer", "weapon", "arm", "weapon system", "whip"] There are many more WordNet methods supported Ð too many to list here. See the documentation for the complete list. == LinkParser Integration Another new feature in version 0.02 is integration with the Ruby version of the CMU Link Grammar Parser by Martin Chase. If you have the LinkParser module installed, you can create linkages from English sentences that let you query for parts of speech: # Test to see whether or not the link parser is loaded. Linguistics::EN.has_link_parser? # => true # Diagram the first linkage for a test sentence puts "he is a big dog".sentence.linkages.first.to_s +---O*---+ | +--Ds--+ +Ss+ | +-A-+ | | | | | he is a big dog # Find the verb in the sentence "he is a big dog".en.sentence.verb.to_s # => "is" # Combined infinitive + LinkParser: Find the infinitive form of the verb of the given sentence. "he is a big dog".en.sentence.verb.infinitive # => "be" # Find the direct object of the sentence "he is a big dog".en.sentence.object.to_s # => "dog" # Look at the raw LinkParser::Word for the direct object of the sentence. "he is a big dog".en.sentence.object # => #, @lword=#], @suffix="", @left=[], @name="is", @position=1>, @subName="*", @name="O", @length=3>], @name="dog", @position=4> # Combine WordNet + LinkParser to find the definition of the direct object of # the sentence "he is a big dog".en.sentence.object.gloss # => "a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds; \"the dog barked all night\""