Sha256: f2434445d91d2f8db0dd0538d374ce0f1d2a2e70f6a3a3a2011bc96424335de2
Contents?: true
Size: 1.13 KB
Versions: 2
Compression:
Stored size: 1.13 KB
Contents
#!/usr/bin/env ruby require 'rubygems' gem 'dimus-biodiversity' $LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__) + "/../lib")) require 'biodiversity' require 'yaml' if ARGV.empty? puts "Usage:\n\nnnparse file_with_scientific_names [output_file]\n\ndefault output_file is parsed.yml\n\n" exit end parser = ScientificNameParser.new output = ARGV[1] || 'parsed.yml' o = File.open(output,'w') # parse a file with names count = count2 = 0 names = [] IO.foreach(ARGV[0]) do |n| name_dict = {} puts 'Parsing names' if count2 == 0 count2 += 1 p count2 if count2 % 5000 == 0 n.strip! name_dict = {:input => n} parsed = parser.parse n unless parsed name_dict[:details] = {:parsed => false} names << name_dict count += 1 else name_dict[:output] = parsed.value name_dict[:caononical] = parsed.canonical name_dict[:details] = parsed.details name_dict[:parsed => true] names << name_dict end end $KCODE = 'UTF8' puts "Converting results to YAML" results = YAML.dump(names) puts "Writing restuls to #{output} file" o.write(results) puts "Found #{count2} records, #{count} of them could not be parsed."
Version data entries
2 entries across 2 versions & 1 rubygems
Version | Path |
---|---|
dimus-biodiversity-0.0.5 | bin/nnparse |
dimus-biodiversity-0.0.6 | bin/nnparse |