zcc

Get Version

0.2.0

→ ‘zcc’

What

Z Copy Cataloging is a command line tool written in Ruby to make your MARC record copy cataloging faster and more accurate. The ‘Z’ may stand for zippy if you really want it to.

THIS IS beta SOFTWARE. IT MIGHT MANGLE YOUR MARC AND CORRUPT YOUR CATALOG. I call it beta because I care about your data.

Requirements

Ruby 1.8

YAZ I suggest adding the indexdata repositories for your distro (Debian or Redhat) and installing that way. From a footnote to ruby-zoom: If you build from source, make sure you pass the—enable-shared option to the configure script before building YAZ, by default it does not build shared libraries required by Ruby/ZOOM.

Gems

sudo gem install zcc marc zoom unicode highline term-ansicolor

Linux? ZCC has only been tested on Linux (Debian Etch). It probably won’t work under other operating systems, but hopefully works with different distros. Feedback is appreciated on how it functions under other operating systems. I’m willing to try to make it work if there is enough interest.

The gems bin folder must be added to your PATH. For me it is /var/lib/gems/1.8/bin or /usr/bin

Repository

A new gem is super easy to release to rubyforge, so expect that the gem on rubyforge is up-to-date for working features, though not necessarily for small changes. If you want to build the gem yourself, it will also be super easy once you set things up.

To make sure you have all the dependencies needed to build the gem:

$ sudo gem install newgem --include-dependencies

Grab the latest from the svn repository:

$ svn checkout svn://rubyforge.org/var/svn/zcc/zcc/trunk

To build and install the gem as a user most easily, the user must be added to the sudoers list. On my system I do this by running visudo as root. For more information on sudoers and possible settings that may be more secure consult: Sudoers Manual

You can add the following lines using visudo:

Cmnd_Alias GEM=/usr/bin/gem
user     ALL=GEM

Now as a user you can build and install the gem on your own system:

$ cd zcc
$ rake local_deploy

or just

 $ rake install_gem

For more tasks take a look at the output of:

$ rake -T 

Features

Z39.50 search for records. Configure ZCC for as many targets as you like grouped in order of preference. A relatively current list of targets is provided in proper yaml format based on the targettest list (see examples/zservers.yaml). Searches are grouped so if you find the perfect record with the first batch of targets other targets do not have to be searched. Include your preferred zservers in a lower group to get better quality records faster.

Search by Title, ISBN or LCCN Currently three searches are supported. From the same prompt you may search either by Title, ISBN (no dashes) or LCCN (with dash).

Sort your result sets by relevancy (title search only and on by default), date, content (AACR, ISBD), or any chosen subfield (first instance only).

See the most important fields for copy cataloging when choosing correct records.* In the initial list view ZCC presents the user with the full 245 (title and statement of responsibility) and 300 (extent, which includes page numbers). This gives you a quick way to determine potential records. The fields displayed is configurable as of the 0.1.0 release so you may include whatever

View the full MARC record before choosing. ZCC allows you to see the full MARC record in a pretty line format. ZCC does not hide MARC from you, even as ugly as it can be. If there’s only one record in your result set then you see the full record.

Compare two records and choose the best. ZCC compares field by field (ie line by line) for matched fields. You can quickly see what fields come from which record. The comparison is similar to the way diff works. Matches between records are denoted with an ‘m’ while one record is denoted with a plus-sign and the other with a minus-sign. TODO: improvements would allow for color coding differences.

Optional: Check the record for common errors. If you have Perl’s MARC::Lint installed you can see if all your indicators have proper values and only repeatable fields repeat. An added feature is determining what encoding level (ISBD, AACR2) the record is in.

Choose records from different targets and then choose among them to find the best record. In the end you may just want one record and you want the best one. All the same features available for each target are also available for all the records chosen from all targets, so users can pick the best record for their location. This is called the winnowing stage. View the full records, compare two records, and optionally lint records. This allows you to choose possible records quickly from various targets and then make a final decision later on. Choose all the records, none of them or just one for processing.

Scripted changes to records. ZCC is highly configurable for making scripted changes to each record like macros. Want to delete all 852 or 9XX fields from a record before importing into your library system? ZCC can do that and more. Add fields and subfields with pre-determined information or prompt for information. With version 0.2.0 you can now set up more than one scripting profile and choose which profile to use on a per record basis. The ‘start’ and ‘end’ scripts run before and after the script you choose. The previous examples are easy to configure. With a little Ruby scripting you can make more complex changes. A few sample scripts are provided that do things like take the dewey call number from the 082a field and copy it to the fields Koha uses for call numbers, and if there is no 082a field it prompts for the proper call number parts. Don’t need scripting? Just turn it off. Need a script that isn’t available and you don’t know any Ruby—let know and we’ll see if we can work something out. TODO: Put each script in a separate file in a ‘scripts’ directory under the ZCC root directory. Make writing plugins in Ruby easier.

Output to CSV file. Want to print labels or keep statistics? ZCC allows you to choose which subfields you would like to export into a CSV file. Programs like glabels can accept CSV files for custom labelmaking. CSV files can also be imported into spreadsheet programs like OpenOffice Calc or Excel. (If a value is not found for a particular field ZCC will prompt the user for input, which can be blank. I’m considering turning off this feature.)

Subfield editing Since version 0.0.3 there is a facility for editing subfields. Have you found a good record but one or two subfields are different than the item you have in hand? You can still accept the record and make small edits of existing subfields. It allows you to pick the subfield you want to edit when there are repeatable fields and subfields. Not a replacement for a MARC editor, but for copy cataloging probably good enough in many cases. Still lacking some features like timestamp change.

Full record editing Version 0.2.0 now has a full MARC editor. Well, sort of. ZCC uses yaz-marcdump to turn the file into line format. It then opens up this line formatted record in your favorite editor (vim by default). Once you edit the record and save it, yaz-marcdump translates the record from line format back into MARC format.

Independent features You may turn on or off particular features. Don’t want your copy catalogers to do more than minimal editing? Turn off the full record editing and only allow subfield editing.

Update zebra records You can now use ZCC to update records that are already in your database. The 901$a field is a reserved field for ZCC so selecting a record from your own database, editing it and saving it again will match on that field and update the record with the edited version. You’ll probably want to select ‘none’ if you have scripting turned on, or you may want to have a yaml config file just for editing records that are already in your zebra database. TODO: Allow the recordID field to be configurable.

Sample simple zebra set up to enable a localhost. See: Very simple setup of local zebra server

Configuration

Because of all the scripting and csv features zcc requires a lot of configuration to get exactly what you want out of it. Currently the configuration files are my own. I use them and they work for my purposes, but they will not work for yours. Once you have configured zcc, though, you should be copy cataloging much faster.

An example zcc.yaml config file can be found in the examples directory of the gem. For instance if in your home directory you have a directory .zcc for all ZCC related configuration then:

cd ~/.zcc
cp -r /var/lib/gems/1.8/gems/zcc-0.2.0/examples/* .
Or if use Debian and you’ve updated rubygems to use a version other than the one in apt you might find it here:
cp -r /usr/lib/ruby/gems/1.8/gems/zcc-0.2.0/examples/* .

Edit the file zcc.yaml to your liking. YOU MUST AT LEAST CHANGE THE ROOT DIRECTORY. Detailed instructions are given in this file to aid in configuration. It refers you to some other configuration files for optional added configuration.

Now if you want your retrieved records to be indexed and searchable over Z39.50 via zebra check out this quick start page: Very simple setup of local zebra server

Use

If zcc.yaml is in the working directory: $ zcc

If your yaml configuration file is in a different directory do something like: $ zcc —yaml ~/.zcc/zcc.yaml or $ zcc -y /path/to/zcc_config.yaml So your ZCC yaml configuration file can have a different name if the —yaml (-y) switch is used. It may also be placed in a different directory from the rest of your ZCC configuration and working files. You may wish to have multiple configuration files for different needs.

  1. From the command line run zcc
  2. Next to each shown result is a number. Numbering starts with zero.
  3. ZCC command line: type ‘help’ to see the possible help options.

TODO

Smart character set conversion Currently incoming records are assumed to be marc8 and are converted to utf8. This is complex. Let me know what you need here. v. 0.0.3+ checks leader byte 9 for character encoding and either keeps it as UTF8 or convert from MARC8 to UTF8. Uses ruby-zoom’s xml method for conversion.

TUI If there is interest in this script, I’m hoping to make a nice Text User Interface with curses or ncurses. Currently everything just scrolls up the terminal. Since v. 0.0.3 there are some nicer TUI elements like highlighting With version 0.1.0, there’s a much nicer configurable TUI which uses highline and ansicolor gems.

Automatic retrieval of authority records. I already have a separate script in the works that can retrieve authority records for names (not subjects). I’d like to work that as an option into the main script.

Unit Testing. While I’ve tested the script in my own work, I need to do more formal testing of the methods.

Exception handling. Currently there’s not great exception handling for all potential errors that may arise in the script. This needs to change, but hasn’t caused problems for me yet.

Internationalize This can start by not hardcoding any fields and allowing them to be configurable. The initial display of fields on search of a target should show the relevant fields in the user’s prefered MARC flavor. You now have the choice of which fields display by default. If there is interest, I’d be willing to try making other text translatable.

Turn off/on displayed fields/subfields

Move all configuration to YAML file. Currently some configuration is made in the main script and some in the YAML file. As of version 0.2.0, the yaml file is getting quite long so I’m looking at ways to break out long configuration files like zservers.yaml and scripts while still allowing everything to be self-contained in one file if that is the desire.

Create full rdoc documentation.

Editing has been added now as two modules: simple subfield editing or full record editing. Still to do: Change the timestamp automatically and allow for adding subfields like 040d upon modification.

Suggestions

Perl and MARC::Lint To have error checking of records turned on you must also have Perl and MARC::Lint installed.

Using Koha with zcc

ZCC can now use Koha2’s bulkmarcimport script to insert records directly into the Koha database. For more information see this page: using Koha with zcc

Help/paches

email: Jason Ronallo

For bug reports: If relevant, please include z-target, search term and error messages.

Will work for Ruby books

If you’d like to sponsor the addition of a feature to ZCC or need a change to better meet your workflow, I will work for Ruby books. Up to now I’ve, requested Programming Ruby, Ruby Cookbook, Agile Web Development with Rails and others from libraries via Inter-Library Loan. I never get to keep them as long as I’d like. I’d like to have my own copies of these invaluable resources.

License

Copyright© 2007 Jason Ronallo

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

Contact

Comments are welcome. Send an email to Jason Ronallo.