#--
# =============================================================================== 
# Copyright (c) 2005, Christopher Kleckner
# All rights reserved
#
# This file is part of the Rio library for ruby.
#
# Rio is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# Rio is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Rio; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
# =============================================================================== 
#++
#
# To create the documentation for Rio run the command
#  rake rdoc
# from the distribution directory. Then point your browser at the 'doc/rdoc' directory.
#
# Suggested Reading
# * RIO::Doc::SYNOPSIS
# * RIO::Doc::INTRO
# * RIO::Doc::HOWTO
# * RIO::Rio
#
# <b>Rio is pre-alpha software. 
# The documented interface and behavior is subject to change without notice.</b>


module RIO
module Doc
=begin rdoc


The following example are provided without comment

 array = rio('afile').readlines

 rio('afile') > rio('acopy')
 
 ary = rio('afile').chomp.lines[0...10]

 rio('adir').rename.all.files('*.htm') do |file| 
   file.ext = '.html'
 end

A basic familiarity with ruby and shell operations should allow a casual reader to guess what
these examples will do. How they are being performed may not be what a casual reader might expect.
I will explain these example to illustrate the Rio basics.

For many more examples please read the HOWTO document and the rdoc documentation.

== Example 1.

 array = rio('afile').readlines

This uses IO#readlines to read the lines of 'afile' into an array.

=== Creating a Rio

Rio extends the module Kernel by adding one function _rio_, which acts as a constructor returning a Rio. This 
constructor builds a description of the resource the Rio will access (usually a path). It does not open the
resource, check for its existance, or do anything except remember its specifcation. _rio_ returns the Rio 
which can be chained to a Rio method as in this example or stored in a variable. This coud have been written

 ario = rio('afile')
 array = ario.readlines

 ario = rio('afile')
In this case the resource specified is a relative path. After the first line
the Rio does know or care whether it
is a path to a file nor whether it exists. Rio provides many methods that only deal with a resource
at this level, much as the standard library classes Pathname and URI. It should be noted at this
point that Rio paths stored internally as a URL as specified in RFC 1738 and therefore use slashes as
separators. A resource can also be specified without separators, because _rio_ interprets multiple arguments
as parts of a path to be joined, and an array as an array of parts to be joined. So the following
all specify the same resource.

 rio('adir/afile')
 rio('adir','afile')
 rio(%w/adir afile/)

The rio constructor can be used to specify non-file-system resources, but for this example we will restrict 
our discussion to paths to entities on file-systems.

 array = ario.readlines
Now that we have a Rio, we can call one of its methods; in this case _readlines_. This is an example of using
a Rio as a proxy for the builtin IO#readlines. Given the method _readlines_, the Rio opens 'afile' for reading, 
calls readlines on the resulting IO object, closes the IO object, and returns the lines read.

== Example 2

 rio('afile') > rio('acopy')
 
This copies the file 'afile' into the file 'acopy'.

The first things that happen here are the creation of the Rios. As described in Example 1, when created
a Rio simply remembers the specifcation of its resource. In this case, a relative path 'afile' on the
left and a relative path 'acopy' on the right.

Next the Rio#> (copy-to) method is called on the 'afile' Rio with the 'acopy' Rio as its argument. If that
looks like a greater-than operator to you, think Unix shell, with Rios '>' is the copy-to operator.

Upon seeing the copy-to operator, the Rio has all the information it needs to proceed. It determines that
it must be opened for reading, that its argument must be opened for writing, and that it's contents must
be copied to the resource referenced by it' argument -- and that is what it does. Then it closes itself and
its argument. 

Consider if we had written this example this way.

 afile = rio('afile')
 acopy = rio('acopy')
 afile > acopy

In this case we would still have variables referencing the Rios, and perhaps we would like do things a little 
differently than described above. Be assured that the selection of mode and automatic closing of files are the
default behaviour and can be changed. Say we wanted 'afile' to remain open so that we could rewind it and make 
a second copy, we might do something like this:

 afile = rio('afile').nocloseoneof
 afile > rio('acopy1')
 afile.rewind > rio('acopy2')
 afile.close

Actually the 'thinking-process' of the Rio when it sees a copy-to operator is much more complex that described above.
If its argument had been a rio referencing a directory, it would not have opened itself for reading, 
but instead used FileUtils#cp to copy itself; if its argument had been a string, its contents would have ended up
in the string; If its argument had been an array, its lines would been elements of that array; if its argument had 
been a socket, the its contents would have been copied to the socket. See the documentation for details.

== Example 3.

 array = rio('afile').chomp.lines[0...10]

This fills +array+ with the first ten lines of 'afile', with each line chomped

The casual observer mentioned above might think that +lines+ returns an array of lines and that this
is a simple rewording of <tt>array = rio('afile').readlines[0...10]</tt> or even of 
<tt>array = File.new('afile').readlines[0...10]</tt>. They would be wrong.

+chomp+ is a configuration method which turns on chomp-mode and returns the Rio. Chomp-mode causes all
line oriented read operations to perform a String#chomp on each line

=== Reading files

Rio provides four methods to select which part of the file is read and how the file is divided. They are +lines+,
+records+, +rows+ and +bytes+. Briefly, +lines+ specifies that the file should be read line by line and +bytes(n)+
specifies that the file should be read in _n_ byte chunks. All four take arguments which can be used to
filter lines or chunks in or out. For simple Rios +records+ and +rows+ only specify the filter arguments and
are provided for use be extensions. For example, the CSV extension returns an array of the columns in a line
when +records+ is used. In the absence of an extension +records+ and +rows+ behave like +lines+.

First lets rewrite our example as:

 array = rio('afile').chomp.lines(0...10).to_a

The arguments to lines specify which records are to be read. 
Arguments are interpreted based on their class as follows:
* Range - interpreted as a range of record numbers to be read
* Integer - interpreted as a one-element range
* RegExp - only matching records are processed
* Symbol - sent to each record, which is processed unless the result is false or nil
* Proc - called for each record, the record is processed unless the return value is false or nil
See the documentation for details and examples.

In our example we have specified the Range (0...10). The +lines+ method is just configuring the Rio, it does 
not trigger
any IO operation. The fact that it was called and the arguments it was called with are stored away and the Rio 
is returned for further configuration or an actual IO operation. When an IO operation is called the Range will be
used to limit processing to the first ten records. For example:
 rio('afile').lines(0...10).each { |line| ... } # block will be called for the first 10 records
 rio('afile').lines(0...10).to_a                # the first 10 records will be returned in an array
 rio('afile').lines(0...10) > rio('acopy')      # the first 10 records will be copied to 'acopy'

"But wait", you say, "In our original example the range was an argument to the subscript operator, not to +lines+". 
This works because the subscript operator processes its arguments as if they had been arguments to the
most-recently-called selection method and then calls +to_a+ on the rio. So our rewrite of the example
does precisely the same thing as the original

The big difference between the original example and the casual-observer's solution is that hers
creates an array of the entire contents and only returns the first 10 while the original only puts
10 records into the array.

As a sidenote, Rios also have an optimization that can really help in certain situations. If records are only
selected using Ranges, it stops iterating when it is beyond the point where it could possibly ever match. This
can make a dramatic difference when one is only interested in the first few lines of very large files.

== Example 4.

 rio('adir').rename.all.files('*.htm') do |file| 
   file.ext = '.html'
 end

This changes the extension of all .htm files below 'adir' to '.html'

First we create the rio as always.

Next we process the +rename+ method. When used as it is here -- without arguments -- it just turns on rename-mode
and returns the Rio.

+all+ is another configuration method, which causes directories to be processed recursively

+files+ is another configuration method. In example 3 we used +lines+ to select what to process when
iterating through a file. +files+ is used to select what to process when iterating through 
directories. The arguments to +files+ can be the same as those for +lines+ except that Ranges can not
be used and globs can.

In our example, the argument to +files+ is a string which is treated as a glob. As with +lines+, +files+
does not trigger any IO, it just configures the Rio.

=== There's no action

The previous examples had something that triggered IO: +readlines+, +to_a+, +each+, <tt>> (copy-to)</tt>. This example
does not. This example illustrates Rio's 'implied each'. All the configuration methods will call each for you
if a block is given. So, because a block follows the files method, it calls +each+ and passes it the block.

Let's recap. At this point we have a Rio with a resource specified. We have configured with a couple of modes, 
'rename', and 'all', and we have limited the elements we want to process to entries that are files and
match the glob '*.htm'. +each+ causes the Rio to open the directory and call the block for each entry that is
both a file and matches the glob. It was also configured with +all+,so it descends into subdirectories to
find further matches and calles the block for each of them. The argument passed to the block is a Rio
referencing the entry on the file-system. 

The _rename_mode_ we set has had no effect on our iteration at all, so why is it there? In general, 
configuration options that are not applicable to a Rio are silently ignored, however, for directories
some of them are passed on to the Rios for each entry when iterating. Since +rename+ is one such option,
The example could have been written:

 rio('adir').all.files('*.htm') do |file| 
   file.rename.ext = '.html'
 end

The rename-with-no-args method affects the behaviour of the <tt>ext=</tt> option. In this case,
setting it for the directory, rather than for each file in the block seems to make the intent
of the code more clear, but that is a matter of personal taste. See the documentation for more 
information on the rename-with-no-args method

== Suggested Reading
* RIO::Doc::SYNOPSIS
* RIO::Doc::INTRO
* RIO::Doc::HOWTO
* RIO::Rio

=end
module MISC
end
end
end