## Table of Contents

* [Scope](#scope)
* [Motivation](#motivation)
* [Installation](#installation)
* [Usage](#usage)
  * [Loader](#loader)
  * [Filters](#filters)
    * [Defaults](#defaults)
    * [Custom](#custom)
  * [Worker](#worker)
    * [Batches](#batches)
    * [Enumerator](#enumerator)
    * [Logger](#logger)
    * [Factory](#factory)

## Scope
This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by any default/custom filters (access time, matching name and size range) and apply a set of actions via a block call.

## Motivation
This gem is helpful to purge obsolete files or to promote relevant ones, by calling external services (CDN APIs) and/or local file system actions (copy, move, delete, etc).

## Installation
Add this line to your application's Gemfile:
```ruby
gem "file_scanner"
```

And then execute:
```shell
bundle
```

Or install it yourself as:
```shell
gem install file_scanner
```

## Usage

### Loader
The first step is to create a `Loader` instance by specifying the path where the files need to be scanned with optional extensions list:
```ruby
require "file_scanner"

loader = FileScanner::Loader.new(path: ENV["HOME"], extensions: %w[html txt])
```

### Filters
The second step is to provide the filters list to select file paths for which the `call` method is *truthy*.  
Selection is done with the `any?` predicate, so also one matching filter will do the selection.

#### Defaults
If you specify no filters the default ones are loaded, selecting files by:
* checking if file is older than *30 days* 
* checking if file size is within *0KB and 5KB*
* checking if file *basename matches* the specified *regexp* (if any)

You can update default filters behaviours by passing custom arguments:
```ruby
a_week_ago = FileScanner::Filters::LastAccess.new(Time.now-7*24*3600)
one_two_mb = FileScanner::Filters::SizeRange.new(min: 1024**2, max: 2*1024**2)
hidden = FileScanner::Filters::MatchingName.new(/^\./)
filters = [a_week_ago, one_two_mb, hidden]
```

#### Custom
It is convenient to create custom filters by creating `Proc` instances that satisfy the `callable` protocol:
```ruby
filters << ->(file) { File.directory?(file) }
```

### Worker
Now that you have all of the collaborators in place, you can create the `Worker` instance to performs actions on the filtered paths:
```ruby
worker = FileScanner::Worker.new(loader: loader, filters: filters)
worker.call do |paths|
  # do whatever you want with the paths list
end
```

#### Batches
In case you are going to scan a large number of files, it is suggested to work in batches.  
The `Worker` constructor accepts a `slice` attribute to give you a chance to distribute loading:
```ruby
worker = FileScanner::Worker.new(loader: loader, slice: 1000)
worker.call do |slice|
  # perform action 1000 paths per time
end
```

#### Enumerator
In case you want access the sliced enumerator directly, just do not pass a block to the method:
```ruby
slices = worker.call
count = slices.flatten.size
```

#### Logger
If you dare to trace what the worker is doing (including errors), you can specify a logger to the worker class:
```ruby
my_logger = Logger.new("my_file.log")
worker = FileScanner::Worker.new(loader: loader, logger: my_logger)
worker.call do |slice|
  fail "Doh!" # will log error to my_file.log and re-raise exception
end
```

If you want to easily pass the same logger instance to the actions you are performing, it's available as the second argument of the block:
```ruby
require "fileutils"

worker.call do |slice, logger|
  logger.info { "going to remove #{slice.size} files from disk!" }
  FileUtils.rm_rf(slice)
end
```

#### Factory
You can create loader and worker instances at once by using the available factory:
```ruby
worker = FileScanner::Worker.factory(path: ENV["HOME"], extensions: %w[html txt], filters: filters, logger: my_logger, slice: 1000)
worker.call do |slice, logger|
  # perform action 1000 paths per time
end
```