## Table of Contents

* [Scope](#scope)
* [Motivation](#motivation)
* [Installation](#installation)
* [Usage](#usage)
  * [Loader](#loader)
  * [Filters](#filters)
    * [Defaults](#defaults)
    * [Custom](#custom)
  * [Worker](#worker)
    * [Mode](#mode)
    * [Batches](#batches)
    * [Limit](#limit)
    * [Enumerator](#enumerator)
    * [Logger](#logger)

## Scope
This gem is aimed to collect a set of file paths starting by a wildcard rule, filter them by any/all default/custom filters (access time, matching name and size range) and apply a set of actions via a block call.

## Motivation
This gem is helpful to purge obsolete files or to promote relevant ones, by calling external services (CDN APIs) and/or local file system actions (copy, move, delete, etc).

## Installation
Add this line to your application's Gemfile:
```ruby
gem "file_scanner"
```

And then execute:
```shell
bundle
```

Or install it yourself as:
```shell
gem install file_scanner
```

## Usage

### Loader
The first step is to create a `Loader` instance by specifying the path where the files need to be scanned with optional extensions list:
```ruby
require "file_scanner"

loader = FileScanner::Loader.new(path: ENV["HOME"], extensions: %w[html txt])
```

### Filters
The second step is to provide the filters list to select file paths for which the `call` method is *truthy*.  

#### Defaults
If you specify no filters the default ones are loaded, selecting files by:
* checking if file is older than *30 days* 
* checking if file size is within *0KB and 5KB*
* checking if file *basename matches* the specified *regexp* (if any)

You can update default filters behaviours by passing custom arguments:
```ruby
a_week_ago = FileScanner::Filters::LastAccess.new(Time.now-7*24*3600)
one_two_mb = FileScanner::Filters::SizeRange.new(min: 1024**2, max: 2*1024**2)
hidden = FileScanner::Filters::MatchingName.new(/^\./)
filters = [a_week_ago, one_two_mb, hidden]
```

#### Custom
It is convenient to create custom filters by creating `Proc` instances that satisfy the `callable` protocol:
```ruby
filters << ->(file) { File.directory?(file) }
```

### Worker
Now that you have all of the collaborators in place, you can create the `Worker` instance to performs actions on the filtered paths:
```ruby
worker = FileScanner::Worker.new(loader: loader, filters: filters)
worker.call do |paths|
  # do whatever you want with the paths list
end
```

### Mode
By default the worker will select paths by applying any of the matching filters: this is it, it suffice just one of the specified filters to be true to grab the path.  
In case you want restrict paths selection by all matching filters, just specify it:
```ruby
worker = FileScanner::Worker.new(loader: loader, filters: filters, all: true)
```

#### Batches
In case you are going to scan a large number of files, it is suggested to work in batches.  
The `Worker` constructor accepts a `slice` attribute to give you a chance to distribute loading:
```ruby
worker = FileScanner::Worker.new(loader: loader, slice: 1000)
worker.call do |slice|
  # perform action 1000 paths per time
end
```

#### Limit
In case you are going to apply some heavy filtering upon the selected files (i.e. reading the file in memory to get some creepy data), you can found helpful to limit the number of retuned paths before applying any filtering:
```ruby
worker = FileScanner::Worker.new(loader: loader, slice: 1000, limit: 6000)
worker.call do |slice|
  # filters applied on a maximum of 6000 paths, working a slice of 1000 files per time
end
```

#### Enumerator
In case you want access the sliced enumerator directly, just do not pass a block to the method:
```ruby
slices = worker.call
count = slices.flatten.size
```

#### Logger
If you dare to trace what the worker is doing (including errors), you can specify a logger to the worker class:
```ruby
my_logger = Logger.new("my_file.log")
worker = FileScanner::Worker.new(loader: loader, logger: my_logger)
worker.call do |slice|
  fail "Doh!" # will log error to my_file.log and re-raise exception
end
```

If you want to easily pass the same logger instance to the actions you are performing, it's available as the second argument of the block:
```ruby
require "fileutils"

worker.call do |slice, logger|
  logger.info { "going to remove #{slice.size} files from disk!" }
  FileUtils.rm_rf(slice)
end
```