[![Circle CI](https://circleci.com/gh/DMazzei/log-analyser.svg?style=shield)](https://app.circleci.com/pipelines/github/DMazzei/log-analyser)
[![Coverage Status](https://coveralls.io/repos/github/DMazzei/log-analyser/badge.svg?branch=master)](https://coveralls.io/github/DMazzei/log-analyser?branch=master)
[![Gem Version](https://badge.fury.io/rb/log-analyser.svg)](https://badge.fury.io/rb/log-analyser)
![GitHub code size in bytes](https://img.shields.io/github/languages/code-size/dmazzei/log-analyser)
![Gem](https://img.shields.io/gem/dv/log-analyser/stable)
 
# Log-Analyser

## About

Simple ruby library to read and parse web-server's log files and aggregate pageview data. 

### TL;DR
<details>
<summary>check minimal instructions</summary>

Install *log-analyser* gem.
After instantiating *log-analyser's* `PageviewsLogAggregator` class with the path to the logfile:
</br>- the method `all` will return the pageview count
</br>- whilst method `unique` will return the unique pageview count. 

</details>

### Table of Contents
<details>
<summary>click to expand the index</summary>

- [Installation](#installation)
    * [Gem](#gem)
    * [Project](#project)
- [Usage](#usage)
- [Logs and Pageviews](#logs-and-pageviews)
    * [Definitions](#definitions)
    * [Log Formatting](#log-formatting)
- [Development](#development)
- [Contributing](#contributing)
- [Next Steps](#next-steps)
- [License](#license)
    

</details>

## Installation

### Gem

To use *log-analyser* in your application, add this line to your Gemfile:

```ruby
gem 'log-analyser'
```

Or install it yourself as:

    $ gem install log-analyser

#### Gem Usage

```ruby
#!/usr/bin/env ruby

require 'pageviews_log_aggregator'

file_path = '/Users/dmazzei/projects/personal/ruby/sp_test/log-analyser/resources/webserver.log'
log_aggregator = LogAnalyser::PageviewsLogAggregator.new(file_path)

puts "\nAll pageviews"
log_aggregator.all.each do |key, value|
  puts "#{key&.to_s&.ljust(28, '.')} | #{value}"
end

puts "\nUnique pageviews"
log_aggregator.unique.each do |key, value|
  puts "#{key&.to_s&.ljust(28, '.')} | #{value}"
end
```

![image](https://user-images.githubusercontent.com/3502642/98482375-dbe8b880-21f8-11eb-853a-ea67acf643ae.png)

### Project

Install the Ruby version specified in `.ruby-version` </br>
Clone the project and install Bundler

```
git clone git@github.com:DMazzei/log-analyser.git
cd log-analyser
gem install bundler
```

#### Setup:

Run the initial setup

    $ bin/setup

> If you need to reinstall dependencies or something alike:
> ```
> $ bundle install
> ```

#### Usage

Call `./bin/parse_pageview_file.rb` passing a logfile path as argument, it will return the pageview count ordered from most to less viewed.</br>
Check `--help` for more options

![image](https://user-images.githubusercontent.com/3502642/98471556-0c265c00-21e5-11eb-8fc3-c029e09e41fa.png)

An example log can be found in :file_folder:`resources` folder:

    $ ./bin/parse_pageview_file.rb --file 'resources/webserver.log'
    |--------------------------------------------------|
    | All pageviews                                    |
    |--------------------------------------------------|
    | /about/2.................... | 90                |
    | /contact.................... | 89                |
    | /index...................... | 82                |
    | /about...................... | 81                |
    | /help_page/1................ | 80                |
    | /home....................... | 78                |
    |--------------------------------------------------|
    
The `-u` or `--unique` option will also display the unique pageview count:

    $ ./bin/parse_pageview_file.rb --file 'resources/webserver.log' -u
    
And any specific page can be filtered with `-p` or `--page`:

    $ ./bin/parse_pageview_file.rb --file 'resources/webserver.log' -p '/index'
    |--------------------------------------------------|
    | View count for page: /index                      |
    |--------------------------------------------------|
    | All pageviews                                    |
    |--------------------------------------------------|
    | /index...................... | 82                |
    |--------------------------------------------------|

## Logs and Pageviews

### Definitions

> :page_facing_up: A pageview is defined as a view of a page on your site that is being tracked by the Analytics tracking code. If a user clicks reload after reaching the page, this is counted as an additional pageview. If a user navigates to a different page and then returns to the original page, a second pageview is recorded as well.

> :page_with_curl: A unique pageview, as seen in the Content Overview report, aggregates pageviews that are generated by the same user during the same session. A unique pageview represents the number of sessions during which that page was viewed one or more times.


### Log Formatting

The library is prepared to parser text files, containing one entry per line, in the format: `\page_name identifier`.

A space must separate the page name (first column) from the user identifier (e.g. IP address):

```
/help_page/1 126.318.035.038
/contact 184.123.665.067
/home 184.123.665.067
```

## Development

#### Start with the project:

```
$ git clone git@github.com:DMazzei/log-analyser.git
$ cd log-analyser
$ gem install bundler
$ bundle install
```

And the world is your oyster...

You can also run `$ bundle exec console` for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run `$ bundle exec rake install`. 
To release a new version, update the version number in `version.rb`, and then run `$  bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).

#### Linter (rubocop)

_*Rubocop*_ is used as code analyser and maintain code formatting (as well as some best practices).   

Use `$ bundle exec rake rubocop` to run the checks. 

#### Test coverage

[![Coverage Status](https://coveralls.io/repos/github/DMazzei/log-analyser/badge.svg?branch=master)](https://coveralls.io/github/DMazzei/log-analyser?branch=master)

Use `$ bundle exec rspec` or `$ bundle exec rake spec:all` to run all the tests.

:white_check_mark: To run only unit-tests

    $ bundle exec rake spec:unit

:white_check_mark: To run only integration tests

    $ bundle exec rake spec:integration

The test coverage is handled by `rspec`, `simplecov` and `coveralls`.</br>
After running the tests, a local version of the test coverage report is available [here](http://localhost:63342/log-analyser/coverage/index.html).

Full status and coverage history can be checked online on [coveralls](https://coveralls.io/github/DMazzei/log-analyser). 

#### Deployment

Following the creation of a _*Pull Request*_ a CI workflow is triggered in CircleCI, that can be checked [here](https://app.circleci.com/pipelines/github/DMazzei/log-analyser).</br>
This workflow consist in _building_ the library; Running _rubocop_ and _rspec_ to validate integrity and code quality; And lastly generating and pushing a _feature-gem_ that can be used for development and tests.

After passing all checks and requirements on github, a *PR* can be merged as soon as it is reviewed and approved. 
The _*master branch*_ merge process will trigger the deployment process on CircleCI, and this workflow ends with the generation of a _*tagged-gem*_.

The whole deployment process will finish by building and tagging a new gem version and pushing it to [rubygems.org](https://rubygems.org/gems/log-analyser).

> :warning: To merge changes into _*master*_, the version must be bumped up, otherwise the deployment will fail!</br>
> The version must be updated in `version.rb`.

## Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/DMazzeig/log-analyser.

## Next Steps
 
- One conundrum faced that can be reviewed, deciding between:
    * reading the file whilst aggregation data, preserving memory - e.g. using `Set`;
    * loading data into memory and leaving aggregation and count to be dealt later, gaining flexibility and performance;
- Extend the accepted logfile format;
- Add more options for sorting and filtering;  
- Automate library version bump up;

## License

The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).