[](https://coveralls.io/github/DMazzei/log-analyser?branch=master)
[](https://badge.fury.io/rb/log-analyser)
[](https://app.circleci.com/pipelines/github/DMazzei/log-analyser)


# Log-Analyser
## About
Simple ruby library to read and parse web-server's log files and aggregate pageview data.
### TL;DR
check minimal instructions
Install *log-analyser* gem.
After instantiating *log-analyser's* `PageviewsLogAggregator` class with the path to the logfile:
- the method `all` will return the pageview count
- whilst method `unique` will return the unique pageview count.
### Table of Contents
click to expand the index
- [Installation](#installation)
* [Gem](#gem)
* [Project](#project)
- [Usage](#usage)
- [Logs and Pageviews](#logs-and-pageviews)
* [Definitions](#definitions)
* [Log Formatting](#log-formatting)
- [Development](#development)
- [Contributing](#contributing)
- [Next Steps](#next-steps)
- [License](#license)
## Installation
### Gem
To use *log-analyser* in your application, add this line to your Gemfile:
```ruby
gem 'log-analyser'
```
Or install it yourself as:
$ gem install log-analyser
#### Gem Usage
```ruby
#!/usr/bin/env ruby
require 'pageviews_log_aggregator'
file_path = '/Users/dmazzei/projects/personal/ruby/sp_test/log-analyser/resources/webserver.log'
log_aggregator = LogAnalyser::PageviewsLogAggregator.new(file_path)
puts "\nAll pageviews"
log_aggregator.all.each do |key, value|
puts "#{key&.to_s&.ljust(28, '.')} | #{value}"
end
puts "\nUnique pageviews"
log_aggregator.unique.each do |key, value|
puts "#{key&.to_s&.ljust(28, '.')} | #{value}"
end
```

### Project
Install the Ruby version specified in `.ruby-version`
Clone the project and install Bundler
```
git clone git@github.com:DMazzei/log-analyser.git
cd log-analyser
gem install bundler
```
#### Setup:
Run the initial setup
$ bin/setup
> If you need to reinstall dependencies or something alike:
> ```
> $ bundle install
> ```
#### Usage
Call `./bin/parse_pageview_file.rb` passing a logfile path as argument, it will return the pageview count ordered from most to less viewed.
Check `--help` for more options

An example log can be found in :file_folder:`resources` folder:
$ ./bin/parse_pageview_file.rb --file 'resources/webserver.log'
|--------------------------------------------------|
| All pageviews |
|--------------------------------------------------|
| /about/2.................... | 90 |
| /contact.................... | 89 |
| /index...................... | 82 |
| /about...................... | 81 |
| /help_page/1................ | 80 |
| /home....................... | 78 |
|--------------------------------------------------|
The `-u` or `--unique` option will also display the unique pageview count:
$ ./bin/parse_pageview_file.rb --file 'resources/webserver.log' -u
And any specific page can be filtered with `-p` or `--page`:
$ ./bin/parse_pageview_file.rb --file 'resources/webserver.log' -p '/index'
|--------------------------------------------------|
| View count for page: /index |
|--------------------------------------------------|
| All pageviews |
|--------------------------------------------------|
| /index...................... | 82 |
|--------------------------------------------------|
## Logs and Pageviews
### Definitions
> :page_facing_up: A pageview is defined as a view of a page on your site that is being tracked by the Analytics tracking code. If a user clicks reload after reaching the page, this is counted as an additional pageview. If a user navigates to a different page and then returns to the original page, a second pageview is recorded as well.
> :page_with_curl: A unique pageview, as seen in the Content Overview report, aggregates pageviews that are generated by the same user during the same session. A unique pageview represents the number of sessions during which that page was viewed one or more times.
### Log Formatting
The library is prepared to parser text files, containing one entry per line, in the format: `\page_name identifier`.
A space must separate the page name (first column) from the user identifier (e.g. IP address):
```
/help_page/1 126.318.035.038
/contact 184.123.665.067
/home 184.123.665.067
```
## Development
#### Start with the project:
```
$ git clone git@github.com:DMazzei/log-analyser.git
$ cd log-analyser
$ gem install bundler
$ bundle install
```
And the world is your oyster...
You can also run `$ bundle exec console` for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run `$ bundle exec rake install`.
To release a new version, update the version number in `version.rb`, and then run `$ bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
#### Linter (rubocop)
_*Rubocop*_ is used as code analyser and maintain code formatting (as well as some best practices).
Use `$ bundle exec rake rubocop` to run the checks.
#### Test coverage
[](https://coveralls.io/github/DMazzei/log-analyser?branch=master)
Use `$ bundle exec rspec` or `$ bundle exec rake spec:all` to run all the tests.
:white_check_mark: To run only unit-tests
$ bundle exec rake spec:unit
:white_check_mark: To run only integration tests
$ bundle exec rake spec:integration
The test coverage is handled by `rspec`, `simplecov` and `coveralls`.
Status and coverage history can be checked [here](https://coveralls.io/github/DMazzei/log-analyser).
#### Deployment
Following the creation of a _*Pull Request*_ a CI workflow is triggered in CircleCI, that can be checked [here](https://app.circleci.com/pipelines/github/DMazzei/log-analyser).
This workflow consist in _building_ the library; Running _rubocop_ and _rspec_ to validate integrity and code quality; And lastly generating and pushing a _feature-gem_ that can be used for development and tests.
After passing all checks and requirements on github, a *PR* can be merged as soon as it is reviewed and approved.
The _*master branch*_ merge process will trigger the deployment process on CircleCI, and this workflow ends with the generation of a _*tagged-gem*_.
The whole deployment process will finish by building and tagging a new gem version and pushing it to [rubygems.org](https://rubygems.org/gems/log-analyser).
> :warning: To merge changes into _*master*_, the version must be bumped up, otherwise the deployment will fail!
> The version must be updated in `version.rb`.
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/DMazzeig/log-analyser.
## Next Steps
- One conundrum faced that can be reviewed, deciding between:
* reading the file whilst aggregation data, preserving memory - e.g. using `Set`;
* loading data into memory and leaving aggregation and count to be dealt later, gaining flexibility and performance;
- Extend the accepted logfile format;
- Add more options for sorting and filtering;
- Automate library version bump up;
## License
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).