README.md in red_amber-0.3.0 vs README.md in red_amber-0.4.0

- old
+ new

@@ -1,45 +1,43 @@ # RedAmber -[![Gem Version](https://badge.fury.io/rb/red_amber.svg)](https://badge.fury.io/rb/red_amber) +[![Gem Version](https://img.shields.io/gem/v/red_amber?color=brightgreen)](https://rubygems.org/gems/red_amber) [![CI](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml) [![Maintainability](https://api.codeclimate.com/v1/badges/b8a745047045d2f49daa/maintainability)](https://codeclimate.com/github/heronshoes/red_amber/maintainability) [![Test coverage](https://api.codeclimate.com/v1/badges/b8a745047045d2f49daa/test_coverage)](https://codeclimate.com/github/heronshoes/red_amber/test_coverage) [![Doc](https://img.shields.io/badge/docs-latest-blue)](https://heronshoes.github.io/red_amber/) [![Discussions](https://img.shields.io/github/discussions/heronshoes/red_amber)](https://github.com/heronshoes/red_amber/discussions) A simple dataframe library for Ruby. - Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow) -[![Gitter Chat](https://badges.gitter.im/red-data-tools/en.svg)](https://gitter.im/red-data-tools/en) +[![Gitter Chat](https://badges.gitter.im/red-data-tools/en.svg)](https://gitter.im/red-data-tools/en) [![Gem Version](https://img.shields.io/gem/v/red-arrow?color=brightgreen)](https://rubygems.org/gems/red-arrow) - Inspired by the dataframe library [Rover-df](https://github.com/ankane/rover) ![screenshot from jupyterlab](https://raw.githubusercontent.com/heronshoes/red_amber/main/doc/image/screenshot.png) ## Requirements - +### Ruby Supported Ruby version is >= 3.0 (since RedAmber 0.3.0). +- I decided to remove Ruby 2.7 without waiting for EOL. See [Release note for v0.3.0](https://github.com/heronshoes/red_amber/discussions/162) for details. -- I decided to remove Ruby 2.7 without waiting for EOL because it cannot solve the problem of simultaneous use of Hash and keyword arguments when implementing DataFrame#join. - +### Libraries ```ruby -# Libraries required -gem 'red-arrow', '~> 10.0.0' # Requires Apache Arrow (see installation below) - -gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet +gem 'red-arrow', '~> 11.0.0' # Requires Apache Arrow (see installation below) +gem 'red-parquet', '~> 11.0.0' # Optional, if you use IO from/to parquet gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame ``` ## Installation Install requirements before you install Red Amber. -- Apache Arrow (~> 10.0.0) -- Apache Arrow GLib (~> 10.0.0) -- Apache Parquet GLib (~> 10.0.0) # If you use IO from/to parquet +- Apache Arrow (~> 11.0.0) +- Apache Arrow GLib (~> 11.0.0) +- Apache Parquet GLib (~> 11.0.0) # If you use IO from/to parquet - See [Apache Arrow install document](https://arrow.apache.org/install/). +See [Apache Arrow install document](https://arrow.apache.org/install/). - Minimum installation example for the latest Ubuntu: ``` sudo apt update @@ -56,43 +54,44 @@ ``` sudo dnf update sudo dnf -y install gcc-c++ libarrow-devel libarrow-glib-devel ruby-devel ``` - - On macOS, you can install Apache Arrow C++ library using Homebrew: + - On macOS, using Homebrew: ``` brew install apache-arrow - ``` - - and GLib (C) package with: - - ``` brew install apache-arrow-glib ``` If you prepared Apache Arrow, add these lines to your Gemfile: ```ruby -gem 'red-arrow', '~> 10.0.0' +gem 'red-arrow', '~> 11.0.0' gem 'red_amber' -gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet +gem 'red-parquet', '~> 11.0.0' # Optional, if you use IO from/to parquet gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame gem 'red-datasets-arrow' # Optional, recommended if you use Red Datasets gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray ``` -And then execute `bundle install` or install it yourself as `gem install red_amber`. +And then execute `bundle install` or install them yourself such as `gem install red_amber`. ## Docker image and Jupyter Notebook -[RubyData Docker Stacks](https://github.com/RubyData/docker-stacks) is available as a ready-to-run Docker image containing Jupyter and useful data tools as well as RedAmber (Thanks to @mrkn). +[RubyData Docker Stacks](https://github.com/RubyData/docker-stacks) is available as a ready-to-run Docker image containing Jupyter and useful data tools as well as RedAmber (Thanks to Kenta Murata). Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb). [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb) +## Comparison of DataFrames +Comparison of basic features of RedAmber with Python +[pandas](https://pandas.pydata.org/), +R [Tidyverse](https://www.tidyverse.org/) and +Julia [Dataframes](https://dataframes.juliadata.org/stable/) is [here](doc/DataFrame_Comparison.md) (Thanks to Benson Muite). + ## Data frame in `RedAmber` Class `RedAmber::DataFrame` represents a set of data in 2D-shape. The entity is a Red Arrow's Table object. @@ -135,11 +134,11 @@ For example, we can compute mean prices per cut for the data larger than 1 carat. ```ruby df = diamonds - .slice { carat > 1 } + .slice { carat > 1 } # or use #filter instead of #slice .group(:cut) .mean(:price) # `pick` prior to `group` is not required if `:price` is specified here. .sort('-mean(price)') # => @@ -184,11 +183,11 @@ starwars .drop(0) # delete unnecessary index column .remove { species == "NA" } # delete unnecessary rows .group(:species) { [count(:species), mean(:height, :mass)] } - .slice { count > 1 } + .slice { count > 1 } # or use #filter instead of slice # => #<RedAmber::DataFrame : 8 x 4 Vectors, 0x000000000000f848> species count mean(height) mean(mass) <string> <int64> <double> <double> @@ -211,10 +210,10 @@ See [Vector.md](doc/Vector.md) for details. ## Jupyter notebook -[89 Examples of Red Amber](https://github.com/heronshoes/docker-stacks/blob/RedAmber-binder/binder/examples_of_red_amber.ipynb) +[Examples of Red Amber](https://github.com/heronshoes/docker-stacks/blob/RedAmber-binder/binder/examples_of_red_amber.ipynb) ([raw file](https://raw.githubusercontent.com/heronshoes/docker-stacks/RedAmber-binder/binder/examples_of_red_amber.ipynb)) shows more examples in jupyter notebook. You can try this notebook on [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb). [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb)