README.md in red_amber-0.2.2 vs README.md in red_amber-0.2.3

- old
+ new

@@ -1,9 +1,9 @@ # RedAmber [![Gem Version](https://badge.fury.io/rb/red_amber.svg)](https://badge.fury.io/rb/red_amber) -[![Ruby](https://github.com/heronshoes/red_amber/actions/workflows/test.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/test.yml) +[![Ruby](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml) [![Discussions](https://img.shields.io/github/discussions/heronshoes/red_amber)](https://github.com/heronshoes/red_amber/discussions) A simple dataframe library for Ruby. - Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow) [![Gitter Chat](https://badges.gitter.im/red-data-tools/en.svg)](https://gitter.im/red-data-tools/en) @@ -18,62 +18,77 @@ Since v0.2.0, this library uses pattern matching which is an experimental feature in 2.7 . It is usable but a warning message will be shown in 2.7 . I recommend Ruby 3 for performance. ```ruby # Libraries required -gem 'red-arrow', '>= 9.0.0' +gem 'red-arrow', '~> 10.0.0' # Requires Apache Arrow (see installation below) -gem 'red-parquet', '>= 9.0.0' # Optional, if you use IO from/to parquet +gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame ``` ## Installation Install requirements before you install Red Amber. -- Apache Arrow GLib (>= 9.0.0) +- Apache Arrow (~> 10.0.0) +- Apache Arrow GLib (~> 10.0.0) +- Apache Parquet GLib (~> 10.0.0) # If you use IO from/to parquet -- Apache Parquet GLib (>= 9.0.0) # If you use IO from/to parquet - See [Apache Arrow install document](https://arrow.apache.org/install/). - Minimum installation example for the latest Ubuntu is in the ['Prepare the Apache Arrow' section in ci test](https://github.com/heronshoes/red_amber/blob/master/.github/workflows/test.yml) of Red Amber. + - Minimum installation example for the latest Ubuntu: + ``` + sudo apt update + sudo apt install -y -V ca-certificates lsb-release wget + wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb + sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb + sudo apt update + sudo apt install -y -V libarrow-dev + sudo apt install -y -V libarrow-glib-dev + ``` + - On macOS, you can install Apache Arrow C++ library using Homebrew: + + ``` + brew install apache-arrow + ``` -Add this line to your Gemfile: + and GLib (C) package with: + ``` + brew install apache-arrow-glib + ``` + +If you prepared Apache Arrow, add these lines to your Gemfile: + ```ruby +gem 'red-arrow', '~> 10.0.0' gem 'red_amber' +gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet +gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame +gem 'red-datasets-arrow' # Optional, recommended if you use Red Datasets +gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray ``` -And then execute: +And then execute `bundle install` or install it yourself as `gem install red_amber`. -```shell -bundle install -``` - -Or install it yourself as: - -```shell -gem install red_amber -``` - ## Docker image and Jupyter Notebook [RubyData Docker Stacks](https://github.com/RubyData/docker-stacks) is available as a ready-to-run Docker image containing Jupyter and useful data tools as well as RedAmber (Thanks to @mrkn). -Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=README.ipynb). +Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb). [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb) ## Data frame in `RedAmber` Class `RedAmber::DataFrame` represents a set of data in 2D-shape. The entity is a Red Arrow's Table object. ![dataframe model of RedAmber](doc/image/dataframe_model.png) -Load the library. +Let's load the library and try some examples. ```ruby require 'red_amber' # require 'red-amber' is also OK. include RedAmber ``` @@ -99,11 +114,11 @@ 53937 0.7 Very Good D SI1 62.8 60.0 2757 5.66 ... 3.56 53938 0.86 Premium H SI2 61.0 58.0 2757 6.15 ... 3.74 53939 0.75 Ideal D SI2 62.2 55.0 2757 5.83 ... 3.64 ``` -For example, we can compute mean prices per 'cut' for the data larger than 1 carat. +For example, we can compute mean prices per cut for the data larger than 1 carat. ```ruby df = diamonds .slice { carat > 1 } .group(:cut) @@ -123,11 +138,11 @@ Arrow data is immutable, so these methods always return new objects. Next example will rename a column and create a new column by simple calcuration. ```ruby -usdjpy = 110.0 +usdjpy = 110.0 # when the yen was stronger df.rename('mean(price)': :mean_price_USD) .assign(:mean_price_JPY) { mean_price_USD * usdjpy } # => @@ -179,10 +194,11 @@ See [Vector.md](doc/Vector.md) for details. ## Jupyter notebook -[73 Examples of Red Amber](binder/examples_of_red_amber.ipynb) shows more examples in jupyter notebook. +[83 Examples of Red Amber](https://github.com/heronshoes/docker-stacks/blob/RedAmber-binder/binder/examples_of_red_amber.ipynb) +([raw file](https://raw.githubusercontent.com/heronshoes/docker-stacks/RedAmber-binder/binder/examples_of_red_amber.ipynb)) shows more examples in jupyter notebook. You can try this notebook on [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb). [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb)