README.md in red_amber-0.2.2 vs README.md in red_amber-0.2.3
- old
+ new
@@ -1,9 +1,9 @@
# RedAmber
[![Gem Version](https://badge.fury.io/rb/red_amber.svg)](https://badge.fury.io/rb/red_amber)
-[![Ruby](https://github.com/heronshoes/red_amber/actions/workflows/test.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/test.yml)
+[![Ruby](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml/badge.svg)](https://github.com/heronshoes/red_amber/actions/workflows/ci.yml)
[![Discussions](https://img.shields.io/github/discussions/heronshoes/red_amber)](https://github.com/heronshoes/red_amber/discussions)
A simple dataframe library for Ruby.
- Powered by [Red Arrow](https://github.com/apache/arrow/tree/master/ruby/red-arrow) [![Gitter Chat](https://badges.gitter.im/red-data-tools/en.svg)](https://gitter.im/red-data-tools/en)
@@ -18,62 +18,77 @@
Since v0.2.0, this library uses pattern matching which is an experimental feature in 2.7 . It is usable but a warning message will be shown in 2.7 .
I recommend Ruby 3 for performance.
```ruby
# Libraries required
-gem 'red-arrow', '>= 9.0.0'
+gem 'red-arrow', '~> 10.0.0' # Requires Apache Arrow (see installation below)
-gem 'red-parquet', '>= 9.0.0' # Optional, if you use IO from/to parquet
+gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet
gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame
```
## Installation
Install requirements before you install Red Amber.
-- Apache Arrow GLib (>= 9.0.0)
+- Apache Arrow (~> 10.0.0)
+- Apache Arrow GLib (~> 10.0.0)
+- Apache Parquet GLib (~> 10.0.0) # If you use IO from/to parquet
-- Apache Parquet GLib (>= 9.0.0) # If you use IO from/to parquet
-
See [Apache Arrow install document](https://arrow.apache.org/install/).
- Minimum installation example for the latest Ubuntu is in the ['Prepare the Apache Arrow' section in ci test](https://github.com/heronshoes/red_amber/blob/master/.github/workflows/test.yml) of Red Amber.
+ - Minimum installation example for the latest Ubuntu:
+ ```
+ sudo apt update
+ sudo apt install -y -V ca-certificates lsb-release wget
+ wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
+ sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
+ sudo apt update
+ sudo apt install -y -V libarrow-dev
+ sudo apt install -y -V libarrow-glib-dev
+ ```
+ - On macOS, you can install Apache Arrow C++ library using Homebrew:
+
+ ```
+ brew install apache-arrow
+ ```
-Add this line to your Gemfile:
+ and GLib (C) package with:
+ ```
+ brew install apache-arrow-glib
+ ```
+
+If you prepared Apache Arrow, add these lines to your Gemfile:
+
```ruby
+gem 'red-arrow', '~> 10.0.0'
gem 'red_amber'
+gem 'red-parquet', '~> 10.0.0' # Optional, if you use IO from/to parquet
+gem 'rover-df', '~> 0.3.0' # Optional, if you use IO from/to Rover::DataFrame
+gem 'red-datasets-arrow' # Optional, recommended if you use Red Datasets
+gem 'red-arrow-numo-narray' # Optional, recommended if you use inputs from Numo::NArray
```
-And then execute:
+And then execute `bundle install` or install it yourself as `gem install red_amber`.
-```shell
-bundle install
-```
-
-Or install it yourself as:
-
-```shell
-gem install red_amber
-```
-
## Docker image and Jupyter Notebook
[RubyData Docker Stacks](https://github.com/RubyData/docker-stacks) is available as a ready-to-run Docker image containing Jupyter and useful data tools as well as RedAmber (Thanks to @mrkn).
-Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=README.ipynb).
+Also you can try the contents of this README interactively by [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb).
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=red-amber.ipynb)
## Data frame in `RedAmber`
Class `RedAmber::DataFrame` represents a set of data in 2D-shape.
The entity is a Red Arrow's Table object.
![dataframe model of RedAmber](doc/image/dataframe_model.png)
-Load the library.
+Let's load the library and try some examples.
```ruby
require 'red_amber' # require 'red-amber' is also OK.
include RedAmber
```
@@ -99,11 +114,11 @@
53937 0.7 Very Good D SI1 62.8 60.0 2757 5.66 ... 3.56
53938 0.86 Premium H SI2 61.0 58.0 2757 6.15 ... 3.74
53939 0.75 Ideal D SI2 62.2 55.0 2757 5.83 ... 3.64
```
-For example, we can compute mean prices per 'cut' for the data larger than 1 carat.
+For example, we can compute mean prices per cut for the data larger than 1 carat.
```ruby
df = diamonds
.slice { carat > 1 }
.group(:cut)
@@ -123,11 +138,11 @@
Arrow data is immutable, so these methods always return new objects.
Next example will rename a column and create a new column by simple calcuration.
```ruby
-usdjpy = 110.0
+usdjpy = 110.0 # when the yen was stronger
df.rename('mean(price)': :mean_price_USD)
.assign(:mean_price_JPY) { mean_price_USD * usdjpy }
# =>
@@ -179,10 +194,11 @@
See [Vector.md](doc/Vector.md) for details.
## Jupyter notebook
-[73 Examples of Red Amber](binder/examples_of_red_amber.ipynb) shows more examples in jupyter notebook.
+[83 Examples of Red Amber](https://github.com/heronshoes/docker-stacks/blob/RedAmber-binder/binder/examples_of_red_amber.ipynb)
+([raw file](https://raw.githubusercontent.com/heronshoes/docker-stacks/RedAmber-binder/binder/examples_of_red_amber.ipynb)) shows more examples in jupyter notebook.
You can try this notebook on [Binder](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb).
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/heronshoes/docker-stacks/RedAmber-binder?filepath=examples_of_red_amber.ipynb)