DataFrame.md in red_amber-0.1.7

- old
+ new

@@ -7,12 +7,10 @@
 - `variable`s with same vector length are aligned and arranged to be a `DataFrame`.
 - Each `Vector` in a `DataFrame` contains a set of relating data at same position. We call it `observation`.
 
 ![dataframe model image](doc/../image/dataframe_model.png)
 
-(No change in this model in v0.1.6 .)
-
 ## Constructors and saving
 
 ### `new` from a Hash
 
   ```ruby
@@ -35,10 +33,12 @@
 
 ### `new` from a Rover::DataFrame
 
 
   ```ruby
+  require 'rover'
+
   rover = Rover::DataFrame.new(x: [1, 2, 3])
   RedAmber::DataFrame.new(rover)
   ```
 
 ### `load` (class method)
@@ -59,10 +59,12 @@
   ```
 
 - from a Parquet file
 
   ```ruby
+  require 'parquet'
+
   dataframe = RedAmber::DataFrame.load("file.parquet")
   ```
 
 ### `save` (instance method)
 
@@ -73,10 +75,12 @@
 - to a URI
 
 - to a Parquet file
 
   ```ruby
+  require 'parquet'
+
   dataframe.save("file.parquet")
   ```
 
 ## Properties
 
@@ -173,16 +177,45 @@
 
 ## Output
 
 ### `to_s`
 
+`to_s` returns a preview of the Table.
+
+```ruby
+puts penguins.to_s
+
+# =>
+    species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+    <string> <string>        <double>      <double>           <uint8> ... <uint16>
+  1 Adelie   Torgersen           39.1          18.7               181 ...     2007
+  2 Adelie   Torgersen           39.5          17.4               186 ...     2007
+  3 Adelie   Torgersen           40.3          18.0               195 ...     2007
+  4 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+  5 Adelie   Torgersen           36.7          19.3               193 ...     2007
+  : :        :                      :             :                 : ...        :
+342 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+343 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+344 Gentoo   Biscoe              49.9          16.1               213 ...     2009
+```
+### `inspect`
+
+`inspect` uses `to_s` output and also shows shape and object_id.
+
+
 ### `summary`, `describe` (not implemented)
 
 ### `to_rover`
 
 - Returns a `Rover::DataFrame`.
 
+```ruby
+require 'rover'
+
+penguins.to_rover
+```
+
 ### `to_iruby`
 
 - Show the DataFrame as a Table in Jupyter Notebook or Jupyter Lab with IRuby.
 
 ### `tdr(limit = 10, tally: 5, elements: 5)`
@@ -194,10 +227,11 @@
   require 'red_amber'
   require 'datasets-arrow'
 
   penguins = Datasets::Penguins.new.to_arrow
   RedAmber::DataFrame.new(penguins).tdr
+
   # =>
   RedAmber::DataFrame : 344 x 8 Vectors
   Vectors : 5 numeric, 3 strings
   # key                type   level data_preview
   1 :species           string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
@@ -212,26 +246,10 @@
 
   - limit: limit of variables to show. Default value is 10.
   - tally: max level to use tally mode.
   - elements: max num of element to show values in each observations.
 
-### `inspect`
-
-- Returns the information of self as `tdr(3)`, and also shows object id.
-
-  ```ruby
-  puts penguins.inspect
-  # =>
-  #<RedAmber::DataFrame : 344 x 8 Vectors, 0x000000000000f0b4>
-  Vectors : 5 numeric, 3 strings
-  # key                type   level data_preview
-  1 :species           string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
-  2 :island            string     3 {"Torgersen"=>52, "Biscoe"=>168, "Dream"=>124}
-  3 :bill_length_mm    double   165 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
-   ... 5 more Vectors ...
-  ```
-
 ## Selecting
 
 ### Select variables (columns in a table) by `[]` as `[key]`, `[keys]`, `[keys[index]]`
 - Key in a Symbol: `df[:symbol]`
 - Key in a String: `df["string"]`
@@ -248,31 +266,34 @@
  
   ```ruby
   hash = {a: [1, 2, 3], b: %w[A B C], c: [1.0, 2, 3]}
   df = RedAmber::DataFrame.new(hash)
   df[:b..:c, "a"]
+
   # =>
-  #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000b02c>
-  Vectors : 2 numeric, 1 string            
-  # key type   level data_preview         
-  1 :b  string     3 ["A", "B", "C"]      
-  2 :c  double     3 [1.0, 2.0, 3.0]      
-  3 :a  uint8      3 [1, 2, 3]
+  #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000328fc>
+    b               c       a
+    <string> <double> <uint8>
+  1 A             1.0       1
+  2 B             2.0       2
+  3 C             3.0       3
   ```
 
   If `#[]` represents single variable (column), it returns a Vector object.
 
   ```ruby
   df[:a]
+
   # =>
   #<RedAmber::Vector(:uint8, size=3):0x000000000000f140>
   [1, 2, 3]
   ```
   Or `#v` method also returns a Vector for a key.
 
   ```ruby
   df.v(:a)
+
   # =>
   #<RedAmber::Vector(:uint8, size=3):0x000000000000f140>
   [1, 2, 3]
   ```
 
@@ -292,18 +313,20 @@
 - Mixed case: `df[2, 0..]`
 
   ```ruby
   hash = {a: [1, 2, 3], b: %w[A B C], c: [1.0, 2, 3]}
   df = RedAmber::DataFrame.new(hash)
-  df[:b..:c, "a"].tdr(tally_level: 0)
+  df[2, 0..]
+
   # =>
-  RedAmber::DataFrame : 4 x 3 Vectors
-  Vectors : 2 numeric, 1 string
-  # key type   level data_preview
-  1 :a  uint8      3 [3, 1, 2, 3]
-  2 :b  string     3 ["C", "A", "B", "C"]
-  3 :c  double     3 [3.0, 1.0, 2.0, 3.0]
+  #<RedAmber::DataFrame : 4 x 3 Vectors, 0x0000000000033270>
+          a b               c
+    <uint8> <string> <double>
+  1       3 C             3.0
+  2       1 A             1.0
+  3       2 B             2.0
+  4       3 C             3.0
   ```
 
 - Select obs. by a boolean Array or a boolean RedAmber::Vector at same size as self.
 
   It returns a sub dataframe with observations at boolean is true.
@@ -311,17 +334,16 @@
     ```ruby
     # with the same dataframe `df` above
     df[true, false, nil] # or
     df[[true, false, nil]] # or
     df[RedAmber::Vector.new([true, false, nil])]
+
     # =>
-    #<RedAmber::DataFrame : 1 x 3 Vectors, 0x000000000000f1a4>
-    Vectors : 2 numeric, 1 string
-    # key type   level data_preview
-    1 :a  uint8      1 [1]
-    2 :b  string     1 ["A"]
-    3 :c  double     1 [1.0]
+    #<RedAmber::DataFrame : 1 x 3 Vectors, 0x00000000000353e0>
+            a b               c
+      <uint8> <string> <double>
+    1       1 A             1.0
     ```
 
 ### Select rows from top or from bottom
 
   `head(n=5)`, `tail(n=5)`, `first(n=1)`, `last(n=1)`
@@ -338,47 +360,68 @@
 
   `pick(keys)` accepts keys as arguments in an Array.
 
     ```ruby
     penguins.pick(:species, :bill_length_mm)
+
     # =>
-    #<RedAmber::DataFrame : 344 x 2 Vectors, 0x000000000000f924>
-    Vectors : 1 numeric, 1 string
-    # key             type   level data_preview
-    1 :species        string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
-    2 :bill_length_mm double   165 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
+    #<RedAmber::DataFrame : 344 x 2 Vectors, 0x0000000000035ebc>
+        species  bill_length_mm
+        <string>       <double>
+      1 Adelie             39.1
+      2 Adelie             39.5
+      3 Adelie             40.3
+      4 Adelie            (nil)
+      5 Adelie             36.7
+      : :                     :
+    342 Gentoo             50.4
+    343 Gentoo             45.2
+    344 Gentoo             49.9
     ```
 
 - Booleans as a argument
 
   `pick(booleans)` accepts booleans as a argument in an Array. Booleans must be same length as `n_keys`.
 
     ```ruby
     penguins.pick(penguins.types.map { |type| type == :string })
+    
     # =>
-    #<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000000f938>
-    Vectors : 3 strings
-    # key      type   level data_preview
-    1 :species string     3 {"Adelie"=>152, "Chinstrap"=>68, "Gentoo"=>124}
-    2 :island  string     3 {"Torgersen"=>52, "Biscoe"=>168, "Dream"=>124}
-    3 :sex     string     3 {"male"=>168, "female"=>165, ""=>11}
+    #<RedAmber::DataFrame : 344 x 3 Vectors, 0x00000000000387ac>
+        species  island    sex
+        <string> <string>  <string>
+      1 Adelie   Torgersen male
+      2 Adelie   Torgersen female
+      3 Adelie   Torgersen female
+      4 Adelie   Torgersen (nil)
+      5 Adelie   Torgersen female
+      : :        :         :
+    342 Gentoo   Biscoe    male
+    343 Gentoo   Biscoe    female
+    344 Gentoo   Biscoe    male
     ```
 
  - Keys or booleans by a block
 
     `pick {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
 
     ```ruby
-    # It is ok to write `keys ...` in the block, not `penguins.keys ...`
     penguins.pick { keys.map { |key| key.end_with?('mm') } }
+
     # =>
-    #<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000000f1cc>
-    Vectors : 3 numeric
-    # key                type   level data_preview
-    1 :bill_length_mm    double   165 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
-    2 :bill_depth_mm     double    81 [18.7, 17.4, 18.0, nil, 19.3, ... ], 2 nils
-    3 :flipper_length_mm int64     56 [181, 186, 195, nil, 193, ... ], 2 nils
+    #<RedAmber::DataFrame : 344 x 3 Vectors, 0x000000000003dd4c>
+        bill_length_mm bill_depth_mm flipper_length_mm
+              <double>      <double>           <uint8>
+      1           39.1          18.7               181
+      2           39.5          17.4               186
+      3           40.3          18.0               195
+      4          (nil)         (nil)             (nil)
+      5           36.7          19.3               193
+      :              :             :                 :
+    342           50.4          15.7               222
+    343           45.2          14.8               212
+    344           49.9          16.1               213
     ```
 
 ### `drop  ` - pick and drop -
 
   Drop some variables (columns) to create a remainer DataFrame.
@@ -412,17 +455,21 @@
 
   ```ruby
   df = RedAmber::DataFrame.new(a: [1, 2, 3], b: %w[A B C], c: [1.0, 2, 3])
   df.pick(:a) # or
   df.drop(:b, :c)
+
   # =>
-  #<RedAmber::DataFrame : 3 x 1 Vector, 0x000000000000f280>
-  Vector : 1 numeric
-  # key type  level data_preview
-  1 :a  uint8     3 [1, 2, 3]
+  #<RedAmber::DataFrame : 3 x 1 Vector, 0x000000000003f4bc>
+          a
+    <uint8>
+  1       1
+  2       2
+  3       3
 
   df[:a]
+
   # =>
   #<RedAmber::Vector(:uint8, size=3):0x000000000000f258>
   [1, 2, 3]
   ```
 
@@ -439,35 +486,47 @@
     Negative index from the tail like Ruby's Array is also acceptable.
 
     ```ruby
     # returns 5 obs. at start and 5 obs. from end
     penguins.slice(0...5, -5..-1)
+
     # =>
-    #<RedAmber::DataFrame : 10 x 8 Vectors, 0x000000000000f230>
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     2 {"Adelie"=>5, "Gentoo"=>5}
-    2 :island            string     2 {"Torgersen"=>5, "Biscoe"=>5}
-    3 :bill_length_mm    double     9 [39.1, 39.5, 40.3, nil, 36.7, ... ], 2 nils
-     ... 5 more Vectors ...
+    #<RedAmber::DataFrame : 10 x 8 Vectors, 0x0000000000042be4>
+       species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+       <string> <string>        <double>      <double>           <uint8> ... <uint16>
+     1 Adelie   Torgersen           39.1          18.7               181 ...     2007
+     2 Adelie   Torgersen           39.5          17.4               186 ...     2007
+     3 Adelie   Torgersen           40.3          18.0               195 ...     2007
+     4 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+     5 Adelie   Torgersen           36.7          19.3               193 ...     2007
+     : :        :                      :             :                 : ...        :
+     8 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+     9 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    10 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 
 - Booleans as an argument
 
   `slice(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
 
     ```ruby
     vector = penguins[:bill_length_mm]
     penguins.slice(vector >= 40)
+
     # =>
-    #<RedAmber::DataFrame : 242 x 8 Vectors, 0x000000000000f2bc>
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     3 {"Adelie"=>51, "Chinstrap"=>68, "Gentoo"=>123}
-    2 :island            string     3 {"Torgersen"=>18, "Biscoe"=>139, "Dream"=>85}
-    3 :bill_length_mm    double   115 [40.3, 42.0, 41.1, 42.5, 46.0, ... ]
-     ... 5 more Vectors ...
+    #<RedAmber::DataFrame : 242 x 8 Vectors, 0x0000000000043d3c>
+        species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+        <string> <string>        <double>      <double>           <uint8> ... <uint16>
+      1 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      2 Adelie   Torgersen           42.0          20.2               190 ...     2007
+      3 Adelie   Torgersen           41.1          17.6               182 ...     2007
+      4 Adelie   Torgersen           42.5          20.7               197 ...     2007
+      5 Adelie   Torgersen           46.0          21.5               194 ...     2007
+      : :        :                      :             :                 : ...        :
+    240 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    241 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    242 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 
 - Indices or booleans by a block
 
     `slice {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return indeces or a boolean Array with a same length as `size`. Block is called in the context of self.
@@ -480,26 +539,32 @@
       max = vector.mean + vector.std
       vector.to_a.map { |e| (min..max).include? e }
     end
 
     # =>
-    #<RedAmber::DataFrame : 204 x 8 Vectors, 0x000000000000f30c>
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     3 {"Adelie"=>82, "Chinstrap"=>33, "Gentoo"=>89}
-    2 :island            string     3 {"Torgersen"=>31, "Biscoe"=>112, "Dream"=>61}
-    3 :bill_length_mm    double    90 [39.1, 39.5, 40.3, 39.3, 38.9, ... ]
-     ... 5 more Vectors ...
+    #<RedAmber::DataFrame : 204 x 8 Vectors, 0x0000000000047a40>
+        species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+        <string> <string>        <double>      <double>           <uint8> ... <uint16>
+      1 Adelie   Torgersen           39.1          18.7               181 ...     2007
+      2 Adelie   Torgersen           39.5          17.4               186 ...     2007
+      3 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      4 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      5 Adelie   Torgersen           38.9          17.8               181 ...     2007
+      : :        :                      :             :                 : ...        :
+    202 Gentoo   Biscoe              47.2          13.7               214 ...     2009
+    203 Gentoo   Biscoe              46.8          14.3               215 ...     2009
+    204 Gentoo   Biscoe              45.2          14.8               212 ...     2009
     ```
 
 - Notice: nil option
   - `Arrow::Table#slice` uses `filter` method with a option `Arrow::FilterOptions.null_selection_behavior = :emit_null`. This will propagate nil at the same row.
     
     ```ruby
     hash = { a: [1, 2, 3], b: %w[A B C], c: [1.0, 2, 3] }
     table = Arrow::Table.new(hash)
     table.slice([true, false, nil])
+
     # =>
     #<Arrow::Table:0x7fdfe44b9e18 ptr=0x555e9fe744d0>
 	         a	b	            c
     0	     1  A      1.000000
     1	(null)	(null)   (null)
@@ -507,10 +572,11 @@
 
   - Whereas in RedAmber, `DataFrame#slice` with booleans containing nil is treated as false. This behavior comes from `Allow::FilterOptions.null_selection_behavior = :drop`. This is  a default value for `Arrow::Table.filter` method.
 
     ```ruby
     RedAmber::DataFrame.new(table).slice([true, false, nil]).table
+
     # =>
     #<Arrow::Table:0x7fdfe44981c8 ptr=0x555e9febc330>
 	    a	b	         c
     0	1	A	  1.000000
     ``` 
@@ -526,40 +592,48 @@
     `remove(indeces)` accepts indeces as arguments. Indeces should be an Integer or a Range of Integer.
 
     ```ruby
     # returns 6th to 339th obs.
     penguins.remove(0...5, -5..-1)
+
     # =>
-    #<RedAmber::DataFrame : 334 x 8 Vectors, 0x000000000000f320>
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     3 {"Adelie"=>147, "Chinstrap"=>68, "Gentoo"=>119}
-    2 :island            string     3 {"Torgersen"=>47, "Biscoe"=>163, "Dream"=>124}
-    3 :bill_length_mm    double   162 [39.3, 38.9, 39.2, 34.1, 42.0, ... ]
-     ... 5 more Vectors ...
+    #<RedAmber::DataFrame : 334 x 8 Vectors, 0x00000000000487c4>
+        species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+        <string> <string>        <double>      <double>           <uint8> ... <uint16>
+      1 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      2 Adelie   Torgersen           38.9          17.8               181 ...     2007
+      3 Adelie   Torgersen           39.2          19.6               195 ...     2007
+      4 Adelie   Torgersen           34.1          18.1               193 ...     2007
+      5 Adelie   Torgersen           42.0          20.2               190 ...     2007
+      : :        :                      :             :                 : ...        :
+    332 Gentoo   Biscoe              44.5          15.7               217 ...     2009
+    333 Gentoo   Biscoe              48.8          16.2               222 ...     2009
+    334 Gentoo   Biscoe              47.2          13.7               214 ...     2009
     ```
 
 - Booleans as an argument
 
   `remove(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
 
     ```ruby
     # remove all observation contains nil
     removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
-    removed.tdr
+    removed
+
     # =>
-    RedAmber::DataFrame : 333 x 8 Vectors
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     3 {"Adelie"=>146, "Chinstrap"=>68, "Gentoo"=>119}
-    2 :island            string     3 {"Torgersen"=>47, "Biscoe"=>163, "Dream"=>123}
-    3 :bill_length_mm    double   163 [39.1, 39.5, 40.3, 36.7, 39.3, ... ]
-    4 :bill_depth_mm     double    79 [18.7, 17.4, 18.0, 19.3, 20.6, ... ]
-    5 :flipper_length_mm uint8     54 [181, 186, 195, 193, 190, ... ]
-    6 :body_mass_g       uint16    93 [3750, 3800, 3250, 3450, 3650, ... ]
-    7 :sex               string     2 {"male"=>168, "female"=>165}
-    8 :year              uint16     3 {2007=>103, 2008=>113, 2009=>117}    
+    #<RedAmber::DataFrame : 333 x 8 Vectors, 0x0000000000049fac>
+        species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+        <string> <string>        <double>      <double>           <uint8> ... <uint16>
+      1 Adelie   Torgersen           39.1          18.7               181 ...     2007
+      2 Adelie   Torgersen           39.5          17.4               186 ...     2007
+      3 Adelie   Torgersen           40.3          18.0               195 ...     2007
+      4 Adelie   Torgersen           36.7          19.3               193 ...     2007
+      5 Adelie   Torgersen           39.3          20.6               190 ...     2007
+      : :        :                      :             :                 : ...        :
+    331 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    332 Gentoo   Biscoe              45.2          14.8               212 ...     2009
+    333 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 
 - Indices or booleans by a block
 
     `remove {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return indeces or a boolean Array with a same length as `size`. Block is called in the context of self.
@@ -569,47 +643,59 @@
       vector = self[:bill_length_mm]
       min = vector.mean - vector.std
       max = vector.mean + vector.std
       vector.to_a.map { |e| (min..max).include? e }
     end
+
     # =>
-    #<RedAmber::DataFrame : 140 x 8 Vectors, 0x000000000000f370>
-    Vectors : 5 numeric, 3 strings
-    # key                type   level data_preview
-    1 :species           string     3 {"Adelie"=>70, "Chinstrap"=>35, "Gentoo"=>35}
-    2 :island            string     3 {"Torgersen"=>21, "Biscoe"=>56, "Dream"=>63}
-    3 :bill_length_mm    double    75 [nil, 36.7, 34.1, 37.8, 37.8, ... ], 2 nils
-     ... 5 more Vectors ...
+    #<RedAmber::DataFrame : 140 x 8 Vectors, 0x000000000004de40>
+        species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
+        <string> <string>        <double>      <double>           <uint8> ... <uint16>
+      1 Adelie   Torgersen          (nil)         (nil)             (nil) ...     2007
+      2 Adelie   Torgersen           36.7          19.3               193 ...     2007
+      3 Adelie   Torgersen           34.1          18.1               193 ...     2007
+      4 Adelie   Torgersen           37.8          17.1               186 ...     2007
+      5 Adelie   Torgersen           37.8          17.3               180 ...     2007
+      : :        :                      :             :                 : ...        :
+    138 Gentoo   Biscoe             (nil)         (nil)             (nil) ...     2009
+    139 Gentoo   Biscoe              50.4          15.7               222 ...     2009
+    140 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 - Notice for nil
   - When `remove` used with booleans, nil in booleans is treated as false. This behavior is aligned with Ruby's `nil#!`.
 
     ```ruby
     df = RedAmber::DataFrame.new(a: [1, 2, nil], b: %w[A B C], c: [1.0, 2, 3])
     booleans = df[:a] < 2
+    booleans
+
     # =>
     #<RedAmber::Vector(:boolean, size=3):0x000000000000f410>
     [true, false, nil]
 
     booleans_invert = booleans.to_a.map(&:!) # => [false, true, true]
+    
     df.slice(booleans) == df.remove(booleans_invert) # => true
     ```
+
   - Whereas `Vector#invert` returns nil for elements nil. This will bring different result.
 
     ```ruby
     booleans.invert
+
     # =>
     #<RedAmber::Vector(:boolean, size=3):0x000000000000f488>
     [false, true, nil]
 
     df.remove(booleans.invert)
-    #<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000000f474>
-    Vectors : 2 numeric, 1 string
-    # key type   level data_preview
-    1 :a  uint8      2 [1, nil], 1 nil
-    2 :b  string     2 ["A", "C"]
-    3 :c  double     2 [1.0, 3.0]
+
+    # =>
+    #<RedAmber::DataFrame : 2 x 3 Vectors, 0x000000000005df98>
+            a b               c
+      <uint8> <string> <double>
+    1       1 A             1.0
+    2   (nil) C             3.0
     ```
 
 ### `rename`
 
   Rename keys (column names) to create a updated DataFrame.
@@ -619,19 +705,20 @@
 - Key pairs as arguments
 
     `rename(key_pairs)` accepts key_pairs as arguments. key_pairs should be a Hash of `{existing_key => new_key}`.
 
     ```ruby
-    h = { 'name' => %w[Yasuko Rui Hinata], 'age' => [68, 49, 28] }
-    df = RedAmber::DataFrame.new(h)
+    df = RedAmber::DataFrame.new( 'name' => %w[Yasuko Rui Hinata], 'age' => [68, 49, 28] )
     df.rename(:age => :age_in_1993)
+
     # =>
-    #<RedAmber::DataFrame : 3 x 2 Vectors, 0x000000000000f8fc>
-    Vectors : 1 numeric, 1 string
-    # key          type   level data_preview
-    1 :name        string     3 ["Yasuko", "Rui", "Hinata"]
-    2 :age_in_1993 uint8      3 [68, 49, 28]
+    #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000060838>
+      name     age_in_1993
+      <string>     <uint8>
+    1 Yasuko            68
+    2 Rui               49
+    3 Hinata            28
     ```
 
 - Key pairs by a block
 
     `rename {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return key_pairs as a Hash of `{existing_key => new_key}`. Block is called in the context of self.
@@ -653,29 +740,33 @@
 
     `assign(key_pairs)` accepts pairs of key and values as arguments. key_pairs should be a Hash of `{key => array}` or `{key => Vector}`.
 
     ```ruby
     df = RedAmber::DataFrame.new(
-      'name' => %w[Yasuko Rui Hinata],
-      'age' => [68, 49, 28])
+      name: %w[Yasuko Rui Hinata],
+      age: [68, 49, 28])
+    df
+    
     # =>
-    #<RedAmber::DataFrame : 3 x 2 Vectors, 0x000000000000f8fc>
-    Vectors : 1 numeric, 1 string
-    # key   type   level data_preview
-    1 :name string     3 ["Yasuko", "Rui", "Hinata"]
-    2 :age  uint8      3 [68, 49, 28]
+    #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000062804>
+      name         age                      
+      <string> <uint8>                      
+    1 Yasuko        68                      
+    2 Rui           49                      
+    3 Hinata        28
 
     # update :age and add :brother
     assigner = { age: [97, 78, 57], brother: ['Santa', nil, 'Momotaro'] }
     df.assign(assigner)
+
     # =>
-    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x000000000000f960>
-    Vectors : 1 numeric, 2 strings
-    # key      type   level data_preview
-    1 :name    string     3 ["Yasuko", "Rui", "Hinata"]
-    2 :age     uint8      3 [97, 78, 57]
-    3 :brother string     3 ["Santa", nil, "Momotaro"], 1 nil
+    #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000658b0>
+      name         age brother
+      <string> <uint8> <string>
+    1 Yasuko        97 Santa
+    2 Rui           78 (nil)
+    3 Hinata        57 Momotaro
     ```
 
 - Key pairs by a block
 
     `assign {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return pairs of key and values as a Hash of `{key => array}` or `{key => Vector}`. Block is called in the context of self.
@@ -683,40 +774,48 @@
     ```ruby
     df = RedAmber::DataFrame.new(
       index: [0, 1, 2, 3, nil],
       float: [0.0, 1.1,  2.2, Float::NAN, nil],
       string: ['A', 'B', 'C', 'D', nil])
+    df
+
     # =>
-    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000000f8c0>
-    Vectors : 2 numeric, 1 string
-    # key     type   level data_preview
-    1 :index  uint8      5 [0, 1, 2, 3, nil], 1 nil
-    2 :float  double     5 [0.0, 1.1, 2.2, NaN, nil], 1 NaN, 1 nil
-    3 :string string     5 ["A", "B", "C", "D", nil], 1 nil
+    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
+        index    float string
+      <uint8> <double> <string>
+    1       0      0.0 A
+    2       1      1.1 B
+    3       2      2.2 C
+    4       3      NaN D
+    5   (nil)    (nil) (nil)
 
     # update numeric variables
     df.assign do
       assigner = {}
       vectors.each_with_index do |v, i|
         assigner[keys[i]] = v * -1 if v.numeric?
       end
       assigner
     end
+
     # =>
-    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000000f924>
-    Vectors : 2 numeric, 1 string
-    # key     type   level data_preview
-    1 :index  int8       5 [0, -1, -2, -3, nil], 1 nil
-    2 :float  double     5 [-0.0, -1.1, -2.2, NaN, nil], 1 NaN, 1 nil
-    3 :string string     5 ["A", "B", "C", "D", nil], 1 nil
+    #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000006e000>
+       index    float string
+      <int8> <double> <string>
+    1      0     -0.0 A
+    2     -1     -1.1 B
+    3     -2     -2.2 C
+    4     -3      NaN D
+    5  (nil)    (nil) (nil)
 
     # Or it ’s shorter like this:
     df.assign do
       variables.select.with_object({}) do |(key, vector), assigner|
         assigner[key] = vector * -1 if vector.numeric?
       end
     end
+
     # => same as above
     ```
 
 - Key type
 
@@ -734,18 +833,21 @@
   df = RedAmber::DataFrame.new({
         index:  [1, 1, 0, nil, 0],
         string: ['C', 'B', nil, 'A', 'B'],
         bool:   [nil, true, false, true, false],
       })
-  df.sort(:index, '-bool').tdr(tally: 0)
+  df.sort(:index, '-bool')
+  
   # =>
-  RedAmber::DataFrame : 5 x 3 Vectors
-  Vectors : 1 numeric, 1 string, 1 boolean
-  # key     type    level data_preview
-  1 :index  uint8       3 [0, 0, 1, 1, nil], 1 nil
-  2 :string string      4 [nil, "B", "B", "C", "A"], 1 nil
-  3 :bool   boolean     3 [false, false, true, nil, true], 1 nil
+  #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000009b03c>
+      index string   bool
+    <uint8> <string> <boolean>
+  1       0 (nil)    false
+  2       0 B        false
+  3       1 B        true
+  4       1 C        (nil)
+  5   (nil) A        true
   ```
 
 - [ ] Clamp
 
 - [ ] Clear data
@@ -756,71 +858,21 @@
 
   Remove any observations containing nil.
 
 ## Grouping
 
-### `group(aggregating_keys, function, target_keys)`
+### `group(aggregating_keys)`
 
-  (This is a temporary API and may change in the future version.)
+  (
+    This API will change in the future version. Especcially I want to change:
+      - Order of the column of the result (aggregation_keys should be the first)
+      - DataFrame#group will accept a block (heronshoes/red_amber #28)
+  )
 
-  Create grouped dataframe by `aggregation_keys` and apply `function` to each group and returns in `target_keys`. Aggregated key name is `function(key)` style.
+  `group` creates a class `Group` object. `Group` accepts functions below as a method.
+  Method accepts options as `summary_keys`.
 
-  (The current implementation is not intuitive. Needs improvement.)
-
-  ```ruby
-  ds = Datasets::Rdatasets.new('dplyr', 'starwars')
-  starwars = RedAmber::DataFrame.new(ds.to_table.to_h)
-  starwars.tdr(11)
-  # =>
-  RedAmber::DataFrame : 87 x 11 Vectors
-  Vectors : 3 numeric, 8 strings
-  #  key         type   level data_preview
-  1  :name       string    87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader",   "Leia Organa", ... ]
-  2  :height     uint16    46 [172, 167, 96, 202, 150, ... ], 6 nils
-  3  :mass       double    39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
-  4  :hair_color string    13 ["blond", nil, nil, "none", "brown", ... ], 5 nils
-  5  :skin_color string    31 ["fair", "gold", "white, blue", "white", "light", ..  . ]
-  6  :eye_color  string    15 ["blue", "yellow", "red", "yellow", "brown", ... ]
-  7  :birth_year double    37 [19.0, 112.0, 33.0, 41.9, 19.0, ... ], 44 nils
-  8  :sex        string     5 {"male"=>60, "none"=>6, "female"=>16, "hermaphroditic"=>1, nil=>4}
-  9  :gender     string     3 {"masculine"=>66, "feminine"=>17, nil=>4}
-  10 :homeworld  string    49 ["Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", ... ], 10 nils
-  11 :species    string    38 ["Human", "Droid", "Droid", "Human", "Human", ... ], 4 nils
-
-  grouped = starwars.group(:species, :mean, [:mass, :height])
-  # =>
-  #<RedAmber::DataFrame : 38 x 3 Vectors, 0x000000000000fbf4>
-  Vectors : 2 numeric, 1 string
-  # key             type   level data_preview
-  1 :"mean(mass)"   double    27 [82.78181818181818, 69.75, 124.0, 74.0, 1358.0, ... ], 6 nils
-  2 :"mean(height)" double    32 [176.6451612903226, 131.2, 231.0, 173.0, 175.0, ... ]
-  3 :species        string    38 ["Human", "Droid", "Wookiee", "Rodian", "Hutt", ... ], 1 nil
-
-  count = starwars.group(:species, :count, :species)[:"count(species)"]
-  df = grouped.slice(count > 1)
-  # =>
-  #<RedAmber::DataFrame : 8 x 3 Vectors, 0x000000000000fc44>
-  Vectors : 2 numeric, 1 string
-  # key             type   level data_preview
-  1 :"mean(mass)"   double     8 [82.78181818181818, 69.75, 124.0, 74.0, 80.0, ... ]
-  2 :"mean(height)" double     8 [176.6451612903226, 131.2, 231.0, 208.66666666666666, 173.0, ... ]
-  3 :species        string     8 ["Human", "Droid", "Wookiee", "Gungan", "Zabrak", ... ]
-
-  df.table
-  # =>
-  #<Arrow::Table:0x1165593c8 ptr=0x7fb3db144c70>
-	mean(mass)	mean(height)	species
-  0	 82.781818	  176.645161	Human  
-  1	 69.750000	  131.200000	Droid  
-  2	124.000000	  231.000000	Wookiee
-  3	 74.000000	  208.666667	Gungan 
-  4	 80.000000	  173.000000	Zabrak 
-  5	 55.000000	  179.000000	Twi'lek
-  6	 53.100000	  168.000000	Mirialan
-  7	 88.000000	  221.000000	Kaminoan
-  ```
-
   Available functions are:
 
   - [ ] all                 
   - [ ] any
   - [ ] approximate_median
@@ -835,13 +887,119 @@
   - ✓ stddev
   - ✓ sum
   - [ ] tdigest
   - ✓ variance
 
+  For the each group of `aggregation_keys`, the aggregation `function` is applied and returns a new dataframe with aggregated keys according to `summary_keys`.
+  Aggregated key name is `function(summary_key)` style.
+
+  This is an example of grouping of famous STARWARS dataset.
+
+  ```ruby
+  starwars =
+    RedAmber::DataFrame.load(URI("https://vincentarelbundock.github.io/Rdatasets/csv/dplyr/starwars.csv"))
+  starwars
+  
+  # =>
+  #<RedAmber::DataFrame : 87 x 12 Vectors, 0x00000000000773bc>
+  species     name            height     mass hair_color skin_color  eye_color ... homeworld
+  <string>    <string>       <int64> <double> <string>   <string>    <string>  ... <string>
+  Human     1 Luke Skywalker     172     77.0 blond      fair        blue      ... Tatooine
+  Droid     2 C-3PO              167     75.0 NA         gold        yellow    ... Tatooine
+  Droid     3 R2-D2               96     32.0 NA         white, blue red       ... Naboo
+  Human     4 Darth Vader        202    136.0 none       white       yellow    ... Tatooine
+  Human     5 Leia Organa        150     49.0 brown      light       brown     ... Alderaan
+  :         : :                    :        : :          :           :         ... :
+  Droid    85 BB8              (nil)    (nil) none       none        black     ... NA
+  NA       86 Captain Phasma   (nil)    (nil) unknown    unknown     unknown   ... NA
+  Human    87 Padmé Amidala      165     45.0 brown      light       brown     ... Naboo
+
+  starwars.tdr(12)
+
+  # =>
+  RedAmber::DataFrame : 87 x 12 Vectors
+  Vectors : 4 numeric, 8 strings
+  #  key         type   level data_preview
+  1  :""         int64     87 [1, 2, 3, 4, 5, ... ]
+  2  :name       string    87 ["Luke Skywalker", "C-3PO", "R2-D2", "Darth Vader", "Leia Organa", ... ]
+  3  :height     int64     46 [172, 167, 96, 202, 150, ... ], 6 nils
+  4  :mass       double    39 [77.0, 75.0, 32.0, 136.0, 49.0, ... ], 28 nils
+  5  :hair_color string    13 ["blond", "NA", "NA", "none", "brown", ... ]
+  6  :skin_color string    31 ["fair", "gold", "white, blue", "white", "light", ... ]
+  7  :eye_color  string    15 ["blue", "yellow", "red", "yellow", "brown", ... ]
+  8  :birth_year double    37 [19.0, 112.0, 33.0, 41.9, 19.0, ... ], 44 nils
+  9  :sex        string     5 {"male"=>60, "none"=>6, "female"=>16, "hermaphroditic"=>1, "NA"=>4}
+  10 :gender     string     3 {"masculine"=>66, "feminine"=>17, "NA"=>4}
+  11 :homeworld  string    49 ["Tatooine", "Tatooine", "Naboo", "Tatooine", "Alderaan", ... ]
+  12 :species    string    38 ["Human", "Droid", "Droid", "Human", "Human", ... ]
+  ```
+
+  We can aggregate for `:species` and calculate the mean of `:mass` and `:height`.
+
+  ```ruby
+  grouped = starwars.group(:species).mean(:mass, :height)
+  grouped
+
+  # =>
+  #<RedAmber::DataFrame : 38 x 3 Vectors, 0x000000000008e620>                                 
+     mean(mass) mean(height) species                                                          
+       <double>     <double> <string>                                                         
+   1       82.8        176.6 Human                                                            
+   2       69.8        131.2 Droid                                                            
+   3      124.0        231.0 Wookiee                                                          
+   4       74.0        173.0 Rodian                                                           
+   5     1358.0        175.0 Hutt                                                             
+   :          :            : :                                                                
+  36      159.0        216.0 Kaleesh                                                          
+  37       80.0        206.0 Pau'an
+  38       80.0        188.0 Kel Dor
+  ```
+
+  Select rows for count > 1.
+  
+  ```ruby
+  count = starwars.group(:species).count(:species)[:'count(species)'] # => Vector
+  grouped = grouped.slice(count > 1)
+
+  # =>
+  #<RedAmber::DataFrame : 9 x 3 Vectors, 0x0000000000098260>
+    mean(mass) mean(height) species       
+      <double>     <double> <string>      
+  1       82.8        176.6 Human         
+  2       69.8        131.2 Droid         
+  3      124.0        231.0 Wookiee       
+  4       74.0        208.7 Gungan        
+  5       48.0        181.3 NA            
+  :          :            : :
+  7       55.0        179.0 Twi'lek
+  8       53.1        168.0 Mirialan
+  9       88.0        221.0 Kaminoan
+  ```
+
+  Assemble the result and change the order of columns.
+
+  ```ruby
+  grouped.assign(count: count[count > 1]).pick { [2,3,0,1].map{ |i| keys[i] } }
+  
+  # =>
+  #<RedAmber::DataFrame : 9 x 4 Vectors, 0x0000000000141838>                                  
+    species    count mean(mass) mean(height)                                                  
+    <string> <uint8>   <double>     <double>                                                  
+  1 Human         35       82.8        176.6                                                  
+  2 Droid          6       69.8        131.2                                                  
+  3 Wookiee        2      124.0        231.0                                                  
+  4 Gungan         3       74.0        208.7                                                  
+  5 NA             4       48.0        181.3                                                  
+  : :              :          :            :                                                  
+  7 Twi'lek        2       55.0        179.0                                                  
+  8 Mirialan       2       53.1        168.0                                                  
+  9 Kaminoan       2       88.0        221.0
+  ```
+
 ## Combining DataFrames
 
-- [ ]  obs
+- [ ] Combining rows to a dataframe
 
 - [ ] Add vars
 
 - [ ] Inner join
 
@@ -850,5 +1008,7 @@
 ## Encoding
 
 - [ ] One-hot encoding
 
 ## Iteration (not impremented)
+
+- [ ] each_rows