DataFrame.md in red_amber-0.2.1

- old
+ new

@@ -153,12 +153,30 @@
 
 - Returns an Array of Vectors.
 
 ### `indices`, `indexes`
 
-- Returns all indexes in an Array.
+- Returns indexes in an Array.
+  Accepts an option `start` as the first of indexes.
 
+  ```ruby
+  df = RedAmber::DataFrame.new(x: [1, 2, 3, 4, 5])
+  df.indices
+
+  # =>
+  [0, 1, 2, 3, 4]
+
+  df.indices(1)
+
+  # =>
+  [1, 2, 3, 4, 5]
+
+  df.indices(:a)
+  # =>
+  [:a, :b, :c, :d, :e]
+  ```
+
 ### `to_h`
 
 - Returns column-oriented data in a Hash.
 
 ### `to_a`, `raw_records`
@@ -370,17 +388,17 @@
 
 ## Sub DataFrame manipulations
 
 ### `pick  ` - pick up variables by key label -
 
-  Pick up some variables (columns) to create a sub DataFrame.
+  Pick up some columns (variables) to create a sub DataFrame.
 
   ![pick method image](doc/../image/dataframe/pick.png)
 
 - Keys as arguments
 
-  `pick(keys)` accepts keys as arguments in an Array.
+  `pick(keys)` accepts keys as arguments in an Array or a Range.
 
     ```ruby
     penguins.pick(:species, :bill_length_mm)
 
     # =>
@@ -396,15 +414,37 @@
     342 Gentoo             50.4
     343 Gentoo             45.2
     344 Gentoo             49.9
     ```
 
-- Booleans as a argument
+- Indices as arguments
 
-  `pick(booleans)` accepts booleans as a argument in an Array. Booleans must be same length as `n_keys`.
+  `pick(indices)` accepts indices as arguments. Indices should be Integers, Floats or Ranges of Integers.
 
     ```ruby
+    penguins.pick(0..2, -1)
+    
+    # =>
+    #<RedAmber::DataFrame : 344 x 4 Vectors, 0x0000000000055ce4>
+        species  island    bill_length_mm     year
+        <string> <string>        <double> <uint16>
+      1 Adelie   Torgersen           39.1     2007
+      2 Adelie   Torgersen           39.5     2007
+      3 Adelie   Torgersen           40.3     2007
+      4 Adelie   Torgersen          (nil)     2007
+      5 Adelie   Torgersen           36.7     2007
+      : :        :                      :        :
+    342 Gentoo   Biscoe              50.4     2009
+    343 Gentoo   Biscoe              45.2     2009
+    344 Gentoo   Biscoe              49.9     2009
+    ```
+
+- Booleans as arguments
+
+  `pick(booleans)` accepts booleans as arguments in an Array. Booleans must be same length as `n_keys`.
+
+    ```ruby
     penguins.pick(penguins.types.map { |type| type == :string })
     
     # =>
     #<RedAmber::DataFrame : 344 x 3 Vectors, 0x00000000000387ac>
         species  island    sex
@@ -418,13 +458,13 @@
     342 Gentoo   Biscoe    male
     343 Gentoo   Biscoe    female
     344 Gentoo   Biscoe    male
     ```
 
- - Keys or booleans by a block
+- Keys or booleans by a block
 
-    `pick {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
+    `pick {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, indices or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
 
     ```ruby
     penguins.pick { keys.map { |key| key.end_with?('mm') } }
 
     # =>
@@ -442,25 +482,29 @@
     344           49.9          16.1               213
     ```
 
 ### `drop  ` - pick and drop -
 
-  Drop some variables (columns) to create a remainer DataFrame.
+  Drop some columns (variables) to create a remainer DataFrame.
 
   ![drop method image](doc/../image/dataframe/drop.png)
 
 - Keys as arguments
 
-  `drop(keys)` accepts keys as arguments in an Array.
+  `drop(keys)` accepts keys as arguments in an Array or a Range.
 
-- Booleans as a argument
+- Indices as arguments
 
-  `drop(booleans)` accepts booleans as a argument in an Array. Booleans must be same length as `n_keys`.
+  `drop(indices)` accepts indices as a arguments. Indices should be Integers, Floats or Ranges of Integers.
 
+- Booleans as arguments
+
+  `drop(booleans)` accepts booleans as an argument in an Array. Booleans must be same length as `n_keys`.
+
 - Keys or booleans by a block
 
-  `drop {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
+  `drop {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return keys, indices or a boolean Array with a same length as `n_keys`. Block is called in the context of self.
   
 - Notice for nil
 
   When used with booleans, nil in booleans is treated as a false. This behavior is aligned with Ruby's `nil#!`.
 
@@ -491,13 +535,24 @@
   # =>
   #<RedAmber::Vector(:uint8, size=3):0x000000000000f258>
   [1, 2, 3]
   ```
 
+  A simple key name is usable as a method of the DataFrame if the key name is acceptable as a method name.
+  It returns a Vector same as `[]`.
+
+  ```ruby
+  df.a
+
+  # =>
+  #<RedAmber::Vector(:uint8, size=3):0x000000000000f258>
+  [1, 2, 3]
+  ```
+
 ### `slice  `  - to cut vertically is slice -
 
-  Slice and select observations (rows) to create a sub DataFrame.
+  Slice and select rows (observations) to create a sub DataFrame.
 
   ![slice method image](doc/../image/dataframe/slice.png)
 
 - Indices as arguments
 
@@ -524,11 +579,11 @@
     10 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
 
 - Booleans as an argument
 
-  `slice(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
+  `slice(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
 
     ```ruby
     vector = penguins[:bill_length_mm]
     penguins.slice(vector >= 40)
 
@@ -601,11 +656,11 @@
     0	1	A	  1.000000
     ``` 
 
 ### `remove`
 
-  Slice and reject observations (rows) to create a remainer DataFrame.
+  Slice and reject rows (observations) to create a remainer DataFrame.
 
   ![remove method image](doc/../image/dataframe/remove.png)
 
 - Indices as arguments
 
@@ -630,11 +685,11 @@
     334 Gentoo   Biscoe              47.2          13.7               214 ...     2009
     ```
 
 - Booleans as an argument
 
-  `remove(booleans)` accepts booleans as a argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
+  `remove(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`.
 
     ```ruby
     # remove all observation contains nil
     removed = penguins.remove { vectors.map(&:is_nil).reduce(&:|) }
     removed
@@ -658,14 +713,16 @@
 
     `remove {block}` is also acceptable. We can't use both arguments and a block at a same time. The block should return indeces or a boolean Array with a same length as `size`. Block is called in the context of self.
 
     ```ruby
     penguins.remove do
-      vector = self[:bill_length_mm]
-      min = vector.mean - vector.std
-      max = vector.mean + vector.std
-      vector.to_a.map { |e| (min..max).include? e }
+      # We will use another style shown in slice
+      # self.bill_length_mm returns Vector
+      mean = bill_length_mm.mean
+      min = mean - bill_length_mm.std
+      max = mean + bill_length_mm.std
+      bill_length_mm.to_a.map { |e| (min..max).include? e }
     end
 
     # =>
     #<RedAmber::DataFrame : 140 x 8 Vectors, 0x000000000004de40>
         species  island    bill_length_mm bill_depth_mm flipper_length_mm ...     year
@@ -678,10 +735,11 @@
       : :        :                      :             :                 : ...        :
     138 Gentoo   Biscoe             (nil)         (nil)             (nil) ...     2009
     139 Gentoo   Biscoe              50.4          15.7               222 ...     2009
     140 Gentoo   Biscoe              49.9          16.1               213 ...     2009
     ```
+
 - Notice for nil
   - When `remove` used with booleans, nil in booleans is treated as false. This behavior is aligned with Ruby's `nil#!`.
 
     ```ruby
     df = RedAmber::DataFrame.new(a: [1, 2, nil], b: %w[A B C], c: [1.0, 2, 3])
@@ -770,19 +828,23 @@
       age: [68, 49, 28])
     df
     
     # =>
     #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000062804>
-      name         age                      
-      <string> <uint8>                      
-    1 Yasuko        68                      
-    2 Rui           49                      
+      name         age
+      <string> <uint8>
+    1 Yasuko        68
+    2 Rui           49
     3 Hinata        28
 
     # update :age and add :brother
-    assigner = { age: [97, 78, 57], brother: ['Santa', nil, 'Momotaro'] }
-    df.assign(assigner)
+    df.assign do
+      {
+        age: age + 29,
+        brother: ['Santa', nil, 'Momotaro']
+      }
+    end
 
     # =>
     #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000658b0>
       name         age brother
       <string> <uint8> <string>
@@ -797,11 +859,12 @@
 
     ```ruby
     df = RedAmber::DataFrame.new(
       index: [0, 1, 2, 3, nil],
       float: [0.0, 1.1,  2.2, Float::NAN, nil],
-      string: ['A', 'B', 'C', 'D', nil])
+      string: ['A', 'B', 'C', 'D', nil]
+    )
     df
 
     # =>
     #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
         index    float string
@@ -819,17 +882,17 @@
              .map { |v| [v.key, -v] }
     end
 
     # =>
     #<RedAmber::DataFrame : 5 x 3 Vectors, 0x00000000000dfffc>
-        index    float string             
-      <uint8> <double> <string>           
-    1       0     -0.0 A                  
-    2       1     -1.1 B                  
-    3       2     -2.2 C                  
-    4       3      NaN D                  
-    5   (nil)    (nil) (nil) 
+        index    float string
+      <uint8> <double> <string>
+    1       0     -0.0 A
+    2       1     -1.1 B
+    3       2     -2.2 C
+    4       3      NaN D
+    5   (nil)    (nil) (nil)
 
     # Or we can use assigner by a Hash
     df.assign do
       vectors.select.with_object({}) do |v, assigner|
         assigner[v.key] = -v if v.float?
@@ -850,11 +913,11 @@
 - Append from left
 
   `assign_left` method accepts the same parameters and block as `assign`, but append new columns from leftside.
 
   ```ruby
-  df.assign_left(new_index: [1, 2, 3, 4, 5])
+  df.assign_left(new_index: df.indices(1))
   
   # => 
   #<RedAmber::DataFrame : 5 x 4 Vectors, 0x000000000001787c>
     new_index   index    float string
       <uint8> <uint8> <double> <string>
@@ -863,24 +926,92 @@
   3         3       2      2.2 C
   4         4       3      NaN D
   5         5   (nil)    (nil) (nil)
   ```
 
+### `slice_by(key, keep_key: false) { block }`
+
+`slice_by` accepts a key and a block to select rows.
+
+(Since 0.2.1)
+
+  ```ruby
+  df = RedAmber::DataFrame.new(
+    index: [0, 1, 2, 3, nil],
+    float: [0.0, 1.1,  2.2, Float::NAN, nil],
+    string: ['A', 'B', 'C', 'D', nil]
+  )
+  df
+
+  # =>
+  #<RedAmber::DataFrame : 5 x 3 Vectors, 0x0000000000069e60>
+      index    float string
+    <uint8> <double> <string>
+  1       0      0.0 A
+  2       1      1.1 B
+  3       2      2.2 C
+  4       3      NaN D
+  5   (nil)    (nil) (nil)
+
+  df.slice_by(:string) { ["A", "C"] }
+
+  # =>
+  #<RedAmber::DataFrame : 2 x 2 Vectors, 0x000000000001b1ac>
+      index    float
+    <uint8> <double>
+  1       0      0.0
+  2       2      2.2
+  ```
+
+It is the same behavior as;
+
+  ```ruby
+  df.slice { [string.index("A"), string.index("C")] }.drop(:string)
+  ```
+
+`slice_by` also accepts a Range.
+
+  ```ruby
+  df.slice_by(:string) { "A".."C" }
+
+  # =>
+  #<RedAmber::DataFrame : 3 x 2 Vectors, 0x0000000000069668>
+      index    float
+    <uint8> <double>
+  1       0      0.0
+  2       1      1.1
+  3       2      2.2
+  ```
+
+When the option `keep_key: true` used, the column `key` will be preserved.
+
+  ```ruby
+  df.slice_by(:string, keep_key: true) { "A".."C" }
+
+  # =>
+  #<RedAmber::DataFrame : 3 x 3 Vectors, 0x0000000000073c44>
+      index    float string
+    <uint8> <double> <string>
+  1       0      0.0 A
+  2       1      1.1 B
+  3       2      2.2 C
+  ```
+
 ## Updating
 
 ### `sort`
 
   `sort` accepts parameters as sort_keys thanks to the amazing Red Arrow feature。
     - :key, "key" or "+key" denotes ascending order
     - "-key" denotes descending order
 
   ```ruby
-  df = RedAmber::DataFrame.new({
+  df = RedAmber::DataFrame.new(
         index:  [1, 1, 0, nil, 0],
         string: ['C', 'B', nil, 'A', 'B'],
         bool:   [nil, true, false, true, false],
-      })
+      )
   df.sort(:index, '-bool')
   
   # =>
   #<RedAmber::DataFrame : 5 x 3 Vectors, 0x000000000009b03c>
       index string   bool
@@ -1033,106 +1164,107 @@
 
 ## Reshape
 
 ### `transpose`
 
-  Creates transposed DataFrame for wide type dataframe.
+  Creates transposed DataFrame for the wide (messy) dataframe.
 
   ```ruby
   import_cars = RedAmber::DataFrame.load('test/entity/import_cars.tsv')
 
   # =>
   #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000d520>
        Year    Audi     BMW BMW_MINI Mercedes-Benz      VW
     <int64> <int64> <int64>  <int64>       <int64> <int64>
-  1    2021   22535   35905    18211         51722   35215
-  2    2020   22304   35712    20196         57041   36576
+  1    2017   28336   52527    25427         68221   49040
+  2    2018   26473   50982    25984         67554   51961
   3    2019   24222   46814    23813         66553   46794
-  4    2018   26473   50982    25984         67554   51961
-  5    2017   28336   52527    25427         68221   49040
+  4    2020   22304   35712    20196         57041   36576
+  5    2021   22535   35905    18211         51722   35215
+  import_cars.transpose(:Manufacturer)
 
-  import_cars.transpose
-
   # =>
   #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000ef74>
-    name              2021     2020     2019     2018     2017
-    <dictionary>  <uint16> <uint16> <uint32> <uint32> <uint32>
-  1 Audi             22535    22304    24222    26473    28336
-  2 BMW              35905    35712    46814    50982    52527
-  3 BMW_MINI         18211    20196    23813    25984    25427
-  4 Mercedes-Benz    51722    57041    66553    67554    68221
-  5 VW               35215    36576    46794    51961    49040
+    Manufacturer      2017     2018     2019     2020     2021
+    <dictionary>  <uint32> <uint32> <uint32> <uint16> <uint16>
+  1 Audi             28336    26473    24222    22304    22535
+  2 BMW              52527    50982    46814    35712    35905
+  3 BMW_MINI         25427    25984    23813    20196    18211
+  4 Mercedes-Benz    68221    67554    66553    57041    51722
+  5 VW               49040    51961    46794    36576    35215
   ```
   
   The leftmost column is created by original keys. Key name of the column is
-  named by 'name'.
+  named by parameter `:name`. If `:name` is not specified, `:N` is used for the key.
 
 ### `to_long(*keep_keys)`
 
-  Creates a 'long' DataFrame.
+  Creates a 'long' (tidy) DataFrame from a 'wide' DataFrame.
 
   - Parameter `keep_keys` specifies the key names to keep.
 
   ```ruby
   import_cars.to_long(:Year)
 
   # =>
   #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000012750>               
-         Year name             value
+         Year N                    V
      <uint16> <dictionary>  <uint32>
-   1     2021 Audi             22535
-   2     2021 BMW              35905
-   3     2021 BMW_MINI         18211
-   4     2021 Mercedes-Benz    51722
-   5     2021 VW               35215
+   1     2017 Audi             28336
+   2     2017 BMW              52527
+   3     2017 BMW_MINI         25427
+   4     2017 Mercedes-Benz    68221
+   5     2017 VW               49040
    :        : :                    :
-  23     2017 BMW_MINI         25427
-  24     2017 Mercedes-Benz    68221
-  25     2017 VW               49040
+  23     2021 BMW_MINI         18211
+  24     2021 Mercedes-Benz    51722
+  25     2021 VW               35215
   ```
 
-  - Option `:name` : key of the column which is come **from key names**.
-  - Option `:value` : key of the column which is come **from values**.
+  - Option `:name` is the key of the column which came **from key names**.
+  - Option `:value` is the key of the column which came **from values**.
 
   ```ruby
   import_cars.to_long(:Year, name: :Manufacturer, value: :Num_of_imported)
 
   # =>
   #<RedAmber::DataFrame : 25 x 3 Vectors, 0x0000000000017700>
          Year Manufacturer  Num_of_imported
      <uint16> <dictionary>         <uint32>
-   1     2021 Audi                    22535
-   2     2021 BMW                     35905
-   3     2021 BMW_MINI                18211
-   4     2021 Mercedes-Benz           51722
-   5     2021 VW                      35215
+   1     2017 Audi                    28336
+   2     2017 BMW                     52527
+   3     2017 BMW_MINI                25427
+   4     2017 Mercedes-Benz           68221
+   5     2017 VW                      49040
    :        : :                           :
-  23     2017 BMW_MINI                25427
-  24     2017 Mercedes-Benz           68221
-  25     2017 VW                      49040
+  23     2021 BMW_MINI                18211
+  24     2021 Mercedes-Benz           51722
+  25     2021 VW                      35215
   ```
 
 ### `to_wide`
 
-  Creates a 'wide' DataFrame.
+  Creates a 'wide' (messy) DataFrame from a 'long' DataFrame.
 
-  - Option `:name` : key of the column which will be expanded **to key name**.
-  - Option `:value` : key of the column which will be expanded **to values**.
+  - Option `:name` is the key of the column which will be expanded **to key names**.
+  - Option `:value` is the key of the column which will be expanded **to values**.
 
   ```ruby
   import_cars.to_long(:Year).to_wide
-  # import_cars.to_long(:Year).to_wide(name: :name, value: :value)
+  # import_cars.to_long(:Year).to_wide(name: :N, value: :V)
   # is also OK
 
   # =>
   #<RedAmber::DataFrame : 5 x 6 Vectors, 0x000000000000f0f0>
         Year     Audi      BMW BMW_MINI Mercedes-Benz       VW
     <uint16> <uint16> <uint16> <uint16>      <uint32> <uint16>
-  1     2021    22535    35905    18211         51722    35215
-  2     2020    22304    35712    20196         57041    36576
+  1     2017    28336    52527    25427         68221    49040
+  2     2018    26473    50982    25984         67554    51961
   3     2019    24222    46814    23813         66553    46794
-  4     2018    26473    50982    25984         67554    51961
-  5     2017    28336    52527    25427         68221    49040
+  4     2020    22304    35712    20196         57041    36576
+  5     2021    22535    35905    18211         51722    35215
+
+  # == import_cars
   ```
 
 ## Combine
 
 - [ ] Combining dataframes