doc/DataFrame.md in red_amber-0.3.0 vs doc/DataFrame.md in red_amber-0.4.0

- old
+ new

@@ -55,10 +55,14 @@ - from a `.arrow`, `.arrows`, `.csv`, `.csv.gz` or `.tsv` file ```ruby RedAmber::DataFrame.load("test/entity/with_header.csv") ``` + + ```ruby + RedAmber::DataFrame.load("test/entity/without_header.csv", headers: [:x, :y, :z]) + ``` - from a string buffer - from a URI @@ -273,10 +277,11 @@ ### `tdr(limit = 10, tally: 5, elements: 5)` - Shows some information about self in a transposed style. - `tdr_str` returns same info as a String. + - `glimpse` is an alias. It is similar to dplyr's (or Polars's) `glimpse()`. ```ruby require 'red_amber' require 'datasets-arrow' @@ -566,11 +571,11 @@ # => #<RedAmber::Vector(:uint8, size=3):0x000000000000f258> [1, 2, 3] ``` -### `slice ` - slice and select records - +### `slice ` - cut into slices of records - Slice and select records (rows) to create a sub DataFrame. ![slice method image](doc/../image/dataframe/slice.png) @@ -599,15 +604,18 @@ 9 Gentoo Biscoe 49.9 16.1 213 ... 2009 ``` - Booleans as an argument - `slice(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`. + `filter(booleans)` or `slice(booleans)` accepts booleans as an argument in an Array, a Vector or an Arrow::BooleanArray . Booleans must be same length as `size`. + note: `slice(booleans)` is acceptable for orthogonality of `slice`/`remove`. + ```ruby vector = penguins[:bill_length_mm] - penguins.slice(vector >= 40) + penguins.filter(vector >= 40) + # penguins.slice(vector >= 40) is also acceptable # => #<RedAmber::DataFrame : 242 x 8 Vectors, 0x0000000000043d3c> species island bill_length_mm bill_depth_mm flipper_length_mm ... year <string> <string> <double> <double> <uint8> ... <uint16> @@ -831,18 +839,18 @@ ### `assign` Assign new or updated variables (columns) and create an updated DataFrame. - - Variables with new keys will append new columns from the right. + - Variables with new keys will append new columns from right. - Variables with exisiting keys will update corresponding vectors. ![assign method image](doc/../image/dataframe/assign.png) - Variables as arguments - `assign(key_pairs)` accepts pairs of key and values as parameters. `key_pairs` should be a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`. + `assign(key_value_pairs)` accepts pairs of key and values as parameters. `key_value_pairs` should be a Hash of `{key => array_like}` or an Array of Arrays like `[[key, array_like], ... ]`. `array_like` is ether `Vector`, `Array` or `Arrow::Array`. ```ruby df = RedAmber::DataFrame.new( name: %w[Yasuko Rui Hinata], age: [68, 49, 28]) @@ -855,16 +863,16 @@ 0 Yasuko 68 1 Rui 49 2 Hinata 28 # update :age and add :brother - df.assign do + df.assign( { age: age + 29, brother: ['Santa', nil, 'Momotaro'] } - end + ) # => #<RedAmber::DataFrame : 3 x 3 Vectors, 0x00000000000658b0> name age brother <string> <uint8> <string> @@ -930,11 +938,11 @@ If assigner is empty or nil, returns self. - Append from left - `assign_left` method accepts the same parameters and block as `assign`, but append new columns from leftside. + `assign_left` method accepts the same parameters and block as `assign`, but append new columns from left. ```ruby df.assign_left(new_index: df.indices(1)) # => @@ -1451,10 +1459,12 @@ <string> <uint8> 0 A 1 1 B 4 2 D 5 ``` +##### `set_operable?(other)` + Check if `types` of self and other are same. ##### `intersect(other)` Select records appearing in both self and other. @@ -1496,19 +1506,27 @@ #<RedAmber::DataFrame : 1 x 2 Vectors, 0x0000000000029fcc> KEY1 KEY2 <string> <uint8> 1 B 2 2 C 3 + + other.differencr(df) + #=> + #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000040e0c> + KEY1 KEY2 + <string> <uint8> + 0 B 4 + 1 D 5 ``` ## Binding ### `concatenate(other)` - Concatenate another DataFrame or Table onto the bottom of self. The shape and data type of other must be the same as self. + Concatenate another DataFrame or Table onto the bottom of self. The types of other must be the same as self. - The alias is `concat`. + The alias is `concat` and `bind_rows`. An array of DataFrames or Tables is also acceptable as other. ```ruby df @@ -1536,12 +1554,14 @@ 1 2 B 2 3 C 3 4 D ``` -### `merge(other)` +### `merge(*other)` - Concatenate another DataFrame or Table onto the bottom of self. The shape and data type of other must be the same as self. + Concatenate another DataFrame or Table onto the bottom of self. The size of other must be the same as self. Self and other must not share the same key. + + The alias is `bind_cols`. ```ruby df #=> #<RedAmber::DataFrame : 2 x 2 Vectors, 0x0000000000009150>