## [0.5.2] - 2023-09-01 Support Apache Arrow 13.0.0 . This version is compatible with Arrow 12.0.0 . - Breaking change - Bug fixes - Fix bundle install issue by install libyaml-devel (#280) - Fix ownership in devcontainer ci (#280) - New features and improvements - Support Arrow 13.0.0 (#280) - Documentation and Example - Add dataframe_comparison_ja (#281) ## [0.5.1] - 2023-08-18 Docker environment is replaced by Dev Container, and Jupyter Notebooks will be created from qmd files. - Breaking change - Bug fixes - Fix timestamp test to set TZ locally (#249) - Fix regexp for beginning of String (#251) - Fix loading bin/Gemfile locally in bin/jupyter script (#261) - New features and improvements - Support sort and null_placement options in Vector#rank (#265) - Add Vector#find_substring method (#270) - Add Group#one method (#274) - Add Group#all and #any method (#274) - Add Group#median method (#274) - Add Group#count_uniq method (#274) - Introduce Dev Container environment - Introduce Devcontainer environment (#253) - Change lifecycle script from postCreate to onCreate (#253) - Move example to bin (#253) - Fix Python and Ruby versions in Dev Container (#254) - Add locale and timezone settings (#256) - Add quarto from devcontainer feature (#259) - Install HaranoAjiFonts as default Tex font (#259) - Refactoring - Rename boolean methods in VectorStringFunction (#263) - Refine Vector#inspect to show wheather chunked or not (#267) - Add an alias Group#count_all for #group_count (#274) - Improve in tests/CI - Create rake commands for Notebook convert/test (#269) - Fix rubocop warning of forwarding arguments in assign_update (#269) - Use rake to start example script (#269) - Add test in Vector#rank to cover illegal rank option error (#271) - Add bundle install to Rakefile (#276) - Use Dockerfile to create dev container (#276) - Save image to ghcr in ci (#276) - Documentation and Example - YARD - Update Docker Environment (#245) - Refine jupyter notebook environment (#253) - Refine yard in Group aggregations (#274) - Fix yard of Vector#rank (#269) - Fix yard of Group (#269) - Notebook - Start source management for jupyter notebook by qmd (#259) - Don't create ipynb if it exists (#261) - Add Group methods (125 in total) (#269) - Add ArrowFunction (126 in total) (#269) - Add DataFrame#auto_cast (127 in total) (#269) - Update required version in examples notebook (#269) - Update examples_of_red_amber (#269) - Update red-amber.qmd (#269) - GitHub site - Fix broken link in README/README.ja by Viktorius Suwandi (#262) - Change description in gemspec (#254) - Add documents for Dev Container (#254) - Thanks - Viktorius Suwandi ## [0.5.0] - 2023-05-24 - Breaking change - Use non keyword argument in #sub_by_value (#219) - Upgrade dependency to Arrow 12.0.0 (#238) - right_join will output columns as same order as Red Arrow. - DataFrame#join will not force ordering of original column by default - Join with type, such as full_join, sort after join by default - Bug fixes - Use truncate in Vector#sample(float) (#229) - Support options in DataFrame#tdra (#231) - Fix printing table with non-ascii strings (#233) - Fix join for Arrow 12.0.0 - New features and improvements - Add a singleton method Vector.[] (#218) - Add an alias #sub_group (#219) - Accept Group#summarize{Hash} to rename aggregated columns (#219) - Add Group#group_frame (#219) - Add Vector#cast (#224) - Add Vector#fill_nil(value) (#226) - Add Vector#one (#227) - Add Vector#mode (#228) - Add DataFrame#propagate (#235) - Add DataFrame#sample (#237) - Add DataFrame#shuffle (#237) - Support RankOptions in Vector#rank (#239) - Introduce MatchSubstringOptions family in Vector (#241) - Introduce Vector#match_substring? - Add Vector#end_with?, #start_with? method - Add Vector#match_like? - Add Vector#count_substring method - Refactoring - Refine Group and SubFrames function (#219) - Refine Group#group_count - Use Acero in Group#filters - Refine Group#filters, not using Acero - Refine Group#summarize(array) - Use Acero for renaming columns in join (#238) - Use index kernel with IndexOptions introduced in 12.0.0 (#240) - Improve in tests/CI - Use Fedra 39 Rawhide in CI (#238) - Documentation and Example - Add missing yard documents for SubFrames::Selectors (#219) - Update docker/example (#219) - Update Gemfile in docker (#219) - Add README.ja.md (#242) - GitHub site - Update link of Red Data Tools Chat to matrix (#242) - Thanks ## [0.4.2] - 2023-04-02 - Breaking change - Bug fixes - Fix Vector#modulo, #fdiv, #remainder (#203) - New features and improvements - Update SubFrames#take to return SubFrames (#212) - Refactoring - Refine SubFrames to support partial retrieval (#207) - Upgrade SubFrames#frames and promote to public (#207) - Use faster count in Group#inspect (#207) - Improve in tests/CI - Documentation and Example - Introduce minimum docker environment (#205) - Move example REPL to docker (#205) - Add readme.md in docker (#205) - Add example_of_red_amber.ipynb (#205) - Use smaller dataset in irb example - Fix docker/example - Updated link to red-data-tools (#213) - Thanks to Soumya Kushwaha - GitHub site - Migrated to [Red Data Tools](https://github.com/red-data-tools) - Thanks to Sutou Kouhei - Thanks - Sutou Kouhei - Soumya Kushwaha ## [0.4.1] - 2023-03-11 - Breaking change - Remove Vector.aggregate? method (#200) - Bug fixes - Return self in DataFrame#drop when dropper is empty (reverts 746ac263) (#193) - Return self in DataFrame#rename when renaming to same name (#193) - Return self in DataFrame#pick when pick itself (#199) - Fix column width for non-ascii elemnts in DataFrame#to_s (#193) - This change uses String#width. - Fix DataFrame#to_iruby when data is date32 type (#193) - Fix DataFrame#shorthand to show temporal type data simply (#193) - Fix Vector#rank when data is ChunkedArray (#198) - Fix Vector element-wise functions with nil as scalar (#198) - Support :force_order for all methods of join family (#199) - Supports :force_order option to force sorting after join for all #join familiy. - This will valuable in some cases such as large dataframes. - Ensure baseframe's schema for SubFrames (#200) - New features and improvements - Add Vector#first, #last method (#198) - This method will be used in SubFrames feature. - Add Vector#modulo method (#198) - The divmod function in Arrow C++ is still in draft state. This method was created by combining existing functions - Add Vector#quotient method (#198) - Add aliases #div, #mod, #mul, #pow, #quo and #sub for Vector (#198) - Add Vector#*_checked functions (#198) - This functions will check numeric range overflow. - Add 'tdra' and 'plain' in display mode (#193) - The plain mode and default inspect will show up to 128 rows and 128 columns. - Add String#width method in refinements (#193) - This will be used to update DataFrame#to_s. - Introduce pre-loaded REPL environment (#199) - This commit will add bin/example and it will start irb environment with enabled commonly used datasets such as penguins, diamonds, etc. - Upgrade SubFrames#aggregate to accept block (#200) - Refactoring - Use symbolized keys in refinements of Table#keys, #key? (#193) - This can be treat Tables and DataFrames as same manner. - Use key_name.succ in suffix of DataFrame#join (#193) - This will make simple to get name candidate. - Use ||= to memorize instance variables (#193) - Refine vector projection to use #variables (#193) - #variables is fastest when picking Vectors. - Refine Vector#is_in to avoid #pack (#198) - Refine Vector#index (#198) - Improve in tests/CI - Tests - Update benchmarks to test from older version (#193) - Refine test of Vector function with scalar (#198) - Refine test subframes and test_vector_selectable (#200) - Cops - CI - Documentation - Update documents(small fix) (#201) - GitHub site - Thanks ## [0.4.0] - 2023-02-25 - Breaking change - Upgrade dependency to Arrow 11.0.0 (#188) - Bug fixes - Add :force_order option for DataFrame#join (#174) - Return error for empty DataFrame in DataFrame#filter (#172) - Accept ChunkedArray in DataFrame#filter (#172) - Fix Vector#replace to accept Arrow::Array as a replacer (#179) - Fix Vector#round_to_multiple to accept Float or Integer (#180) - Change Vector atan2 to a class method (#180) - Fix Vector#shift when boolean Vector (#184) - Fix processing empty SubFrames (#183) - Do not check object id in DataFrame#rename, #drop for self (#188) - New features and improvements - Accept a block in DataFrame#filter (#172) - Add Vector.aggregate? method (#175) - Introduce Vector#propagate method (#175) - Add Vector#rank methods (#176) - Add Vector#sample method (#176) - Add Vector#sort method (#176) - Promote DataFrame#shape_str to public (#184) - Introduce Vector#concatenate (#184) - Add #numeric? in refinements of Array (#184) - Add Vector#cumulative_sum_checked and #cumsum (#184) - Add Vector#resolve method (#184) - Add DataFrame#tdra method (#184) - Add #expand as an alias for Vector#propagate (#184) - Add #glimpse as an alias for DataFrame#tdr (#184) - New class SubFrames (#183) - Introduce class SubFrames - Memorize dataframes in SubFrames - Add @frames to memorize sub DataFrames - Accept filters in SubFrames.new - Accept block in SubFrames.new - Add SubFrames.by_filter - Introduce methods creating SubFrames from DataFrame - Introduce SubFrames#each method - Add SubFrames#to_s method - Add SubFrames#concatenate method - Add SubFrames#offset_indices method - SubFrames#aggregate method - Redefine SubFrames#map to return SubFrames - Define SubFrame#map dynamically - Add SubFrames#assign method - Redefine SubFrames#select to return SubFrames - Add SubFrames#reject method - Add SubFrames#filter_map method - Refine DataFrame#indices memorizing @indices - Rename SubFrames#universal_frame as #baseframe - Set Group iteration feature to @api private - Refactoring - Generate Vector functions in class method (#177) - Set Constant visibility to private (#179) - Separate test_vector_function (#179) - Relocate methods in DataFrameIndexable (#179) - Rename Array refinements to the same name as Vector (#184) - Improve in tests/CI - Tests - Update benchmarks to set 0.3.0 as a reference (#167) - Move test of Vector#logb to proper location (#180) - Cops - Update .rubocop.yml to align with latest cops (#174) - Unify style of MethodCallIndentation as relative to reciever (#184) - CI - Fix setting up Arrow by homebrew in CI (#167) - Fix CI error on homebrew deleting python link (#167) - Set cache-version to get new C extensions in CI (#173) - Thanks to @kou for suggestion. - Documentation - Update DataFrame.md about loading csv without headers (#165) - Thanks to kojix2 - Update YARD in DataFrame combinable (#168) - Update comment for Ruby 2.7 support in README.md - Update license year - Update README (#172) - Update Vector.md and yardoc in #propagate (#175) - Use customized style sheet for YARD (#179) - Add examples for the doc of #pick and #drop (#179) - Add examples to YARD in DataFrame reshaping methods (#179) - Update documents in DataFrameDisplayable (#179) - Update documents in DataFrameVariableOperation (#179) - Update document for dynamically generated methods (#179) - Unify style in document (#179) - Update documents in DataFrameSelectable (#179) - Update documents of basic Vector methods (#179) - Update document in VectorUpdatable (#179) - Update document of Group (#179) - Update document of DataFrameLoadSave (#180) - Add examples for document of ArrowFunction (#180) - Update document of Vector_unary_aggregation (#180) - Update document of Vector_unary_element_wise (#180) - Update document of Vector_biary_element_wise (#180) - Add documentation to give comparison of dataframes(#169) - Thanks to Benson Muite - Update documents for consistency of method indentation (#189) - Update CHANGELOG (#189) - Update README for 0.4.0 (#189) - GitHub site - Thanks - kojix2 - Benson Muite ## [0.3.0] - 2022-12-18 - Breaking change - Supported Ruby version has changed from 2.7 to 3.0 - Upgrade minimum supported/required version of Ruby from 2.7 to 3.0 (#159, #160) - Bug fixes - Add check with #key? in DataFrame#method_missing (#140) - Delete unnecessary backslash to supress warning in unary functions (#140) - Fix syntax in code_climate.yml (144) - Temporary disable simplecov test report (#149) - Change Vector#[] to return Array or scalar (#148) - Add missing simplecov HTML formatter (#148) - Change return value of DataFrame#save to self (#160) - Originally reported by kojix2. - New features and improvements - Update Vector#take to accept block (#148) - Add properties of list Vectors (#148) - Add Vector#split, #split_to_column, #split_to_row (#148) - Add Vector#merge (#148) - Refactoring - Refactor code (#140) - Add DataFrame.create as a faster constructor - Refactor DataFrame.new using refinements and duck typing - Refactor Vector.new using refinements and duck typing - Add Vector.create as a faster constructor - Refactor Group - Refactor DataFrame#pick/#drop by refininig Array - Refactor DataFrame#pick/#drop - Refactor nil treatment in pick/drop - Refactor DataFrame#pick/#drop using new parser - Refactor DataFrame#[] - Refactor Vector#[], #take, #filter by updating parser - Add for_keys option to parse_args - Refactor Vector properties by refinements for Arrow::Array - Refactor DataFrame selectable using Arrow::Array refinements instead of Vector methods - Refactor DataFrame#assign - Refine error message in DataFrame#to_long/to_wide #143) - Refactor Vector#take/filter returns arrow array (#148) - Change LineLength in cop from 120 to 90 (#152) - Refine DataFrame combinable (join) operations (#159) - Refine DataFrame#join effectively using outputs options - Simplify DataFrame set operations - Improve in tests/CI - Tests - Update benchmark using 0.2.3 (#138) - Update benchmark basic#02/pick by [] (#140) - Update benchmark contexts and loop_count (#140) - Add benchmark for vector (#140) - Add tests for refinements (#140) - Add benchmark for the series of DataFrame operations (#140) - Add missing test for tdr and dictionary (#140) - Add missing test for group#method with foreign key (#152) - Add missing test for set operations and natural join (#152) - Add missing test for DataFrame#[] with selecting by Array of illegal type' (#152) - Add missing test for DataFrame#assign when assigner size is mismatch (#152) - Accept Hash as join keys in DataFrame join methods (#159) - Cops - Refactor/clean rubocop.yml (#138) - CI - Support Ruby 3.2 in CI test (#141) - Send test coverage report to Code Climate (#144) - Add test on Fedora (#151) - Thanks to Benson Muite. - Add workflow to generate document (#153) - Thanks to kojix2. - Support Code Climate test coverage report in CI (#155) - Documentation - Add YARD in data_frame.rb (#140) - Fix YARD document in the code (#140) - Add Code Climate badges of maintainability and coverage (#144) - Add installation for Fedora in README (#147) - Thanks to Benson Muite. - Add Vector#split/merge in Vector.md (#148) - Fix codeclimate badges in README (#155) - Update YARD in DataFrame join methods (#159) - Update jupyter notebook '89 examples of Redamber' (#160) - Thanks - Benson Muite - kojix2 ## [0.2.3] - 2022-11-16 - Bug fixes - Fix DataFrame#to_s when DataFrame.size == 0 (#125) - Remove unused lines in funcs (#128) - Remove unused methods in helper (#128) - Add test for invalid arg in DataFrame.new (#128) - Add test for Vector#shift(0) (#128) - Fix bugs for DataFrame#[], #pick and #drop with Range of Symbols and Symbol (#135) - New features and improvements - Upgrade dependency to Arrow 10.0.0 (#132) It is possible to initialize by the objects responsible to `to_arrow` since 0.2.3 . Arrays in Numo::NArray is responsible to `to_arrow` with Red Arrow Numo::NArray 0.0.6 . This feature is proposed by the Red Data Tools member @kojix2 and implemented by @kou. I made also Vector to be responsible to `to_arrow` and `to_arrow_array`. It becomes a member of ducks ('quack quack'). Thanks! - Change dev dependency to red-dataset-arrow (#117) - Add dev dependency for red-arrow-numo-narray (#132) - Support Numo::NArray in Vector.new (#132) - Support Vector#to_arrow_array (#132) - Update group (#118) - Introduce new DataFrame group support (experimental) This additional API will treat a grouped DataFrame as a list of DataFrames. I think this API has pros such as: - API is easy to understand and flexible. - It has good compatibility with Ruby's primitive Enumerables. - We can only use non hash-ed aggregation functions. - Do not need grouped DataFrame state, nor `#ungroup` method. - May be useful for concurrent operations. This feature is implemented by Ruby, so it is pretty slow and experimental. Use original Group API for practical purpose. - `include Enumerable` to Group (experimental) - Add Group#each, #inspect - Refactor Group to align with Arrow - Introduce DataFrame combining methods (#125) - Introduce DataFrame#concatenate method - Add DataFrame#merge method - Add DataFrame#inner_join method - Add DataFrame#full_join method - Add DataFrame#left_join method - Add DataFrame#right_join method - Add DataFrame#semi_join method - Add DataFrame#anti_join method - Add DataFrame#intersect method - Add DataFrame#union method - Add DataFrame#setdiff method - Rename #setdiff to #difference - Support natural join in DataFrame#join - Support partial join_key and renaming - Fix DataFrame#join to merge key columns - Add DataFrame#set_operable? method - Add join/set/bind image to DataFrame.md - Fix DataFrame#join, #right_semi, #right_anti (#128) - Miscellaneous - Return Vector in DataFrame#indices (#118) - Improve tests/ci - Improve CI - Add CI test on macOS (#133) - Enable bundler-cache on macOS (#128) - Add install gobject introspection prior to glib in CI (#133) This will stabilize CI system installation especially with cache. - Rename workflows/test.yml to ci.yml (#133) - Fix link in CI badge of README.md (#118) - Add github action for coverage (#128) - Add benchmark - Add benchmarks with Rover (#118) - Introduce benchmark suite (#134) - Add benchmark for combining operations (#134) - Measuring test coverage - Add test coverage measurement (#128) - Refactoring - Remove redundant string escape in `test_vector_function` (#132) - Refine tests to use `assert_equal_array` (#128) - Rewrite Vector#replace (#128) - Documentation - Update README.md for installation (#126) - Add clause that keys must be unique in doc. (#126) - Rows should be called as 'records' (#126) - Update Jupyter Notebook `83 examples of RedAmber` (#135) - GitHub site - Update Jupyter notebooks in Binder - Change default branch name from 'master' to 'main' (#127) - Thanks Ruby Association Grant committee It is a great honor for selecting RedAmber as a project of Ruby Association Grant 2022. ## [0.2.2] - 2022-10-04 - Bug fixes - Return self when no replacement happen in Vector#replace. (#92) - Limit n-digits in to_iruby. (#111) - Fix displaying space in to_iruby. (#111) - Raise error if key is duplicated. (#113) - Fix DataFrame#pick/#drop with endless Range. (#113) - Change type from dictionary to string in DataFrame reshaping methods. (#113) - Fix arguments parser to accept Enumerator. (#114) - New features and improvements - Support to make a data frame from a to_arrow-responsible object. (#106) [Patch by Kenta Murata] - Introduce DataFrame#auto_cast (experimental feature) (#105) - Change default name in DataFrame#transpose, #to_long, #to_wide. (#110) - Add Vector#dictionary? method. (#113) - Add display mode 'Plain' and 'Minimum'. (#113) - Refactor code - Refine test_vector_selectable. (#92) - Refine test_vector_updatable. (#92) - Refine Vector.new. (#113) - Refine DataFrame#pick, #drop. (#113) - Documents - Update images. (#90, #105, #113) - Update README to use simpler examples. (#112) - Update README with a new screenshot example. (#113) - GitHub site - Update Jupyter notebooks in Binder (#88, #115) - Move binder support to heronshoes/docker-stacks repository. - Update README notebook on binder. - Add examples_of_RedAmber notebook on binder. - Start to use discussions. - Thanks - Kenta Murata ## [0.2.1] - 2022-09-07 - Bug fixes - Fix `Vector#each` with block (#66) `Vector#each` will return value of each element with block. - Fix table format at size == 9 (#67) - Fix to support Vector in `DataFrame#assign` (#77) - Add `assert_delta` functionality for `assert_with_NaN` (#78) - Fix Vector#is_in when self is chunked (#79) - Fix Array type error (uint/int) (#79) - New features and improvements - Refine `DataFrame#indices` method (#67) - Update DataFrame reshaping methods (#73) - Change default option value of DataFrame reshaping - Change the order of import_cars example - Add `DataFrame#method_missing` to get column vector by method (#75) - Add `DataFrame#method_missing` to get column (#75) - Accept both args and block in `DataFrame#assign` (#75) - Accept indices in `DataFrame#pick` and `DataFrame#drop` (#76) - Add `DataFrame#slice_by` method (#77) - Add new Vector functions (#78) - Add inverse trigonometric function for Vector - `acos` - `asin` - Add logarithmic function for Vector - `ln` - `log10` - `log1p` - `log2` - Add binary function `Vector#logb` - Docker image and Jupyter Notebook [Thanks to Kenta Murata] - Add link to RubyData in README - Add link to interactive README by Binder - Update Jupyter Notebook `71 examples of RedAmber` - Thanks - Kenta Murata ## [0.2.0] - 2022-08-15 - Bump version up to 0.2.0 - Bug fixes - Fix order of multiple group keys (#55) Only 1 group key comes to left. Other keys remain in right. - Remove optional `require` for rover (#55) Fix DataFrame.new for argument with Rover::DataFrame. - Fix occasional failure in CI (#59) Sometimes the CI test fails. I added -dev dependency in Arrow install by apt, not doing in bundler. - Fix calling :take in V#[] (#56) Fixed to call Arrow function :take instead of :array_take in Vector#take_by_vector. This will prevent the error below when called with Arrow::ChunkedArray. - Raise error renaming non existing key (#61) Add error when specified key is not exist. - Fix DataFrame#rename #assign by array (#65) - New features and improvements - Support Arrow 9.0.0 - Upgrade to Arrow 9.0.0 (#59) - Add Vector#quantile method (#59) Arrow::QuantileOptions has supported in Arrow GLib 9.0.0 (ARROW-16623, Thanks!) - Add Vector#quantiles (#62) - Add DataFrame#each_row (#56) - Returns Enumerator if block is not given. - Change DataFrame#each_row to return a Hash {key => row} (#63) - Refactor to use pattern match in overloaded parameter parsing (#61) - Refine DataFrame.new to use pattern match - Use pattern match in DataFrame#assign - Use pattern match in DataFrame#rename - Accept Array for renamer/assigner in #rename/#assign (#61) - Accept assigner by Arrays in DataFrame#assign - Accept renamer pairs by Arrays in DataFrame#rename - Add DataFrame#assign_left method - Add summary/describe (#62) - Introduce DataFrame#summary(#describe) - Introduce reshaping methods for DataFrame (#64) - Introduce DataFrame#transpose method - Intorduce DataFrame#to_long method - Intorduce DataFrame#to_wide method - Others - Add alias sort_index for array_sort_indices (#59) - Enable :width option in DataFrame#to_s (#62) - Add options to DataFrame#format_table (#62) - Update Documents - Add Yard doc for some methods - Update Jupyter notebook '61 Examples of Red Amber' (#65) ## [0.1.8] - 2022-08-04 (experimental) - Bug fixes - Fix unnamed column in table formatter (#52) - Fix DataFrame#key?, DataFrame#key_index when @keys.nil? (#52) - Align order of replacer in Vector#replace (#53, resolved #38) - New features and improvements - Refine DataFrame.new for empty arguments (#50) - Delete .rubocop_todo.yml for not to use yoda condition (#50) - Refine Group (#52, resolved #28) - Refine Group methods creation - Make group key at first(left) - Show only one group count when same counts - Add block acceptability for group - Rename empty key to :unnamed in DataFrame.new - Rename Group#aggregated_by to #summarize (#54) - Add Vector#shift (#51) - Vector#[] accepts Range as an argument (#51) - Update documents - Add support for yard (#54) - Renew jupyter notebook '53 examples' (#54) - Add more examples and images in README (#52) - Add document of group manipulations in README (#52) - Renew DF#group document in DataFrame.md (#52) ## [0.1.7] - 2022-07-15 (experimental) - Bug fixes - Remove development dependency for red-dataset-arrow (#47) - To avoid irregular fails in CI test - Add red-datasets to development dependency instead (#49) - Supress useless log in tests (#46) Suppress log of Webrick and iruby. - New features and improvements - Use Table mode as default preview mode in `inspect`/`to_s` (#40) - Show examples in documents in Table - Use the word rows/columns - Update images of data processing in Table style - Introduce a new Table formatter (#47) - Migrate from the Arrow's formatter - Do not use TAB, format by spaces only. - Align column width with head rows and tail rows. - Show nils. - Show data types. - Refine documents to use new formatter output - Simplify options of Vector functions (#46) Vector functions with options use optional argument opt in previous code. - Add `#float?`, `#integer?` to Vector (#46) - Add `#each` to Vector (#47) - Introduce class `Group` (#48) - Refine `DataFrame#group` to use class Group - Add methods to Group - Move parquet and rover to development dependency (#49) - Refine text in `DataFrame#to_iruby` (#40) - Add badges in Github site - Gitter badge for Red Data Tools (#42) - Gem version and CI status badge (#45) - Exchange containers in red-amber.rb and red_amber.rb (#47) - Mainly use red_amber by consistency with the folder name - Add Jupyter notebook '47 Examples of Red Amber' (#49) ## [0.1.6] - 2022-06-26 (experimental) - Bug fixes - Fix mime-type of empty DataFrame in `#to_iruby` (#31) - Fix mime setting in `DataFrame#to_iruby` (#36) - Fix unmatched return val in Selectable (#34) - Fix to return same error as `#[]` in `DataFrame#slice` (#34) - New features and improvements - Introduce Jupyter support (#29, #30, #31, #32) - Add `DataFrame#to_html (changed to use #to_iruby) - Add feature to show nil in to_iruby - nil is expressed as (nil) - empty string('') is "" - blank spaces are " " - Enable to change DataFrame display mode by ENV (#36) - Support ENV['RED_AMBER_OUTPUT_STYLE'] to change display mode in `#inspect` and `#to_iruby` - ENV['RED_AMBER_OUTPUT_STYLE'] = 'table' # => Table mode - ENV['RED_AMBER_OUTPUT_STYLE'] = nil or other than 'table' # => TDR mode - Support `require 'red-amber'`, as well (#34) - Refine Vector slicing methods (#31) - Introduce `Vector#take` method - Introduce `Vector#filter` method - Improve `Vector#[]` to overload take and filter - Introduce `Vector#drop_nil` method - Introduce `Vector#if_else` method - Intorduce `Vector#is_in` method - Add alias `Vector#all?`, `#any?` methods (#32) - Add `Vector#has_nil?` method(#32) - Add `Vector#empty?` method - Add `Vector#primitive_invert` method - Refactor `Vector#take`, `#filter` - Move `Vector#if_else` from function to Updatable - Move if_else test to updatable - Rename updatable in test - Remove method `Vector#take_out_element_wise` - Rename inner metthod name - Refine DataFrame slicing methods (#31) - Introduce `DataFrame#take method - #take is implemented as vector calculation by #if_else - Introduce `DataFrame#fliter method - Change `DataFrame#[] to use take and filter - Float indices is acceptable (#10) - Negative index (like Array) is also acceptable - Further refinement in DataFrame slicing methods (#34) - Improve `DataFrame#[]`, `#slice`, `#remove` by a new engine - It parses arguments to Vector internally. - Used Kernel#Array to simplify code (#16) . - Move `DataFrame#slice`, `#remove` to Selectable - Refine `DataFrame#take`, `#filter` (undocumented) - Introduce coerce in Vector (#35) - Introduce `Vector#coerce` - Now we can `-1 * Vector.new([1, 2, 3])` - Add `Vector#to_ary` method - Now we can `[1, 2] + Vector.new([3, 4, 5])` - Other new feature or refinements - Common - Refactor helper as common for DataFrame and Vector (#35) - Change name row/col to obs/var (#34) - Rename internal function name (#34) - Delete unused methods (#34) - DataFrame - Change to return instance variable in `#to_arrow`, `#keys` and `#key_index` (#34) - Change to return an Array in `DataFrame#indices` (#35) - Vector - Introduce `Vector#replace` method - Accept Range and expanded Array in `Vector#new` - Add `Vector#indices` method (#35) - Add `Vector#index` method (#35) - Rename VectorCompensable to *Updatable (#33) - Documentation - Fix typo in DataFrame.md - Github site - Add gem and status badges in README. (#42) [Patch by kojix2] - Thanks - kojix2 ## [0.1.5] - 2022-06-12 (experimental) - Bug fixes - Fix DataFrame#tdr to display timestamp type (#19) - Add TZ setting in CI test to pass temporal tests (#19) - Fix example in document of #load(csv_from_URI) (#23) - New features and improvements - Improve usability of DataFrame manipulating block (#19) - Add `DataFrame#v` to select a Vector - Add `DataFrame#variables` method - Add `DataFrame#to_arrow` - Add instance variables in DataFrame with lazy initialization - Add `Vector#key` to get key name - Add `Vector#temporal?` to check if temporal type - Refine around DataFrame#variables - Refine init of instance variables - Refine DataFrame#type_classes, Vector#ectortype_class - Refine DataFrame#tdr to shorten temporal data - Add supports to make up for missing values (#20) - Add VectorArgumentError - Add `Vector#replace_with` - Add helper function to assert with NaN - To assert NaN == NaN - Add `Vector#fill_nil_backward`, `Vector#forward` - Add `DataFrame#remove_nil` method - Change to accept nil as replacement in Vector#replace_with - Introduce index related methods (#22) - Add `Vector#sort_indexes` method - Add `Vector#uniq` method - Add `Vector#tally` and `Vectorvalue_counts` methods - Add `DataFrame#sort` method - Add `DataFrame#group` method - Change to use DataFrame#map_indices in #[] - Add rounding functions with opts (#21) - With options :mode and :n_digits - :n_digits also can be specified with :multiple option in `Vector#round_to_multiple` - `Vector#round` - `Vector#ceil` - `Vector#floor` - `Vector#trunc` - Documentation - Update TDR, TDR_ja documents to latest (#18) - Refinement and small fix in DataFrame.md (#18) - Update README to use more effective example (#18) - Delete expired TDR_operations.pdf (#23) - Update README and dataframe_model image (#23) - Update description about rover-df in README (#23) - Add installation of Arrow in README (#23) - Others - Tried but cannot use bundler cache in ci test (#17) - Bump up requirements to Arrow 8.0.0 (#25) - Arrow 7.0.0 with Ubuntu 21.04 causes an fatal error in replace_with_mask function. - Update the description of gem (#23) - Add benchmark tests (#26) ## [0.1.4] - 2022-05-29 (experimental) - Bug fixes - Fix missing support for scalar argument (#1) - Fix type name of boolean in DataFrame#types to be same as Vector#type (#6, #7) - Fix zero picking to return empty DataFrame (#8) - Fix code at both args and a block given (#8) - New features and improvements - `DataFrame` - Refine module name `Displayable` - Rename nrow/ncol methods to `size`/`n_keys` to align with TDR concept (#4) - Remain `n_row`/`n_col` for compatibility - Rename `ls` method to `tdr` (#4) - Add limit option to `tdr` - Shorten option name (#11) - Introduce `pick` method to create sub DataFrame (#8) - Add boolean support (#8) - Refactor `pick` (#9) - Introduce `drop` method to create sub DataFrame (#8) - Add boolean support (#8) - Refactor `drop` (#9) - Add boolean array support for `[]` (#9) - Add `indexes`/`indices` to use with selecting observations (#9) - Introduce `slice` method to create sub DataFrame (#8) - Refactor `slice` (#9) - Introduce `remove` method to create sub DataFrame (#9) - Introduce `rename` method to create sub DataFrame (#14) - Introduce `assign` method to create sub DataFrame (#14) - Improve to call block by instance_eval (#13) - `Vector` - Refine `find(function)` - Add `min_max` method (#2) - Add `std`/`sd` method (ddof=0 version: `stddev`) (#2) - Add `var` method (ddof=0 version: `variance`) (#2) - Add `VectorFunctions.arrow_doc(func_name)` (temporally) - Documentation - Show code in README - Change row/column names for **TDR** concept (#4) - Add documents about **TDR** concept (#4) - Add example about TDR (#4) - Separate README to create DataFrame and Vector documents (#12) - Add DataFrame model concept image to README (#12) - GitHub site - Switched to use merge on GitHub (not to push merged master) (#1) - Create lifetime issue #3 to show the goal of this project (#3) ## [0.1.3] - 2022-05-15 (experimental) - Bug fixes - Fix boolean functions in `Vector` to align with Ruby's behavior - `&` == `and_kleene` - `|` == `or_kleene` - Quote strings of data-preview in `DataFrame#inspect` - Quote empty and blank keys in `DataFrame#inspect` - Respond to error for a wrong key in `DataFrame#[]` - New features and improvements - `DataFrame` - Display nil elements in `inspect` - Show NaN and nil counts in `inspect` - Refactor `inspect` - Add method `key` and `key_index` - Add how to load/save Parquet to README - `Vector` - Add categorization functions This is an important step to support `slice` method and NA treatment features. - `is_finite` - `is_inf` - `is_na` (RedAmber original) - `is_nan` - `is_nil`, `is_null` - `is_valid` - Show in a reduced representation for long array in `inspect` - Support options in aggregatiton functions - Return values in non-arrow object for scalar aggregation functions ## [0.1.2] - 2022-05-08 (experimental) - Bug fixes: - `DataFrame` - Fix bug in `#[]` with end-less Range - New features and improvements - Add support for Arrow 8.0.0 - `DataFrame` - `types` and `data_types` - Range is usable to specify columns in `#[]` - `Vector` - `type` and `data_type` ## [0.1.1] - 2022-05-06 (experimental) - Release on rubygems.org - Introduce class `DataFrame` - New from Hash, schema/rows, `Arrow::Table`, `Rover::DataFrame` - Load from file, string, URI - Save to file, string, URI - Methods for basic properties - Rich inspect method - Basic selecting by `#[]` - Introduce class `Vector` - New from a column in a `DataFlame` - New from `Arrow::Array`, `Arrow::ChunkedArray`, `Array` - Methods for basic properties - Function support - Unary aggregations - Unary element-wises - Binary element-wises - Some operators defined ## [0.1.0] - 2022-04-15 (unreleased) - Initial version