README.md in summaryse-1.0.0 vs README.md in summaryse-1.1.0

- old
+ new

@@ -1,6 +1,6 @@ -# Array#summaryse +# Array#summaryse (version 1.1.0) [sudo] gem install summaryse ## Links @@ -22,13 +22,13 @@ ## An opinionated use-case -- YAML merging In many projects of mine including {https://github.com/blambeau/noe noe}, {https://github.com/blambeau/agora agora} or -{https://github.com/blambeau/dbagile dbagile}, a common need is merge YAML files. -Merging YAML files is complex because you need full control of how merging applies -on specific tree nodes. Summaryse solves this very effectively. +{https://github.com/blambeau/dbagile dbagile}, a common need is to merge YAML +files. Merging YAML files is difficult because you need full control of how +merging applies on specific tree nodes. Summaryse solves this very effectively. # This is left.yaml left = YAML.load ... # syntactically wrong, but to avoid Yard's rewriting hobbies: - ruby @@ -140,22 +140,20 @@ { :hobbies => [:ruby], :size => 12 }, { :hobbies => [:music], :size => 17 } ].summaryse(:hobbies => :union, nil => :first) # => {:hobbies => [:ruby, :music], :size => 12} -### Specifying with lambdas +### Unexisting keys -When no default summarization function fit your needs, just pass a lambda. It -will be called with the array of values on which aggregation must be done: +Specifying unexisting keys is also permitted. In this case, the evaluation is +done on an empty array: [ { :hobbies => [:ruby], :size => 12 }, { :hobbies => [:music], :size => 17 } - ].summaryse(:hobbies => :union, :size => lambda{|a| - a.join(', ') - }) - # => {:hobbies => [:ruby, :music], :size => "12, 17"} + ].summaryse(:hello => lambda{|a| a}) + # => {:hello => []} ## On Arrays of Hash-es Summarizing an Array of Array-s of Hash-es yields -> an Array of Hash-es @@ -175,30 +173,103 @@ A quick remark: when merging arrays of hashes, #summaryse guarantees that the returned hashes are in order of encountered 'by key' values. That is, in the example above, yard comes before summaryse that comes before treetop because this is the order in which they have been seen initially. -# By the way, why this stupid name? +# Some extra goodness -Just because summarize was already an {https://rubygems.org/gems/summarize existing gem}. -Summaryse is also much less likely to cause a name clash on the Array class. And -I'm a french-speaking developer :-) +## On empty arrays -And where does 'summarize' come from? The name is inspired by (yet not equivalent -to) {http://en.wikipedia.org/wiki/D_(data_language_specification)#Tutorial_D -TUTORIAL D}'s summarization operator on relations. -See my {https://github.com/blambeau/alf alf} project. Array#summaryse is -rubyiesque in mind and does not conform to a purely relational vision of -summarization, though. +For now, no special support is provided for the corner cases. One could argue +that the sum of an empty array should be 0, but this is wrong because of duck +typing (maybe you try to sum something else)... A nil value is returned in +almost all empty cases unless the semantics is very clear: + [].summaryse(:count) # => 0 + [].summaryse(:sum) # => nil + [].summaryse(:avg) # => nil + + [].summaryse(:min) # => nil + [].summaryse(:max) # => nil + [].summaryse(:first) # => nil + [].summaryse(:last) # => nil + + [].summaryse(:intersection) # => nil + [].summaryse(:union) # => nil + +Special support for specifying a default value to use on empty arrays should +be provided around 2.0. Don't hesitate too contribute a patch if you need it +earlier. + +## Specifying with lambdas + +When no default summarization function fit your needs, just pass a lambda. It +will be called with the array of values on which aggregation must be done: + + [ + { :hobbies => [:ruby], :size => 12 }, + { :hobbies => [:music], :size => 17 } + ].summaryse(:hobbies => :union, :size => lambda{|a| + a.join(', ') + }) + # => {:hobbies => [:ruby, :music], :size => "12, 17"} + +## Registering your own aggregators + +Since 1.1, you can register your own aggregation functions. Such function simply +takes a single argument which is an array of values to aggregate. This is +especially useful to install new and/or override existing aggregation functions. +This also allows handling parameters: + + Summaryse.register(:comma_join) do |ary| + ary.join(', ') + end + [1, 4, 12, 7].summaryse(:comma_join) # => "1, 4, 12, 7" + +## Aggregator objects + +Sometimes it is useful to use your own objects as aggregators. For this, simply +provide them a to_summaryse function that returns an aggregation function: + + class Foo + def to_summaryse; :sum; end + end + [1, 2, 3].summaryse(Foo.new).should eq(6) + +Returned object is anything that can be seen as an aggregation function: a +Symbol, a Proc, a Hash, an Array, or even another such object: + + class Bar + def to_summaryse; Foo.new; end + end + [1, 2, 3].summaryse(Bar.new).should eq(6) + +## Bypassing Hash entries + +It is sometimes useful to explicit bypass specific Hash entries as the result of +the computation. Entries for which the aggregator returns Summaryse::BYPASS will +be simply removed from the result: + + [ + { :hobbies => [:ruby], :size => 12 }, + { :hobbies => [:music], :size => 17 } + ].summaryse(:size => :max, :hobbies => lambda{|a| Summaryse::BYPASS}) + # => {:size => "17"} + # Contribute, Versioning and so on. As usual: the code is on {http://github.com/blambeau/summaryse github}, I follow {http://semver.org/ semantic versioning} (the public API is almost everything but implementation details, that is, the method name, its recognized arguments and the semantics of the returned value), etc. -Now, frankly, you can also copy/paste the source code of this simple array -extension in your own project. This tend to be much friendly and much simpler -than using a gem, IMHO. Reuse by copy-pasting even has a name: -{http://revision-zero.org/reuse code scavenging}. - +## By the way, why this stupid name? + +Just because summarize was already an {https://rubygems.org/gems/summarize existing gem}. +Summaryse is also much less likely to cause a name clash on the Array class. And +I'm a french-speaking developer :-) + +And where does 'summarize' come from? The name is inspired by (yet not equivalent +to) {http://en.wikipedia.org/wiki/D_(data_language_specification)#Tutorial_D TUTORIAL D}'s +summarization operator on relations. See my {https://github.com/blambeau/alf alf} +project. Array#summaryse is rubyiesque in mind and does not conform to a purely +relational vision of summarization, though.