Enumerable#bucket_by (and #uniq_by and #uniq_by!)
I just learned about Enumerable#partition and immediately had the desire to partition into more than two parts. The #partition method is certainly ideal for implementing a quicksort, but for general bucketing there is a need for something less boolean:
module Enumerable def bucket_by hash = Hash.new { |h,k| h[k] = [] } each { |v| hash[yield(v)] << v } hash.default = nil hash end end
The bucket_by method takes a block and returns a hash of arrays of objects keyed by the block return values for each element. Simple and convenient. And hey, while we're using blocks for stand-in values and _by suffixes:
module Enumerable def uniq_by seen = {} select { |v| key = yield(v) (seen[key]) ? nil : (seen[key] = true) } end end class Array def select! reject! { |v| not yield(v) } end def uniq_by! seen = {} select! { |v| key = yield(v) (seen[key]) ? nil : (seen[key] = true) } end end
Enjoy!
Update! See this post with a slightly cleaner implementation.
Labels: Ruby
2 Comments:
At 7/09/2006 09:53:00 AM, Martin DeMello said…
Here's my implementation of uniq_by - it preserves the order of elements in the original Enumerable.
def uniq_by(*args, &blk)
blk ||= lambda {|i| i.send(*args)}
h = {}
res = []
self.each {|i|
j = blk[i]
unless h[j]
h[j] = i
res << i
end
}
res
end
At 11/29/2006 07:27:00 PM, Anonymous said…
Hi--
Facets has these. Your bucket_by is called partition_by. Your implementation looks a tad better however, so I will apply it to Facets.
Also uniq_by:
def uniq_by #:yield:
h = {}; inject([]) {|a,x| h[yield(x)] ||= a << x}
end
Not sure which is better implementation here, although I don't like the ||= in Facet's version.
Thanks.
Post a Comment
<< Home