Ruby, iOS, and Other Development

A place to share useful code snippets, ideas, and techniques

All code in posted articles shall be considered public domain unless otherwise noted.
Comments remain the property of their authors.

2006-06-15

Enumerable#bucket_by (and #uniq_by and #uniq_by!)

I just learned about Enumerable#partition and immediately had the desire to partition into more than two parts. The #partition method is certainly ideal for implementing a quicksort, but for general bucketing there is a need for something less boolean:

module Enumerable
  def bucket_by
    hash = Hash.new { |h,k| h[k] = [] }
    each { |v| hash[yield(v)] << v }
    hash.default = nil
    hash
  end
end

The bucket_by method takes a block and returns a hash of arrays of objects keyed by the block return values for each element. Simple and convenient. And hey, while we're using blocks for stand-in values and _by suffixes:

module Enumerable
  def uniq_by
    seen = {}
    select { |v|
      key = yield(v)
      (seen[key]) ? nil : (seen[key] = true)
    }
  end
end
class Array
  def select!
    reject! { |v| not yield(v) }
  end
  def uniq_by!
    seen = {}
    select! { |v|
      key = yield(v)
      (seen[key]) ? nil : (seen[key] = true)
    }
  end
end

Enjoy!

Update! See this post with a slightly cleaner implementation.

Labels:

2 Comments:

  • At 7/09/2006 09:53:00 AM, Blogger Martin DeMello said…

    Here's my implementation of uniq_by - it preserves the order of elements in the original Enumerable.

    def uniq_by(*args, &blk)
    blk ||= lambda {|i| i.send(*args)}
    h = {}
    res = []
    self.each {|i|
    j = blk[i]
    unless h[j]
    h[j] = i
    res << i
    end
    }
    res
    end

     
  • At 11/29/2006 07:27:00 PM, Anonymous Anonymous said…

    Hi--

    Facets has these. Your bucket_by is called partition_by. Your implementation looks a tad better however, so I will apply it to Facets.

    Also uniq_by:

    def uniq_by #:yield:
    h = {}; inject([]) {|a,x| h[yield(x)] ||= a << x}
    end

    Not sure which is better implementation here, although I don't like the ||= in Facet's version.

    Thanks.

     

Post a Comment

<< Home