Ruby, iOS, and Other Development

A place to share useful code snippets, ideas, and techniques

All code in posted articles shall be considered public domain unless otherwise noted.
Comments remain the property of their authors.

2007-05-21

Named Array Slots

Sometimes you have an array of data that isn't quite complicated enough for a full-fledged data model, but you want to access elements by name rather than positionally. Probably you even have a bunch of these arrays with positions corresponding to named fields. These arrays might have come from a DBI query, or CSV, or parsing some arbitrary data file, but ultimately you have a need to make your code more readable and avoid poking at these data structures with error-prone magic numbers. What you actually want is a Ruby module that defines methods to access the fields, with which you can then extend the Array objects.

Suppose that your arrays represent users and have four elements, in order: name, gender, email, zip. The naïve, ad hoc way of doing things, then, is:

module MyFields
  def name
    self[0]
  end
  def name=(val)
    self[0] = val
  end
  def age
    self[1]
  end
  def age=(val)
    self[1] = val
  end
  def email
    self[2]
  end
  def email=(val)
    self[2] = val
  end
  def zip
    self[3]
  end
  def zip=(val)
    self[3] = val
  end
end

What a mess, and that's for just four fields! Let's do a little dynamic programming. It's still simple and ad hoc, but it's better:

module MyFields
  %w(name age email zip).each_with_index { |field,i|
    define_method(field) { self[i] }
    define_method("#{field}=") { |val| self[i] = val }
  }
end

Much better, and we can change the list of fields pretty easily. Still, if we have several different sets of fields (e.g. rows from several different database tables) that's a lot of syntax for something pretty simple. Also, if both the field names and data are coming from an external data source, you may only care about some limited number of those fields but still need to get all of them properly named in the correct order. Ultimately, you'd like to be able to take an array of arbitrary objects, convert the objects to strings, and get a module with which you can extend your row arrays out of it. Something like this:

class Array
  ConvertElementsToFields = lambda { |f|
    f = "#{f}" # get as a new string, even if it's already a String
    f.downcase!
    f.gsub!(/[^\w]+/, '_')
    f
  }
  def field_names_module(&convert)
    convert ||= ConvertElementsToFields
    fields = self
    Module.new do |mod|
      const_set 'Fields',
        fields.map(&convert).each_with_index { |f,i|
          f.freeze
          define_method(f) { self[i] }
          define_method("#{f}=") { |val| self[i] = val }
        }.freeze
      unless instance_methods.include? "field_list"
        define_method("field_list") { mod::Fields }
      end
    end
  end
end

The simple case, where we know the list of fields ahead of time, looks like this:

MyFields = %w(name gender email zip).field_names_module

The more complicated case where we don't know the field names/positions ahead of time is almost as easy. Consider a result from a DBI query:

MyFields = result.fetch_fields.field_names_module { |field| field.name }

Still pretty easy, even for the complicated case. Enjoy!

Labels: , ,

3 Comments:

  • At 5/21/2007 02:20:00 PM, Anonymous Anonymous said…

    Can you explain the context in which you needed this? I'm interested because a Hash is the obvious solution.

    If nothing else though, this does show what kind of cool stuff you can do with Ruby.

     
  • At 5/21/2007 06:58:00 PM, Blogger Gregory said…

    I actually mentioned the context in the posting, though sort of in passing. I ran into this both when using DBI to query a database and when using FasterCSV to load a file. (Yes, I know about the header row automation in FasterCSV, but it does not do what I want. See below.) In both cases there were enough rows that instantiating an additional object for each row posed significant performance issues. Extending each row with a module as it is read, however, is efficient and effective.

    On top of that, when I am dealing with something with a static structure (e.g. a row of structured data) I generally prefer to be able to treat it as a struct (or Struct). You'll notice that I am not redefining [] and []=, I am defining a getter and setter for each named field. I want a row to act like a struct, rather than a hash, because it is semantically more like a struct than a hash.

     
  • At 5/22/2007 12:39:00 PM, Anonymous Anonymous said…

    That makes sense. I guess my concern is that you didn't discuss where this code would go, and readers might end up using this technique in the wrong place.

    For example, if I write some library that uses a CSV file as the data store, it'd be pretty bad of me to hand you an array and a link to this article. I should be using this technique within the library itself in order to give the client programmer a data structure that uses the domain language.

    I just wanted to clarify that in case other people are as dull as me and don't see it right away :)

     

Post a Comment

<< Home