Named Array Slots
Sometimes you have an array of data that isn't quite complicated enough for a full-fledged data model, but you want to access elements by name rather than positionally. Probably you even have a bunch of these arrays with positions corresponding to named fields. These arrays might have come from a DBI query, or CSV, or parsing some arbitrary data file, but ultimately you have a need to make your code more readable and avoid poking at these data structures with error-prone magic numbers. What you actually want is a Ruby module that defines methods to access the fields, with which you can then extend the Array objects.
Suppose that your arrays represent users and have four elements, in order: name, gender, email, zip. The naïve, ad hoc way of doing things, then, is:
module MyFields def name self[0] end def name=(val) self[0] = val end def age self[1] end def age=(val) self[1] = val end def email self[2] end def email=(val) self[2] = val end def zip self[3] end def zip=(val) self[3] = val end end
What a mess, and that's for just four fields! Let's do a little dynamic programming. It's still simple and ad hoc, but it's better:
module MyFields %w(name age email zip).each_with_index { |field,i| define_method(field) { self[i] } define_method("#{field}=") { |val| self[i] = val } } end
Much better, and we can change the list of fields pretty easily. Still, if we have several different sets of fields (e.g. rows from several different database tables) that's a lot of syntax for something pretty simple. Also, if both the field names and data are coming from an external data source, you may only care about some limited number of those fields but still need to get all of them properly named in the correct order. Ultimately, you'd like to be able to take an array of arbitrary objects, convert the objects to strings, and get a module with which you can extend your row arrays out of it. Something like this:
class Array ConvertElementsToFields = lambda { |f| f = "#{f}" # get as a new string, even if it's already a String f.downcase! f.gsub!(/[^\w]+/, '_') f } def field_names_module(&convert) convert ||= ConvertElementsToFields fields = self Module.new do |mod| const_set 'Fields', fields.map(&convert).each_with_index { |f,i| f.freeze define_method(f) { self[i] } define_method("#{f}=") { |val| self[i] = val } }.freeze unless instance_methods.include? "field_list" define_method("field_list") { mod::Fields } end end end end
The simple case, where we know the list of fields ahead of time, looks like this:
MyFields = %w(name gender email zip).field_names_module
The more complicated case where we don't know the field names/positions ahead of time is almost as easy. Consider a result from a DBI query:
MyFields = result.fetch_fields.field_names_module { |field| field.name }
Still pretty easy, even for the complicated case. Enjoy!
Labels: Metaprogramming, Ruby, Tip
3 Comments:
At 5/21/2007 02:20:00 PM, Anonymous said…
Can you explain the context in which you needed this? I'm interested because a Hash is the obvious solution.
If nothing else though, this does show what kind of cool stuff you can do with Ruby.
At 5/21/2007 06:58:00 PM, Gregory said…
I actually mentioned the context in the posting, though sort of in passing. I ran into this both when using DBI to query a database and when using FasterCSV to load a file. (Yes, I know about the header row automation in FasterCSV, but it does not do what I want. See below.) In both cases there were enough rows that instantiating an additional object for each row posed significant performance issues. Extending each row with a module as it is read, however, is efficient and effective.
On top of that, when I am dealing with something with a static structure (e.g. a row of structured data) I generally prefer to be able to treat it as a struct (or Struct). You'll notice that I am not redefining [] and []=, I am defining a getter and setter for each named field. I want a row to act like a struct, rather than a hash, because it is semantically more like a struct than a hash.
At 5/22/2007 12:39:00 PM, Anonymous said…
That makes sense. I guess my concern is that you didn't discuss where this code would go, and readers might end up using this technique in the wrong place.
For example, if I write some library that uses a CSV file as the data store, it'd be pretty bad of me to hand you an array and a link to this article. I should be using this technique within the library itself in order to give the client programmer a data structure that uses the domain language.
I just wanted to clarify that in case other people are as dull as me and don't see it right away :)
Post a Comment
<< Home