ruby picture

RCR 262: more enumerator functionality

Submitted by Kristof Bastiaensen (Thu Aug 05 20:08:25 UTC 2004)

Abstract

This RCR proposes two additions for "enumerator": to_enum with a block, and Enumerable#enum_if.

Problem

The enumerator library provides a way to create an enumerable from any method. This RCR adds even more convenient functionality.

Proposal

to_enum creates an enumerable object from any method. I propose that to_enum can take a block, which enables it to transform the block-output. for example:

powers = 4.to_enum(:times){ |i| i * i}
powers.to_a
=> [0, 1, 4, 9]

A new method Enumerable#enum_if will also be provided to enable powerful filtering possibilities. It will yield the current value if the given block doesn't return false. for example:

(0..4).enum_if { |i| i % 2 == 0 }.to_a
=> [0, 2, 4]

This method can be useful as a replacement for Enumerable#select, where it may be inefficient to create a large temporary array:

large_dataset.enum_if { |d| sometest(d) }.collect do |d|
  <some transformations>
end

Any enumerable created by enum_if will reflect any changes made to the original object:

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bar"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=>  [6, 9, -19, -19]

data += ["Bear", 20, 3] ints.to_a => [6, 9, -19, -19, 20, 3]

Analysis

Currently filtering can be done using Enumerable#select, but this can be inefficient when it is used with large data, because a temporary array needs to be created. The enumerable created by enum_if will also reflect any changes made to the original object. Using to_enum with a block provides a powerful way of creating custum enumerables that can perform any transformation on the original data.

Implementation

A patch is available which was created by Nobu Nokada

ruby picture
Comments Current voting

I haven't used enumerator much, so take these comments in that context.

The block idea: this sounds like you want a kind of default map behavior. But isn't the idea to return something which mimics an iterator -- and then (as with an iterator) tell it what to do? This seems maybe a little too magic.

#enum_if: I don't really like the idea of having an alternative #select because of implementation issues. That brings the implementation too directly into the language -- it's too low-level a decision, I think.

-- David Black


How about provide something very similar to Enumerator but return a new Enumerating object instead of an Array? I.e. what if

  obj.to_enum.select{|x| x.isa? Numeric}

works like enum_if that is proposed here?

--matz.


"The block idea: this sounds like you want a kind of default map behavior."

Well, the idea is more to have a kind of transparent map behaviour. In this way you can transform one kind of iterator in another one. You may want to save the enumerable to avoid doing the same transformation on an object over and over again. Someone had also implemented something similar to have a "lazy" iterator. It would be lazy because the block is only executed when a call to the original objects :each method is made.

  "obj.to_enum.select{|x| x.isa? Numeric}"

That's nice! I am not sure if it will confuse people who expect Enumerator#select to return an Array (or break something), but I like the idea.

--Kristof Bastiaensen


I strongly advocate the first idea of the block. Just today I ran over a problem which seems quite typical to me and which could be solved elegantly with this. Suppose we have a container which contains multiple records (i.e. items that consist of several members you want to be able to query):

  Employee = Struct.new(:name, :age, :job)
  class Company
    def initialize
      @employees = []
    end
    def names
      @employees.to_enum(:each) {|emp| emp.name}
    end
    def ages
      @employees.to_enum(:each) {|emp| emp.age}
    end
    # even simpler
    def jobs() enum_elems(:job) end
    ...
  private
    def enum_elems(field)
      @employees.to_enum(:each) {|emp| emp[:field]}
    end
  end

This would be very efficient even for large employee sets.

The only alternative I can think of at the moment would be to have an additional class that serves as a mapping proxy to an Enumerable. (Even with to_enum with block this might be the case for efficiency reasons.) But it feels more natural to have it integrated here.

About enum_if: I like the idea of Matz - I'm just unsure about side effects on existing code. For example, someone may rely on the result being an array and then trying to iterate by index for whatever reasons (complicated calculations for example). #enum_if might be a better solution here as it doesn't break compatibility. I'd probably choose another name like #enum_select because #enum_if sounds to me like the whole iteration was conditional.

-- Robert


Strongly opposed 0
Opposed 3
Neutral 0
In favor 1
Strongly advocate 1
ruby picture
If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

ruby picture

Powered by .