RCR 335: Sather-like iterators

RCRchive		Top	Help	Register	Sign in	RSS	Contact	Credits

I admit Sather intatator has superior functionality, but it's more complex in both behavior and implementation. I don't think it's worth it.

matz.

I agree there are implementation complexity issues, but I think that if Ruby ever wants to take on multi-threaded processing, this would be inevitable. Given the trend to move to multi-core CPUs, such a move would be necessary in the next few years. Sather-like iterators would make a wonderful first step in that direction, without actually having to deal with the full complexity of multiple threads (for a while anyway).

I am not sure how Sather iterator relates to mult-core CPUs. Can you elaborate?

Concerning multi-core, I have a vague idea of Enumerable with concurrent iteration. FYI, see <>.

matz.

Here is a crude implementation, it demonstrates that implementation isn't that complex. I'm also uploading it as a gem in RubyForge.

  #
  # si.rb - Sather-like iterators for Ruby
  #
  # Author: Oren Ben-Kiki 2006
  #
  # == Overview
  #
  # This file extends Ruby with very basic support for sather-like iterators.
  # Simply use "si_loop" instead of "loop" and prefix each iterator call with
  # ".si".
  #
  # === Example
  #
  #   a = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
  #   h = { :A => :a, :B => :b, :C => :c, :D => :d }
  #   si_loop do
  #     puts "---"
  #     v1 = a.si(:foo1).each
  #     v2 = a.si(:foo1).each
  #     p [ v1, v2 ]
  #     p a.si.each
  #     p h.si.each
  #   end
  #
  # Produces:
  #
  #   ---
  #   [0, 1]
  #   0
  #   [:D, :d]
  #   ---
  #   [2, 3]
  #   1
  #   [:A, :a]
  #   ---
  #   [4, 5]
  #   2
  #   [:B, :b]
  #   ---
  #   [6, 7]
  #   3
  #   [:C, :c]
  #   ---
  #   [8, 9]
  #   4
  # Context for Sather iteration loops.
  #
  # This class is not meant to be used directly. See the documentation for si.rb
  # for an overview.
  class SiContext
    # Push a new context on the stack when a new loop begins.
    def SiContext.push
      @@maps ||= []
      @@maps.push({})
    end
    # Pop the current context from the stack when a loop ends.
    def SiContext.pop
      @@maps.pop
    end
    # Access an SiInterator in the current scope. The +id+ uniquely identifies
    # the iterator instance, and +object+ is the one that has the iterator
    # method that will be wrapped.
    def SiContext.iterator(id, object)
      @@maps.last[id] ||= SiIterator.new(object)
    end
  end
  # Wrap a Ruby iterator for a Sather iteration loop.
  #
  # This class is not meant to be used directly. Instances are created by calls
  # to SiContext.iterator. See the documentation for si.rb for an overview.
  class SiIterator
    # Create a new iterator wrapper.
    #
    # +object+ is the object that provides the Ruby iterator method.
    def initialize(object)
      @object = object
      @is_first_fetch = true
      @to_call_out = true
      @to_call_in = false
    end
    # Capture a call to a ruby iterator.
    #
    # It is implicitly assumed that the missing method is actually the call
    # to the Ruby iterator meant for the original object. As usual, +symbol+ is
    # the missing method name, and +args+ are the arguments passed to it.
    def method_missing(symbol, *args)
      si_fetch { |b| @object.send(symbol, *args, &b) }
    end
    # Fetch the next value from a Ruby iterator.
    def si_fetch
      callcc { |@out_cont| }
      if @is_first_fetch
        @is_first_fetch = false
        yield Proc.new { |value| si_capture(value) }
        si_break
      elsif @to_call_in
        @to_call_in = false
        @in_cont.call
      else
        @to_call_in = true
      end
      @value
    end
    # Capture the value yielded by a Ruby iterator.
    def si_capture(v)
      @value = v
      callcc { |@in_cont| }
      if @to_call_out
        @to_call_out = false
        @out_cont.call
      else
        @to_call_out = true
      end
    end
  end
  module Kernel
    # Return a Sather iterator wrapper for an iterator method. The optional +id+
    # parameter identifies the iterator. If it is not provided, the source file
    # and line number of the call serve as a unique id. This means that two calls
    # in the same line for the same object will be treated as a single iterator
    # unless explicitly given explicit +id+.
    def si(id = caller[0])
      SiContext.iterator(id, self)
    end
    # Begin a Sather iterator loop.
    def si_loop(&block)
      SiContext.push
      catch(:si_break) { loop(&block) }
    ensure
      SiContext.pop
    end
    # Break a Sather iteration loop. A normal break statement will also work, but
    # this one will work even if called from a sub-function, which may be useful.
    def si_break
      throw :si_break
    end
  end

On Sather iterators and multi-threading - my point was that creating a producer/consumer pattern in a language requires either seperate threads or co-routines support. This has little to do with employing multiple cores, although obviously using a seperate thread will make use of multi-core CPUs.

Multi-threading in an imperative language is *hard* - it raises a huge amount of issues about locking and indeterminism. Using co-routines, on the other hand, is *easy*. There are no locking and safery issues and the code is deterministic.

I view Ruby as an inherently single-threaded language. I happen to believe that concurrent languages should be based on side-effect-free code, write-once variables and encapsulating each thread in a monitor object. That's just me, though; The Java people would make a strong case that it is possible to do concurrency pretty well in an imperative language using a very tightly specified memory model and explicit locks. They at least have the advantage that the language and libraries were designed for this from the start. I find it hard to believe it would be possible to add concurrency to Ruby "well". I'd be more than happy to be proven wrong :-)

At any rate, a single-threaded Ruby doesn't mean that it should be barred from using producer/consumer patterns. With a bit of creative use of continuations (as demonstrated by the code I posted), it is possible to do so without any extensions to the run time system. Continuations are much too low level for every day use, however - they are the ultimate form of goto statement. Sather happens to contain the best thought-out framework for implementing producer/consumer/coroutines in a structured, practical manner.

My sample code above demonstrates that this is doable in Ruby as well. There are of course details to work out. For example, my code handles nested loops, but not as nicely as Sather does. Hot variables aren't quite supported. My detection of seperate iterator instances isn't as robust as it should be (it is based on the line number or an explicit identifier). Finally, the syntax sucks (using "si_loop" and prefixing the method call with ".si" are hacks).

If this were integrated into the Ruby language itself, the syntax could be cleaned up, the implementation could be more robust and probably more efficient, and Ruby programmers would gain a powerful new tool - support for producer/consumer/coroutine patterns, not to mention trivial iteration on several collections at once.

Oren.

Released as a Ruby Gem calles SI (version 0.0.1), accessible in

Share & Enjoy,

    Oren.

RCR 335: Sather-like iterators

Abstract

Problem

Proposal

Analysis

Implementation

If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password: