
Submitted by orenbenkiki (Wed May 03 18:24:11 UTC 2006)
seq1.each_index do |i|
v1 = seq1[i]
v2 = seq2[i]
...
end
This is awakward to write and does not generalize well. Sather is the first language (that I know of) to solve this problem. See for example.
This is one area where Python has the better technical solution (since each "iterator" has a "next" method and raising a specific exception aborts the loop), but being Python the syntax leaves much to be desired, and there is no support for "hot" arguments.
def ruby_iterator(args)
loop # of any type
yield stuff
end
end
Old style iteration:
ruby_iterator { |stuff| ... single sequence ... }
Sather-style iteration:
loop
stuff = <<ruby_iterator
other_stuff = <<other_iterator
... two sequences at once ...
end
The above uses << to indicate "fetch the next value from an iterator". Sather uses "!" which is already used by Ruby for another purpose. The exact syntax does not matter.
Returning from either iterator methods would terminate the loop (just like in Sather).
The above does not allow for Sather "hot" arguments and is a relatively minor enhancement to Ruby's syntax. While it is very useful already, supporting "hot" arguments brings this to a whole new level. A common idiom in many programs is producer/consumer, where the natural way to write each one is by functional decomposition. In a language without Sather iterators and "hot" arguments, this requires an explicit additional thread (with all the nastiness this requires), or writing either the producer or consumer as a callback (with a different sort of nastiness). Some functional languages avoid this trap by using lazy evaluation, which isn't natural for Ruby due to side effects. Using "hot" arguments, the problem disappears:
def produce
if condition
yield "A"
produce_a
else
yield "B"
produce_b
end
end
def produce_a
seq.each { |v| yield v }
yield "."
end
def produce_b
... etc. ...
end
def consume(&stuff)
if <<stuff == "A"
consume_a(&stuff) # With & - pass it on
else
... etc. ...
end
end
def consume_a(&stuff)
until <<stuff == "."
p stuff # Note no "<<" - uses current value
end
end
And then tying them together:
loop
consume(&produce)
end
The Sather compiler has demonstrated that it is possible to compile the above example into single-threaded code that works as expected without the need to muck around with threads on the one hand, and without forcing the developer to code either the producer or the consumer as a callback on the other hand.
The need for complex consumer/producer is less obvious, but supporting it has a deep impact on program structure. In effect, it gives the programmer many of the benefits of threading and/or lazy evaluation functional languages, while maintaining Ruby's simple clean syntax and avoiding the issues of locking, non-deterministic execution etc.
Support for Sather iterators requires some form of multiple stacks. This complicates the internal state of the interpreter (or virtual machine) in a way similar to that required for supporting multiple threads. So it might make sense to implement this feature as a stepping stone along the way to deep support for multiple threads within Ruby.
Even with deep built-in threading support, Sather iterators still have merit in that they don't have the overhead of an actual thread. Execution is completely deterministic. Since only one iterator is executing at any given time, there are no race conditions, dead/locking issues etc.

| Comments | Current voting | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|


RCRchive copyright © David Alan Black, 2003-2005.
Powered by .
matz.
I agree there are implementation complexity issues, but I think that if Ruby ever wants to take on multi-threaded processing, this would be inevitable. Given the trend to move to multi-core CPUs, such a move would be necessary in the next few years. Sather-like iterators would make a wonderful first step in that direction, without actually having to deal with the full complexity of multiple threads (for a while anyway).
I am not sure how Sather iterator relates to mult-core CPUs. Can you elaborate?
Concerning multi-core, I have a vague idea of Enumerable with concurrent iteration. FYI, see <>.
matz.
Here is a crude implementation, it demonstrates that implementation isn't that complex. I'm also uploading it as a gem in RubyForge.
# # si.rb - Sather-like iterators for Ruby # # Author: Oren Ben-Kiki 2006 # # == Overview # # This file extends Ruby with very basic support for sather-like iterators. # Simply use "si_loop" instead of "loop" and prefix each iterator call with # ".si". # # === Example # # a = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ] # h = { :A => :a, :B => :b, :C => :c, :D => :d } # si_loop do # puts "---" # v1 = a.si(:foo1).each # v2 = a.si(:foo1).each # p [ v1, v2 ] # p a.si.each # p h.si.each # end # # Produces: # # --- # [0, 1] # 0 # [:D, :d] # --- # [2, 3] # 1 # [:A, :a] # --- # [4, 5] # 2 # [:B, :b] # --- # [6, 7] # 3 # [:C, :c] # --- # [8, 9] # 4 # Context for Sather iteration loops. # # This class is not meant to be used directly. See the documentation for si.rb # for an overview. class SiContext # Push a new context on the stack when a new loop begins. def SiContext.push @@maps ||= [] @@maps.push({}) end # Pop the current context from the stack when a loop ends. def SiContext.pop @@maps.pop end # Access an SiInterator in the current scope. The +id+ uniquely identifies # the iterator instance, and +object+ is the one that has the iterator # method that will be wrapped. def SiContext.iterator(id, object) @@maps.last[id] ||= SiIterator.new(object) end end # Wrap a Ruby iterator for a Sather iteration loop. # # This class is not meant to be used directly. Instances are created by calls # to SiContext.iterator. See the documentation for si.rb for an overview. class SiIterator # Create a new iterator wrapper. # # +object+ is the object that provides the Ruby iterator method. def initialize(object) @object = object @is_first_fetch = true @to_call_out = true @to_call_in = false end # Capture a call to a ruby iterator. # # It is implicitly assumed that the missing method is actually the call # to the Ruby iterator meant for the original object. As usual, +symbol+ is # the missing method name, and +args+ are the arguments passed to it. def method_missing(symbol, *args) si_fetch { |b| @object.send(symbol, *args, &b) } end # Fetch the next value from a Ruby iterator. def si_fetch callcc { |@out_cont| } if @is_first_fetch @is_first_fetch = false yield Proc.new { |value| si_capture(value) } si_break elsif @to_call_in @to_call_in = false @in_cont.call else @to_call_in = true end @value end # Capture the value yielded by a Ruby iterator. def si_capture(v) @value = v callcc { |@in_cont| } if @to_call_out @to_call_out = false @out_cont.call else @to_call_out = true end end end module Kernel # Return a Sather iterator wrapper for an iterator method. The optional +id+ # parameter identifies the iterator. If it is not provided, the source file # and line number of the call serve as a unique id. This means that two calls # in the same line for the same object will be treated as a single iterator # unless explicitly given explicit +id+. def si(id = caller[0]) SiContext.iterator(id, self) end # Begin a Sather iterator loop. def si_loop(&block) SiContext.push catch(:si_break) { loop(&block) } ensure SiContext.pop end # Break a Sather iteration loop. A normal break statement will also work, but # this one will work even if called from a sub-function, which may be useful. def si_break throw :si_break end endOn Sather iterators and multi-threading - my point was that creating a producer/consumer pattern in a language requires either seperate threads or co-routines support. This has little to do with employing multiple cores, although obviously using a seperate thread will make use of multi-core CPUs.
Multi-threading in an imperative language is *hard* - it raises a huge amount of issues about locking and indeterminism. Using co-routines, on the other hand, is *easy*. There are no locking and safery issues and the code is deterministic.
I view Ruby as an inherently single-threaded language. I happen to believe that concurrent languages should be based on side-effect-free code, write-once variables and encapsulating each thread in a monitor object. That's just me, though; The Java people would make a strong case that it is possible to do concurrency pretty well in an imperative language using a very tightly specified memory model and explicit locks. They at least have the advantage that the language and libraries were designed for this from the start. I find it hard to believe it would be possible to add concurrency to Ruby "well". I'd be more than happy to be proven wrong :-)
At any rate, a single-threaded Ruby doesn't mean that it should be barred from using producer/consumer patterns. With a bit of creative use of continuations (as demonstrated by the code I posted), it is possible to do so without any extensions to the run time system. Continuations are much too low level for every day use, however - they are the ultimate form of goto statement. Sather happens to contain the best thought-out framework for implementing producer/consumer/coroutines in a structured, practical manner.
My sample code above demonstrates that this is doable in Ruby as well. There are of course details to work out. For example, my code handles nested loops, but not as nicely as Sather does. Hot variables aren't quite supported. My detection of seperate iterator instances isn't as robust as it should be (it is based on the line number or an explicit identifier). Finally, the syntax sucks (using "si_loop" and prefixing the method call with ".si" are hacks).
If this were integrated into the Ruby language itself, the syntax could be cleaned up, the implementation could be more robust and probably more efficient, and Ruby programmers would gain a powerful new tool - support for producer/consumer/coroutine patterns, not to mention trivial iteration on several collections at once.
Oren.
Released as a Ruby Gem calles SI (version 0.0.1), accessible in
Share & Enjoy,