RCR 304: reference/pointer concept

Problem

Ruby seems to be missing a concept that is in many other languages - references or pointers. References and pointers allow one to abstract where a value came from and still be able to modify the thing holding that value. They may also provide some efficiency when reading or writing that value multiple times.

Proposal

RCRchive		Top	Help	Register	Sign in	RSS	Contact	Credits

My proposal is to put a "ref" method in the Object class along with having a Reference class. This could be builtin or be put in the standard distribution (and need a require). With the current implementation described below, here are the ways references can be made:

 ref { :<variable> }
 ref { "<assignable-expression>" }
 <obj>.ref
 <obj>.ref.<method>(<arguments>)
 <obj>.ref(:<gmeth>,:<smeth>,<arguments>)
 <obj>.ref([:<gmeth>,<gargs>],[:<pmeth>,<pargs>],<more-args>)
 Reference.new(method(:<gmeth>),method(:<pmeth>))
 Reference.new(proc{<get-code>},proc{|v|<set-code>})

In this implementation, references are dereferenced with the [] and []= methods (equivalent to [0] and [0]= in C):

 <ref>[<optional-args>]         # get what is referenced
 <ref>[<optional-args>]=<value> # set what is referenced

Here are some examples:

 a = ("a".."h").to_a       # ["a", "b", "c", "d", "e", "f", "g", "h"]
 b = a                     # b and a have the same object
 p = ref{:a}               # reference to the variable :a
 q = a.ref                 # reference to the object in a
 r = ref{"a[4]"}           # reference to the assignable expression a[4]
 s = a.ref[4]              # reference to the [4] attribute of a
 t = a.ref.[](4)           # same
 u = a.ref(:"[]",:"[]=",4) # same except set/get method names are explicit
 v = a.ref([:"[]",4],[:"[]=",4]) # same except arguments are explicit
 p[]                       # ["a", "b", "c", "d", "e", "f", "g", "h"]
 p[] = (0..7).to_a         # [0, 1, 2, 3, 4, 5, 6, 7]
 a                         # [0, 1, 2, 3, 4, 5, 6, 7]
 b                         # ["a", "b", "c", "d", "e", "f", "g", "h"]
 q[]                       # ["a", "b", "c", "d", "e", "f", "g", "h"]
 q[] = (-7..0).to_a        # [-7, -6, -5, -4, -3, -2, -1, 0]
 a                         # [0, 1, 2, 3, 4, 5, 6, 7]
 b                         # [-7, -6, -5, -4, -3, -2, -1, 0]
 r[]                       # 4
 r[] = "r"                 # "r"
 a                         # [0, 1, 2, 3, "r", 5, 6, 7]
 b                         # [-7, -6, -5, -4, -3, -2, -1, 0]
 # extra arg is slice length
 s[2]                      # [-3, -2]
 s[2] = ["s"]              # ["s"]
 a                         # [0, 1, 2, 3, "r", 5, 6, 7]
 b                         # [-7, -6, -5, -4, "s", -1, 0]
 t[]                       # "s"
 t[] = "t"                 # "t"
 a                         # [0, 1, 2, 3, "r", 5, 6, 7]
 b                         # [-7, -6, -5, -4, "t", -1, 0]
 u[]                       # "t"
 u[] = "u"                 # "u"
 a                         # [0, 1, 2, 3, "r", 5, 6, 7]
 b                         # [-7, -6, -5, -4, "u", -1, 0]
 v[]                       # "u"
 v[] = "v"                 # "v"
 a                         # [0, 1, 2, 3, "r", 5, 6, 7]
 b                         # [-7, -6, -5, -4, "v", -1, 0]
 # multiple ways to make a reference to lines in stdin/stdout
 lineref = Reference.new(method(:gets),method(:puts))
 lineref = Reference.new(proc{gets},proc{|v|puts(v)})
 line = lineref[] # line = gets
 lineref[]="hello world!" # puts("hello world!")

Other syntax could be used instead. This RCR is for the reference concept, not necessarily the above syntax.

These things below would be other nice-to-have reference features that aren't implementable in Ruby and may require language changes. They are optional to this RCR.

ability to create more efficient pointer-like references that access variables - global, local, class, instance. These types of references could access the variable directly instead of going through proxy get/set procs as the current implementation does. The methods that make these types of references could be in Kernel (global or any?), Binding (any), and Object (class or instance).

have some of the built-in classes override the default "ref" to create more efficient references like the variable references mentioned above. Doing this for Array and Hash would avoid the lookup every time the reference is dereferenced. For Hash, in addition to a hash value reference, a hash key reference, and a key/value reference would also be useful.

Unary reference-of operator: This unary operator could prefix (like C &) or postfix. Postfix might be easier because it could just be another optional suffix character on methods like =, !, and ?. If the character was say "@", the expression xyz[5]@ would get translated to the method []@ (with an arg of 5) of the object xyz which would default to create a reference using the [] and []= methods. The class could override to possibly create a more direct reference to an instance variable. With postfix, xyz@ would be equivalent to ref{:xyz}. With a prefix character ("&"), the syntax would be more known, but it may not fit as well with the ruby parser. &xyz[5] would get translated to the method &[] which would default to making a reference using the [] and []= methods of object xyz with a starting argument of 5. &xyz would be equivalent to ref{:xyz}.

dereferencing operators: a unary prefix * operator (equivalent to [] above) and a binary *= operator (equivalent to []= above) might be nice.

Analysis

If you do a search of reference|pointer on ruby-talk, you'll see that many have discussed this. This RCR provides a solution to the general reference/pointer and pass-by-reference. I'm not going to argue that when someone wants to use a reference/pointer that using one would be the best and most efficient solution. Many times, it probably isn't the best in Ruby. The reference/pointer concept just gives another way to do things and may help others coming from a languages that had them. It gives another tool in the Ruby bag of tricks. Since an implementation is easily done in pure Ruby and it doesn't violate encapsulation (deferencing uses methods to access), it can still be considered "the Ruby way".

For reference, here are some languages that support references/pointers ( ):

B, C, C#, C++, Perl, Pascal, BCPL, OCaml, SML, Haskell, Oz, Ada, Pliant

Here are a couple general cases where a reference might be useful:

1. pass-by-reference or pass an lvalue. From what I understand, right now the way you would write/use a Ruby method that modified lvalues would be like this:

 def byvalue(*args)
     # read and modify args
     args
 end
 a,b,c = *byvalue(a,b,c)

With references, you could do this:

 def byref(*args)
     # read and modify args[i][]
 end
 byref(ref{:a},ref{:b},ref{:c})

With this simple variable case, there isn't much advantage, but when you are dealing with something deep within an object, it can offer an advantage:

 a["hi"][3],io.pos = *byvalue(a["hi"][3],io.pos)

vs.

 byref(a["hi"].ref[3],io.ref.pos)

Also, with the byvalue case, you are forced into the assign, whereas byref may choose not to do this assign. Depending on the object, this could even prevent an exception (i.e. IO#pos=).

2. Although not addressed in this implementation, a reference could provide some efficiency when there is some work that could be saved when you get or set the same thing 2 or more times. A primary example of this would be a reference to a hash value. With a good implementation of a reference to a hash value, you would only need to do the hash lookup once (while making the reference). Compare this to what you have now - you do a hash lookup every time you get/set a given hash value. Also consider the case where a custom Hash-like class is made where the lookup may be O(n) or O(log(n)) and a reference that incorporated an index from the lookup could easily be made.

Implementation

An implementation is in this rubyforge project:

Here is the current source code:

class Reference

    def initialize(getter,setter)
        @getter = getter
        @setter = setter
    end
    def [](*args)
        @getter[*args]
    end
    def []=(*args)
        @setter[*args]
    end

end

class Binding

    def eval(s,*file_line)
        Kernel.eval(s,self,*file_line)
    end
    def variable_reference(var)
        Reference.new(
            eval("proc { #{var} }"),
            eval("proc {|_v| #{var} = _v }")
        );
    end

end

class Object

    class ObjectReference < Reference
        def initialize(obj)
            @obj = obj
        end
        def [](*args)
            if args.size==0
                @obj
            else
                method_missing(:"[]",*args)
            end
        end
        def []=(obj)
            if @obj.respond_to?:replace
                @obj.replace(obj)
            else
                @obj.become(obj) # need evil.rb
            end
            @obj
        end
        def method_missing(method,*args)
            get = @obj.method(method)
            set = @obj.method(method.to_s<<"=")
            if args.size>0
                getter = proc { |*a| get[*(args+a)] }
                setter = proc { |*a| set[*(args+a)] }
            else
                getter = get
                setter = set
            end
            Reference.new(getter,setter)
        end
    end
    def reference(*args,&block)
        if block
            args.size==0 or raise(ArgumentError,"no arguments allowed when block given")
            block.binding.variable_reference(block[])
        elsif args.size>0
            get,set,*args = *args
            if get.kind_of?Array
                get,*getargs = *(get+args)
            else
                getargs = args
            end
            set ||= get.to_s<<"="
            if set.kind_of?Array
                set,*setargs = *(set+args)
            else
                setargs = args
            end
            get = self.method(get)
            set = self.method(set)
            if getargs.size>0 then getter = proc { |*a| get[*(getargs+a)] }
            else getter = get
            end
            if setargs.size>0 then setter = proc { |*a| set[*(setargs+a)] }
            else setter = set
            end
            Reference.new(getter,setter)
        else
            ObjectReference.new(self)
        end
    end
    alias ref reference

end

Comments

Current voting

I am opposed to this for all of the reasons I enumerated on ruby-talk. This is a bad idea from a bad language with a really ugly implementation. Ruby doesn't need this. -Austin

Sorry, but this seems utterly useless. In C, it's often the only way to do certain things, but in Ruby your options are endless. This is, like Austin stated before me, a bad idea lifted from a bad (well, dated is perhaps a more appropriate term) language,

    nikolai

Which of the 12 listed above is the "bad language"? And Ruby was written in one of them (while using pointers all over the place). And what makes the implementation so ugly?

Oh, let's see. Could we say that C++ is a bad language? Can we say that this is one of the worst concepts of Perl? I thought we could. Most of the languages you've noted are statically typed and either have this because it's impossible to do anything useful without it (C, C++, Perl) or have better/alternative ways of specifying the same concept (Pascal/Ada with IN, IN OUT, and OUT parameters) and not making the implementation ugly.

The whole idea is useless in Ruby -- Ruby supports multiple return values. With a hash, you can even make those values *named* return values. In *all* of the code that I've written in Ruby, I've *needed* something like this exactly once. And there's a better way to do it than this. The sample implementation abuses the [] syntax something horribly and just looks ugly in a way that I didn't think that Ruby code could look ugly (IMO, even the IORCC entries look clean compared to the code that this RCR would spawn).

Ruby really doesn't need this because it will discourage people from thinking the Ruby way. They'll see that Ruby supports "references" and start abusing them just like they do in C, C++, and Perl -- and the code that is created would no longer be easy to read or debug. This is not a "Change Request", it's an "Obfuscation Request".

Does *that* give you an idea of how bad I think this proposal is? It's neat code and it's neat that you can do this. But just because you can do something neat in Ruby doesn't mean that it's a good idea, and this is one of the worst ideas that I think Ruby could adopt -- aside from static typing. This might be even worse, in some ways. -Austin

I wrote something like this several years ago as a programming puzzle ( Since then I have never had an occasion to use it in real code. It is just not needed as a basic language feature. -- Jim Weirich

I oppose this as a language feature. Since it can be apparently implemented in ruby, I suggest a period of evaluation after which it can be considered for the stdlib.

I oppose this in the stdlib as well as a language feature. Its use is not good Ruby programming, and as Jim pointed out -- he hasn't needed it in real code after developing something similar almost two years ago. -Austin

This is a reasonably engaging thing to talk about on ruby-talk, but I'm a little surprised to see it suggested here.

Ruby is not "missing" pointers. It's differently designed from languages that have this pointer concept. One can have a library to do this (though no one ever did provide a real case where it would be the best solution to any problem in Ruby), but I don't see how this relates to or serves the interests of Ruby itself.

David Black

This would work for hashes, but would violate the OOP encapsulation of just about any other object. I understand the desire for it (I've run into that desire before) but I don't think it's worth it. - Gavin Kistner

Source code of a sample usage would be helpful.

The length of the implementation makes it look suspicious to me...

Other than that, I never missed this in Ruby so far. I'm slightly opposed ATM.

robert

I'm not suggesting that a change to the ruby core or an addition to the builti-in classes is needed. The main thing I'm suggesting is that this be in the distributed library so that someone can just "require 'reference'". Also, I have given an example where references would be useful - to hash value. Or to anything that has an expensive lookup that you would only need one time with a reference.

If it's in the default library, it's an addition to the Ruby core.
To get the supposed efficiency boosts (doubtful at best), you would need to embed this in the C code.
If it's in the default library, it has a stamp of approval that people can and should use it, when people who are experienced with Ruby tell you that they either haven't needed it or actively oppose it (or both). Even for a "hash" lookup, which is supposed to be O(1) and not require such a "speedup" in any case.
Even with the above, you still haven't shown why this would be needed in real code. I've written literally thousands of lines of Ruby without ever using anything like this. The only thing that I've needed it for will be far better served with a different implementation entirely -- a ref would be the worst possible thing I could use in this, because it would be too "easy" and not force me to rethink my design to be better designed in any case.

If I'm seeming a bit strident about this, it's because you don't seem to be listening to people who have experimented with this before and said that it's not a useful concept in Ruby. As David Black said, Ruby isn't missing pointers -- it doesn't need them. This simply isn't a concept that is applicable to Ruby. It never will be. -Austin

If you are not intending this to be in the core, why are you leaving an RCR? -rue

Very nice explanation given, but I am also against it. Ruby is very beautiful currently, I'd rather like to enhance the beauty with K.I.S.S. ideas, not with pointers to the Eilvn3ss of RAM issues.

Every value in Ruby is a reference to an Object already.

It sounds like you want to promote local variables to (mutable) objects in their own right, rather than just holders of object references. I oppose this; local variables are currently a lightweight mechanism.

If you want a mutable variable, you can define your own:

    class MutableVar
      attr_accessor :value
      def initialize(value=nil)
        @value = value
      end
    end
    # v is a reference to a MutableVar which can be passed around
    v = MutableVar.new(12)
    p v.value
    def set(x,y)
      x.value = y
    end
    set(v,99)
    p v.value

However, making *all* local variables be mutable objects in this sense would make the language a lot more inefficient; and a lot more prone to errors, as such references may be in scope when you don't expect them. A local variable in a method should, IMO, be local.

Why oh why would you want pointers in ruby :(

Strongly opposed	16
Opposed	2
Neutral	0
In favor	0
Strongly advocate	2

If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

RCR 304: reference/pointer concept

Abstract

Problem

Proposal

Analysis

Implementation