ruby picture

RCR 234: Uniform meta-access while parsing Ruby code (replaces RCR 22

Submitted by itsme213 (Thu Apr 08 14:04:25 UTC 2004)

Abstract

Ruby exposes self within a class body, and also calls certain methods (e.g. Module#included) to signal certain parsing events (e.g. including of a module). These are powerful extensibility feature, allowing new methods to be written at the meta-level (e.g. Class) and easily invoked at many places within Ruby code. This RCR provides both extensibility mechanisms uniformaly throughout the source code, with access to many other syntactic elements (methods, parameters, constants, ...).

Problem

Facilities such as attr rely on methods on Class being invoked within the text of a class fdefinition with self = current_class. I can write new methods on Class and explicitly invoke them directly from anywhere within a class that self is bound.

Similarly, methods such as Module#included and Class#inherited are always invoked by the Ruby intepreter on certain events like module inclusion or class inheritance. I can override these methods for my own module or class and they are automatically invoked when those events take place with my module or class.

However, these two facilities are limited. I cannot define new methods on Method (or UnboundMethod, or whatever the underlying meta-level class will be called) and explicitly invoke them (similar to today's common attr invocation) at appropriate points (if any) in a method definition. I cannot write new methods on Parameter ... etc.

Similarly, I cannot override methods on Method, Parameter, etc. and have them automatically invoked by Ruby at key sentinel points in the code e.g. Method#started, ParamList#finished

This makes it more awkward than necessary to extend the meta-level e.g. I may want to create and attach information to explicit "Attribute" and "Method" objects to represent various meta-data information

This facility could be extremely useful if more uniformly available. Classes representing the entire Ruby Abstract Syntax Tree should be published (Method, Parameter, Constant, ...), self should be bound to instances of these AST classes during parsing, and a rich set of events should be defined that call methods on those AST objects (Class#started, Class#finished, Module#included, Class#inherited, ParamList#started, ParamList#finished, etc.) just as SAX parsers raise events when procesing an XML stream).

Proposal

The examples below are meant only to illustrate the idea. I do not have detailed enough knowledge of the Ruby grammar to know which points in the code would be reasonable choices for definition of new methods or explicit invocation of methods, or if some syntax change may be involved . I have annotatated the code with [1][2]... as points in the code that I cross-reference subsequently.

class Method [1]
 def started [2]; p "New Method"; end
 def explicit [3]; p "Method#explicit"; end
end

class [4] A [5]
 def [6] foo [7] explicit [8] (a [9] ...)[10]
   # body
 end [11]  def self.wrapped [12] (...) # used like self.included in a class
         end [13]
 [14]
end [15]
>

Explanation of above code refers to [1], [2], etc.

Analysis

With this scheme:

Implementation

If I knew more of how the Ruby grammar was being implemented I could probably suggest something. However, most of this seems to require a generalization of facilities currently available: self binding during parsing, and automatic invocation of pre-defined methods at key events during parsing.

ruby picture
Comments Current voting

If whoever posted this change is the original author of the RCR, you can edit the original RCR. The RCR does not use wiki syntax; it uses html syntax (the comments use wiki syntax).

itsme:thanks. done. the "edit rcr" option does not appear on the rcr page

Also, iirc Ripper is to be included with Ruby in the future. Would this provide the featuers you want?

itsme:not the usage i see.

-- Paul Brannan


I am strongly opposed to this as I was previously. As I stated in the prior version of this RCR, this seems to be a bad mish-mash of the AOP that Matz has suggested will be in Ruby2 and other odd ideas. They may not be bad, but they aren't Ruby. In particular:

  • "class A" is a semantic unit, as is "def foo [<;params>]". There is no way to sensibly execute code between "def foo" and "[<;params>]", which seems to be what you're suggesting.
  • Parameters aren't "executed" right now, so what sense would it make to have point #9 (self = param)? How would making it so that code can be executed during the definition of a parameter list solve anything?
  • Indeed, what exactly are you trying to solve here? Is what you're trying to solve actually worth the reduction in code readability for such magic code?

I really don't think that this solves anything, and in fact would make code less readable and more complex than necessary.

-- Austin


Austin, There are 2 parts to this proposal, both about accessing the meta-level conveniently in source code. Neither is related to AOP, imho.

(1) Allow explicit invocation of methods (ala "Class#attr") but in potentially more places, by providing as fine-grained binding of "self" as possible.

(2) Guarantee certain predefined methods are invoked at key points/events in the code (ala "Module#included), in many more places than is done today.

Are you disagreeing in principle with ALL of (1) and (2)? Or are you pointing out that the spot after "def" or after a param is not a good example of (1) with today's grammar rules? You may be right about the latter; but surely "def self.foo" does involve evalution of "self"? And then perhaps the 2.0 method wrapping facility "def foo:after" could be handled, with some syntax revision, as invocation of an "after" method on a method object? And then perhaps Ruby could more uniformly offer explicit invocations on method objects?

Modulo today's grammar rules, I can see real value in being able to also have code executed as part of a Param list. If I was building up strings and doing "evals" I would use #{..} for this. I fondly recall that Lisp's backquote/comma combination was quite a powerful little combination of building up s-expressions while simultaneously evaluating (sub)s-expressions.

In any case, if that does not fly, as you suggested "def...end" returning the Method object may be a start, so I could do:

 def foo
 end . explicit (...)

That Method object would have to be a reliable handle (behavior of hash etc), and give access to its Param objects, its Class object, etc.

I don't see a downside to part (2).

If looking for motivation for meta-access to method objects, take just about any method naming convention used in Ruby code (e.g. Webrick uses "do_<;xyz>"). These schemes do not scale, do not compose (how to name a method that needs 2 or more tags?), and make all CLIENT code harder to read. It is much better to attach meta-information to the method objects (just as Java is doing in 1.5 @meta-attributes vs. the older Java method-naming conventions).


Also, if

   def foo .... end 
returns a method object (which I hope it will), then the following should be allowed as a way to define a (singleton) method bar on that foo method object:

 def
   def foo
   end
   . bar  # as in "def self.bar ...end"
 end


[edited the conversation in place for clearer "threading". --Austin]

(1) Allow explicit invocation of methods (ala "Class#attr") but in potentially more places, by providing as fine-grained binding of "self" as possible.

(2) Guarantee certain predefined methods are invoked at key points/events in the code (ala "Module#included), in many more places than is done today.
I am mostly opposed (nearly all, if not all) to (1), and slightly opposed to (2). With respect to (1), I'm not sure if "def self.foo" involves the evaluation of "self" per se;
:itsme: It does. Any expression would work. I'm not sure you and I are using the same terminology here, but I'll grant your point.

I'm not that familiar with the interpreter. On the other hand, "def foo" almost certainly doesn't involve the evaluation of "self" anywhere. More to the point, the specification of parameters in Ruby doesn't involve evaluation of those parameters; it merely labels things for code to come. Invocations on method objects are uniform: meth_object[] or meth_object.call.
itsme: Correct, currently. But then i would guess earlier versions of ruby did not permit singleton methods with "def instance.foo" either.
I don't know. I believe it has since 1.6.1, so that's not quite three years of history supporting that particular feature.
I am also opposed to the ugliness that your proposed changes end up encouraging in the code. It's not clean like most of Ruby is now; it no longer "feels" like Ruby as you've described it.

I don't know what value you see in code execution during the specification of the param list, and I've seen no useful examples provided here. Please, do so -- convince me that this is a good idea. (Well, you don't have to convince me, but you do have to convince matz. I just happen to not see the purpose behind what you're proposing, or that its purpose overcomes the absolute nightmare that the such code would be to read.)
itsme: I want to record parameter names, types, other (extensible) properties into a meta-level structure. I want to declare these properties right where they apply. Just as I can declare attributes today using "attr :foo" right within the class body, rather than doing a bunch of uglier code in a different place to generate attribute accessors.

itsme: def foo ( bar meta {type Array; other :baz} ) : this would invoke Param#meta which could do all kinds of things starting with the Param and navigating to the Method, the Class (and superclass etc.).
I still think that is really ugly code that should be discouraged at all costs. What you're talking about here is a more flexible form of strong typing and metainformation that would be useful for class browsers and IDEs, but not useful for the workaday programmer who has to dig into "what does 'bar meta' mean?" Again, I don't have a problem with the specification of parametric meta-data, and I think that it would be useful to have it integral to the language, but where I'm opposed to what you're proposing is that it's ugly -- especially for something that would (theoretically, and hopefully) be used by a majority of Ruby developers. I wouldn't want to see either of the following:
  def foo (bar meta { type Array; other :baz }); end
  ...
  def foo (bar); end.meta(:bar, {type Array; other :baz})
I don't know what would be better, but I know that this is not what I see when I see Ruby in the future. It breaks the clean syntax that we have, it requires the use of parenthesis when defining a method so that you can apply metainformation to its parameters, etc. Not only that, it feels very heavy -- Java-type heavy, and that's too heavy. Ruby is nice because it's conceptually clean and simple; this adds too many exceptions for my feel.
I don't know anything about Lisp, so I don't know what backquote/comma does -- nor do I know what you would really use the building of s-expr while evaluating (sub)s-expr.
itsme: Lisp has a uniform tree AST that is fully accessible. Backquote let you build up a tree with parenths; matching parenths correspond to a (sub)tree. Comma, used within a backquote, let you evaluate code while building the tree. (fwiw, C++ templates have am uglier counterpart with compile-time expansion.)

itsme: `(bar (baz ,(+ 2 3))) would return appox. the tree bar<;-baz<;-5; now stick in my "meta {..}" for (+ 2 3) and you'll see why this is relevant.
That still makes no sense to me. Give me a concrete example. Why do I care about the AST bar<;-baz<;-5? Why does that make your meta {...} call relevant? You haven't made your case here. I understand the technical concept of building a tree; what I don't get it why you want to do that.
What might be more useful, from my perspective, is the ability to specify a default value that is itself a method call or a Proc. Something like:
  def default_enc
    (@media_type == 'text') ? 'quoted-printable' : 'base64'
  end
  ...
  def initialize(type, ext: [], enc: default_enc, system: nil)
    ...
  end
This would not evaluate default_enc at evaluation time, but like Hash.new's block argument, it would note that the method is to be called when the class is initialized.
itsme: But this is just one example of meta-annotations. Useful, no doubt. But an extensible mechanism would be infinitely more valuable. Perhaps we should try to address the ugliness instead? When I first met "def x.foo" I went "Wow!". When I first met "class <;<; x" I went "blech!". Now it's growing on me.
I disagree, actually. One of the odd strengths that Ruby has is that Matz has made some choices that limit the flexibility; I believe that Ruby is stronger for having those limited choices rather than infinite flexibility on some of those things. This may be one of those cases, IMO. There is nothing really unique in what I have specified above in my example -- instead of storing a value in the label, it stores a method call (or a proc that calls said method). There's no meta-evaluation involved with what I've suggested. If you're after method meta-data, there are better ways out there -- I haven't seen one that I like that I would use, yet, but adding an infinite extensible mechanism would not include it.
I don't particularly care for the example you use of "def foo; end.explicit" but that's no different than possibly doing:
  private def foo
    ...
  end
itsme I don't understand. My code would invoke the method #explicit on the foo method object.
Nothing to understand. I don't like seeing code like:
  class <;<; self; self; end.blah
This is no diffferent than:
  def foo; end.explicit
However, such things are also no different than the concept of what I showed above (passing the returned method to the "private" method before it).
(Although, this one rather requires the returning of a Symbol the way that private is implemented. However, Module#private could be changed.) The other problem with returning a Method at this point is that you would probably get UnboundMethod, not Method (which is bound to a particular instance).
itsme: Correct. UnboundMethod would be the closest thing today, but today it is not a reliable handle. Tons of useful meta-information belong on (the equivalent of) UnboundMethod and its associated Params etc.
But there is no Parameter class. I think that asking for one is far too heavy-weight for most uses. I'm not kidding when I say that it feels like Java. A bunch of so-called features because people think that they're necessary that have, in fact, reduced the code velocity and increased cases of CTS and other wrist strains because you have to type so bloody much to use it.

While it might be nice to have Parameter classes, I again don't really see a useful purpose for this as parameter lists in Ruby are (mostly) advisory in nature.
itsme: If you had the library code for FxRuby, which when loaded could build up a full meta-structure of classes, methods, params, attributes, annotations (like your _default_enc_ example earlier), links between params and attributes...), I am sure you could find some very good use for it :-)
No, I don't think so.

As pertains to point (2), I don't necessarily see a downside to such event-based programming, but I don't see an upside to what you have proposed. Maybe I'm reacting to the names you have chosen and perhaps I'm conflating (1) and (2). Might it be better to separate these two as different RCRs so that these can be considered separately without one prejudicing the other?
itsme: They complement each other to provide extensible, fine-grained access to the meta-level, so I'm not sure splitting them would be good.
As long as they are tied together, I have to remain opposed to both of them. I don't think that they are complementary except perhaps in increasing the flexibility. They are certainly not dependent upon one another (at best, (1) depends on (2), but (2) does not in any way depend on (1)). I strongly recommend separating them, because I think that they are orthogonal concerns.

If looking for motivation for meta-access to method objects, take just about any method naming convention used in Ruby code (e.g. Webrick uses "do_<;xyz> "). These schemes do not scale, do not compose (how to name a method that needs 2 or more tags?), and make all CLIENT code harder to read. It is much better to attach meta-information to the method objects (just as Java is doing in 1.5 @meta-attributes vs. the older Java method-naming conventions).
:: expand on this more.
itsme: Suppose "do_<;x>" means that "<;x>" is an action that can be offered on a UI. Suppose "start_<;y>" means that <;y> is a process that can be started, "stop_<;y> means that <;y> is a process that can be stopped, and "start_<;y>" and "stop_<;y>" are expected to be about the same service (all naming convention based, note). How do I name methods <;M>, <;N> that start/stop a process, and offer that start/stop on a UI? This is just a tiny example of the issue. Meta-level access let's me simply add properties and relationships betwen <;M> and <;N> themselves, without mucking around with their names.
Too general. Give me a specific case. This really feels like a feature for the sake of a feature. Why do I care that <;y> is a process that can be started or stopped? If <;y> is a process that is startable/stoppable, then why don't I make it an object that responds to #start and #stop? Then, I make a callback on the UI that calls the appropriate #start and #stop methods on the process's object.

I'm not really sure what you're talking about, as I haven't really looked at Java2 1.5 at all. In my adaptation of Chad's RSS library that I did last year, I did some meta-programming for methods that was quite easy to do and it scaled well. I'm not sure what you mean by "method that needs 2 or more tags"? Can you give a concrete example here?
itsme: And there's lots I have not looked at, like Chad's RSS library. But we should not ignore lessons learned elsewhere. C# (and now Java 1.5) have both taken meta-level annotations quite seriously, and my examples above just scratch the surface of the reasons why.
Drop me an email at rss(at)halostatue.ca and I will send you a copy of the library for perusal. Basically, I created methods similar to attr_accessor that also added metadata for how to convert the object into proper XML. There are things that I would do differently with Ruby 1.8.1, but the Ruby is idiomatic and very clean to deal with.
As far as Webrick's do_GET, do_POST, etc., these don't need to scale very much -- there's only a few options available in HTTP :) What sort of meta-programming would help here in any case? If your application does not support HTTP's GET but does support POST, do you need to create a do_GET in Webrick, or do you just want to rely on Webrick's failover code? I would personally rather do the latter as opposed to writing an empty do_GET.
itsme: I hope this style of inserting comments is OK. If not, apologies. Also, either a 'preview' button or a link to the wiki conventions would be nice.
You can download Ruwiki from RubyForge and see the Wiki conventions.

--Austin Ziegler (sorry about all the edits -- the formatting needed tweaking)


Austin, thanks for the thoughtful dialog. It would be great if we can figure a clean way to provide "something that would hopefully be used by a majority of Ruby developers".

Re-reading this discussion I just had an "Aha!". Some (Austin? David?) might see a distinction between "what an IDE needs" and "what a programmer needs". I think the distinction between develop/introspect/compile/code-gen/run is artificial, more so with a language like Ruby. We increasingly use run-time exploration of the properties of code entities (classes, modules, methods, params, exceptions...). Development and runtime tools should have a common infrastructure. So let's not limit a capability because it sounds at first blush like "just an IDE concern".

I might want to build a component-connection language on top of Ruby: a component declares what it provides and requires, a container or configurer hooks up provided and required bits together; I have to have parameter information for this.

Or a UI is auto-generated and fully-connected to a domain model based on declarative meta-info. I have to have parameter (and more ... flow of [dialog] computation) information for this.

 Run-time class information is more useful ...
 ... with run-time method info, ... which is more useful with
 .... run-time parameter (return, exception, ...) info ... which is
 ..... [*] and any number of other interesting meta-relationships
 ...... (like which constructor params/defaults are for which inst-vars)

No one of us can claim to know quite the limits of the [*]. This RCR is about more complete (and extensible) support for reflective access, for any conceivable run-time purpose. Processing source-code from scratch, and even to assume that source will always be available (byte-code compiled? a different 2.0 interpreter) is not the best alternative. This RCR is not about IDEs or strong typing. Or even about parameter meta-data.

(Aside: That said, Ruby's meta-info on methods with no param information is unfortunate; "even" Java and C# do better. Smalltalk makes parameter names part of the 'method selector' name, requiring those param names in every call; the world in general has unfortunately chosen by-position over by-name calls. I think Ruby 2.0 should make its keyword parameters a part of method selectors, but that is a whole other discussion).

The Lisp analogy was just this: compiling Ruby source code builds a resulting tree (ok, a graph when done) of (meta)objects. The Lisp equivalent is the s-expression in source code, and produces the resulting tree equivalent of that s-expression, except that "," sub-expressions are evaluated during tree-building. Here is approximate Lisp for my Ruby 'def foo (bar meta...)' example; the Ruby version would be more powerful since the evaluation of "meta..." would have access to its context "foo...bar...".

   `(def foo (params bar ,(meta (type Array))) ...)

Re: "private def foo...end" Vs. "def foo ...end . private": one calls a method on Class, the other calls a method on (Unbound)Method. Each applicable, since different situations call for different localization of behaviors.

Re: CTS and having to type too much: nothing I have proposed adds anything extra to what a programmer types today, unless he wants to do more; in which case he would have to type some more anyway, and I prefer he do is right where he needs. "def x.foo" is not the only precedent; even the constant "C" needed in a "class C" can be found by run-time evaluation:

  module M
  end
.
  m1 = { :module => M }
.
  class m1[:module]::X
    def f
      5
    end
  end
.
  p M::X.new.f #==> 5

Re: using naming conventions. Just about any place I have used naming conventions to assist some run-time processing, I could have done it better (both in the implementing code and in all the client code) with meta-data; the converse is definitely not true:

  def foo
  end meta {
         ui ("Please execute Foo")
         sync(:attr_x)
      }

I can express combinations (foo has properties 'ui' & 'synchronized') and richer (foo is synchronized on attr_x) properties than I could ever squeeze into naming conventions.

Hope that helps.


I think that you're slightly mischaracterising my objection to all of this. I program against a lot of different languages, and the one constant in all of that is that I use vim. Even when I'm working with a language that is "better" in an IDE, like Delphi or VBA (where the IDE is something like Access or Word), I tend to do 90% of my work in vim, flipping to the IDE for compiling and sometimes for method completion (or to bring up the context-sensitive help for the API). I'm afraid that I'm just so good with vim that I barely even give wonderful efforts by Ruby developers a second look (aeditor and FreeRIDE, specifically).

Everything that you've said has been from the perspective of what an IDE needs, and I think that you're speaking from a position of ignorance as to how others develop. (Which is, of course, to be expected. You're not us.) I certainly understand that increased introspective capabilities are needed -- this is why Matz has stated an intent to allow some sort of parameter type specification in Ruby2. I personally think that it's a bad idea.

You say that you might want to build a component-connection language. That's fine and all, but does the component connection require automated discovery of that information? I personally think not. In Delphi, something is not a "component property" unless it's published (which is a separate type specification from public). Only if a method or property is published is it available through RTTI. Even public methods aren't available through RTTI; only published methods are. Additionally, there's design-time specification information for a design-time component.

I don't necessarily think that extensible reflection is what is needed here. I do think that extended reflection is needed. I also think that it has to be presented in a humane way, because I don't ever want to see Ruby become a language that effectively requires an IDE to use. The proposal you have here -- the combined proposal -- is inhumane and detracts from the readability of Ruby.

Regarding your aside, I could not disagree more. Java and C# do "better" precisely because they are strongly typed languages and start from that assumption.

Your Lisp example still makes no sense. I am asking you to tell me how I would use this. Right now, all I'm hearing is that the extensible metadata capabilities would make Ruby much better. Well... how? Give me something concrete to wrap my head around. Show me. Why is it important that the meta has access to the context foo/bar.

Re: CTS and having to type too much: nothing I have proposed adds anything extra to what a programmer types today, unless he wants to do more; in which case he would have to type some more anyway, and I prefer he do is right where he needs. "def x.foo" is not the only precedent; even the constant "C" needed in a "class C" can be found by run-time evaluation:

Um. I think that this is a misunderstanding on your part, to be quite honest. "class B" can be seen as syntax sugar for "B = Class.new; class << self". There is very little special about the run-time evaluation you're talking about in Ruby.

I can express combinations (foo has properties 'ui' & 'synchronized') and richer (foo is synchronized on attr_x) properties than I could ever squeeze into naming conventions.

What you have failed to do -- for me, I don't know about for anyone else -- is to convince me why the language needs to be extended to allow for such extensible properties, and why it should be done in-place in such a way as would make the code harder to read for those of us who don't use IDEs. -- Austin


Interesting that I came across as wanting IDE-related things; I tend to use emacs and some RDE. If a split proposal seems like an improvement, fine by me. I am confident Matz has better judgement on language evolution than I do (I'm relatively new to Ruby, perhaps not quite RCR-ready!), but I definitely hope some reasonable ways of finer-grained meta-level access will be available. In any case, it was a useful discussion for me. Cheers.

Add Comments Here


Strongly opposed 3
Opposed 0
Neutral 0
In favor 0
Strongly advocate 0
ruby picture
If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

ruby picture

Powered by .