Editing Topic: RCR279 Project: RCR | RCRchive home

RCR 279: User defined % literals

submitted by petertje on Mon Sep 06 2004 09:08:43 AM -0700

Status: pending


Abstract

Herein, it is proposed to allow for the creation of custom % literals. % literals are especially useful in reducing code clutter for commonly recreated data structures. Presently, the built-in % literals (namely %q, %Q, %r, %s, %w and %x) are handled opaquely by the Ruby interpreter. User-defined % literals can be provided via % methods in the same way as the present `` (backquote) construct, which in itself calls %x. Programmers would then be able to swiftly create data structures particular to their needs. For example, %y could be used for YAML::load.

Problem

Ruby literals provide a clean, concise way of specifying commonly used data structures like strings, regular expressions, word lists, etc. without the need for excessive escaping of quotes. However a request for adding new literals arises from time to time, indicating that people appreciate the convenience of literals and that there is an interest in an addition to the presently provided literals. This is not possible in Ruby now.

Proposal

A user-defined literal constructor may be any single alphabetical character (upper or lower case) preceeded by the % character. The syntax of these custom % literals conforms to the same rules as the current literals, i.e., matching braces, etc. Additionally, the literal's closing delimiter may be followed by any number of letters, serving as a limited form of parameter, congruent with the present behavior of %r.
When a lowercase % literal is evaluated (i.e. %m, where m is any lowercase letter), the literal and its parameters are passed as strings to a method of like name, e.g. def %m(string, options). This method then interprets the string according to any optional parameters and returns a representative object. In the case of an uppercase % literal (%M, where M is an uppercase letter) the lowercase method is also called, but only after the interpreter applies the additional substitutions for double-quoted strings.
To demonstrate the definition of a % method, we will first give the trivial case of %q:
  module Kernel
    def %q(string)
      string
    end
  end
  
That was easy :-). Let's add an option to convert the string to uppercase and do error checking on the parameters. We will also move the method to another class, making it our own private version:
  class OurClass
    private
    def %q(string, params)
      params.split(//).each do |p|
        case p
          when 'u'
            string.upcase!
          else
            raise "unknown string option: #{p}"
        end
      end
      string
    end
  end
  
This new definition of the %q literal can then be used in any place where the usual method resolution would find OurClass#%q as the method to call. An example would be:
  class OurClass
    def test
      %q{Hello #{world}!}u
    end
  end
Note how the option u is passed in the same way as to the %r literal in Ruby now. When calling OurClass#test, the evaluation of the literal will result in a call to OurClass#%q with 'Hello #{world}!' and 'u' as parameters. This method will then return 'HELLO #{WORLD}!'
In a likewise manner we can call the uppercase variant:
  class OurClass
    def test
      world = 'ruby-talk'
      %Q{Hello #{world}!}u
    end
  end
Calling OurClass#test will again result in a call to OurClass#%q, but this time with 'Hello ruby-talk!' and 'u' as parameters. This is because the uppercase variant does do string interpolation. The result of it all would be 'HELLO RUBY-TALK!'.
Of course, a % literal method can return an object other than a string. For instance, this is how the aforementioned YAML case is defined:
  module Kernel
    def %y(string, params)
      YAML::load(string)
    end
  end
  
One last (almost illegible) example for a definition of %r:
  module Kernel
    def %r(string, options)
      Regexp.new(string, options.split(//).inject(0) { |v, c|
        v | Hash.new { |h, k|
              raise "unknown regexp option - #{k}"
            }.update({"i" => Regexp::IGNORECASE,
                      "m" => Regexp::MULTILINE,
                      "x" => Regexp::EXTENDED}[c]})
    end
  end
  

Possible extensions

In Ruby no alphanumeric character is allowed as delimiter in the % literals. So it would also be possible to allow more than one letter after the %, e.g., %yaml which is less cryptic than %y (although longer). While not a necessity, it increases the possibilities.

Analysis

Pros

Cons

Implementation

We don't have an implementation yet since we are no Ruby core wizards and it requires
If we manage to cook up an implementation, the patch will probably appear first at .


Back to RCRchive.


RCR Submission page and RCRchive powered by Ruby, Apache, RuWiki (modified), and RubLog