Editing Topic: RCR280 Project: RCR | RCRchive home

RCR 280: Unified type conversion framework

submitted by grenzi on Wed Sep 08 2004 11:45:27 PM -0700

Status: pending


Abstract

As of ruby 1.8 there is no standard way to provide a transformation (conversion) from one "type" of object to another.
This RCR propose a generalized framework that is completely extensible and is more general and powerful than the simple convention used now.

Problem

Currently Ruby has no well-defined mechanism for converting one type of object into another; it only has a convention. The standard library methods such as to_s and to_i follow a naming convention that is not extensible (third party users write to_class, while the standard library methods write to_c), lack error checking, and pollute the method namespaces of classes that provide these conversions.
If error checking is desired, the user must use global methods String() and Integer(), which don't match the usual convention, and require the use of a user-defined dispatch mechanism to dispatch based on the source type.

Proposal

A method is added to Object (possible names: #to, #to_type, #as -- we'll use #to in the RCR) accepting a parameter wich represents the target type of the transformation.
Transformation paths are kept as Callable objects in a global registry.

Analysis

We dare to use the word type, because this system is not limited to converting
classes or modules. The proposed system does not limit in any way the concept
of starting and target type. Even if the basic building blocks are still
classes and modules we can use properties or state of an object to direct the
tranformation. The proposal considers that types not represented from Classes
or Modules may be supplied as a symbol to the #to method, say:

a=some_string.to(:CapitalizedString)

The Callable object stored in the registry may return a newly created object, wrap
it in a proxy, add singleton methods or include modules. It can be even used
as an assertion facility, say:

a=3.to :Odd #=> 3
a=4.to :Odd #=> TypeError

This way the various is_a?, kind_of? and
respond_to? may be factored in a
conversion path that allows simple declarative usage, centralized management
and crash-early behaviour, say:

def foo a,b
a=a.to Bla
b=b.to Boo
..do stuff
end

This is somewhat similar to the lisp type declaring approach, in that it
allows the author to hint types using the full power of the language.


The system is extensible in that it allows conversion paths to be defined
from the developer of the starting type, of the target type or from third
party users.


It can be used to enhance interoperability between different
libraries, because it allows DeveloperA to build LibA while DeveloperB
develops LibB and ThirdUser can use both libraries if a simple conforming
path to LibT is provided. This is not something new, but a formalization of
a well known practice. This approach has proven itself useful in other languages (
for example,)


The system integrates in actual ruby (see sample implementation) but is not
limited to it.

The compiler/interpreter/vm may optimize the type conversions/declarations
based on compile time or runtime analisys, for example in cases like this:

def sum a,b
a=a.as Numeric; b=b.as Numeric
a+b
end
sum 1,2

by removing the check once informations about arguments are gathered.


Finally, this approach allows clean documentation of expected arguments, which
can be parsed by specialized tools (rdoc)


For examples of usage look at the provided implementation; it comes with
simple tests.

Implementation

A simple implementation is provided. The basic algorithm is:



Raising an error is better than giving a false, nil,
or default value because it allows the transformation of objects in Boolean
values or NilClass, allowing users to write their own boolean
transformation (often asked as #to_bool) without changing existing
classes. It also makes error checking possible, while still allowing easy usage of
default values like:

foo= bla.to(Integer) rescue 0



Note that optional arguments for the conversion (such as the base
in Integer conversions) can be passed to the (#to,
#as, #to_type) method, and
are passed to the conversion Callable, so they still work as expected), i.e.

'200'.to Integer, 16

would convert the string into an Integer considering it encoded in
base 16.



Writing transformation paths is really simple.
Supposing we created a SortedCollection class. Converting an
Enumerable object would be easy:

ConvRegistry[Enumerable,SortedCollection]= proc do |enum|
sca=SortedCollection.new
enum.each {|i| sc.insert i }
sc
end

Conversions to pseudoclass are also simple, i.e. a conversion from
Enumerable to a pseudoclass :Bool can be as easy as

ConvRegistry[Enumerable,:Bool]=proc { |x| not x.empty? }



The behaviour of the proc is anyway not limited to simple statements.
The system allows many of the Interface-like systems proposed for ruby over the
years to be embedded in it, for example people could tag an object as implementing an
interface and write a conversion path that just checks this tag.



If this RCR is accepted if would be nice that this mechanism was easily
accessible from C code. So the proposal is that if this RCR is accepted that
it be rewritten in C, so that it has the same interface from Ruby,
but so that if this code can be written in Ruby:
                                                                   
ConvRegistry[Enumerable,:Bool]=proc { ... }
ConvRegistry[String,Integer]= proc { ... }
ConvRegistry[Class,Enumerable]= proc ... }

It would be possible to write this in C:

rb_define_conversion(rb_cEnumerable, rb_intern("Bool"),enum_to_bool);
rb_define_conversion(rb_cString, rb_cInteger, string_to_int);
rb_define_conversion(rb_cClass, rb_cEnumerable, class_to_enum);

One last thing:
The sample implementation does not allow automatic transitions between
arbitrary types. This means that if a path from T1 to T2 exists, and another
one exists from T2 to T3 then T1 could be automagically converted to T3.


This is not allowed because it makes the system more complex and less
predictable, plus there will be an ambiguity when two or more paths are defined
implicitly.



class ConvRegistry
@@reg=Hash.new
def self.[] from_type,to_type
if res=@@reg[[from_type,to_type]]
res
elsif res= find_in_ancestors(from_type,to_type)
res
else
raise TypeError.new("no conversion for #{from_type},#{to_type}")

end
end
def self.find_in_ancestors from_type,to_type
from_type.ancestors[1..-1].each do |anc|
if res2= @@reg[[anc,to_type]]
return res2
end
end
nil
end
def self.[]= from_type,to_type,func
@@reg[[from_type,to_type]]=func
end

end

class Object
def to(something,*args,&blk)
if something.is_a? Module and self.is_a? something
return self
end
ConvRegistry[self.class,something].call(self, *args,&blk)
end
end

if __FILE__== $0
require 'test/unit'
require 'stringio'
class MyTest < Test::Unit::TestCase
def setup
ConvRegistry[Enumerable,:Bool]=proc { |x| not x.empty? }
ConvRegistry[String,Integer]= proc { |x,*args| x.to_i *args }
ConvRegistry[Class,Enumerable]= proc { |x| x.send
:include,Enumerable }
ConvRegistry[Object,Enumerable]= proc {|x| x.send :extend,
Enumerable }
ConvRegistry[Object,:Frozen]= proc do |x|
if x.frozen?
x
else
raise TypeError.new
end
end
ConvRegistry[Integer,:Odd]= proc do |x|
if x%2==1
x
else
raise TypeError.new
end
end
ConvRegistry[Object,:Readable]= proc do |x|
if x.respond_to? :read
x
else
raise TypeError.new
end
end
end

def test_class_class
assert_equal '5'.to(Integer),5
end

def test_subclass_class
my=Class.new String
m=my.new '5'
assert_equal m.to(Integer),5
end

def test_class_module
f_class=Class.new(Object)
f_class.class_eval do
def each
yield 1
end
end
f_obj=f_class.to(Enumerable).new
assert_equal f_obj.find_all {|x| x==1}, [1]

end

def test_more_arguments
a='aa'
assert_equal 170,a.to(Integer, 16)
end

def test_instance_module
f_class=Class.new Object
f_class.class_eval do
def each
yield 1
end
end
f_obj=f_class.new
f_obj=f_obj.to Enumerable
assert_equal f_obj.find_all {|x| x==1}, [1]
end

def test_singleton_module
f=Object.new
def f.each
yield 1
end
f=f.to(Enumerable)
assert_equal f.find_all {|x| x==1}, [1]
end

def test_property_pseudoclass_ok
a= 'ciao'
a.freeze
a=a.to(:Frozen)

assert_equal a, 'ciao'
end

def test_property_pseudoclass_fail
assert_raises(TypeError) {'ciao'.to :Frozen}
end

def test_property_pseudoclass_ok2
a=5
a=a.to :Odd
assert_equal a,5
end


def test_property_pseudoclass_fail2
assert_raises(TypeError) {6.to :Odd}
end

def test_object_superclass
a=42
assert_equal a, a.to(Integer)
end

def test_object_mixin_ok
a=[]
assert_equal a,a.to(Enumerable)
end

def test_object_mixin_fail
a=5
assert_raises(TypeError) {a.to Enumerable}
end

def test_methodbag_pseudoclass_ok
a=StringIO.new
assert_equal a,a.to(:Readable)
end

def test_methodbag_pseudoclass_fail
a=24
assert_raises(TypeError) {a.to :Readable}
end

def test_enumerable_bool
a=''
assert_equal false, a.to( :Bool)
a << 1
assert_equal true , a.to( :Bool)
a=[]
assert_equal false, a.to( :Bool)
a << 1
assert_equal true , a.to( :Bool)
end
end
end



Back to RCRchive.


RCR Submission page and RCRchive powered by Ruby, Apache, RuWiki (modified), and RubLog