ruby picture

RCR 261: introducing Hash-like mixin

Submitted by grenzi (Fri Jun 18 12:06:53 UTC 2004)

Abstract

provide a mixin for key->value mappings

Problem

there are many classes that implements key->value mappings, mostly via #[] and #[]=, but they are not build around a common base, so they show up little quirks . Furthermore, it is hard to write a new hash-like class or inherit from Hash.

Proposal

Introducing a Map module would reduce code and allow developers to write their own Hash-like objects trusting that those could fit where others does.

Analysis

Please consider that Map is just a placeholder while people think of a name like Mappable that is more Likeable.

As of now writing an object that provides key->value mappings is hard. More than this, inherithing from an hash is hard cause it requires you to override near every method.

Key->value maps are extremely commons even in the stdlib (grepping "def []=" gives >30 results), yet the complete interface is somewhat hidden and hard to use, because few of the users of #[]= implemented the whole mapping functionality (say, PStore does not respond_to #store)

If a mixin is available for this it would be much easier for everyone to write this kind of objects.

As a (useful?) side effect it makes easier for users to use some form of type checking or interface checking (i.e. no confusion beetween Array#[] an Map#[])

Providing a simple set of methods would make classes mixing in this Map module very feature rich.

In the mixin we could rely on:

#[] or #fetch or a private #get method
#[]= or #store or a private #put method
#delete
#key? or #has_key?


This is near to the interface needed in perl to build an object that can override an hash with Tie.

The java Map interface is richer in that it provides a keySet() method that gives a set of the keys contained in the map.

If we choose that an Hashable/Map object can always provide a #keys method (or #each_key)we can include Enumerable in Map trivially, and implement a dumb #key? like #keys.member?

Possibly we could provide two different interfaces: OpaqueMap and Map (better names wanted) with the only difference of the #keys method.

Please note that this is not some kind of modification of the language, it is merely a refactoring based on actual usage.
It could have been useless to provide this in the early ruby days, yet it is important to provide it as of now, beacause the usage for Classes that provide such a behaviour is quite high, so consistence and reliability of an interface is a bigger need

Implementation

Mine would be buggy and long, but you can take a tentative loook at matju's MetaRuby package, where he provides an HollowHash that is mostly like the proposed Map mixin. See:
ruby picture
Comments Current voting

Good idea, I reckon. Existing mixins (Enumerable and Comparable) are part of the charm of Ruby. They dazzle by makin so many methods available for free. Mapping keys to values is one of the fundamental operations in modern computer languages, so seeing some standards and reuse would be nice.

AFAIC, Map is the perfect name, for that's what the object *is*. Hashable is complete garbage. Every single object is hashable: it can be a key in a hash. Not everything needs to end in -able!

-- Gavin Sinclair


FWIW, this is what I tried to do a few years ago with RubyCollections:

  http://cvs.sourceforge.net/viewcvs.py/rubycollections/rubycollections/rbc/mixins/

BinaryTree includes Enumerable, EnumerableContainer, and PairEnumerableContainer. I never was happy with the names, but I do like dividing the mixins so they are as simple as possible (Enumerable and EnumerableContainer require #each, PairEnumerableContainer requires #each_pair, and PairUpdatableContainer requires #[], #[]=, #clear, and #each -- and this last one could be even further subdivided).

The point writing those mixins was ONLY to provide an implementation for methods that could be implemented in terms of other methods. Using the modules themselves as an indicator of whether an object implements a particular interface would be wrong, as the object may in fact have the same interface but not include the module (e.g. if an extension implemented all of the methods provided by the mixins in the name of optimization). It may even be conceivable that the provided mixins have the wrong implementation for a particular class, but the class wants to provide the same interface as Hash anyway; in that case expecting the class to include the mixin just so it works with some type checking code is silly.

I think this is a good idea in principal (similar to RCR#65) but the proposal here isn't very concrete. Which methods exactly do you want this mixin to implement? What requirements should it put on the implmentation of the methods it relies on? What exactly do you want to call it (this is an important one; if it doesn't have a good name, it shouldn't be part of the standard Ruby distribution).

Also, would the author of this RCR please finish the analysis section? It looks like it got cut off in the middle of a sentence.

-- Paul Brannan


This is a good idea, but it needs to be fleshed out a bit. This certainly can be prototyped in current Ruby and it would be good to provide a sample implementation. What methods are will be the base methods? (e.g. in Enumerable, #each is the base method).

Also, what other X-able modules might we see as useful? It might be good to take good look at a well decomposed library (like EiffelBase for Eiffel) for ideas.

Regarding names, I agree Hashable is a poor choice. Mappable would work (I like to follow the X-able pattern with Enumerable and Comparable). MapMethods would work if you don't like the X-able pattern.

-- Jim Weirich


I'm sorry for not have been specific enough.
The reason I did not proposed a sample implementation was that I was not sure about the requirement for an enumeration method (wheter we choose #keys or #each_key) and wanted to hear people's voice. I'm not sure if we should:
- require put,get,has_key,delete (put/get may be []=/[] or store/fetch) - require put,get,has_key,delete and keys
- make two mixins one with require put,get,has_key,delete, the other with keys and including enumerable.

OTOH the implementation of HollowHash seemed quite good to show the concept, and I have to say that Paul Brannan's one is too. Just, why did you based it on #clear instead of #delete?

I'd like to provide a sample implementation now, but Murphy's law was waiting for me: my home pc died and I won't have access to a ruby box for a while. Anyway, I'll write out what needed methods and provided ones could be.

 If we choose to add just one mixin (java's Map interface):<br>
EnumerableMap: 
 needs: 
  key?,[],[]=,delete,keys (maybe each_key is better?)
 includes:
  Enumerable
 provides:
  ==,default,default=,delete_if, each, each_key(keys), each_pair, each_value, 
  empty?, fetch, has_key?, has_value?, include?, index, indexes, indices,
  length, member?,merge, merge!,reject, reject!, replace, select, shift,
  size, sort, store, to_a, to_hash,update, value?, values,
  values_at

#default/#default= could be provided as dumb methods by the mixin, or should be ignored? I'm not sure.

If we agree on providing a lower level Map without #keys (to be included in EnumerableMap, Perl's hash interface ) it could be:

Map: 
 needs: 
  key?,[],[]=,delete
 includes:
  Enumerable
 provides:
  ==,default,default=,fetch,has_key?,merge,merge!,replace,store,update

Last things: I'm not in favor of type checking. DBM could safely provide all the methods of Hash withouth including anything. If someone really wanted to type check like this, he could simply include Map in it without problems, surely it's not a problem for the DBM developers :)

About names: I believe MapMethods is nice, but it sounds really weird in case of : kind_of?MapMethods.

About cut-out rcr: dunno what happened, my ram is gone mad, I'm going to fix it. Thanks for all the feedback, I'll meditate on it :)

-- Gabriele Renzi


#clear is used for PairUpdatableContainer#replace, which is like #update but clears the container first. Doing it this way is actually wrong, since it doesn't meet the strong exception-safety guarantee (if update fails the container is not restored to its original state). I'm not sure how to implement #replace generically and efficiently and in an exception-safe manner, though, without using Object#become.

-- Paul Brannan


I see, and it is interesting. I believe working at the C level #become-like behaviour may be achieved easily, yet the same problem remains for, say, #delete_if. How is full-operation-safe behaviour in ruby core itself? Say, how does Enumerable#reject work without letting an object in a half-rejected state ?

Anyway, I'd say that #replace could be provided just for EnumerableMap, where #clear can be implemented in terms of #keys and #delete. If people agrees on that we could put it in Map withouth the need for #keys but based on a #become like behaviour. -- Gabriele Renzi


I suppose a default implementation of #clear could be provided, but most data structures would want to provide their own. It's silly to have an O(n) #clear when an O(1) version probably isn't difficult to write.

-- Paul Brannan


Add new comments here.


I've written a sample implementation for a "Map" mixin here:

A unit test is available here:

This implementation relies on #get_mapped_value(key) (which must either return the value for _key_ or raise IndexError), and optionally #each (to make things like #keys, #values, #size, #each_key etc. work).

It may be better to split this up into "Map" and "EnumerableMap", where the former only relies on #get_mapped_value(key) and provides mapping methods based on it, and the latter additionally relies on #each, includes Map and Enumerable and provides methods like #keys, #values, #each_key, #size etc. I've put this variant here:

-- Olaf Klischat


I think that an Array should also be considered one of these things. The keys are integers instead of arbitrary things and valid key/value pairs are those where the value is not nil. Given that, all of the methods described should be implementable.


Strongly opposed 0
Opposed 0
Neutral 1
In favor 3
Strongly advocate 8
ruby picture
If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

ruby picture

Powered by .