Submitted by damelang (Sat Mar 19 20:08:31 UTC 2005)
Although the current behavior frequently causes frustration in newbies, the real victims are the more experienced rubyists that can't seem to get used to the behavior.
Although matz has referred to this request as 'premature optimization', some people have argued that this behavior has been needed long enough to be re-termed 'mature optimization'. ,
The most common case is in text processing, perhaps the most frequent of ruby's tasks. A not-so-contrived example of the new behavior in action:
gets.chomp!.strip!.downcase!.gsub!(/[^a-z]/, '').split(/ /)
If a text of several thousand lines were to be processed, the alternative non-destructive approach would incur significant overhead. Currently, to achieve the above optimization, one must split the expression into 6 separate lines, which is inefficient for the programmer.
It appears that the ruby community in general makes very little use of the current behavior (informal polls on ruby-talk, e.g. ).
Yet, many have expressed the need for the proposed behavior. Thus, it can be seen as a problem that Ruby's current behavior caters to an rare case rather than the common one.
Before:
line = gets line.chomp! line.strip! line.downcase! line.gsub!(/[^a-z]/, '') words = line.split(/ /)
After:
words = gets.chomp!.strip!.downcase!.gsub!(/[^a-z]/, '').split(/ /)
In the rare circumstance that one needs the current behavior of the destructive methods, a comparison of the object state pre and post operation is sufficient. Yes, a very small amount of ruby code will run slower because of this change.
Perhaps a more interesting solution would be to get rid of all destructive methods and work on optimizing ther existing non-destructive counterparts. Matz seems to prefer this solution:
If interest towards this approach materializes, I would be happy to supercede this RCR.
ObjectSpace.each_object do |o| if o.is_a? Class o.instance_methods.grep(/!$/).each do |m| o.class_eval <<-EOS alias :old_#{m} :#{m} def #{m}(*args) old_#{m}(*args) self end EOS end end end
Comments | Current voting | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
RCRchive copyright © David Alan Black, 2003-2005.
Powered by .
It seems it is this RCR that fails to suggest an alternative for the functionality that is lost by this proposal: the ability to efficiently detect whether the object was modified.
What is the suggested alternative for the above, if this RCR is accepted?
See Analysis section for answer.
Do you have any hard data to back up the following statement of your RCR:
>If a text of several thousand lines were to be processed, the alternative non-destructive approach would incur significant overhead.
As far as my tests go, your example gives about a 1 second difference between the destructive and the non-destructive versions for a million lines of input. In my book, that's not a "significant overhead".
You also state that:
>Yet, many have expressed the need for the proposed behavior. Thus, it can be seen as a problem that Ruby's current behavior caters to an rare case rather than the common one.
Perhaps the problem is that most don't understand when and why to use the destructive methods and when they fail to do it right they consider it a problem with the implementation instead of a problem with use?
The exclamation-point at the end of a destructive method is a warning sign that this method might not do what you may think it does; much like "monsters be here" written on an old map, it's a warning that you perhaps shouldn't go that way to arrive at your goal.
In your Analysis you state that
>This is the most straightforward solution to the problem. Opponents to this change usually don't suggest an alternative, they just refute the claim that it is a problem at all.
Perhaps a more straightforward solution would be to drop the destructive versions of the methods in question. Most people, as you say yourself, don't know how to use them and many don't use them. Perhaps the question isn't about how to make these methods more useful, but if they have any future at all.
Paul Graham has some interesting things to say about premature optimization in programming languages in his essay "The Hundred-Year Language" (see
In the end, the only reason for having destructive versions of the methods you suggest is that they are (ever so) slightly more efficient than their non-destructive counterparts. I would say that if you start to worry that a gsub! may be slightly more efficient than a gsub, you are missing the important part of gsub, namely getting the regular expression right and then making it as efficient as possible. After that, I'd be worrying about IO-efficiency...
It's great to see that Matz seems to agree with the analysis of the problem in the comment above. (Sorry for posting this separately, I missed it before submitting it.)
No problem. I'm just happy that we might be converging on a solution. Your comment is proof that there is interest in this alternative solution. If I get more support for this alternative approach, I'll be superceding this one. Anyone else?
BTW, the alternative solution is found in the Analysis section, second half.
The alternative being eliminating destructive methods (!) entirely.
While I'm not a fan of obfuscated-terse code, I do like terse. I specifically have wanted this functionality repeatedly because I'm doing a bunch of .gsub! operations on a string, and would prefer to chain them. As a workaround I can either put each .gsub! on its own line (which some might prefer, visually--I don't) or change all the gsub! to gsub and lose a certain amount of performance.
My applications tend not to need the level of performance I save using gsub!, but if the feature is there, why not use it? :)
I am in favour, not because of optimization but because if I want to modify an object, I want to *modify* it, not replace it with a new object. Conceptually.
Similarly, if one does hsh[:a] = 'b', one presumably is not creating a copy of the hash but modifying the existing one.
-rue