ruby picture

RCR 241: in operator

Submitted by hal9000 (Sat Apr 10 23:04:27 UTC 2004)

Abstract

in operator

Problem

Ruby allows calling include? on a collection to determine whether an item is in that collection. In essence, we can think of this as: container operator item

But there is nothing to do that in reverse order, syntactically speaking: item operator container

The issues I perceive are these:

  1. Sometimes literal collections are used, and the literal is long and distracting:
    if [MS, AL, VA, TX, NJ, NY].include? state ...
  2. Sometimes the collection may be an expression like array1 + array2 (making parentheses necessary when calling include?).
  3. It is mathematically backwards, in that the familiar "epsilon" set operator goes in the opposite order.
  4. It is sometimes conceptually backwards, because we want to focus our attention on the item rather than the collection.
  5. There is precedent for this operator in computing, since Pascal has had it from the beginning; and (I believe) Python does also.

Proposal

Make the (existing) keyword in represent an operator in a syntax-sugar fashion.

Just as for x in y is syntax sugar that invokes the standard iterator each, let x in y be syntax sugar that invokes the method include? on the operand.

Analysis

In other words, x in y would always mean exactly the same as y.include? x, even when y does not define an include? method.

By analogy, note that for x in y will fail if y does not define each.

Note that in is already a reserved word in Ruby (which Matz has said "will not go away"). We therefore don't have to add a new keyword.

Existing code will not break, as the tokens x in y are not meaningful in Ruby currently.

Use of the operator would be optional at the programmer's discretion, just as for may or may not be used in place of each.

I envision this expression always returning true or false, since as far as I know, that is what include? always does.

Obviously common uses might be in loops and conditionals:

words << (ch = STDIN.getc) until ch in ['.', ';', '!', '?']
or also
if digit in 0..4 then...

The biggest question I am aware of is: What should the precedence of this operator be?

I favor putting it at the same level as the comparison operators. Note the following:

Implementation

I'm not qualified to comment much here. It seems to me this should be a small change.
 <code>in</code> is already a keyword;
I would expect we could assign it a precedence, add some yacc rules, and add a bit of code calling the include? method.
ruby picture
Comments Current voting

I understand the case you're making, but (for reasons I probably can't express as well as you've expressed yours) it doesn't quite appeal to me. I think part of it is the syntactic sugar and/or special (or semi-special) case nature of it: a keyword, dedicated to providing an inversion of a method that only certain objects implement anyway. In other words, why does #include? deserve (so to speak) to have its inversion treated this specially?

It also seems a little incongruous for it not to be a method, specifically a '?' method. I'm not saying it isn't reasonably clear, or couldn't be learned, but it feels like "a.include?(x)" and "x in a" are cut from excessively different cloths. I actually like "x.in?(a)" better (and yes, for the minimalists out there, I do know it involves typing more characters :-)

-- David Black


I like the proposal. While I understand David's points, I feel that "for x in y" and "if x in y" are so similar that it makes sense for the latter to be in the language.

However, I do get by just fine with


          
  class Object
    def in?(enum)
      enum.include? self
    end
  end    

          

(This is in the extensions project:

-- Gavin Sinclair


Given this (from the Analysis):

"[...] x in y would always mean exactly the same as y.include? x, even when y does not define an include? method."

... then "for x in y" should read as "for true" or "for false" (?), so I disagree that there's any similarity between "for x in y" and "if x in y".

While I agree with the perceived problem, I had to vote against the proposed solution.

-- daz


having in is good to me (even if it sounds a lot pythonic). But I'd prefer to have junctions in ruby (see with them this will become: perl6-style


          
 elem == any('.', ';', '!', '?')
or flgr/ruby-style
          

          
 elem == ['.', ';', '!', '?'].any
Junctions solve lots of other problems as well, withouth requiring any syntax change, and with a clear sound ".. elem is equal to any of .." -- gabriele renzi
          


Personally, I hate the syntax of junctions even if they are powerful (for much the same reasons I hate the flipflop operator).

I'm not opposed to the functionality, just the horrible syntax.

-- Hal Fulton


And by the way: Here is one more little reason for the in-operator. It saves us using parentheses when we look at ranges:

  (1..10).include? x

would become

  x in 1..10

Of course, that tidbit is not likely to sway anyone who hates this idea. :)

I just think that since we have for/in (and it's not going away), 'in' makes a nice complement to it.

I don't miss the dot and the question mark any more than I miss them when I say

  x == y

rather than saying

  x.==? y

But it's just my opinion.

-- Hal Fulton


I don't like the idea of 'in' having different meanings depending on the context.

The result of...

  if x in y

...should be similar to that of

  for x in y

To be consistent, the 'if x in y' version should assign the first element of 'y' or nil to 'x'.

  if x in y # if (x = y.first)

The operator proposed, if accepted, should be consistent with 'include?', thus the name should be something like 'is_included?' or just 'included?'. I don't see a benefit here and, speaking in terms of linguistics, I think in this case it's not worthy the passive voice (placing the direct object as subject).

-- Michel Martens


The meaning of "in" is not changed; rather the meanings of "if" and "for" are different.

I have never seen nor imagined a language in which "if x in y" would perform an assignment operation.

As for passive/active, relational operators correspond to linking verbs in human language, not action verbs.

-- Hal Fulton


Let's see both expressions expanded:

  for [each element] x [that has a value assigned from an element] in [collection] y
    print x
  end

Note that 'x' has no value before the 'for' expression is evaluated. At each loop, a value from collection 'y' must be assigned.

Bellow is the 'in' as used in the proposal:

  if [a value equal to that of element] x [exists] in [collection] y
    print x
  end

Here, 'x' MUST have a value before the evaluation occurs, and no assignment is ever done.

To be consistent, an assignment should occur. It doesn't mean that I want to see that kind of operand, we already have one, but another wording should be used instead.

-- Michel Martens


I see your point, but I can't agree with your assessment.

It's the nature of a for-loop to assign successive elements of a collection to a control variable. The "in" in this case is almost a meaningless particle (like the "then" following an if). I don't believe it makes any sense to say that "in" denotes assignment. The "for" itself implies assignment.

"in" as an operator works the way I've shown in Pascal, learned by millions for over thirty years.

Though I don't know Python, my understanding is that it works the same way.

I've seen this used in algorithmic notation in my discrete structures class. And I've even heard my calculus professors (orally) read the "is a member of" operator as "in" or "is in."

So I think this usage is well-supported and very intuitive for most people.

-- Hal Fulton


Pascal and Python use 'in' as shown in this proposal, and it may be very intuitive for most people, you are right. I like the way Ruby handles this need (the OO way of include?) and prefer iterators and "object.boolean_functions?" over constructs like those exposed (the Python and Pascal way). I won't be mad if enough people vote in favor of this syntax sugar and it gets included in the language, I just saw a chance to look for a better word. Sadly for this case, I'm a native spanish speaker so chances are I won't come up with a great precise word to use. -- Michel Martens


That is fascinating, I would have assumed you were a native English speaker, but I see now that "Michel" is spelled without an "a." Do you know Mauricio Fernandez (batsman)? He also speaks excellent English. Maybe Americans should go to Spain to learn English? :)

Anyway, I can understand why you would want a different word. But if it were a different one, I think that would tend to make me *oppose* the idea. If 'in' had not already been a keyword, I would not have made the suggestion. I think we don't want too many keywords in Ruby.

-- Hal Fulton


I agree with David Black. It seems to break the elegant symmetry to use the keyword and not the method in?. Additionally, I feel that x.in? y is actually clearer. One of the things that attracted me to Ruby is its excellent use of question marks, and it works wonderfully with in?.

I would also argue that "x in y" does *appear* to change the meaning of the keyword. Although on close inspection "in" may be a meaningless particle in the context of for-loops, its position between the variable name and the elements being assigned may cause people to view it as different sort of assignment operator. It makes more sense for the assignment operator to be between the elements rather than in front (Lisp users excluded!) I have that impression, and it seems so did Michel Martens. I also share his wariness for new constructs. It would seem, at the very least, that using the keyword in this context would change "in" from being a meaningless particle to a meaningful one.

--David Chen


Ruby needs more keyword clusters.

 list.frob! if foo is in list

Is the word `is' a keyword already? If not, it should be! This is a good opportunity to add it. Not every useful concept out there can be expressed with a single word. :-)

Though this is slightly tounge-in-cheek, I do like the concept of keyword clusters, and I do like `is in'. These two words are often used to transliterate the set element operator into ASCII, and are in fact how most people pronounce said operator.

We would also need to add `is not in', though.

Of course, these expressions could simply be sugar for `list.include? foo' and `not list.include? foo'.

Voting strongly in favor because that's how much I like sugar.

-- Daniel Brockman


Voting strongly in favor. This is how Python does it and it has never caused any confusion. Also, it makes the mentioned idioms much more readable and keeps the OO in tact, albeit behind the scenes. Sweet, sweet sugar ;-) Chris R


Strongly opposed 7
Opposed 7
Neutral 1
In favor 12
Strongly advocate 13
ruby picture
If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

ruby picture

Powered by .