Mike Vanier (mvanier) wrote,
Mike Vanier

Ruby critique, part 1

A couple of friends of mine have written a book about Ruby on Rails called RailsSpace. I was a tech reviewer for the book (at least for a while, long story). I was interested in the book because I've always wanted to learn web programming (well, maybe not in the womb, but immediately after that for sure) and Rails seemed like a mature and agile web framework. Of course, I'd have to learn Ruby to understand the code. I've been using Python for many years and I like it; it's clean and (for the most part) simple. Since Python and Ruby are quite similar, I figured I could follow the code without too much work, and I could. However, I was interested enough in the peculiarities of Ruby (those areas where it was markedly different from Python) to start reading the PickAxe book in order to learn the language more thoroughly. I was also considering teaching the language as part of my CS 11 course at Caltech.

I really, really, really wanted to like Ruby. In fact, there are many features of the language that I do like. However, I kept hitting misfeatures that made me cringe and not want to use the language. Recently, I've discovered the Django web framework, which seems to be just about as capable as Rails and written in Python, so my motivation for mastering Ruby has waned considerably. What I'd like to talk about here are the features of Ruby that bug me, as well as the features I like. This is not going to be a FUD-laced diatribe against the language; I just want to point out some features that I consider to be misfeatures in the hope that the Ruby community can wake up and fix things. I think that there is a terrific smaller language inside Ruby struggling to get out, and I'd like to see it get out. But to do that, we have to be able to separate the wheat from the chaff.

Basically, Ruby is a hybrid of two languages: Smalltalk and Perl. Put bluntly, most of the things I like about Ruby come from Smalltalk, and most of the things I don't like about it come (at least philosophically) from Perl. I personally believe strongly in the Python philosophy of "There should be one -- and preferably only one -- obvious way to do it." This is at odds with the Perl philosophy of "There's More Than One Way To Do It", and Ruby has followed Perl's lead in this area. But there's more to it than that. Having multiple ways to do the same thing isn't necessarily going to be a problem (it just means you have to memorize more when learning the language), but what I really hate is when the different ways to do the same thing are really different ways to do almost the same thing, but with slight variations. Ruby seems to be full of different ways of doing almost the same thing, but with tiny differences that you have to remember. This is a Bad Thing. It increases the conceptual load of learning the language, and this will only get worse as the language grows (as all successful languages grow). Basically, a successful language is almost guaranteed to grow to the point where there is an awful lot to keep in your head when using it (i.e. the conceptual load of the language will increase over time); artificially inflating the conceptual load right from the start isn't wise.

It reminds me of a time in my life when I would play wargames that had rule books that were 40+ pages long. I rarely finished any of those games. Now I play Go, which has trivially simple rules, and yet which is a better game than any wargame. OK, it's not a completely fair comparison, but hey, it's my blog.

Let me start by talking about some of the things I like, or at least don't mind, about Ruby. This list is by no means exhaustive.

  • The @ syntax for instance variables and @@ syntax for class variables is intuitive and neat. I don't like typing "self." all the time in Python.
  • The $ syntax for global variables is nice.
  • Using "end" to delimit code blocks of all kinds is clear and readable. It also eliminates the significant-whitespace issue (or at least diminishes it; Ruby uses significant whitespace in other contexts). Often in Python code I see people putting in explicit "end" statements in comments.
  • I like having a symbol type, and the symbol syntax is clean and nice.
  • Being able to drop the parentheses around no-argument methods is nice.
  • The explicit scope operator :: is a good thing. I tend to think that Python overloads the "." operator more than I care for.
  • I don't mind the use of the < operator for inheritance.
  • I like the fact that object-orientation is more pervasive (or at least feels more pervasive) in Ruby than in Python. Many operations that are handled by functions (in some cases, overloadable functions) in Python are handled by normal method calls in Ruby.
  • I like having a lightweight block syntax. Python's lambda expressions are bulky and don't mesh well with the whitespace-sensitive syntax.
  • I like the metaprogramming hooks to add accessors.

There are other things I like as well, but this is supposed to be a blog post, not a journal article.

Now on to things I don't like. If I've made mistakes here, it's due to my misreading of the PickAxe book. In other words, it's all their fault. Again, this is not an exhaustive list.

  • I don't like that unqualified identifiers (usually) mean method calls to self, unless there's a local variable of the same name, in which case the local variable has precedence. So inside a class, if you want to distinguish "foo" the method call (to self) from "foo" the local variable (which has precedence), you have to write "self.foo", which is clunky. This is where Python's more uniform treatment pays off -- I never have to look at a name and go "is that a local variable or an attribute of the class?". Everything works the same way.

  • I don't like that identifier rules for method names are different from those for local variables.

  • Being able to use any kind of delimiters for the general delimited input syntax for special literals seems unnecessarily general, though it does make regular expressions look more like what we're used to. Frankly, I think having special syntactic support for regular expressions is completely unnecessary anyway.

  • Being able to leave off parentheses for methods with arguments can be visually nice on occasion, but it is easily abused. I've come to the conclusion that this is a misfeature.

  • I don't like that there are two completely different syntaxes for blocks (curly braces and do/end), and (worse) that they have different precedences. It's all so arbitrary and unnecessary.

  • I don't like that the additional block argument to a method (block as coroutine) needs to start on the same line as the method call, though I can imagine that it's necessary to make the parsing unambiguous. (Can you say "slippery slope"? I knew you could.) I also don't like the fact that you can alternately give an extra Proc argument prefixed with an ampersand, though again I can see why this might be useful. Frankly, the rules for blocks are just way too complicated, especially considering that all blocks should have been is a lightweight way to specify anonymous functions. I think the extra-block-argument-as-coroutine concept was put in mainly to avoid having to put the block inside a comma-separated, parentheses-delimited list of arguments, where it would look syntactically odd (there's the slippery slope again: making semantic decisions to make the syntax nicer).

  • Methods can only take ONE block argument (!) which is auto-converted to a Proc value (automatic conversions are evil!). It also requires special syntax (yuck!). Note that blocks are not first-class (though Proc objects are, being objects). There is a possibility of confusion between a true block argument and an additional block parameter used as a coroutine, especially if parentheses are left off. What a mess!

  • The rules for unpacking arrays are complicated and annoying.

  • The funky global variables like $! etc. are not horrible (except syntactically) but I would be happier if they weren't there. Are they really used so much that it's necessary to give them tiny symbolic names that clash with the normal rules for Ruby identifiers? I also don't like functions that auto-assign to $_.

  • if/then expressions: being able to use "then" or ":" or a line break to indicate the beginning of the first clause is ridiculously redundant. This also applies to unless and when expressions. It also applies to while/until/for expressions, though there the choice is between do/:/[newline]. Is any of this necessary?

  • The rules for array references preceded by asterisks in case statements are ridiculous.

  • There is no need for both the "lambda" and "proc" keywords. ARE they keywords?

  • Procs created with Proc.new and with lambda have different semantics! They should be different objects if that's the case!

  • I would only have allowed the => syntax in hashes rather than also allowing the comma-separated syntax.

  • The block scoping rules are weird, as many people have already pointed out.

  • The parallel assignment rules are WAY too complicated, and for very little benefit.

  • There is no need for two different logical and/or/not operators with different precedences.

  • Having two different range operators (.. and ...) is unnecessary. Allowing ranges for boolean expressions is downright stupid.

  • I don't like if and unless modifiers, or while and until modifiers. They are unnecessary.

  • The slight semantic differences between for loops and each method invocations are lame.

  • I don't see why we need all of break, redo, next and retry. break and next should be sufficient (as they are in other languages).

  • I don't get why we need throw/catch at all. It seems like just a hack to get a non-local goto.

That's all I've got for now, though I'm sure there are other things as well. I think Ruby really needs a serious cleanup, though given the philosophy of the language, I doubt that it'll happen any time soon. For my part, I'm glad that I can use Python instead (not that Python has no flaws, but that's a topic for another time).

And all of this in a language which supposedly adheres to the "Principle of Least Surprise"! I sure found a lot of things that surprised me. What I want are languages which (in Occam's words) "do not multiply without necessity". Don't add stuff to the language just because you can -- add it because there is a big payoff in doing so. The introduction to the Scheme language standard puts it well:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

So, for instance, the next major revision of Python is actually going to remove some misfeatures from the language (like the ever-so-convenient print statement) because they complicate the language unnecessarily with little benefit. That's the kind of philosophy I like in computer languages. Add stuff to your language, sure, but be careful what you add; only add things that give a big payoff, not things that make the syntax a bit more comfy in isolated cases.

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded