Last time I asked a simple (but quite hard) Ruby quiz:
1 2 3 4
Before giving the answer, let me give you a bit of background…
In a blog post, Ujihisa was discussing how to compare arrays in Ruby and I was curious about the implementation which deals with recursion.
So what’s recursion you may ask? Just check:
1 2 3
x is an array containing a single element:
x itself. At this point, the choice is yours. You can ask “why should I care?”. I have no good answer and you might as well stop reading now. Or you can say “cool” and read on.
So recursion happens whenever part of an object refers to the object itself. If you’re not careful about it,you can get infinite loops, for instance. For example, if you attempt to compare arrays naively by comparing their elements, you’ll get into trouble:
1 2 3 4 5 6
Can you guess the answer?
Older ruby 1.8.6 raise a StackOverflowError because it uses the naive algorithm of comparing the elements (
xx) over and over.
Current ruby 1.8.7 and 1.9 detect the recursion and say “woah, I don’t want to deal with that, let’s just say they’re different”, so it returns
How is that implemented exactly? Well, any call that can be recursive (like
x.==(xx) in this case) goes through
rb_exec_recursive which keeps track of the receiver (
x) on which the method (
:==) is called. Recursion is detected when an attempt to call the same method is made on the same object. The method
false for recursive cases.
x == x will return still
true, because before the call to rb_exec_recursive,
:== will check if the two objects being compared are the same.
What struck me immediately was the lack of symmetry. It didn’t smell good and it didn’t take long to find a problem.
y = [x] works fine, actually.
y are not the same object, so
rb_exec_recursive, which stores
x in its ‘deja-vu’ list. The elements of
y are examined, and since their are both the same object,
true is returned.
y == x also returns
true. So far so good.
z = [y] are another matter. Again,
y are not the same object, so rb_exec_recursive gets called. It pushes
x on the ‘deja-vu’ list, and compares its elements (x and y). Comparison of
y triggers is considered as recursion, because
x is already on the list. So
x == z returns
But what about
z == x?
x are not the same object, so
z is put on the recursion-list and elements are compared.
x are not the same, so a second call to rb_exec_recursive is made, but
y is not on the list (only
z is at this point) so their elements are compared.
x are the same object and thus the comparison returns
true. In summary:
1 2 3 4 5 6
Fixing this inconsistency is not that difficult. Can you imagine how? Instead of pushing only
x when calling
x.==(y), we need to push the pair
[x, y]. Recursion will be triggered only if
x.==(y) gets called again, but not for
x.==(z). I set out to make a patch in the C code. With the more strict criteria, we get that both
x == z and
z == x return
On the other hand, we still get
false for identical recursive arrays that are built independently, like
I then realized that if we detect a recursion when comparing
xx, it simply means that there is no use in looking further down for differences, so we should return
false. Unless a difference is detected somewhere else, then
xx are equal! This made it possible to compare recursive arrays that have the same contents, even though they were constructed differently:
1 2 3 4 5 6 7 8 9 10 11 12 13
If there is a difference between the arrays (say
x != y), then
xx == y returns
false. If no such ‘path’ exists, then
xx == y.
I was quite happy when my patch was accepted a week ago, so the current head of Ruby 1.9 deals with recursion perfectly and it’s no longer possible that
x == y while
y != x…
Details on redmine.