Dry migrations

I wanted to write a post about the many things that should be fixed with Rails.

Interestingly, Rails 3.1 fixes quite many of these.

At last, jQuery takes over Prototype. Prototype was nice and didn't exactly solve the same problem, but in my experience jQuery is mandatory for developing anything decent. Same thing for Sass and I'm glad they have corrected the mistake of the default sass location (which used to be /public/stylesheets/sass when it had to be in /app somewhere. Handling assets was also sorely missing; I've been using sprockets before and it's a fine choice.

I'm happily surprised at CoffeeScript. I've also been using it but I didn't expect it to become the default, especially given the fact that it's quite young and I'd argue it's a much bolder move than using Haml. I have no idea as to why Haml doesn't also come standard.

It's interesting that we are now targeting the web platform without writing anything directly in it: using HAML instead of HTML, Sass instead of CSS, CoffeeScript instead of Javascript (and accessing the DOM more often via jQuery than directly).

The last goodie is DRY migrations. I find it irritating to write most migrations as I'd really like to generate them automatically from a change to the schema, maybe because my ancient development tool 4D gave me that 25 years ago...

I'd rather write the schema in the model (where it belongs IMO) and generate a "diff" as a migration, but at the very least I wanted to avoid writing the drop_table and remove_column that always correspond one to one with create_table and add_column.

I was actually looking at the code to see where one could have automatically undoable migrations, as it is much easier than my dream solution, and lo and behold, we can now do this!



Much better. Hopefully we'll soon be able to specify :from => ... when issuing change_column_default or similar so that they become undoable too.

I still have a couple of gripes on my list. In no particular order:

Haml


Default template

Way too basic. There should be a basic solution for the page title (that isn't a static title!), default content_for, etc... Easy to do yourself, but why not encourage a standard convention?

test environment & fixtures

Also too basic too. I find fixtures longer to generate and harder to maintain when the schema changes compared to factory-based data.

config/database.yml

It has the wrong idea in mixing important production information with less important and more local information for the test & dev environments. I've always had problems with source control and that file because I stick with SQLite for dev/test while other developers prefer other DBs.

Yaml

Now that I think of it, I'm not sure there should be any yml files in a rails project. The gain over a strictly Ruby file is minimal, even more so in Ruby 1.9.2, and it's just less flexible. It also encourages crazy stuff like cucumber yml config file with ERB in it.

MVC...L?

Maybe it's just me, but I like to write separate functionality that acts like a library. It doesn't fit as a Model, so I stick that code in /lib with the caveat that there is no default structure, that it doesn't autoload nor auto reloads. It should probably go in app/lib or similar.

Fingers crossed for Rails 3.2!

Method_missing, politely

In their Polite Programmer talk at Rubyconf, Jim Weirich and Chris Nelson pointed out that merely adding some behavior with method_missing wasn't quite polite, as shown below:


In order for respond_to? to return true, one can specialize it, as follows:


This is better, but it still doesn't make play_some_Beethoven behave exactly like a method. Indeed:


Ruby 1.9.2 introduces respond_to_missing? that provides for a clean solution to the problem. Instead of specializing respond_to? one specializes respond_to_missing?. Here's a full example:

Fixing MRI, a dozen steps at a time.

Is there a term like bugfield? You know, when everytime you get to take a couple of steps in a code base you encounter a different bug, which leads to another one, ..., like a minefield of bugs?

Here was my last sequence in Ruby (MRI)...

Main goal: improve Matrix#determinant and #rank
after a suggestion of Yu Ichino. The bulk of the work took me quite a while, as I had to check a bunch of things, understand the algorithm, do some performance testing, etc...

When modifying Matrix#rank to use this different approach, I take the opportunity to improve the styling. A variable name of ii is not as clear as row, and... it actually reveals that something is amiss because that loop goes up to the number of columns, not rows...

1) So I find a minimal test case to convince myself I'm not mistaken. Yup, a simple 3x2 matrix has the wrong rank. I add that to the spec and fix Matrix#rank. When cleaning up, I make sure that Matrix#regular? and Matrix#singular? are using the right determinant function and not a bad variant that's now deprecated.

Turns out they are checking the rank of the matrix, which is not as efficient but more importantly...

2) they both return false if the matrix is not square. This doesn't make much mathematical sense.

Since I'm now the happy maintainer of the lib and I am confident there is no other reasonable solution, I have them raise an error for rectangular matrices. This means specs are either wrong or incomplete in Rubyspec, though, so I check them out...

3) Turns out Rubyspec is incomplete for those, so I specify what error should be returned in case of a rectangular matrix. Double check my change by running it gives me 0 assertions. Oups?

Turns out that the guard I wrote to signify this was a bug never passes. Ah, right, ruby_bug "", "1.9" means "this is a bug present in the whole 1.9 .x line", so it will not be executed until Ruby 2.0!

My bad, but the program to run the specs shouldn't allow that though, so...

4) Discussion with Brian Ford, the maintainer of RubySpec. Good thing he's always on IRC. Anyways, he might put in a max version to avoid such nonsense in the future. Meanwhile...

5) A quick search in RubySpec reveals about a half dozen of such bad guards, so I set about fixing each one, and...

6) One of the spec that was not guarded properly fails for the latest Ruby trunk. It's not clear it's a bug though. At least for me, as I've never tried to open the singleton class of a Bignum!

So I investigate, try a couple of things, and yeah, the more I dig, the more it looks like a bug, so I open an issue to confirm with ruby-core. There's one spec left...

7) The last spec shows clearly a small bug in String#sub! so I fix that in MRI... and I realize that the error message for the wrong number of parameters is misleading.

8) It takes about a microsecond to fix that error message. A quick find reveals other similar error messages in the MRI code. A quick review leads to... 18 issues of all sorts. Some more inaccuracies, some uninformative messages, some that don't follow the standard format and typos in the doc.

9) I fix all of these too. Ideally this should be refactored, but I'm getting tired. Yet I'm still awake enough to realize that one more method has the wrong doc...

10) From the code, I gather that the interface for SignalException.new is a bit more complex than advertised. I supplement the doc as best as I can.

Ouf, I'm done. Double check the commit... arghh, there's another method that accepts an undocumented extra parameter, so...

11) That extra param is a bit odd. Looks like you can build a regexp with a third parameter equal to "n" or "N" and the encoding switches to binary. Other values will get you a warning, and any letter after the "n" will be ignored. Smells like legacy.

git blame tracks back the changes years ago, giving me a reference to the ruby-dev list. Lucky me, it's not in Japanese and refers to uri/common.rb. A quick check refers to no Regexp.new with that third argument. Ah, there's a Regexp.new(HEADER_PATTERN, 'N') in uri/mailto. The 'N' doesn't mean binary, though, since it's in second place (so it means "case insensitive", as would true), which....

12) is a bug; the regexp is already case insensitive so that 'N' has no effect. I don't understand enough what an extra "N" really does to be sure if it can be removed (since it doesn't have any effect right now, ) or should put in third position.

I'm a bit dizzy. I should really go to sleep. Even though this is all pretty minor, I fire a redmine issue about the doc and another one about the lib and go to bed...

And I thought fixing Matrix#regular? would be trivial...

Best time to get involved in ruby-core

Apart from enjoying the summer, I've spent time hacking on MRI, especially since I've been accepted as a committer. The feature freeze for Ruby 1.9.2 was planned for yesterday and this has been pushed back a couple of days before. Rejoice!

Why? The reason stated was that the next version of Ruby will, for the first time ever, pass the RubySpec. This makes RubySpec the official meeting point for all Ruby implementations, not just Rubinius (the originator of RubySpec), JRuby and others. This should also give a bit more time to decide on a couple of new features that might make it in 1.9.2.

Much work has been done to have the specs meet MRI 1.9.x and the language and core sections only have a couple of failures1. Most are due to cases for which the best decisions still have to be figured out. I'll remind you that it's easy to gain commit access to RubySpec: any accepted patch grants you your commit bit.

There is still quite a bit of work to be done spec'ing the libraries. Actually there's a lot of work to be done in the libraries themselves. Some are quite badly maintained, others don't even have an official maintainer. And that's all about to change, hopefully!

It was announced yesterday that being a maintainer is no longer for life. Not doing anything about opened issues? Sorry, we'll get someone else to take care of it. Many libraries currently have no maintainer and there should be many others that won't be claimed in the confirmation process.

Feeling competent to maintain a library? You talk using only sockets? You dream in yaml? Might as well apply to maintain your favorite lib...

I sincerely hope 1.9.2 kicks some serious ass. It's bound to be the version Ruby 1.9 that most people will use and target for the first time. More reason to get it right!



1Actually, the bulk of the work was spec'ing Ruby 1.8.6 under the supervision of Brian Ford. I helped finish the specs for 1.8.7 and the mysterious and tireless Run Paint Run Run did most of the 1.9 specific specs. Spec'ing Ruby usually leads to finding bugs or asking clarifications. Indeed, Run Paint opened more issues on redmine than any other user!

A stickler in Silicon Valley

I have not been actively looking for a job yet. Nevertheless, I was contacted by a startup and invited to spend a week in Silicon Valley / San Francisco, hacking around with them to see if I could become part of their team, which I found quite flattering. I learned lots of new things in California. A couple of new words too. I'm still unsure as to what exactly a hipster is, but "stickler" was easier to grasp: one who insists on exactness or completeness in the observance of something.

It was fascinating to witness the startup culture. Tens of thousand of users is considered a small test bed; the target is millions. Every newcomer on the web scene is analysed & probed. There was technology and technology talk everywhere. It seems like everyone in the Bay Area has an iPhone. And I mean everyone! Lacking a decent map of the city, I asked two random strangers for directions and both dug out their iPhone to help me out. When I needed to call someone I was meeting, I asked another stranger if I could use his phone. It was an iPhone, of course, and after a thorough examination to estimate the chances I'd run away with it, he graciously let me use it. I found people particularily nice too, although maybe my tourist status helped, I don't know.

My timing for the trip was great because Brian Ford and Evan Phoenix were also in town and invited me to have a drink. It turns out the monthly SF Ruby meetup was on that very same day, I met them there. I'd say the crowd was about three times that of a typicial Montreal.rb meet. There were other noticeable differences too. Many people were part of pretty exciting projects and companies (EngineYard, GitHub, PeepCode and the like). Chris Wanstrath (of GitHub) presented his newest gem rip, while Mike Dirolf was presenting his mongoDB project. Three people stood up announcing they were looking for developers, which has yet to happen in Montreal... I guess recession doesn't have the same meaning in the Valley.

Back to Palo Alto and the startup. I realized a couple of things there. I really enjoy thinking about what a product could look like, how it should be presented to users. Finding ways to improve it by analysing its use is something I've never had the chance to do and is quite appealing. On the other hand, I somehow assumed that the "Joel" approach would be a sine qua non for an ambitious startup: hire the best, only the best, give them the best tools and let them loose.

It turns out that when considering what a good programmer is, different qualities can be given different weights. Most will agree that getting things done is the main one. Without it, not much can save you. As a reflection of my values though, I expected that embracing standards, learning the available tools and applying principles like DRY, refactoring, etc..., was also part of it. That's apparently not the case, and that's why we all realized I wouldn't mesh as nicely as we hoped in their startup.

I couldn't help but notice that all the rails programmers are Windows guys. Except one; he is a Linux guy and although I didn't have the chance to really work with him, he gave me a really good impression. I'm ready to bet his values are more aligned with mine. The HTML/css/design expert was the only Mac guy and I could not have agreed more on what his opinions and point of view. So is there a Windows/Mac divide? Something like "Get things done" vs "Design it well so it just works"?

Nah. Things are never that simple, as I was reminded when taking part in the interview of a mac guy that clearly didn't care for DRY or nice tools like named scopes, besides otherwise decent technical skills. So no, I just have the face the fact that, for better or for worse, I'm a stickler for getting things done well.

Update:My friend Pascal suggested this be related to an Engineer/Scientist divide: using tools vs understanding them; making things work vs comprehension through abstraction. Interesting idea.

Lost in recursion

Last time I asked a simple (but quite hard) Ruby quiz:

Before giving the answer, let me give you a bit of background...
In a blog post, Ujihisa was discussing how to compare arrays in Ruby and I was curious about the implementation which deals with recursion.

So what's recursion you may ask? Just check:

x is an array containing a single element: x itself. At this point, the choice is yours. You can ask "why should I care?". I have no good answer and you might as well stop reading now. Or you can say "cool" and read on.

So recursion happens whenever part of an object refers to the object itself. If you're not careful about it,you can get infinite loops, for instance. For example, if you attempt to compare arrays naively by comparing their elements, you'll get into trouble:

Can you guess the answer?
Older ruby 1.8.6 raise a StackOverflowError because it uses the naive algorithm of comparing the elements (x and xx) over and over.
Current ruby 1.8.7 and 1.9 detect the recursion and say "woah, I don't want to deal with that, let's just say they're different", so it returns false.

How is that implemented exactly? Well, any call that can be recursive (like x.==(xx) in this case) goes through rb_exec_recursive which keeps track of the receiver (x) on which the method (:==) is called. Recursion is detected when an attempt to call the same method is made on the same object. The method :== returns false for recursive cases.

Note that x == x will return still true, because before the call to rb_exec_recursive, :== will check if the two objects being compared are the same.

What struck me immediately was the lack of symmetry. It didn't smell good and it didn't take long to find a problem.

Comparing x and y = [x] works fine, actually. x and y are not the same object, so :== calls rb_exec_recursive, which stores x in its 'deja-vu' list. The elements of x and y are examined, and since their are both the same object, true is returned. y == x also returns true. So far so good.

Now x and z = [y] are another matter. Again, x and y are not the same object, so rb_exec_recursive gets called. It pushes x on the 'deja-vu' list, and compares its elements (x and y). Comparison of x and y triggers is considered as recursion, because x is already on the list. So x == z returns false.

But what about z == x? z and x are not the same object, so z is put on the recursion-list and elements are compared. y and x are not the same, so a second call to rb_exec_recursive is made, but y is not on the list (only z is at this point) so their elements are compared. x and x are the same object and thus the comparison returns true. In summary:

Fixing this inconsistency is not that difficult. Can you imagine how? Instead of pushing only x when calling x.==(y), we need to push the pair [x, y]. Recursion will be triggered only if x.==(y) gets called again, but not for x.==(z). I set out to make a patch in the C code. With the more strict criteria, we get that both x == z and z == x return true.

On the other hand, we still get false for identical recursive arrays that are built independently, like x and xx.

I then realized that if we detect a recursion when comparing x and xx, it simply means that there is no use in looking further down for differences, so we should return true, not false. Unless a difference is detected somewhere else, then xx and xx are equal! This made it possible to compare recursive arrays that have the same contents, even though they were constructed differently:

If there is a difference between the arrays (say x[0][1][0] != y[0][1][0]), then xx == y returns false. If no such 'path' exists, then xx == y.

I was quite happy when my patch was accepted a week ago, so the current head of Ruby 1.9 deals with recursion perfectly and it's no longer possible that x == y while y != x...

Details on redmine.

A schizo Ruby puzzle

Quick quirky quiz (schizo version)

Here's how I got to checkout Ruby's source and stumble upon that.

Age of Innocence


This is all Mathieu's fault. He asked innocently if my backports gem was compatible with Rails. I thought "duh! of course!". After all, it's meant to be compatible with any Ruby code.

Of course, he was right, there were bugs. Hundreds of tests were failing! Turned out to be two bugs. It dawned on me that my small bunch of unit tests were not even close to be enough. I really needed to test some more.

So I set out to test it on JRuby. I found a bug, but it was JRuby's this time. It was easy to circumvent though, so "JRuby compatibility: check".

How about rubinius? Well, that's were the story really begins... Rubinius is a bit different because most of the builtin library is written in ruby and that many methods use other core methods. That won't make a difference for you, until you fiddle with core methods. For example I was redefining String#upto by calling Range#each. Kosher in MRI, but rubinius' Range#each handles string ranges by calling... String#upto!

There were other problems though, because rubinius was doing all sorts of stuff it wasn't really supposed to do. And because rubinius is mostly Ruby, it was easy for me to fix. Or should I say temping to fix? I have difficulty to resist that kind of temptation, so I submitted my first patch and eagerly awaited my commit access (granted to anyone who submits a patch)...

Eye Opener


I discussed a bit with Evan Phoenix, the creator or rubinius, about 'backports' and told him I'd build it into rubinius, avoiding a bunch of alias_method_chain. I thought it would be dirt quick. That is, until I started.

See, to change things in rubinius, you first start by showing they're broken. And to do that, enters RubySpecs. It's a huge collection of tests that check if what you're running works as expected. Or as MRI runs it, should I say. You knew that Ruby has no official spec, right?

With the help Brian Ford, I started to modify my first RubySpecs. That's when I realized there were so many questions I never asked myself! Time for another quiz, this time with answers (just click on what you think is right!)

# Assume we have:
class MyArray < Array ; end
foo = MyArray.new
# What is the class of:
foo.to_ary MyArrayArray
foo.to_a MyArrayArray
Array.try_convert(foo) MyArrayArray
foo.dup MyArrayArray
(foo+foo) MyArrayArray
(foo*2) MyArrayArray
foo.pop(2) MyArrayArray
foo.shift(2) MyArrayArray
foo[0..2] MyArrayArray
foo.slice(0,2) MyArrayArray
foo.slice!(0,2) MyArrayArray
foo.first(2) MyArrayArray
foo.sample(2) MyArrayArray
foo.flatten MyArrayArray
foo.product MyArrayArray
foo.combination(1).first MyArrayArray
foo.shuffle MyArrayArray


Some are intuitive, like #shuffle, some less so, like #+. I wonder how you're going to do, because I think I made worse than a monkey would by guessing randomly!

The complexity and amount of detail found in RubySpecs was a real eye opener. The fact is, often you won't care about that level of detail about the implementation. But inevitably some people will.

So far I've ported all 1.8.7 Array methods and I'm working on the rest. Writing the specs is usually a bit longer than the implementation and damn difficult to get right. Well, at least for me; luckily there's people like Ujihisa that fix my specs minutes after I commit them.

It's because of a question he asked that I had to refer the Ruby C source and realized there was a potential problem like the x == y but !(y == x).

That cost me a bunch of hours today, because fixing it was another of those challenges I can hardly refuse, even if I had to delve in the C code!

Next blog entry: update on that bug, along with the solution (unless someone posts them in the comments)!

Thanks to Brian Ford and Evan Phoenix for their help and Ujihisa for pointing me to the complexity of the <=> operator he calls the spacecraft operator. And yeah, to Mathieu Houle for his damn question! ;-)