# Hash

Why I won't squash my commits

2014-02-05T00:27:00-05:00

tl;dnr

While I really appreciate the constructive comments I get on pull requests I make, there’s one type of comment I have a hard time with.

Please don’t ask me to squash the commits of my pull requests.

Don’t ask me because, with all due respect to the amazing work from committers, I won’t do it.

Unless I’ve made an actual mistake, if my pull request has 5 commits, it is because each of them is independent and I feel they should remain so.

If they really want to, project committers can manually squash and merge the commits, wait for someone else to make a PR with the commits squashed or even reject the PR.

Call it lunacy or pride, but I just won’t squash my commits. Hopefully committers won’t take offense at that as I’ll do my best to not take offense at their suggestion that I didn’t segment my commits correctly.

Why committers ask to squash commits?

I imagine the historical reason it that many contributors aren’t super comfortable with git and git rebase -i in particular, so their commits represent their train of thought, not a sequence of independent changes to apply. E.g.:

Introduce
Oops, fix typo
Oh oh, fix a bug of
Fix bug of (for real this time)

For these contributors, it is mandatory that the commits are not accepted as is. Squashing the whole thing is usually the safest bet.

There are many contributors that use git rebase -i and git commit -p in their sleep though, so that won’t be the situation.

The other reason I’ve been told is that “we generally squash all commits so that it’s easy to backport/revert”. I can’t agree with that. First, it is much more difficult to backport/revert just part of the PR if it is squashed. Morever, it’s pretty easy to backport/revert the whole PR by using git revert and git cherry-pick either with a range of commits or the merge commit. Less than 2% of commits get reverted anyways.

When should committers not ask to squash commits?

If all commits can stand on their own, i.e. all tests pass after each individual commit, then the commits are atomic and do not need to be squashed. I’d even say they probably shouldn’t be squashed.

My commits are typically the smallest unit of change that will work and still pass all tests. The main exception is a commit of a bunch of trivial changes that are isolated (e.g. removing trailing whitespace, fixing a bunch of typos in the doc,renaming of a local variable). Even then, I won’t commit doc typos and renaming of a variable together, say.

When doing a refactor, I will usually split the changes in small independent refactoring commits. I believe it makes it easier to understand and judge than one big commit.

A rather telling example was this PR I made recently. It aims at fixing one bug, but I broke it down into 15 commits. Each consists of a single refactoring step in the right direction, until the last commit which is the one fixing the bug per say. It’s easier to see the validity of each change this way, while the combined diff has a lot of noise and does a bunch of different things at once. I just can’t grok the combined diff.

I’m not asking everyone to structure their commits with such detail and attention; I’m only asking that it be, if not appreciated, at least accepted to do so.

Method lookup in Ruby 2.0.0

2013-03-23T00:27:00-04:00

Tech.pro sponsored a tutorial on method lookup in Ruby 2.0.0.

It’s an in-depth review of how exactly Ruby deals with method calls.

Ruby 2.0.0 by example

2013-02-23T07:40:00-05:00

There’s a Portugese translation by Rodrigo Martins if you prefer.

A quick summary of some of the new features of Ruby 2.0.0:

Language Changes

Keyword arguments

# Ruby 1.9:
  # (From action_view/helpers/text_helper.rb)
def cycle(first_value, *values)
  options = values.extract_options!
  name = options.fetch(:name, 'default')
  # ...
end

# Ruby 2.0:
def cycle(first_value, *values, name: 'default')
  # ...
end

# CAUTION: Not exactly identical, as keywords are enforced:
cycle('odd', 'even', nme: 'foo')
# => ArgumentError: unknown keyword: nme

# To get exact same result:
def cycle(first_value, *values, name: 'default', **ignore_extra)
  # ...
end

This makes method definitions very flexible. In summary:

def name({required_arguments, ...}
         {optional_arguments, ...}
         {*rest || additional_required_arguments...} # Did you know?
         {keyword_arguments: "with_defaults"...}
         {**rest_of_keyword_arguments}
         {&block_capture})

In Ruby 2.0.0, keyword arguments must have defaults, or else must be captured by **extra at the end. Next version will allow mandatory keyword arguments, e.g. def hello(optional: 'default', required:), but there are ways to do it now.

Defaults, for optional parameters or keyword arguments, can be mostly any expression, including method calls for the current object and can use previous parameters.

A complex example showing most of this:

class C
  def hi(needed, needed2,
         maybe1 = "42", maybe2 = maybe1.upcase,
         *args,
         named1: 'hello', named2: a_method(named1, needed2),
         **options,
         &block)
  end

  def a_method(a, b)
    # ...
  end
end

C.instance_method(:hi).parameters
# => [ [:req, :needed], [:req, :needed2],
#      [:opt, :maybe1], [:opt, :maybe2],
#      [:rest, :args],
#      [:key, :named1], [:key, :named2],
#      [:keyrest, :options],
#      [:block, :block] ]

Known bug: it’s not currently possible to ignore extra options without naming the ** argument.

Symbol list creation

Easy way to create lists of symbols with %i and %I (where i is for intern):

# Ruby 1.9:
KEYS = [:foo, :bar, :baz]

# Ruby 2.0:
KEYS = %i[foo bar baz]

Default encoding is UTF-8

No magic comment is needed in case the encoding is utf-8.

# Ruby 1.9:
# encoding: utf-8
# ^^^ previous line was needed!
puts "❤ Marc-André ❤"

# Ruby 2.0:
puts "❤ Marc-André ❤"

Unused variables can start with _

Did you know that Ruby can warn you about unused variables?

# Any Ruby, with warning on:
ruby -w -e "
  def hi
    hello, world = 'hello, world'.split(', ')
    world
  end"
# => warning: assigned but unused variable - hello

The way to avoid the warning was to use _. Now we can use any variable name starting with an underscore:

# Ruby 1.9
ruby -w -e "
  def foo
    _, world = 'hello, world'.split(', ')
    world
  end"
# => no warning

# Ruby 2.0
ruby -w -e "
  def hi
    _hello, world = 'hello, world'.split(', ')
    world
  end"
# => no warning either

Core classes changes

Prepend

Module#prepend inserts a module at the beginning of the call chain. It can nicely replace alias_method_chain:

# Ruby 1.9:
class Range
  # From active_support/core_ext/range/include_range.rb
  # Extends the default Range#include? to support range comparisons.
  def include_with_range?(value)
    if value.is_a?(::Range)
      # 1...10 includes 1..9 but it does not include 1..10.
      operator = exclude_end? && !value.exclude_end? ? :< : :<=
      include_without_range?(value.first) && value.last.send(operator, last)
    else
      include_without_range?(value)
    end
  end

  alias_method_chain :include?, :range
end

Range.ancestors # => [Range, Enumerable, Object...]

# Ruby 2.0
module IncludeRangeExt
  # Extends the default Range#include? to support range comparisons.
  def include?(value)
    if value.is_a?(::Range)
      # 1...10 includes 1..9 but it does not include 1..10.
      operator = exclude_end? && !value.exclude_end? ? :< : :<=
      super(value.first) && value.last.send(operator, last)
    else
      super
    end
  end
end

class Range
  prepend IncludeRangeExt
end

Range.ancestors # => [IncludeRangeExt, Range, Enumerable, Object...]

Refinements [experimental]

In Ruby 1.9, if you alias_method_chain a method, the new definition takes place everywhere. In Ruby 2.0.0, you can make this kind of change just for yourself using Module#refine:

# Ruby 2.0
module IncludeRangeExt
  refine Range do
    # Extends the default Range#include? to support range comparisons.
    def include?(value)
      if value.is_a?(::Range)
        # 1...10 includes 1..9 but it does not include 1..10.
        operator = exclude_end? && !value.exclude_end? ? :< : :<=
        super(value.first) && value.last.send(operator, last)
      else
        super
      end
    end
  end
end

def test_before(r)
  r.include?(2..3)
end
(1..4).include?(2..3) # => false (default behavior)

# Now turn on the refinement:
using IncludeRangeExt

(1..4).include?(2..3) # => true  (refined behavior)

def test_after(r)
  r.include?(2..3)
end
test_after(1..4) # => true (defined after using, so refined behavior)

3.times.all? do
  (1..4).include?(2..3)
end # => true  (refined behavior)

# But refined version happens only for calls defined after the using:
test_before(1..4) # => false (defined before, not affected)
require 'some_other_file' # => not affected, will use the default behavior

# Note:
(1..4).send :include?, 2..3 # => false (for now, send ignores refinements)

Full spec is here and is subject to change in later versions. More in-depth discussion here

Lazy enumerators

An Enumerable can be turned into a lazy one with the new Enumerable#lazy method:

# Ruby 2.0:
lines = File.foreach('a_very_large_file')
            .lazy # so we only read the necessary parts!
            .select {|line| line.length < 10 }
            .map(&:chomp)
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }
  # => Lazy enumerator, nothing executed yet
lines.first(3) # => Reads the file until it returns 3 elements
               # or until an element of length <= 20 is
               # returned (because of the take_while)

# To consume the enumerable:
lines.to_a # or...
lines.force # => Reads the file and returns an array
lines.each{|elem| puts elem } # => Reads the file and prints the resulting elements

Note that lazy will often be slower than a non lazy version. It should be used only when it really makes sense, not just to avoid building an intermediary array.

require 'fruity'
r = 1..100
compare do
  lazy   { r.lazy.map(&:to_s).each_cons(2).map(&:join).to_a }
  direct { r     .map(&:to_s).each_cons(2).map(&:join)      }
end
# => direct is faster than lazy by 2x ± 0.1

Lazy size

Enumerator#size can be called to get the size of the enumerator without consuming it (if available).

# Ruby 2.0:
(1..100).to_a.permutation(4).size # => 94109400
loop.size # => Float::INFINITY
(1..100).drop_while.size # => nil

When creating enumerators, either with to_enum, Enumerator::New, or Enumerator::Lazy::New it is possible to define a size too:

# Ruby 2.0:
fib = Enumerator.new(Float::INFINITY) do |y|
  a = b = 1
  loop do
    y << a
    a, b = b, b+a
  end
end

still_lazy = fib.lazy.take(1_000_000).drop(42)
still_lazy.size # => 1_000_000 - 42

module Enumerable
  def skip(every)
    unless block_given?
      return to_enum(:skip, every) { size && (size+every)/(every + 1) }
    end
    each_slice(every+1) do |first, *ignore|
      yield first
    end
  end
end

(1..10).skip(3).to_a # => [1, 5, 9]
(1..10).skip(3).size # => 3, without executing the loop

Additional details and examples in the doc of to_enum

dir

Although require_relative makes the use of File.dirname(__FILE__) much less frequent, we can now use __dir__

# Ruby 1.8:
require File.dirname(__FILE__) + "/lib"
File.read(File.dirname(__FILE__) + "/.Gemfile")

# Ruby 1.9:
require_relative 'lib'
File.read(File.dirname(__FILE__) + '/.config')

# Ruby 2.0
require_relative 'lib' # no need to use __dir__ for this!
File.read(__dir__ + '/.config')

bsearch

Binary search is now available, using either Array#bsearch or Range#bsearch:

# Ruby 2.0:
ary = [0, 4, 7, 10, 12]
ary.bsearch {|x| x >=   6 } #=> 7
ary.bsearch {|x| x >= 100 } #=> nil

# Also on ranges, including ranges of floats:
(Math::PI * 6 .. Math::PI * 6.5).bsearch{|f| Math.cos(f) <= 0.5}
# => Math::PI * (6+1/3.0)

to_h

There is now an official way to convert a class to a Hash, using to_h:

# Ruby 2.0:
Car = Struct.new(:make, :model, :year) do
  def build
    #...
  end
end
car = Car.new('Toyota', 'Prius', 2014)
car.to_h # => {:make=>"Toyota", :model=>"Prius", :year=>2014}
nil.to_h # => {}

This has been implemented for nil, Struct and OpenStruct, but not for Enumerable/Array:

{hello: 'world'}.map{|k, v| [k.to_s, v.upcase]}
                .to_h # => NoMethodError:
# undefined method `to_h' for [["hello", "WORLD"]]:Array

If you think this would be a useful feature, you should try to convince Matz.

caller_locations

It used to be tricky to know which method just called. It wasn’t very efficient either, since the whole backtrace had to be returned. Each frame was a string that needed to be first computed by Ruby and probably parsed afterwards.

Enters caller_locations that returns the information in an object fashion and with a better api that can limit the number of frames requested.

# Ruby 1.9:
def whoze_there_using_caller
  caller[0][/`([^']*)'/, 1]
end

# Ruby 2.0:
def whoze_there_using_locations
  caller_locations(1,1)[0].label
end

How much faster is it? A simple test gives me 45x speedup for a short stacktrace, and 100x for a stacktrace of 100 entries!

The extra information like the file path, line number are still accessible; instead of asking for label, ask for path or lineno.

Optimizations

It’s difficult to show most optimizations by code, but some nice optimizations made it in Ruby 2.0.0. In particular, the GC was optimized to make forking much faster.

One optimization we can demonstrate was to make many floats immediates on 64-bit systems. This avoids creating new objects in many cases:

# Ruby 1.9
4.2.object_id == 4.2.object_id # => false

# Ruby 2.0
warn "Optimization only on 64 bit systems" unless 42.size * 8 == 64
4.2.object_id == 4.2.object_id # => true (4.2 is immediate)
4.2e100.object_id == 4.2e100.object_id # => false (4.2e100 isn't)

What else?

An extensive list of changes is the NEWS file.

I want it!

Try it out today:

install with rvm: rvm get head && rvm install 2.0.0 (note that rvm get stable is not sufficient!)
install with rbenv: rbenv install 2.0.0-p0 (maybe, see comment by Artur Hebda)
other installation: See the ruby-lang.org instructions

For those who can’t upgrade yet, you can still have some of the fun with my backports gem. It makes lazy, bsearch and a couple more available for any version of Ruby. The complete list is in the readme.

Enjoy Ruby 2.0.0!

DRY migrations

2011-04-25T00:00:00-04:00

I wanted to write a post about the many things that should be fixed with Rails.

Interestingly, Rails 3.1 fixes quite many of these.

At last, jQuery takes over Prototype. Prototype was nice and didn’t exactly solve the same problem, but in my experience jQuery is mandatory for developing anything decent. Same thing for Sass and I’m glad they have corrected the mistake of the default sass location (which used to be /public/stylesheets/sass when it had to be in /app somewhere. Handling assets was also sorely missing; I’ve been using sprockets before and it’s a fine choice.

I’m happily surprised at CoffeeScript. I’ve also been using it but I didn’t expect it to become the default, especially given the fact that it’s quite young and I’d argue it’s a much bolder move than using Haml. I have no idea as to why Haml doesn’t also come standard.

It’s interesting that we are now targeting the web platform without writing anything directly in it: using HAML instead of HTML, Sass instead of CSS, CoffeeScript instead of Javascript (and accessing the DOM more often via jQuery than directly).

The last goodie is DRY migrations. I find it irritating to write most migrations as I’d really like to generate them automatically from a change to the schema, maybe because my ancient development tool 4D gave me that 25 years ago…

I’d rather write the schema in the model (where it belongs IMO) and generate a “diff” as a migration, but at the very least I wanted to avoid writing the drop_table and remove_column that always correspond one to one with create_table and add_column.

I was actually looking at the code to see where one could have automatically undoable migrations, as it is much easier than my dream solution, and lo and behold, we can now do this!

# Something like:

class AddFoo< ActiveRecord::Migration
  def self.up
    create_table :foos do |t|
      t.string :name
      # ...
    end

    change_table :products do |t|
      t.references     :foo
      # ...
    end

    add_index :products, :foo_id
  end

  def self.down
    remove_index products, :foo_id

    change_table :products do |t|
      t.remove     :foo_id
      # ...
    end

    drop_table :foos

  end
end

# can now be dry:

class AddFoo< ActiveRecord::Migration
  def change
    create_table :foos do |t|
      t.string :name
      # ...
    end

    change_table :products do |t|
      t.references     :foo
      # ...
    end

    add_index :products, :foo_id
  end
end

Much better. Hopefully we’ll soon be able to specify :from => ... when issuing change_column_default or similar so that they become undoable too.

I still have a couple of gripes on my list. In no particular order:

Haml

Default template

Way too basic. There should be a basic solution for the page title (that isn’t a static title!), default content_for, etc… Easy to do yourself, but why not encourage a standard convention?

test environment & fixtures

Also too basic too. I find fixtures longer to generate and harder to maintain when the schema changes compared to factory-based data.

config/database.yml

It has the wrong idea in mixing important production information with less important and more local information for the test & dev environments. I’ve always had problems with source control and that file because I stick with SQLite for dev/test while other developers prefer other DBs.

Yaml

Now that I think of it, I’m not sure there should be any yml files in a rails project. The gain over a strictly Ruby file is minimal, even more so in Ruby 1.9.2, and it’s just less flexible. It also encourages crazy stuff like cucumber yml config file with ERB in it.

MVC…L?

Maybe it’s just me, but I like to write separate functionality that acts like a library. It doesn’t fit as a Model, so I stick that code in /lib with the caveat that there is no default structure, that it doesn’t autoload nor auto reloads. It should probably go in app/lib or similar.

Fingers crossed for Rails 3.2!

method_missing, politely

2010-11-15T00:00:00-05:00

In their Polite Programmer talk at Rubyconf, Jim Weirich and Chris Nelson pointed out that merely adding some behavior with method_missing wasn’t quite polite, as shown below:

class StereoPlayer
  def method_missing(method, *args, &block)
    if method.to_s =~ /play_(\w+)/
      puts "Here's #{$1}"
    else
      super
    end
  end
end

p = StereoPlayer.new
# ok:
p.play_some_Beethoven # => "Here's some_Beethoven"
# not very polite:
p.respond_to? :play_some_Beethoven # => false

In order for respond_to? to return true, one can specialize it, as follows:

class StereoPlayer
  # def method_missing ...
  #   ...
  # end

  def respond_to?(method, *)
    method.to_s =~ /play_(\w+)/ || super
  end
end
p.respond_to? :play_some_Beethoven # => true

This is better, but it still doesn’t make play_some_Beethoven behave exactly like a method. Indeed:

p.method :play_some_Beethoven
# => NameError: undefined method `play_some_Beethoven'
#               for class `StereoPlayer'

Ruby 1.9.2 introduces respond_to_missing? that provides for a clean solution to the problem. Instead of specializing respond_to? one specializes respond_to_missing?. Here’s a full example:

class StereoPlayer
  # def method_missing ...
  #   ...
  # end

  def respond_to_missing?(method, *)
    method =~ /play_(\w+)/ || super
  end
end

p = StereoPlayer.new
p.play_some_Beethoven # => "Here's some_Beethoven"
p.respond_to? :play_some_Beethoven # => true
m = p.method(:play_some_Beethoven) # => #
# m acts like any other method:
m.call # => "Here's some_Beethoven"
m == p.method(:play_some_Beethoven) # => true
m.name # => :play_some_Beethoven
StereoPlayer.send :define_method, :ludwig, m
p.ludwig # => "Here's some_Beethoven"

Fixing MRI, a dozen steps at a time

2010-04-01T00:00:00-04:00

Is there a term like bugfield? You know, when everytime you get to take a couple of steps in a code base you encounter a different bug, which leads to another one, …, like a minefield of bugs?

Here was my last sequence in Ruby (MRI)…

Main goal: improve Matrix#determinant and #rank after a suggestion of Yu Ichino. The bulk of the work took me quite a while, as I had to check a bunch of things, understand the algorithm, do some performance testing, etc…

When modifying Matrix#rank to use this different approach, I take the opportunity to improve the styling. A variable name of ii is not as clear as row, and… it actually reveals that something is amiss because that loop goes up to the number of columns, not rows…

1) So I find a minimal test case to convince myself I’m not mistaken. Yup, a simple 3x2 matrix has the wrong rank. I add that to the spec and fix Matrix#rank. When cleaning up, I make sure that Matrix#regular? and Matrix#singular? are using the right determinant function and not a bad variant that’s now deprecated.

Turns out they are checking the rank of the matrix, which is not as efficient but more importantly…

2) they both return false if the matrix is not square. This doesn’t make much mathematical sense.

Since I’m now the happy maintainer of the lib and I am confident there is no other reasonable solution, I have them raise an error for rectangular matrices. This means specs are either wrong or incomplete in Rubyspec, though, so I check them out…

3) Turns out Rubyspec is incomplete for those, so I specify what error should be returned in case of a rectangular matrix. Double check my change by running it gives me 0 assertions. Oups?

Turns out that the guard I wrote to signify this was a bug never passes. Ah, right, ruby_bug "", "1.9" means “this is a bug present in the whole 1.9 .x line”, so it will not be executed until Ruby 2.0!

My bad, but the program to run the specs shouldn’t allow that though, so…

4) Discussion with Brian Ford, the maintainer of RubySpec. Good thing he’s always on IRC. Anyways, he might put in a max version to avoid such nonsense in the future. Meanwhile…

5) A quick search in RubySpec reveals about a half dozen of such bad guards, so I set about fixing each one, and…

6) One of the spec that was not guarded properly fails for the latest Ruby trunk. It’s not clear it’s a bug though. At least for me, as I’ve never tried to open the singleton class of a Bignum!

So I investigate, try a couple of things, and yeah, the more I dig, the more it looks like a bug, so I open an issue to confirm with ruby-core. There’s one spec left…

7) The last spec shows clearly a small bug in String#sub! so I fix that in MRI… and I realize that the error message for the wrong number of parameters is misleading.

8) It takes about a microsecond to fix that error message. A quick find reveals other similar error messages in the MRI code. A quick review leads to… 18 issues of all sorts. Some more inaccuracies, some uninformative messages, some that don’t follow the standard format and typos in the doc.

9) I fix all of these too. Ideally this should be refactored, but I’m getting tired. Yet I’m still awake enough to realize that one more method has the wrong doc…

10) From the code, I gather that the interface for SignalException.new is a bit more complex than advertised. I supplement the doc as best as I can.

Ouf, I’m done. Double check the commit… arghh, there’s another method that accepts an undocumented extra parameter, so…

11) That extra param is a bit odd. Looks like you can build a regexp with a third parameter equal to “n” or “N” and the encoding switches to binary. Other values will get you a warning, and any letter after the “n” will be ignored. Smells like legacy.

git blame tracks back the changes years ago, giving me a reference to the ruby-dev list. Lucky me, it’s not in Japanese and refers to uri/common.rb. A quick check refers to no Regexp.new with that third argument. Ah, there’s a Regexp.new(HEADER_PATTERN, 'N') in uri/mailto. The ‘N’ doesn’t mean binary, though, since it’s in second place (so it means “case insensitive”, as would true), which….

12) is a bug; the regexp is already case insensitive so that ‘N’ has no effect. I don’t understand enough what an extra “N” really does to be sure if it can be removed (since it doesn’t have any effect right now, ) or should put in third position.

I’m a bit dizzy. I should really go to sleep. Even though this is all pretty minor, I fire a redmine issue about the doc and another one about the lib and go to bed…

And I thought fixing Matrix#regular? would be trivial…

Best Time To Get Involved In Ruby Core

2009-09-01T00:00:00-04:00

Apart from enjoying the summer, I’ve spent time hacking on MRI, especially since I’ve been accepted as a committer. The feature freeze for Ruby 1.9.2 was planned for yesterday and this has been pushed back a couple of days before. Rejoice!

Why? The reason stated was that the next version of Ruby will, for the first time ever, pass the RubySpec. This makes RubySpec the official meeting point for all Ruby implementations, not just Rubinius (the originator of RubySpec), JRuby and others. This should also give a bit more time to decide on a couple of new features that might make it in 1.9.2.

Much work has been done to have the specs meet MRI 1.9.x and the language and core sections only have a couple of failures¹. Most are due to cases for which the best decisions still have to be figured out. I’ll remind you that it’s easy to gain commit access to RubySpec: any accepted patch grants you your commit bit.

There is still quite a bit of work to be done spec’ing the libraries. Actually there’s a lot of work to be done in the libraries themselves. Some are quite badly maintained, others don’t even have an official maintainer. And that’s all about to change, hopefully!

It was announced yesterday that being a maintainer is no longer for life. Not doing anything about opened issues? Sorry, we’ll get someone else to take care of it. Many libraries currently have no maintainer and there should be many others that won’t be claimed in the confirmation process.

Feeling competent to maintain a library? You talk using only sockets? You dream in yaml? Might as well apply to maintain your favorite lib…

I sincerely hope 1.9.2 kicks some serious ass. It’s bound to be the version Ruby 1.9 that most people will use and target for the first time. More reason to get it right!

¹Actually, the bulk of the work was spec’ing Ruby 1.8.6 under the supervision of Brian Ford. I helped finish the specs for 1.8.7 and the mysterious and tireless Run Paint Run Run did most of the 1.9 specific specs. Spec’ing Ruby usually leads to finding bugs or asking clarifications. Indeed, Run Paint opened more issues on redmine than any other user!

Stickler In Silicon Valley

2009-06-01T00:00:00-04:00

I have not been actively looking for a job yet. Nevertheless, I was contacted by a startup and invited to spend a week in Silicon Valley / San Francisco, hacking around with them to see if I could become part of their team, which I found quite flattering. I learned lots of new things in California. A couple of new words too. I’m still unsure as to what exactly a hipster is, but “stickler” was easier to grasp: one who insists on exactness or completeness in the observance of something.

It was fascinating to witness the startup culture. Tens of thousand of users is considered a small test bed; the target is millions. Every newcomer on the web scene is analysed & probed. There was technology and technology talk everywhere. It seems like everyone in the Bay Area has an iPhone. And I mean everyone! Lacking a decent map of the city, I asked two random strangers for directions and both dug out their iPhone to help me out. When I needed to call someone I was meeting, I asked another stranger if I could use his phone. It was an iPhone, of course, and after a thorough examination to estimate the chances I’d run away with it, he graciously let me use it. I found people particularily nice too, although maybe my tourist status helped, I don’t know.

My timing for the trip was great because Brian Ford and Evan Phoenix were also in town and invited me to have a drink. It turns out the monthly SF Ruby meetup was on that very same day, I met them there. I’d say the crowd was about three times that of a typicial Montreal.rb meet. There were other noticeable differences too. Many people were part of pretty exciting projects and companies (EngineYard, GitHub, PeepCode and the like). Chris Wanstrath (of GitHub) presented his newest gem rip, while Mike Dirolf was presenting his mongoDB project. Three people stood up announcing they were looking for developers, which has yet to happen in Montreal… I guess recession doesn’t have the same meaning in the Valley.

Back to Palo Alto and the startup. I realized a couple of things there. I really enjoy thinking about what a product could look like, how it should be presented to users. Finding ways to improve it by analysing its use is something I’ve never had the chance to do and is quite appealing. On the other hand, I somehow assumed that the “Joel” approach would be a sine qua non for an ambitious startup: hire the best, only the best, give them the best tools and let them loose.

It turns out that when considering what a good programmer is, different qualities can be given different weights. Most will agree that getting things done is the main one. Without it, not much can save you. As a reflection of my values though, I expected that embracing standards, learning the available tools and applying principles like DRY, refactoring, etc…, was also part of it. That’s apparently not the case, and that’s why we all realized I wouldn’t mesh as nicely as we hoped in their startup.

I couldn’t help but notice that all the rails programmers are Windows guys. Except one; he is a Linux guy and although I didn’t have the chance to really work with him, he gave me a really good impression. I’m ready to bet his values are more aligned with mine. The HTML/css/design expert was the only Mac guy and I could not have agreed more on what his opinions and point of view. So is there a Windows/Mac divide? Something like “Get things done” vs “Design it well so it just works”?

Nah. Things are never that simple, as I was reminded when taking part in the interview of a mac guy that clearly didn’t care for DRY or nice tools like named scopes, besides otherwise decent technical skills. So no, I just have the face the fact that, for better or for worse, I’m a stickler for getting things done well.

Update:My friend Pascal suggested this be related to an Engineer/Scientist divide: using tools vs understanding them; making things work vs comprehension through abstraction. Interesting idea.

A schizo Ruby puzzle

2009-05-02T00:00:00-04:00

Quick quirky quiz (schizo version)

# Without writing any method/block/lambda,
# can you find ways to obtain, in Ruby 1.8.7 or 1.9:
x == y   # ==> true
y == x   # ==> false

Here’s how I got to checkout Ruby’s source and stumble upon that.

Age of Innocence

This is all Mathieu’s fault. He asked innocently if my backports gem was compatible with Rails. I thought “duh! of course!”. After all, it’s meant to be compatible with any Ruby code.

Of course, he was right, there were bugs. Hundreds of tests were failing! Turned out to be two bugs. It dawned on me that my small bunch of unit tests were not even close to be enough. I really needed to test some more.

So I set out to test it on JRuby. I found a bug, but it was JRuby’s this time. It was easy to circumvent though, so “JRuby compatibility: check”.

How about rubinius? Well, that’s were the story really begins… Rubinius is a bit different because most of the builtin library is written in ruby and that many methods use other core methods. That won’t make a difference for you, until you fiddle with core methods. For example I was redefining String#upto by calling Range#each. Kosher in MRI, but rubinius’ Range#each handles string ranges by calling… String#upto!

There were other problems though, because rubinius was doing all sorts of stuff it wasn’t really supposed to do. And because rubinius is mostly Ruby, it was easy for me to fix. Or should I say temping to fix? I have difficulty to resist that kind of temptation, so I submitted my first patch and eagerly awaited my commit access (granted to anyone who submits a patch)…

Eye Opener

I discussed a bit with Evan Phoenix, the creator or rubinius, about ‘backports’ and told him I’d build it into rubinius, avoiding a bunch of alias_method_chain. I thought it would be dirt quick. That is, until I started.

See, to change things in rubinius, you first start by showing they’re broken. And to do that, enters RubySpecs. It’s a huge collection of tests that check if what you’re running works as expected. Or as MRI runs it, should I say. You knew that Ruby has no official spec, right?

With the help Brian Ford, I started to modify my first RubySpecs. That’s when I realized there were so many questions I never asked myself! Time for another quiz, this time with answers (just click on what you think is right!)

# Assume we have:
class MyArray &lt; Array ; end

foo = MyArray.new

# What is the class of:


foo.to_ary	MyArray	Array
foo.to_a	MyArray	Array
Array.try_convert(foo)	MyArray	Array
foo.dup	MyArray	Array
(foo+foo)	MyArray	Array
(foo*2)	MyArray	Array
foo.pop(2)	MyArray	Array
foo.shift(2)	MyArray	Array
foo[0..2]	MyArray	Array
foo.slice(0,2)	MyArray	Array
foo.slice!(0,2)	MyArray	Array
foo.first(2)	MyArray	Array
foo.sample(2)	MyArray	Array
foo.flatten	MyArray	Array
foo.product	MyArray	Array
foo.combination(1).first	MyArray	Array
foo.shuffle	MyArray	Array

Some are intuitive, like #shuffle, some less so, like #+. I wonder how you’re going to do, because I think I made worse than a monkey would by guessing randomly!

The complexity and amount of detail found in RubySpecs was a real eye opener. The fact is, often you won’t care about that level of detail about the implementation. But inevitably some people will.

So far I’ve ported all 1.8.7 Array methods and I’m working on the rest. Writing the specs is usually a bit longer than the implementation and damn difficult to get right. Well, at least for me; luckily there’s people like Ujihisa that fix my specs minutes after I commit them.

It’s because of a question he asked that I had to refer the Ruby C source and realized there was a potential problem like the x == y but !(y == x).

That cost me a bunch of hours today, because fixing it was another of those challenges I can hardly refuse, even if I had to delve in the C code!

Next blog entry: update on that bug, along with the solution (unless someone posts them in the comments)!

Thanks to Brian Ford and Evan Phoenix for their help and Ujihisa for pointing me to the complexity of the <=> operator he calls the spacecraft operator. And yeah, to Mathieu Houle for his damn question! ;-)

Lost In Recursion

2009-05-01T00:00:00-04:00

Last time I asked a simple (but quite hard) Ruby quiz:

# Without writing any method/block/lambda,
# can you find ways to obtain, in Ruby 1.8.7 or 1.9:
x == y   # ==> true
y == x   # ==> false

Before giving the answer, let me give you a bit of background…

In a blog post, Ujihisa was discussing how to compare arrays in Ruby and I was curious about the implementation which deals with recursion.

So what’s recursion you may ask? Just check:

x = []
x << x
# => [[...]]

x is an array containing a single element: x itself. At this point, the choice is yours. You can ask “why should I care?”. I have no good answer and you might as well stop reading now. Or you can say “cool” and read on.

So recursion happens whenever part of an object refers to the object itself. If you’re not careful about it,you can get infinite loops, for instance. For example, if you attempt to compare arrays naively by comparing their elements, you’ll get into trouble:

x = [];  x << x
# => [[...]]
xx = []; xx << xx
# => [[...]]
x == xx
# => ???

Can you guess the answer?

Older ruby 1.8.6 raise a StackOverflowError because it uses the naive algorithm of comparing the elements (x and xx) over and over.

Current ruby 1.8.7 and 1.9 detect the recursion and say “woah, I don’t want to deal with that, let’s just say they’re different”, so it returns false.

How is that implemented exactly? Well, any call that can be recursive (like x.==(xx) in this case) goes through rb_exec_recursive which keeps track of the receiver (x) on which the method (:==) is called. Recursion is detected when an attempt to call the same method is made on the same object. The method :== returns false for recursive cases.

Note that x == x will return still true, because before the call to rb_exec_recursive, :== will check if the two objects being compared are the same.

What struck me immediately was the lack of symmetry. It didn’t smell good and it didn’t take long to find a problem.

Comparing x and y = [x] works fine, actually. x and y are not the same object, so :== calls rb_exec_recursive, which stores x in its ‘deja-vu’ list. The elements of x and y are examined, and since their are both the same object, true is returned. y == x also returns true. So far so good.

Now x and z = [y] are another matter. Again, x and y are not the same object, so rb_exec_recursive gets called. It pushes x on the ‘deja-vu’ list, and compares its elements (x and y). Comparison of x and y triggers is considered as recursion, because x is already on the list. So x == z returns false.

But what about z == x? z and x are not the same object, so z is put on the recursion-list and elements are compared. y and x are not the same, so a second call to rb_exec_recursive is made, but y is not on the list (only z is at this point) so their elements are compared. x and x are the same object and thus the comparison returns true. In summary:

x = [];  x << x
# => [[...]]
x == [[x]]
# => false
[[x]] == x
# => true

Fixing this inconsistency is not that difficult. Can you imagine how? Instead of pushing only x when calling x.==(y), we need to push the pair [x, y]. Recursion will be triggered only if x.==(y) gets called again, but not for x.==(z). I set out to make a patch in the C code. With the more strict criteria, we get that both x == z and z == x return true.

On the other hand, we still get false for identical recursive arrays that are built independently, like x and xx.

I then realized that if we detect a recursion when comparing x and xx, it simply means that there is no use in looking further down for differences, so we should return true, not false. Unless a difference is detected somewhere else, then xx and xx are equal! This made it possible to compare recursive arrays that have the same contents, even though they were constructed differently:

x = [];  x << x
# => [[...]]
step = []; stone = [step]; step << stone
# => [[[...]]]
x == step
# => true

tree = []; tree << tree << tree
# => [[...], [...]]
mixed = []; mixed << tree << mixed
# => [[[...], [...]], [...]]
tree == mixed
# => true

If there is a difference between the arrays (say x[0][1][0] != y[0][1][0]), then xx == y returns false. If no such ‘path’ exists, then xx == y.

I was quite happy when my patch was accepted a week ago, so the current head of Ruby 1.9 deals with recursion perfectly and it’s no longer possible that x == y while y != x…

Details on redmine.

Zombies Hashes Archaisms Of Ruby Core

2009-04-03T00:00:00-04:00

I just love hashes. So much so, I named my blog after them. I also like that the hash sign is used for comments, in Ruby, or the way hash resembles hatch, thus the messy graphic theme and all. But I really like hashes. They are like mini-objects (object hatchlings?) and I tend to use them to store all sorts of information or instead of many conditions with case x; when :a ...; when :b ....

So I was quite surprised to note that in Ruby, either it’s really easy and natural to create a hash (with the super nice {:key => value, ...} syntax) or, when you need to generate a hash programatically, you’re basically stuck with

h = {}
foo.each do |key|
  h[key] = bar(key)
end

Well, that’s not quite true, there’s the Hash[key, value, key, value, ...] one can use. Do you use that one? So I decided to propose something. Now I don’t want to risk disturbing people. Especially important people. Except on my blog, of course; it’s your damn fault if you’ve read so far! I still have a bonus for you coming up for all your effort.

So I thought about this, researched it a bit and came up with the very best I could think of. I was quite nervous and excited when clicking on “Create”! My very first ruby posting was born: Feature #666: Enumerable::to_hash.

I didn’t quite know what to think of the strange omen of ID 666, though. In any event, I must admit that the excitement died down after waiting for anything to happen. It took a month for it to be assigned to Matz. Another two weeks for it to have the target set to “1.9.x”. Complete silence after that.

I must confess I was not registered to the ruby-core mailing list, so I would not have know of anyone writing directly to the list and not through the issue tracker. I believe no one did though. At least according to google because… there is no search on ruby-core’s archives! It’s quite an archaic system actually. The web front end is horrendous, the user interface is arcane (if not outright buggy). Don’t except a web link to confirm your registration, you have to send a mail back with a specific body. Short of registering, everything is done by email, actually. There might be a search command you send via email? Argh!

The fact that the search on the issue tracker itself (an otherwise fine product) doesn’t appear to work makes it next to impossible to check previous discussions for something. Like why has Ruby not moved to git yet? I guess I shouldn’t complain since others moved to svn a couple of months ago! Or like why is the ruby C code indented using 4 spaces, then 1 tab, then 1 tab + 4 spaces, etc… How do you even indent like that using TextMate? I’m 37, I’m used to feel old-generation and to find like things are moving quite fast, but damn, how come it’s quite the contrary here?

I pointed out a simple bug two months ago, even provided a patch the small change in the C code. New releases of ruby 1.8.6 and .7 were made today, and still no update on my bug report. I presume that the whole ruby-core team has a lot on their plate, but it’s hard not to be discouraged from contributing with that kind of (non-)feedback. Even clueless tourists seem to get more attention.

All this to say that 6 months after my feature request, still nothing. That’s when I discovered a cool new way to create hashes out of key-value pairs that is undocumented. This time, I made my best so that it wouldn’t go unnoticed. I conjured demons, invoked strange incantations, made dubious attempts at being humorous and documented the whole thing (zombies will be next!). Here it is. So that’s my bonus to you. Matz coded it, I’m letting you know about it! ;-)

That at least got my original issue noticed… and shot down. Some musterer the courage to speak their mind, we’ll see if this goes anywhere.

(Updated after Matz explained better his reason)

2011 update: For those interested, a proposal similar to my original one can be seen in this ruby-core thread.

Whats Point Of Ruby 187

2009-04-02T00:00:00-04:00

Can you guess how many built-in methods were introduced or modified when Ruby 1.8.5 came out? How about Ruby 1.8.6? Or the most recent 1.8.7?

Ruby	Changes
1.8.5	Roll over
1.8.6	for the
1.8.7	answers!

Ruby	Changes
1.8.5	2
1.8.6	3
1.8.7	137

I’d love to check that the number of changes was minimal for earlier 1.8.x releases, but I can’t find a good list of changes (other than going through the full changelogs) Anyone has that info?

Are you writing code that targets 1.8.7? I know I’m not. The code I’m releasing on github is aimed at Ruby 1.8 and Ruby 1.9. The thing is, code that runs on 1.8.7 doesn’t necessarily run on 1.9, and even less likely to run on 1.8.6 or earlier. At least if you’re writing Ruby in Ruby and using the new Enumerable features, among others. So you have to test all three?

The fact is, Ruby 1.8.7 has a different API than the rest of the 1.8.x line, but still different from Ruby 1.9. So not only is it already difficult to know if some code is compatible with Ruby 1.9 (e.g. isitruby19.com), there are many more possibilities: some gems can be compatible with Ruby 1.8.7 only, for example. Or 1.8.7 and 1.9.1 but not 1.8.6 and before. It’s actually possible to be compatible just with 1.8.7! Try [:red_pill, :blue_pill].choice.

The solution should have been clear, though. Don’t change the API. Instead, use forward compatibility, and that’s easy to do in ruby. I’ve written my own collection of backports after looking in vain for one. I’m still wondering why change the API instead of releasing a standard forward compatibility gem which would work for all Ruby 1.8.x. I mean, all those OS X users with their default 1.8.6 installation… I’m supposed to tell them to upgrade to 1.8.7 because I want to use map(&:to_s)? In any case, I hope that a single require "backports" will enable 1.8.7 specific code to run on earlier versions of Ruby.

PS: I know python has forward compatibility with their cute from __future__ import *. Anyone knows about Smalltalk, Scala, Lua, IO?

Love & Hate: Array#product

2009-04-01T00:00:00-04:00

Quick quirky quizz:

# What is the output of
p [40, 2].sum
p [2,3,7].product
# ?

Are you expecting a reference to the late Douglas Adams?

Sum

If you’re running Rails, sum will indeed return 42. In straight Ruby, though, sum won’t be defined.

Yes, not even in ruby 1.8.7 or 1.9. Many core extensions of rails were ‘ported’ to ruby.

Symbol::to_proc is probably the most notable one, but Enumerable::group_by,

Float::round_with_precision, Integer::even? and Integer::odd? come to mind also.

Why was sum not included? Probably because the new inject makes it easier to sum enumerables (e.g [40,2].inject(:+)) and because Matz wants the methods of Enumerable to remain as generic as possible (and not assume that elements respond to :+, for instance). Still, I quite like the idea of sum.

Product

Now the irony is that product is not defined in rails, but it is in ruby 1.8.7+.

You might be a bit surprised though! Indeed:

[2,3,7].product # ==> [[2], [3], [7]] !

Say what? Yeah, it turns out the Array::product produces the cartesian product:

(1..13).to_a.product([:spades, :hearts, :diamonds, :clubs])
# produces a full card deck:
# => [[1, :spades], [1, :hearts], ..., [2, :spades],...]

Naming methods is quite a delicate task. My belief is that a more appropriate and descriptive name would have been cartesian_product, cross_product or product_set. product might be shorter I think it will run against the principle of least surprise for a lot of folks. The most frustrating part is that product used without any argument is pretty useless. If you really need that result, there are other ways to get it!

[2,3,7].product
[2,3,7].combination(1)
[2,3,5].each_slice(1).to_a
# => same result

So that’s the hate part.

Now the love part. I had some fun backporting more features of Ruby 1.8.7/1.9 to older Ruby in my backports gem. At some point I had ported enough that I decided I might as well port everything. As of version 1.6, that’s done. This includes, of course, Array#product… which turned out to be the most interesting thing to backport! My first version used a recursive function, but I then thought about using enumerators. After 3 refactors, I got to a really nice version:

class Array
  def product(*arg)
    trivial_enum = Enumerator.new{|yielder| yielder.yield [] }
    [self, *arg].inject(trivial_enum) do |enum, array|
      Enumerator.new do |yielder|
        enum.each do |partial_product|
          array.each do |obj|
            yielder.yield partial_product + [obj]
          end
        end
      end
    end.to_a
  end unless method_defined? :product
end

I get an enumerator for all the combinations by building it up successively using inject and starting from a trivial enumerator. It would be easy to have product accept a block but the standard simply returns an array, so you’ll find a simple call to to_a at the end. I love enumerators and… I love this implementation of product!

Leave My Options Alone

2009-03-02T00:00:00-05:00

Let me start by asking you a small quiz:

# Will there be any difference between the output of:

<% content_tag_for(:tr, Foo.new, :class => "css_class") do %>
...<% end %>
<% content_tag_for(:tr, Bar.new, :class => "css_class") do %>
...<% end %>

# and the output of:

<%- @style = {:class => "css_class"} -%>
<% content_tag_for(:tr, Foo.new, @style) do %>
...<% end %>
<% content_tag_for(:tr, Bar.new, @style) do %>
...<% end %>

# ?

If you answered “Nope”, congratulations, you’re a normal, sane human being. I like you. Anyone answering “Yup” is either slightly crazy, guessed that I wasn’t asking a trivial question (or both?). Because indeed, the output is different. Why? Because the first content_tag_for modifies @style[:class] argument.

You’re probably not expecting yet another apparently trivial question, but here goes: is this a bug?

If you had to bet about my opinion, your 5 bucks would be safe on “yeah, it’s a bug”. But I can’t really say that it is a bug. I’ve never read anywhere that options won’t be modified. It’s (of course) not specified in the doc of content_tag_for. It’s generally not stated what happens when you pass an unrecognized option, so forget about things like that. I’m not aware of any official general rule of rails. I doubt there is one, because I can find many places where options are modified (e.g. error_message_on, truncate, highlight, excerpt, word_wrap, …). These other examples, though, won’t modify the options in a harmful way. Indeed, writing:

@options.reverse_merge!(:foo => "default_bar")

will not cause a problem like the one I just showed (unless anything else relies on options[:foo] being left unspecified).

# If this works:

truncate("hello", {:length => 4})

# Shouldn't this work too?

truncate("hello", {:length => 4}.freeze)

They won’t enable you to pass a frozen hash, though. Do you freeze your constants? I like freezing things. I freeze my constants, I freeze my settings, I freeze everything I can. And it upsets me when I can’t pass a frozen options. Is this a bug too?

Unless I’m mistaken, the current stance is that options passed can be changed, tortured and abused as much as the implementation desires and god damn it, check the source if you care.

I believe rails should take a clear and reasonable stance on options. I can think of two:

1) options can be modified, but only in a way that is independent on any other arguments or internal state.

2) options will never be modified.

truncate would be considered buggy only if the second stance is taken, while content_tag_for, as in my example, would be buggy under either positions, since it depends on the class of the second argument.

My personal vote goes for the second stance: leave my options alone!

Does Bill Gates use IE?

2009-03-01T00:00:00-05:00

Anyone who knows me personally is bound to know that I despise Windows (and Internet Explorer among other Microsoft products). I’m the first to admit that my hatred borders on irrationality. The fact that I’m a complete newbie on Windows probably doesn’t help either. I can count on my fingers the number of hours I spent playing/cursing on windows. That being said, every single time I have to use windows, I always wonder: does Bill Gates uses it? What’s his reaction to all those things that pop-up? Does he browse on Internet Explorer? Does he ever wonder if he just clicked properly and something is happening, or if the computer is just waiting for another click?

Couple of weeks ago, I was staying with my best friend’s family in the middle of the French Alps. They had internet through the owner’s extremely paranoid device that not only requires a password to join the network but also needed a physical acknowledgment to allow the MAC address. I didn’t insist to have my trusty PowerBook blessed and instead accessed my mail with their Dell on Windows XP.

First task: browse. This was a machine owned by a reasonably technical person; Firefox was installed and the little keyboard gizmo was already there, giving me a quick way to switch from the french AZERTY keyboard to the US QWERTY. Side note: If anyone knows of a single reasonable motivation for changing the base layout for any latin language, please enlighten me!¹ Anyways, that’s not Microsoft’s fault and I’m glad I could switch easily. Well, most of the time. Sometimes I’d switch and the ‘FR’ just wouldn’t budge. Repeat, still no change. Again… still no change. After 5 or 6 attempts, woo-hoo, it changes. Another time it would change visually (the gizmo says ‘EN’) but the layout used when I entered text was still wrong. The menu disappeared altogether once. What am I supposed to do then?

Note that even if it worked more than half of the time, I’d still rant about the design lunacy of having this setting be per application. Why didn’t they make the only reasonable choice of a per session setting? Beats me. Luckily for me, I didn’t have to really use any other application besides Firefox.

Of course, I gave up entirely on typing any accents in my emails since I don’t know the dozen of 3-digit codes I’d need. Did you know that on any mac you can type all special symbols with easily remembered keys, like alt-c for ç and alt-` + a,e,u for à,è,ù… ? That holding the shift key will yield the uppercase version, like alt-shift-c for Ç, and alt-` + A for À, … ? That Apple introduced this… 25 years ago? Before Windows even existed? Bonus point if you know the alt-code for É!

OK. Second task: Testing my luck, I thought uploading photos to Facebook would be fun. Alas when copying the photos from the USB keys, the machine would freeze about one time out of three. Better than my old compact flash adapter that would make any PC reboot when the card was 4 GB or bigger, but still! The reboot time was really long; actually most everything took forever. My five-year old powerbook was da-bomb compared to it. It took ages to copy everything to the Dell and I was finally able to upload stuff to Facebook.

After a couple of days, my website on Amazon EC2 froze and I then really wanted to have internet on my machine. We found an ethernet cable (note to self: always pack one) and enabled the internet sharing on the Dell. I would not have been able to find it myself, mind you. My friend Pascal showed me the intricate way². I’m still glad it was there at all! Like the keyboard switching, it would work a bit less than half the time. When it didn’t work, I had to go back in the settings, turn if off, click OK, wait for the window to close (~10 seconds), click ‘Advanced’ again, turn it on, click OK, wait some more (~ 1 minute!), and that would do the trick (most of the time, otherwise goto 10).

So back to my original thought: does Bill Gates use his computer at all? Presumably, he doesn’t change the keyboard layout a lot, type in french much, need to share an internet connection, or about anything worthwhile? Or else wouldn’t he see it doesn’t work properly? He must have the power to fix anything he wants, no? Even if he had to pay from his own pocket to have it fixed, what would it represent for him? He can buy 10 condos like mine, everyday, for the rest of his life without running out of money. If you had this money and power, wouldn’t you say “ok, I’ll just get that fixed”? Forget about making profits, forget about making things better for the planet. “Just fix it for me, yesterday, thank you very much”.

Note: I’ll try not to make more than one (or two?) rants against Microsoft per year!

1 Introduction of new letters (é, ß, …) justify changes to the overall layout but I’m wondering why the common 26 letters couldn’t stay put. Geeks could still curse because the needed symbols [} and such would be placed differently, but for normal needs, there would be a common ground. Let’s thus focus on changes to letters only. One of the changes between the AZERTY and the QWERTY is a swap between the W and Z. These two letters are the two least frequently used in french. Ergo this is the swap, among all 325 possible swaps (ignoring the zillions longer permutations), that will yield the least noticeable gain in efficiency! I leave the trivial proof as an exercise to the reader :-) If the most popular keyboard layout was Dvorak, I could see how a reasonable way to keep the layout optimized would yield different layouts depending on the language. The fact is, QWERTY is quite far from being optimized in any rational way. It’s reputed to have been designed to insure that successive letters wouldn’t jam a typewriter. I call BS. The most frequent pairs of letters in english are th, he, an, re, er, in, on, at, nd, st, es, en, of, te, ed, or, ti, hi, as, to (source). You’ll notice that almost half of these are more or less adjacent (`th, re, er, in, es, te, ed, as`) while Dvorak has only `th, st` and `hi` that are (and the latter still fits Dvorak’s goals.) So not only is the “official” optimization practically obsolete, it’s not even footing the bill. Anyways, all that to say:

QWERTY is a terrible layout in english
it’s not clear if it is worse in french or other latin languages, but small changes won’t lead to any noticeable gain and will confuse any globetrotter

Why, oh why, are we stuck in a world where not only do we use a bad layout, but we can’t stick with that bad layout for most latin languages that use A-Z? 2 On XP, it is: Control Panels -> Network Connections -> Local Area Connection -> Properties -> Advanced -> click on ‘Allow area other network’s users to connect through this computers’ internet connection’ -> OK Compare that to OS X: System Preferences -> Sharing -> click on ‘Internet Sharing’ Not only will you notice that the number of operations is more than double on XP, but the choices to make are more difficult. On the mac, the only non trivial choice is between ‘Network’ and ‘Sharing’. On Windows, your first choice is between ‘Network Connections’ vs ‘Internet Options’ (and hesitation with ‘Network Setup Wizard’ and ‘Wireless Network Setup Wizard’). Since control panels are not grouped like on the mac, you have to consider all of them, if you don’t know what the answer is. On the mac, there are only 2 other possibilities under ‘Internet & Network’ besides ‘Network’ and ‘Sharing’. Then you have to select the currently active internet connection. It is the most probable choice but it’s not obvious which connection is the currently active one! Finally, you have to think of looking in the ‘advanced’ tab. Undoubtably, the most important difference is that on the mac… it works.

Thanks to Pascal for his comments on my first draft.

Please Write Ruby In Ruby

2009-02-27T00:00:00-05:00

I’m always surprised when I see bright people writing ruby code without using ruby’s standard lib. Do I need to point out that it’s less readable and more error prone?

I plead all rubyists to re-read the doc for Array, Hash and Enumerable/Enumerator. Refer back to it. Use it. Please!

I was quite amazed to see the following code (written by an ex rails-core programmer, nothing less!). Check out the three methods and ask yourself what they do and how they should be written (mouse-over the code for the answers).

class Options < Hash
  #...
  def get_bar_settings
    bar_setting_keys.map do |bar_key|
      self[:bar][bar_key]
    end
  end

  def extract_is_cool!
    self[:is_cool] = options.has_key?(:is_cool) ?
                     options[:is_cool] : false
  end

  def check_validity(options)
    invalid_options = options.keys.select do |key|
      !VALID_OPTIONS.include?(key)
    end
    raise SomeError unless invalid_options.empty?
  end
  #...
end

class Options < Hash
  #...
  def get_bar_settings

    self[:bar].values_at(*bar_setting_keys)

  end

  def extract_is_cool!
    self[:is_cool] = options[:is_cool]
    # or options.fetch(:is_cool, false)
  end

  def check_validity(options)

    invalid_options = options.keys - VALID_OPTIONS

    raise SomeError unless invalid_options.empty?
  end
  #...
end

The extract_is_cool! method was actually not even needed because there was a merge!(options) later on, just adding insult to injury…

Ruby Doesnt Dig Threads

2009-02-23T00:00:00-05:00

Either I’m missing something, or threads in both MRI and YARV just plain suck. My test program goes through a 10 MB file of random data, splits it in chunks (either 1K, 10K or 100K each). The results for MRI show the threaded version is much slower (~2x), in YARV performance is similar but usually slower for the threaded version. Mind you, I’m running this on 4 cores! rubinius looks like YARV on a valium overdose (20x slower…). Only in JRuby are things like what I expected, i.e. similar performance or faster for threads, with the difference being noticeable with more processing.

Code is here , detailed timings follow…

# Ruby 1.8.6:
process 0x 10kB, straight 1.150229
process 0x 10kB, threaded 1.343492
process 1x 10kB, straight 1.930851
process 1x 10kB, threaded 3.011537
process 2x 10kB, straight 3.014654
process 2x 10kB, threaded 4.519649
process 0x 100kB, straight 1.128152
process 0x 100kB, threaded 1.143609
process 1x 100kB, straight 1.948754
process 1x 100kB, threaded 2.245689
process 2x 100kB, straight 3.074676
process 2x 100kB, threaded 3.432552
process 0x 1000kB, straight 1.199003
process 0x 1000kB, threaded 3.646992
process 1x 1000kB, straight 2.606668
process 1x 1000kB, threaded 2.177998
process 2x 1000kB, straight 3.316180
process 2x 1000kB, threaded 3.706851

# Ruby 1.9.1:
process 0x 10kB, straight 1.343889
process 0x 10kB, threaded 1.490538
process 1x 10kB, straight 6.292696
process 1x 10kB, threaded 8.079034
process 2x 10kB, straight 11.767741
process 2x 10kB, threaded 15.155683
process 0x 100kB, straight 1.336428
process 0x 100kB, threaded 1.332375
process 1x 100kB, straight 6.467645
process 1x 100kB, threaded 6.359540
process 2x 100kB, straight 11.821027
process 2x 100kB, threaded 12.117181
process 0x 1000kB, straight 1.435732
process 0x 1000kB, threaded 1.784891
process 1x 1000kB, straight 6.212079
process 1x 1000kB, threaded 5.921470
process 2x 1000kB, straight 11.803677
process 2x 1000kB, threaded 11.386862

# JRuby
process 0x 10kB, straight 1.535674
process 0x 10kB, threaded 1.418075
process 1x 10kB, straight 2.900337
process 1x 10kB, threaded 3.036711
process 2x 10kB, straight 4.266761
process 2x 10kB, threaded 3.064340
process 0x 100kB, straight 1.555573
process 0x 100kB, threaded 1.365277
process 1x 100kB, straight 2.408831
process 1x 100kB, threaded 2.718737
process 2x 100kB, straight 3.930232
process 2x 100kB, threaded 2.891176
process 0x 1000kB, straight 3.688882
process 0x 1000kB, threaded 4.970055
process 1x 1000kB, straight 5.632520
process 1x 1000kB, threaded 3.801846
process 2x 1000kB, straight 6.860399
process 2x 1000kB, threaded 3.964439

# Rubinus
process 0x 10kB, straight 2.621673
process 0x 10kB, threaded 2.921372
process 1x 10kB, straight 85.343156
process 1x 10kB, threaded 84.173440
process 2x 10kB, straight 167.755588
process 2x 10kB, threaded 163.454284
process 0x 100kB, straight 2.838818
process 0x 100kB, threaded 2.764404
process 1x 100kB, straight 84.900132
process 1x 100kB, threaded ^C^C^CI'm bored

Note: it’s understandable that 1.9 is much slower than 1.8 because I process strings and only 1.9 deals with encoding

Ruby Threads

2009-02-21T00:00:00-05:00

I’m pondering a really neat scheme for my upcoming FLV editor. My editor can be thought of as a series of processors acting on tags; the first processor reads them, then others analyse/modify them and the last one writes them. The scheme would need some sort of disconnection in the processing, either with continuations (which appear to be implemented two different ways in ruby 1.8 and 1.9) or threads. Which leads to the questions:

What’s the performance comparison of a program that sucessively reads and writes chunks of data, compared to one where one thread reads and the other one writes.

What about many.times{ read; process; write} vs Thread { read } + Thread { process } + Thread {write}. Or doubling the processing (and processing threads)?

Results soon.

W3school Sucks

2008-10-18T00:00:00-04:00

I’m always mesmerized as to why w3schools is considered the html/css reference. It just sucks.

For example, check out the definition of position: absolute. Shouldn’t “its containing block” be “its containing positionned block”? Why is the example so simplistic as to be completely pointless?

Here’s a much better explanation.

Another example:

“The tag is partially supported in all major browsers.”

Geee, thanks. That sure helps a lot.

Update: Many reader comments (lost in migration, sorry) agreed with me, and some pointed out http://w3fools.com/

Please Dont Abbreviate

2008-10-17T00:00:00-04:00

Abbreviation sucks. I’ll add famous people that agree with me here when I get the time. And if I find any!

I dislike the fact that ruby’s Time class as a mon method (c’mon!), but at least it is aliased by month. Now why oh why does ruby’s Time class has a min method and no minute method? Same goes for sec vs second. At least sec isn’t as ambiguous as min.

Can’t stand elsif (wow! one less character! impressive gain…), Enumerable.uniq, …

I’m counting on you to start a Facebook group “don’t abbreviate in ruby”!