More fun with the Ruby Symbol

So while in the #RubyOnRails IRC channel late at night (never a good idea) some silly conversation occured:

(someone gists some code)
sevenseacat: empty array seems the silly thing to have
Tarential: sometimes I like to add empty arrays, you know, just in case I need them later
Tarential: so they’re close at hand
tagrudev: little kittens die when you define an unused var
brianpWins: but if you do want to leave empty arrays around. Make them a symbol so it only allocates once =) :[]
Tarential: Now that’s thinking.

Then some time passed and I ended up throwing a pastie back into the channel with the following code:

class Symbol
  def to_a
    @array ||= []
  end

  def <<(val)
    to_a << val
  end

  alias_method :previous_inspect, :inspect
  def inspect
    if to_a && to_s == "[]" 
      to_a.inspect
    else
      previous_inspect
    end
  end
end

:[]
=> []

:[] << 1 << 2 << 3 << 4
=> [1, 2, 3, 4]

:[].class
=> Symbol

:[].to_a
=> [1, 2, 3, 4]

:[]
=> [1, 2, 3, 4]

Mostly usless but I did find out one interesting piece of information. The Symbol has no public creation interface other then it’s standard definition. I’m assuming at this point it drops down to the C level but I couldn’t leave that stone unturned. I put the word out I was in search of the Symbol implementation in Ruby. I asked some friends, tweeted about it and poked around on IRC.

REPL in the Pry runtime environment
@banister (who brings you awesome projects like Pry) ended up giving me a nice intro to what happens when you put :cats into a ruby runtime environment.

I’ll keep this brief and in laymans terms. People who know much more about this feel free to email me with corrections or improvements. Keep in mind this information was explained at a very high level so it wouldn’t all go over my head.

At a very basic level the runtime is performing a simple REPL (Read, Evaluate, Print, Loop.)

read --> eval --> puts #inspect on eval output --> loop

The Read simply reads in your input. In our case $ :cats

The Eval gets significantly deeper. In the case the value entered is a Symbol (or other immediates) it is stored directly in the VALUE pointer. @banister provided me with the following gist to show how the Symbol is stored: pry_gist.rb

(this is only a small piece of the total evalution process.)

VALUE
rb_obj_id(VALUE obj)
{
    /*
     *                32-bit VALUE space
     *          MSB ------------------------ LSB
     *  false   00000000000000000000000000000000
     *  true    00000000000000000000000000000010
     *  nil     00000000000000000000000000000100
     *  undef   00000000000000000000000000000110
     *  symbol  ssssssssssssssssssssssss00001110
     *  object  oooooooooooooooooooooooooooooo00        = 0 (mod
        sizeof(RVALUE))
     *  fixnum  fffffffffffffffffffffffffffffff1
     *
     *                    object_id space
     *                                       LSB
     *  false   00000000000000000000000000000000
     *  true    00000000000000000000000000000010
     *  nil     00000000000000000000000000000100
     *  undef   00000000000000000000000000000110
     *  symbol   000SSSSSSSSSSSSSSSSSSSSSSSSSSS0        S...S % A = 4
        (S...S = s...s * A + 4)
     *  object   oooooooooooooooooooooooooooooo0        o...o % A = 0
     *  fixnum  fffffffffffffffffffffffffffffff1        bignum if
        required
     *
     *  where A = sizeof(RVALUE)/4
     *
     *  sizeof(RVALUE) is
     *  20 if 32-bit, double is 4-byte aligned
     *  24 if 32-bit, double is 8-byte aligned
     *  40 if 64-bit
     */
    if (TYPE(obj) == T_SYMBOL) {
        return (SYM2ID(obj) * sizeof(RVALUE) + (4 << 2)) | FIXNUM_FLAG;
    }
    if (SPECIAL_CONST_P(obj)) {
        return LONG2NUM((SIGNED_VALUE)obj);
    }
    return (VALUE)((SIGNED_VALUE)obj|FIXNUM_FLAG);
}

He added:

a normal object created via Object.new for example are normally allocated on the heap and they go through a proper initialization process.

@banister

Then the Print runs a simple #inspect on the Symbol and returns => :cats to you.

After that we Loop for more input.

So where does this leave us?
I ended up getting a great introduction to some things going on in my every day environment that I simply take for granted. It all came out of some late night nonsense and exploration and now I feel like new doors have opened and I’m going to have to go through.

I encourage you to poke around. We write code all day long but really finding out what’s going on can be new and exciting. So open up a Ruby class and break something!