Monday 10 December 2007

Numeric Operators

Ruby allows built-in numeric types to co-exist seamlessly with user-defined types. To the client of a non-standard library the interactions between the library and the built in numeric types is fairly invisible. This allows you to define a new type and use instances of it as parameters to built in methods. As an example, consider that we have implemented a class called MyNumber.

class MyNumber
...
end


If the right methods have been implemented on MyNumber then the following code will work - hopefully with the correct semantics.

x = MyNumber.new

30 + x
54.4 / x
0x80000000 << x


There are two strategies employed in the Numeric classes to achieve this:

Convert to Integer/Float
In some cases it is necessary that the parameter passed in is actually of the same type as the Numeric. An example are the bit shift operators. It doesn't make sense to shift a value by a non-integer amount (or does it??). In these cases the Numeric class attempts to convert the parameter to the right type. In this case an Integer (or Fixnum to be more precise). This is done either directly, as in the case of Float, or it is done indirectly by calling either to_i or to_int on the parameter. So to allow your class to be used with the Fixnum#<< operator all you have to do is implement to_i. There is a similar to_f method you can implement to convert to Float.

Value Coercion
Sometimes it is simply necessary to ensure that the object of the method and the parameter have the same type. The aim of coercion is to convert built-in types to match the user-defined types. This is used frequently to allow Fixnum and Bignum objects to interact with Floats. For instance, in the following code, the Fixnum get coerced up to a Float before the / operator is invoked. In this case it is the Float#/ operator that actually does the calculation.

3 / 4.3

This is achieved through implementing the coerce method. This method attempts to coerce its parameter to the same type as itself. The clever bit comes inside the Numeric class when a method is called that doesn't know about its parameter.

Consider the following code:

class MyNumber
...
end

x = MyNumber.new
3 / x


The Fixnum#/ operator is invoked but it doesn't know what to do with a parameter of type MyNumber. Instead it calls MyNumber#coerce passing in the self object (in this case the Fixnum 3) as the parameter. The / operator is then invoked on the value returned from coerce.
So assuming the MyNumber class implements both MyNumber#coerce (which coerces Fixnums to MyNumber instances) and MyNumber#/ then the above code will work. Again, hopefully with reasonable semantics.

IronRuby Implementation
This is all well and good but what about implementing this mechanism in the IronRuby libraries?

The convert to Integer implementation is fairly straightforward, we can call Protocols.ConvertToInteger and let it do the work of invoking to_i or to_int accordingly. E.g.

[RubyMethod("|")]
public static object BitwiseOr(CodeContext/*!*/ context, int self, object other) {
other = Protocols.ConvertToInteger(context, other);
return self | Protocols.CastToFixnum(context, other);
}


In the case of coercion I wrote a bit of helper code in the Numeric class that will do the coercion and then call the original operator or method:

public static object CoerceAndCall(CodeContext/*!*/ context, object self, object other, DynamicInvocation invoke) {
RubyArray coercedValues;
try {
// Swap self and other around to do the coercion.
coercedValues = NumericSites.Coerce(context, other, self);
} catch (MemberAccessException x) {
throw MakeCoercionError(context, self, other, x);
} catch (ArgumentException x) {
throw MakeCoercionError(context, self, other, x);
}
// But then swap them back when invoking the operation
return invoke(context, coercedValues[0], coercedValues[1]);
}


This method invokes Coerce on the other object passing in self and then invokes a method that you passed in, which corresponds to the original method called.

The delegate defining what can be passed in is fairly generic:

public delegate object DynamicInvocation(CodeContext/*!*/ context, object self, object other);

This allows you to do the following for the Fixnum#% operator, when the other parameter is not of a known type:

[RubyMethod("%")]
public static object ModuloOp(CodeContext/*!*/ context, object self, object other) {
return Numeric.CoerceAndCall(context, self, other, NumericSites.ModuloOp);
}


Operator Aliasing
In the case where an operator has been aliased, such as Fixnum#modulo and Fixnum#%, we have to be careful. Ruby insists that the same operator/method is invoked after the coercion regardless of aliasing. This means that despite Fixnum aliasing modulo and % to use the same implementation, MyNumber could implement both separately even with different semantics.

Therefore when implementing Fixnum in IronRuby it is necessary for modulo and % to have separate methods for each when the parameter is arbitrary.

[RubyMethod("%")]
public static object ModuloOp(CodeContext/*!*/ context, object self, object other) {
return Numeric.CoerceAndCall(context, self, other, NumericSites.ModuloOp);
}

[RubyMethod("modulo")]
public static object Modulo(CodeContext/*!*/ context, object self, object other) {
return Numeric.CoerceAndCall(context, self, other, NumericSites.Modulo);
}


They can of course share implementation for overloads that have known parameters:

[RubyMethod("%"), RubyMethod("modulo")]
public static object ModuloOp(int self, int other) {
RubyArray divmod = DivMod(self, other);
return divmod[1];
}


Note the multiple RubyMethod attributes in this method where other is int (Fixnum).

Monday 3 December 2007

Method overloading

In IronRuby you can create a number of different implementations of a method that each take different parameters; much in the same way that you can do in C#.
E.g.
In BignumOps there is a * (multiply) method of the form: self * other.
Here are the overloads:

[RubyMethod("*")]
public static object Multiply(BigInteger self, BigInteger other)
[RubyMethod("*")]
public static object Multiply(BigInteger self, double other)
[RubyMethod("*")]
public static object Multiply(CodeContext/*!*/ context, BigInteger self, object other)


As you can see, if Multiply is called with a BigInteger (or int for that matter) as the "other" parameter the first method is executed. If it is a double, which maps to Float in Ruby, then the second is executed and any other type as other invokes the third implementation (the context parameter is a hidden support mechanism for providing access to stuff such as the dynamic invocation mechanism).

In other words the DLR is responsible for selecting the correct method overload to call at runtime based on the number and type of parameters.

This is different to how the overloading is achieved in the standard Ruby implementation. In this case you have just one method and this method works out which implementation it should execute itself. This is the equivalent C function in the C Ruby implementation:

VALUE
rb_big_mul(x, y)
VALUE x, y;
{
long i, j;
BDIGIT_DBL n = 0;
VALUE z;
BDIGIT *zds;

if (FIXNUM_P(x)) x = rb_int2big(FIX2LONG(x));
switch (TYPE(y)) {
case T_FIXNUM:
y = rb_int2big(FIX2LONG(y));
break;

case T_BIGNUM:
break;

case T_FLOAT:
return rb_float_new(rb_big2dbl(x) * RFLOAT(y)->value);

default:
return rb_num_coerce_bin(x, y);
}

j = RBIGNUM(x)->len + RBIGNUM(y)->len + 1;
z = bignew(j, RBIGNUM(x)->sign==RBIGNUM(y)->sign);
zds = BDIGITS(z);
while (j--) zds[j] = 0;
for (i = 0; i <>len; i++) {
BDIGIT_DBL dd = BDIGITS(x)[i];
if (dd == 0) continue;
n = 0;
for (j = 0; j <>len; j++) {
BDIGIT_DBL ee = n + (BDIGIT_DBL)dd * BDIGITS(y)[j];
n = zds[i + j] + ee;
if (ee) zds[i + j] = BIGLO(n);
n = BIGDN(n);
}
if (n) {
zds[i + j] = n;
}
}

return bignorm(z);
}


You can see here that there is a lot of type checking and conversion going on. I think that the IronRuby/DLR way is a much more pleasant and maintainable way of developing.

One thing that did worry me though was what happens if someone monkey patches your class. For instance what if I did the following in a Ruby program?

class Bignum
def *(other)
puts "hello"
self + other
end
end


In the C Ruby implementation, the whole C function is overridden and so all that complex type conversion goes out the window and you are left with the simple method given:

>> 0x80000000 * 2
hello
=> 2147483650
>> 0x80000000 * 0.3
hello
=> 2147483648.3


What would happen in the IronRuby case?

I felt it could have gone either way. At first I expected that it would just add a method with the equivalent of the following signature:

[RubyMethod("*")]
public static object Multiply(CodeContext/*!*/ context, BigInteger self, object other)


In actual fact, it appears that all the overloaded methods are wiped out. This is perhaps common sense: the DLR now has no type information on the parameters to distinguish the new Ruby method from those written in C#, so it just blats them all.

Even in cases where there where different overloads accepted different numbers of parameters and you could distinguish based on the number of parameters, those get wiped out too. You just get the one method and calls to the method with a different number of parameters causes an ArgumentError.

So all is good and you don't have to worry about spreading your code out into multiple overloaded methods to make it easier to read.