The Y-Combinator Derived

The Y-Combinator was something that came up in my Programming Languages class at Rice. It was bizarre and inscrutable to me, and I never really acquired a deep understanding of it. Frankly, it doesn't have a lot of practical application, but it is interesting. And Douglas Crockford said something like, "If you understand this, you can call yourself a computer scientist." Well, I certainly want to be able to do that. And maybe this post will create a few more computer scientists. I hope so.

First, I want to note that I depended heavily on this page for my derivation steps. I changed them as I saw fit, but it was nice to have them to refer to when I screwed up.

My intent is to derive the Y-Combinator in a way that makes it obvious what's going on. This subject has been done quite a few times (and in quite a few languages), but they always use meaningless function and variable names, and I find that makes it hard to follow. So I'm taking my stab at it. I've been writing in JavaScript lately, so that's what I'm using here.

The first step on that road is to understand what we're deriving. The Y-Combinator is a function that provides "anonymous recursion". That is, if your language doesn't provide a way for a function to call itself while it is being defined, the Y-Combinator can give you that.

Specifically, let's say you have a function that you can tell how to call itself, and then it can be recursive:

function factorial (myself, n) {
 return n <= 2 ? n : n * myself(myself, n - 1);
}

What you want, though, is a function that doesn't have to tell itself how to call itself every time.

// This doesn't work. We need "myself" to be defined somehow
function factorial_wish (n) {
 return n <= 2 ? n : n * myself(n - 1);
};

The first step is to "curry" myself out of the original function:

function curried_function (curried_self) {
 return function factorial (n) {
   return n <= 2 ? n : n * curried_self(curried_self)(n - 1);
 }
}

With this, curried_function(curried_function) returns the factorial function you want, which you can then call with a single argument, and it will work:

var working_factorial = curried_function(curried_function);
working_factorial(5) // returns 120

But what we really want is for the factorial function to reference itself, and not the curried outer function, so that it would be written more like factorial_wish above.

We do that by wrapping a function around it that will define myself for it, and will take care of calling curried_self with itself.

function curried_v2 (curried_self) {
 return function factorial (userArgument) {
   var inner = function(myself) {
     return function(n) {
       return n <= 2 ? n : n * myself(n - 1);;
     };
   };
   var myfact = inner(curried_self(curried_self));
   return myfact(userArgument);
 };
};

Now the innermost function is in the form we want; the function that wraps it, inner, defines its myself for us. The function that wraps that is our new factorial function, which will be what we pass an argument to, and which takes care of the curried mess.

The use of the variables is not strictly necessary here, but it helps make clear what's going on. In programming, it is often a bad practice to combine multiple operations into one monolithic, incomprehensible statement. But in math, it is called "reducing", and it's good. And the Y-Combinator comes from math called Lambda Calculus. We're just trying to understand it in terms of programming.

The calling of this hasn't improved. You still have to call it with itself to get your factorial function out:

var working_factorial = curried_v2(curried_v2);
working_factorial(5)

What has improved is that the inner function is independent of the stuff it's wrapped in, so it can be pulled out, and you'll note that it looks exactly like the factorial_wish function from earlier, wrapped in a function that takes a myself. So what remains is (almost) the goop we're looking for.

var inner = function(myself) {
 return function(n) {
   return n <= 2 ? n : n * myself(n - 1);
 };
};

function curried_v2a (curried_self) {
 return function factorial (userArgument) {
   // inner was here, and is still needed
   var myfact = inner(curried_self(curried_self));
   return myfact(userArgument);
 };
};

So the next thing to do is to pass inner to our function, so it's not hard-wired to look at some global(ish) variable. We'll do this by (of course!) wrapping it in yet another function layer, where we can do the double call, as well:

function basically_Y (inner) {
 function curried_v2a (curried_self) {
   return function (userArgument) { // not necessarily factorial, any more
     var myfact = inner(curried_self(curried_self));
     return myfact(userArgument);
   };
 };
 return curried_v2a(curried_v2a);
}

To use this, we pass in the inner function we previously extracted (or any function that is built like it), and we will get back a well-behaved recursive function that knows how to call itself.

var myfact = basically_Y(inner); // using the inner defined earlier
myfact(5)

All that's really left is generalizing it to handle more than one userArgument.

function Y (inner) {
 function curried_v2a (curried_self) {
   return function (/* any number of user arguments */) {
     // .apply(null, arguments) is how you pass along all arguments
     return inner(curried_self(curried_self)).apply(null, arguments);
   };
 };
 return curried_v2a(curried_v2a);
}

I went ahead and substituted in for myfact, since it's not all that complex, (not to mention that this is no longer just about factorials) but I have retained the variable that is curried_v2a.

To get to the no-variables form, we do a little re-writing: Instead of defining curried_v2a, we create a function that takes it as a parameter, and then pass its definition to that function. The function does the calling-it-on-itself part, and we're done!

function Y (inner) {
 return function(curried_v2a) {
   return curried_v2a(curried_v2a);
 }(function(curried_self) {
     return function() {
       return inner(curried_self(curried_self)).apply(null, arguments);
     };
   });
}

It's still incomprehensible, but if you've been through the derivation, the parameter names might jog your memory about what's what.