The drawing random letters is true for the lambda in lambda-calculus, but not true for (at least some) combinators. e.g. the K combinator derives from Konstanzfunktion, and I from Identitätsfunktion[1]. So I do think it is an interesting question why Y, and we can't just assume it's arbitrary.
Then perhaps this reply comes closest to the truth:
> Y because the letter Y has one stem which splits in two, just like what the function does. That’s probably the reason, or at least that’s how I look at it.
I still think it's a device of rather dubious value, the most actual use of it I've seen is from being able to do it at the type-level (in one of Kiselyov's articles), because honestly, if you are allowed to name things at all, getting the function to refer to itself is pretty straightforward:
factorial' self n =
if n == 0
then 1
else n * self self (n - 1)
factorial n = factorial' factorial' n
That's it. Unless your evaluation strategy is literally implemented as term substitution/rewriting, it's about as efficient as having actual letrec primitive.
This technique is straightforwardly extended to the case of mutual recursion:
even' even odd n =
if n == 0
then True
else odd even odd (n - 1)
odd' even odd n =
if n == 0
then False
else even even odd (n - 1)
even n = even' even' odd' n
odd n = odd' even' odd' n
"if you are allowed to name things at all, getting the function to refer to itself is pretty straightforward"
The neat thing about the Y-combinator is that it allows recursion to be defined in systems, such as the lambda calculus, which don't have naming and therefore a function can't refer to itself by name.
Well, lambda calculus does have names, but yes, I agree, it's a very neat trick in environments where naming is heavily restricted, it gives you "anonymous recursion", so to speak. Church-encoding is another similarly neat trick, for environments with substitutions/applications but without built-in natural numbers.
It's just that it seems there are not that many such systems used in practice except for "advanced type systems".
The applicative-order Y-combinator that Mike derives at the end, λf.(λx.f λy.x x y)(λx.f λy.x x y), can be straightforwardly compiled into SKI-combinators, which don't have naming at all.
I agree that it's rarely justifiable to actually run a translated version of this code, but sometimes it gives you an easy proof of non-termination for some kind of formal system, which often gives you an easy proof of undecidability, which can save you a lot of time trying to figure out how to compute the uncomputable. Or it may persuade you that adding some feature to your design is a bad idea because it eliminates termination guarantees.
If all you need is a proof of non-termination, then encoding a whole of Y combinator is entirely superfluous: "(\f. f f) (\f. f f)" is quite enough already. In case of SKI calculus, that translates to "SII(SII)".
That's true! But it's easy to think that that's something you can fix (no pun intended) because each reduction just takes you back to the same state again, which is easy to recognize as an infinite loop. After all, there are lots of things like finite state machines that can have infinite loops but no decidability difficulties. By contrast, the Y-combinator (together with the rest of the usual suspects) means you can write programs whose termination behavior is undecidable.
As the article points out, the value of the Y combinator is largely theoretical/aesthetical, in both of which aspects your examples suffer greatly: they are ad hoc, lack abstraction/code reuse (which is why, perhaps, you needed more examples), and they look ugly (at least to my untrained eye). The point of the Y combinator is that it is a beautiful, mathematically precise, abstract, purely-functional way of expressing recursion.
"Ad hoc, lack abstraction/code reuse"? How does this
fix f = f f
almost_factorial f n =
if n == 0
then 1
else n * f f (n - 1)
factorial = fix almost_factorial
lack code reuse compared to
fix f = (\x. x x) (\x. f (\y. x x y))
almost_factorial f n =
if n == 0
then 1
else n * f (n - 1)
factorial = fix almost_factorial
? As for ugliness, well, it is indeed in the eye of the beholder: I personally think the fixpoint combinators that enable mutual recursion are pretty ugly, even more so than "(\x. x x) (\x. f (\y. x x y))".
The only real problem is that in my approach the recursive calls look like "f closed_over_functions... new_args..." instead of "f new_args..." but that's what the compilers are for: this transformation is called "closure conversion" and is pretty straightforward. Sure, if you have to encode those things manually, then perhaps using Y is clearer and may even be the only option if you can't mess with the original definitions.
System F isn't consistent as a logic (pretty much precisely because it has general recursion). In languages with general recursion, you can do things like
Extensions with a letrec-like construct are common, and are sometimes inaccurately called 'System F', but those languages do not have the properties of System F.
You're right. I must have been thinking of one of those extensions you're talking about (F# maybe?). I should have remembered that System F is part of the lambda cube, so it's at least as consistent as CoC
No, they are not. Do you know what "interface" is in Java, and what is "class"? And what is "method"?
Interface Func doesn't refer to FuncToFunc in its declaration, and interface FuncToFunc doesn't refer to Func. The method Y does refer to those two interfaces, but it's declared later than those are, and they don't (and can't, acvtually) refer to it.
https://tromp.github.io/cl/diagrams.html