Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem with stack languages for me has always been in typing. I find it difficult to track the exact stack effects of each word (function), particularly if these functions are pushing, popping and applying functions that themselves modify the stack. Even when dealing with something as simple as optimized x87 code, I would annotate all the FXCH etc. to keep track of the bits of the arithmetic expression on the stack, as I tried to keep multiple operations in flight.

Yes, I know Christopher Diggins' Cat is statically typed. The problem isn't in compiler diagnostics, it's in readability.

In a stack language, an expression as simple as 'a b c d' is cryptic. If c consumes two arguments, in applicative terms it would be d(c(b, a)); if b consumes one argument and c only one, then it would be d(c(b(a))); if d consumes two arguments, b one, and c none, then it would be d(c, b(a)). And this ambiguity makes it very hard for me to read the code unless I'm intimate with all the individual words. You have to understand everything to understand anything.

Unfortunately, I suspect that concatenative languages tend to derive their power from this ambiguity, because the lack of specificity leaves considerable flexibility for what the words can do with the stack.

So my considered opinion for now - and this article doesn't really change that, it is mostly a tutorial - is that stack languages are best as a compiler target rather than a human source form.



The ambiguity produced by stack effects is also paralleled in most infix procedural languages. C++ recognizes 18 levels of operator precedence[1] with two types of associativity when evaluating expressions, and any of those operators can be overloaded to produce side effects for different combinations of types. Sometimes this behavior is intuitive, other times it can be deeply confusing.

For what it's worth, my experience is that practice and good factoring can make concatenative languages much less cryptic. In Forth, the ideal size of a procedure is about a line, and procedures that take more than three arguments are considered a "code smell" that suggests a different factoring might be more appropriate. When you only care about the top few stack elements within any given definition there's a lot less to juggle mentally.

[1] http://en.cppreference.com/w/cpp/language/operator_precedenc...


> The ambiguity produced by stack effects is also paralleled in most infix procedural languages. C++ recognizes 18 levels of operator precedence

Yes, but all of those operators are binary operators, instead of let-me-look-at-the-documentation-ary, lots of those precedence rules are drilled into people at a young age, and lots of developers compulsively wrap them in parentheses anyway, just to make their intent absolutely clear.

All of that is very different from the experience of a concatenative language.


Although I think you're right that there is less variation, they are not all binary. -, +, & and * all have unary and binary variants, while !, --, ++, ~, new, delete and delete[] are always unary, and of course the ternary operator is always ternary.


"In Forth, the ideal size of a procedure is about a line, and procedures that take more than three arguments are considered a "code smell" that suggests a different factoring might be more appropriate."

That might be the ideal, but if you take a look at real-world Forth code, most of it is much longer than the ideal.

There are a couple of other things that help readability in Forth. One is thorough commenting, and the other is the use of intermediate variables.

Unfortunately, real-world Forth code often doesn't have much of either -- intermediate variables are often frowned upon as being "un-Forthish", and thorough commenting results in a program that is so verbose that (except for the embedded domain) you might as well have written your code in a more conventional language in the first place.


I always liked this (slightly paraphrased) quote:

    Idiomatic Factor does not (explicitly) use the stack.
That is to say, in Factor, a popular concatenative language, you tend to use higher order functions and high level combinators instead of manually shuffling stuff around on the stack. Sure, from time to time you still need to move stuff into the right place, but its easy to keep track of the stack if you only need to consider two or three items. At least, that seems to be the theory - I have not used Factor in a number of years so don't know how well that holds up in practice.

Also, Factor does allow you to use local (named) variables and infix syntax when it makes more sense to do so. Concatenative languages do not need to be unreadable at all and for some things I think the concatenative code is actually more naturally readable.

"If c consumes two arguments, in applicative terms it would be d(c(b, a)); if b consumes one argument and c only one, then it would be d(c(b(a))); if d consumes two arguments, b one, and c none, then it would be d(c, b(a))."

'a b c d' in a concatenative language is akin to 'a | b | c | d' in a shell script (where a stack is being threaded from function to function) or 'd(c(b(a(STACK))))' in most languages where each function has a type 'Stack func (Stack s)'. It doesn't change depending on how many arguments a function takes off the stack - its always 'd(c(b(a(STACK))))'. Of course, since literals are essentially functions which push their value onto the stack, what you say is true if a, b and c are literals. I don't find this nearly as confusing as what it sounds like from your comment.

In code like '1 2 + print', all the words you see are functions. 1 pushes the value of 1 onto the stack and so on. This is interesting because you can easily factor out any part by replacing it with a new word that calls the replaced word:

    :A 1 2 + print
    
    :add2 2 + ;
    :B a add2 print
A and B are the exact same.

This is also interesting: http://concatenative.org/wiki/view/Concatenative%20language


Here's another way to look at it. The stack is the global state of the program; and this global state is passed as an argument to every function. By an unkind interpretation, a stack language is the opposite of functional; every function isn't just mutable, it mutates the world.

In a shell pipeline of 'a | b | c | d', I can see the dataflow clearly. 'c' never sees the output of 'a' directly; it must be mediated by 'b'. But for a stack, 'a b c d', I have no idea if c uses the value a or not, unless I know the definition of b and how many items c consumes.

When reading stack code, I find myself mentally inserting parentheses to see how far back each operation reaches; for higher-level operators, I find myself wanting to insert comments describing the shape of the code being implicitly constructed. I need to see the "tree" (though frequently a dag) shape.

This isn't different in complex infix expressions. There too, parentheses are often needed for clarity - and good programmers use them even when not necessary - and for very complicated expressions, I'll spread them out across multiple lines, with indentation to make the tree explicit, where performance or similar concerns prevents writing out the expression in a more long-hand fashion.


That's one reason I've found the pipeline metaphor not great as a way of introducing concatenative languages. To me, the shell pipeline, which has a data-in/data-out streaming kind of flavor, generalizes (with multiple input/outputs) into dataflow languages, not into concatenative languages. There are definite connections between the paradigms, but imo the stack-manipulation is a different thing not present in stream/pipe based languages. The graphical music language Pd literally draws the pipelines as wires connecting input and output ports on functions (inspired by physical patch cables), which seems much closer to what I'd expect a generalized version of shell pipelines to look like.


Statically typed concatenative languages are precisely dataflow languages. Dynamically typed ones aren’t, because they can have dynamic stack effects. But the diagrams in my article look just like PureData programs for a reason.


Dataflow languages are what I'm most interested in these days when it comes to programming languages and I've always liked the symmetry between concatenative languages and pure dataflow languages (visual ones or just stream/flow based textual languages).

I've also been doing a lot of Max/MSP programming lately and really love working in its "passing messages along patch cables" system, but am frustrated by its lack of higher order objects and lack of aggregate data types (hell, it won't even let me create a list of lists).

What I really want is a Max/Pd-like language with richer set of data types (algebraic types with pattern matching would be a good start) which can be trivially and unambiguously converted between visual and textual representations. A concatenative language may make this a lot easier.


I think you’re really going to like Prog, then. It’s concatenative but focused on ADTs and pattern matching. Abstraction is easy thanks to quotation. I have some neat ideas in mind for an editor, as well, that would do just what you say—display a flow diagram of a program or selection as you edit. Concatenative languages are indeed, as the bland Wikipedia article puts it, “highly amenable to algebraic manipulation” of this sort.


I recommend you check out Faust, if you haven't already.

http://faust.grame.fr/

It's an applicative language created especially for audio DSP. It seems to bridge the gap you describe by providing dataflow combinators and a graphical representation for programs.


This is exactly my issue with stack-based languages as well; i.e., '(f g h)' vs. '(f (g h))' are pretty easy to distinguish, but both of those would be an identical 'h g f' in a stack-based language.


On the other hand, in an application language, (((f .) .) . g) (for a composing a unary function for with a ternary function g) gets confusing as well.



That ambiguity is the “uniform composition” I was talking about, which is the essence of the paradigm. But I agree that it can be a lot to keep in your head if you think about everything as a stack. That’s why I’m offering applicative syntax in Prog, because it’s far clearer in many cases than the concatenative equivalent.

However, giving Prog a concatenative basis lets me do all kinds of interesting things much more easily than if I were to work from lambda calculus. With the syntactic issues taken care of by sugar, it’s a net gain. I hope you’ll see your way to trying it out one day.


I agree, but I think that with a great IDE for stack languages which would track the stack and the stack effects of functions, and display it all in a convenient way, most of the problem would be solved. It would be hard to read code without it, but I'll argue that the same could be said about syntax highlighting. And honestly, I think reading this kind of code can become natural with practice.


Indeed, I think great tooling can really help keep track of stack effects (I'd also say that combinators, rather than dup, swap, drop, ..., make you generally think less about the stack). You should check out the Brief editor as a prototype idea. Just having immediate feedback with step forward/back as you edit makes a huge difference. http://trybrief.com


I agree, but even without IDE assistance, many concatenative languages either suggest annotating your code with stack effect diagrams[1] or refuse to compile without them. (StrongForth, Factor)

[1]http://en.wikipedia.org/wiki/Stack-oriented_programming_lang...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: