Existential Haskell

philzook · on Nov 27, 2020

A fun post. I liked seeing some things I knew summarized and some things I didn't know. It is fascinating to me that functional programming techniques and object oriented techniques become intersecting once you know enough about the two. The aversion some parts of the functional programming community have to object oriented is more about how it manifests in particular popular languages. People are dutifully intrigued by the CLOS or Smalltalk. Things that are function like should probably be modeled as functions and things that are object like should probably be modeled as objects.

Visitor patterns are a kind of fold, Functions as Closure objects, the various formulations of iterators and stream transformers. The relationship of existential types to objects was (is?) a hot research topic.

- Codata in Action https://www.microsoft.com/en-us/research/publication/codata-...

- Hasekll's Overlooked Object System https://arxiv.org/abs/cs/0509027

- Abadi, Martin; Luca Cardelli - A Theory of Objects.

- Type Theories and Object Oriented Programming http://users.csc.calpoly.edu/~gfisher/classes/530/handouts/r...

danidiaz · on Nov 27, 2020

One neat and practical application of existential types if for defining "collectors" for streaming sources. The internal state of the collector is hidden and it can be different form what the collector ultimately returns. For example, a collector that returns the average of its inputs can store sum and count separately as its internal state. This also makes it easier to combine collectors.

In Haskell, using GADTSyntax, it would be something like

    data Collector a b where
        MakeCollector :: (x -> a -> x) -> x -> (x -> b) -> Collector a b

That is: to construct a collector that ingests as and returns a single b, you need to supply a step function, the initial state of type x, and a "tally" function that calculates the result b from the final state. The type of the state ("x") is not present in the type "Collector a b"; it is an "existential".

In the Java Collectors framework, this is (roughly) analogous to

    Collector.of (Supplier<A> supplier, BiConsumer<A, T> accumulator, BinaryOperator<A> combiner, Function<A, R> finisher, Collector.Characteristics... characteristics)

And the type parameter that corresponds to the existential in Haskell is:

    A - the mutable accumulation type of the reduction operation (often hidden as an implementation detail)

cousin_it · on Nov 27, 2020

It seems to me that Collector a b is isomorphic to [a] -> b, so not clear why you need existentials or GADTs.

    iso1 :: Collector a b -> ([a] -> b)
    iso1 (MakeCollector f x g) = g . foldl f x

    iso2 :: ([a] -> b) -> Collector a b
    iso2 f = MakeCollector (flip (:)) [] (f . reverse)

danidiaz · on Nov 27, 2020

It's not isomorphic to [a] -> b, because you are not restricted to feed a Collector with a preexisting list, you can feed it with lines read incrementally from stdin for example.

Another difference is that you can combine a "Collector a b" and a "Collector a c" into a "Collector a (b,c)" that, like its components, only requires a single pass of the data. (This would be the Applicative instance for collectors.) Combining functions [a] -> b and [a] -> c doesn't necessarily "stream".

Also, I didn't use GADTs, only GADT "syntax" in which data constructors are provided as functions with signatures (MakeCollector :: ...)

Personally, I find this syntax much clearer for existential types. It amounts to having a type variable in a parameter which doesn't appear in the return type of the constructor.

rrobukef · on Nov 27, 2020

Since Haskell is lazy you can feed final list to its construction (like the prime sieve examples). Combination in a single-pass is optimisable by loop-fusion. Though optimisations are famous for 'flaking out' at the worst moment.

danidiaz · on Nov 27, 2020

With [a] -> b, when production of the input elements requires I/O and you don't want to read the entire list beforehand, you are forced to use lazy I/O, which is notoriously flaky and doesn't handle errors well.

Meanwhile, with a Collector, you can read the inputs with standard IO actions just fine, and feed them as they are produced.

rrobukef · on Nov 27, 2020

Because of non-functional properties of programming code. Of course the archetype of a Collector can be used as a backing data structure. But maybe, if you want an accumulator you're better of calculating the sum without constructing a list (or relying on fusion).

axelerator · on Nov 27, 2020

The irony:

"I write this not because I expect to break any new ground—all the techniques I use here are long-documented in the literature, and Haskell veterans will probably find little new in this post"

Yet I can't get rid of the feeling the article is written in a way only "Haskell veterans" are able to follow it :-)

noelwelsh · on Nov 27, 2020

Leaving aside the technical content the post uses uncommon words and constructions that obscure its meaning. In other words it is overwritten. Taking the first paragraph as an example, you could make it much easier to read by: replacing "tenet" and "antithetical" with simpler words, removing the "of course", and breaking some of those run-on sentences into smaller sentences.

I'm reminded of patio11's writing. There's a man who never uses one word when three will do. I expect I will now be excommunicated from HN. :-)

arc0 · on Nov 27, 2020

Not really, it definitely assumes some background with Haskell, but it's far from 'veteran' level. All of the code samples except for the very last section only use elementary concepts.

Cu3PO42 · on Nov 27, 2020

I can't disagree with your feeling, but I think it's sometimes difficult to put yourself in the shoes of someone who doesn't have experience in a certain area.

While explaining something to first-semester CS students, I said "but it obviously doesn't matter if you accept a pair as a single argument or two arguments". They, of course, asked why it didn't matter and I started off with "You see, functions are exponentials" and I was going to say "and normal algebraic laws apply", but realised after the first part that this way of looking at it surely wasn't going to help them without any background in more advanced theoretical topics.

johnisgood · on Nov 27, 2020

Yeah, I do not think it would have been too pragmatic. If the lesson is mathematics, sure, but for CS it is more of a trivia, IMO, or at least I cannot seem to find what changes by knowing that piece of information.

chriswarbo · on Nov 27, 2020

> I cannot seem to find what changes by knowing that piece of information

The most common change is that writing convertion functions between 'foo(bar, baz)', 'foo(bar)(baz)' and 'foo((bar, baz))' calling conventions becomes more annoying. It's not that I find myself doing it any more or less, it's that knowing they're equivalent makes it even more frustrating.

There can also be a difference in efficiency, e.g. in 'foo(bar)(baz)' we can have 'foo(bar)' pre-calculate something expensive, which will get re-used if we do things like 'f = foo(bar); f(a); f(b); f(c); myList.map(foo(baz)); etc. whilst the 'foo(bar, baz)' version will generally re-calculate such things every time.

Of course, this varies depending on compiler optimisations and other language features, but the nice thing about the 'foo(bar)(baz)' version is that it can simply be a matter of scope, e.g. in Haskell:

    foo1 x y = let cached = expensive x
                in ...

    foo2 x = let cached = expensive x
              in \y -> ...

We can achieve a similar result in other ways and in many languages, but many of those alternatives (e.g. 'static variables', intermediate classes, mutable variables, etc.) require much more ceremony and boilerplate.

johnisgood · on Nov 30, 2020

What do you think about the Forth way of doing it? I cannot get into it in detail here, hopefully you know about its internals. If not, then I recommend 3 sources.

- http://galileo.phys.virginia.edu/classes/551.jvn.fall01/prim...

- https://www.forth.com/starting-forth/11-forth-compiler-defin...

- https://www.forth.com/starting-forth/9-forth-execution/

> In Forth, there is virtually no excess overhead in recursive calls because Forth uses the stack directly. So there is no reason not to recurse if that is the best way to program the algorithm.

chriswarbo · on Nov 30, 2020

Yeah Forth is another example of how these calling conventions are equivalent. It's not just some quirk of Forth's implementation either, since languages like Joy and Kitten do the same thing with pure functions. Unfortunately I've not had much need to learn or use Forth in a significant way (the most I've done is play with OpenFirmware)

epsilonclose · on Nov 27, 2020

What do you mean by "functions are exponentials"?

Cu3PO42 · on Nov 27, 2020

A function with type A -> B is essentially equivalent to an object of type B^A. I can give you some relatively intuition for the case that A is finite. Imagine B^|A|, this is a list of B elements with one for every element in A. You can therefore interpret it as the return value for every possible input. Therefore it uniquely defines a function A -> B!

The type of pairs is just the cartesian product. Therefore (A×B)->C ~ C^(A×B) = (C^B)^A ~ A->B->C.

If you find this sort of thing, you might be interested in category theory :)

epsilonclose · on Nov 27, 2020

Thank you for clarifying. I was familiar with the notation A^B for the set of all f: A -> B, but this seemed like something more (and it does appear to be).

Putting some links I looked at here to save someone a few seconds of Googling: https://en.wikipedia.org/wiki/Exponential_object https://ncatlab.org/nlab/show/exponential+object

dllthomas · on Nov 27, 2020

I'm not sure if the notation you'd encountered differed, but in this case the equivalence is between A^B and B -> A (note the order).

Interestingly, every identity you learned in grade school algebra which involved only addition, multiplication, and exponentiation holds here as an isomorphism. It can be a fun exercise to write the functions that take you back and forth. For instance:

    A^(B+C) = A^B * A^C

    to: (Either b c -> a) -> (b -> a, c -> a)
    to f = (f . Left, f . Right)

    fro: (b -> a, c -> a) -> (Either b c -> a)
    fro (f, g) = \case
      Left b -> f b
      Right c -> g c

uryga · on Nov 27, 2020

google keyword: "exponential object"

("exponential function" clashes pretty badly ;) )

epsilonclose · on Nov 27, 2020

Thank you, that helped!

BestCatalonia · on Nov 27, 2020

such a long piece just to make one point..

augustk · on Nov 27, 2020

Are there any Haskell articles about solving actual problems?

arc0 · on Nov 27, 2020

Yes. Haskell is used by many companies, from small startups to hundred billion dollar corporations.

Fighting spam with Haskell -- https://engineering.fb.com/2015/06/26/security/fighting-spam...

The Joy and Agony of Haskell in Production -- https://www.stephendiehl.com/posts/production.html

Haskell in Production: Riskbook -- https://serokell.io/blog/haskell-in-industry-riskbook

Haskell at Barclays (talk, not an article) -- https://www.infoq.com/presentations/haskell-barclays/

Tainnor · on Nov 27, 2020

Why does it matter? Not everything needs to have immediate real world applications. Haskell is a good test bed for type theory and some of the things that it experiments with eventually find their way into other languages.

CyberDildonics · on Nov 27, 2020

Whenever this is asked the first response is always the facebook spam filter from five years ago, which is should help you calibrate to the reality. People who have been down the rabbit hole of getting haskell to work seem to consistently say that it is very difficult to debug not only logic but consistent speed and memory usage. It seems like the reality is that if you want something fast, C++ is there with a great ecosystem, if you want to hack something together lots of scripting languages are available and if you want a balance LuaJIT and even go might work.

ImprobableTruth · on Nov 27, 2020

Because that's the most notable case of a large company using Haskell. If that's your qualifying criterion, recommending LuaJIT is just as if not even more ridiculous.

Most commercial Haskell is just boring CRUD, like with virtually any other language. There are some interesting projects like Hasura (a decently fast GraphQL engine), but something like Facebook is just a million times more prestigious, so that's what will always be brought up.

> that it is very difficult to debug not only logic

Feel free to link any write up making an actual case for this.

CyberDildonics · on Nov 27, 2020

> Because that's the most notable case of a large company using Haskell

I know. It is something small that did not lead to more adoption and was five years ago.

> If that's your qualifying criterion, recommending LuaJIT is just as if not even more ridiculous.

I wasn't really recommending anything, I haven't even tried go, I was saying what seem to be reasons haskell isn't used in pragmatic scenarios, even by people who have invested a lot of time into it Still, cloudflare relies heavily on luaJIT and Love2D is built on it, meaning all these games have it at their core https://store.steampowered.com/curator/32659238/ The language is still lua and the speed is not disputed, so I don't think it is the same as haskell, since haskell still has a much bigger question of pragmatism and productivity.

> Feel free to link any write up making an actual case for this.

What I've seen has mostly been in depth comments here.