Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

were you validating during parsing before?


Validating during parsing is still parsing, there's a reason why `Alternative f` exists after all: you have to choose between branches of possibilities and falsehoods. Now consider that there's another kind of validation that happens outside of program boundaries (where broader-than-needed data is being constrained in a callee rather than the calling site) that should've been expresed as `Alternative f` during parsing instead. That's the main point of the article, but you seem to only focus on the literal occurence of the word "validation" here and there.


So you are saying that if at a certain point in parsing the only expected terms are 'a', 'b' and 'c', one should not put the corresponding parsed entry in a `char` (after checking it is either of these aka validating), and instead it should be put in some kind of enum type (parsed via `Alternative f`). Right?


You put them however you like, be it in a char or a vector of, but the bottom line is that your parsed items are part of the "sanitized" label that allows you to either tuple-unpack or pattern-match (as long as it's near or literally zero-cost) without performing the same validation ever again for the lifetime of the parsed object. The callees that exclusively expect 'a', 'b' and 'c', and for which they perform internal validation step, should be replaced with versions that permit the inputs with sanitized labels only. How you implement the labels depends on the language at hand, in case of Haskell they can be newtypes or labelled GADTs, but the crucial part is: the "validation" word is systematically pushed to the program boundaries, where it's made part of the parsing interface with `Alternative f` and sanitization labels acting on raw data. In other words you collapse validation into a process of parsing where the result value is already being assembled from a sequence of decisions to branch either with one of the possible successful options or with an error.


> but the crucial part is: the "validation" word is systematically pushed to the program boundaries

Yea, so again. Isn't that freaking obvious?! That author seem to be experienced in Haskell where this kind of thing is common knowledge and for some reason this seems to be some kind of revelation to them...


> Yea, so again. Isn't that freaking obvious?!

apparently not, as I always find snippets of patterns of this kind from my coworkers (and I've worked in many companies, including the ones that require precision for legal compliance):

    def do_business_stuff(data):
        orders = data.get("orders")
        if not orders:
            return
        for order in orders:
            attr = order.get("attr")
            if attr and len(attr) < 5:
                continue
            ...
The industry's awareness baseline is very low, and it's across tech stacks, Haskell is no exception. I've seen stuff people do with Haskell at 9 to 5 when the only thing devs cared about was to carry on (and preferably migrate to Go), and I wasn't impressed at all (compared to pure gems that can be found on Hackage). So in that sense having the article that says "actually parse once, don't validate everywhere" is very useful, as you can keep sending the link over and over again until people either get tired of you or learn the pattern.


But in all seriousness devs could be both be aware and indifferent to it at the same time.

And sometimes, if you are not sure about the constraints 100%, it might even be the safe thing to do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: