Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It was a huge mistake for email receivers to take on the cost of filtering spam. Of course given the evolution of the internet and email it is easy to see how that mistake happened. Nobody had a crystal ball. But the only solution here is to raise the cost of sending email to the point where spam is no longer profitable.

It seems like one solution is to bcrypt hash (or some similarly expensive algorithm) the email and include the hash in a header. Of course you need to hash per receiver or a spammer can just hash it once and spam away.

The receiving client hashes the email and compares the result with the value in the header and discards emails that don't match.

You'll never get industry buy in though - the FAANG companies don't want to pay that cost for their semi-legitimate email. They prefer to keep that cost externalized.

I believe there have been attempts at something like this, but it clearly never went anywhere.



This indeed looks like a good direction. A decade ago Freenet (not sure if it still exists?) had a problem with spam on its equivalent of USENET. It was pretty bad, until they changed the protocol so that it's the sending node that keeps the message, which is then pulled (or not) by the recipients. It made a lot of sense to me: I'm the one sending you the message, I want you to see it, while you don't even know whether you want to get that message. So it's me that should pay for storage and propagation of the message in terms of bandwidth and disk space. Not sure how it turned out in the end, but the approach seems right to me. It's like phone calls: it's the caller that pays the cost, not the one receiving the call.


Wow that's ingenious - and obvious when you think about it. I guess evolving the email standard is a much bigger obstacle though, no matter how clever the proposal.



it's ridiculously easy to create a template that is stored and when the remote client pulls the message only a few variables has to be substituted. it's how spam is generated after all :)


Yes, but in the context of Freenet specifically, there's a lot of encryption and hashing involved in every transaction, which are computationally expensive, and there's no CGI equivalent at all (the whole network is basically static, append-only content addressed with the content's hash). So to create many messages you'd need to first generate the content, then encrypt and hash it, then insert it into the network starting at your node, then provide signature and checksum verification on every request. After some time, if the content is widely requested, some other nodes will cache it, but you can't really expect spam to propagate via web of trust; it's more probable that it will be added to spam filters which, themselves, will propagate. As everything is signed, you'd need to create a fresh node for each spam, but that means your new identity is not trusted by anyone - you'd need to convince at least some users that you're a user that won't abuse the system. Only to get your whole identity blacklisted once you do abuse it.

It wasn't perfect, and Freenet's usage was a huge PITA from the usability/ux perspective, but in a constrained environment like that this change in the protocol had much bigger impact than it would on the open web. Not sure how it turned out, but the trend until I left the community was very positive with regards to spam.


Something different: a hash which is expensive to calculate but cheap to verify. E.g. calculating a string of bytes to append to a hash stream on order to produce a hash with a certain number of leading zeros; you provide the hash and the bytes, and it's trivial to verify.


There were proposals for this sort of thing a while back, but they never caught on:

https://en.wikipedia.org/wiki/Hashcash


That's almost exactly my off the cuff suggestion. Of course my brain is seeded by both bitcoin (finding a hash as proof of work) and HLL (which counts leading hash digits to estimate set cardinality).


Like cryptocurrency this will be a moving target. You would need a difficulty setting - but where as cryptocurrency only has to track one thing "total hash rate" this would need to track two things "minimum supported hardware for legit sender" and "hardware threshold for attacker to be successful with a mining rig they can afford, based on the income from the spam".

Each email provider might come up with different difficulty levels based on what they thing this is. So some handshaking might be required. And less computer literate people would be stressed why their email is taking 6 minutes to send. I think it would be hard to implement.


Not good from an energy-wasting perspective.


Oh I know. The point is something hard that the sender needs to do. Ideally it wouldn't waste lots of power, but it must consume some scarce resource, or else it wouldn't cost anything, and the cost should be borne by the sender, so e.g. a RAM-intensive hash function wouldn't be right.


That is a clever idea but I think it'll still fail so long as email (SMTP) is a fire-and-forget architecture. As long as you have that asymmetry, your SNR is going to suck.

If it were a back-and-forth protocol, more like TCP, then you have way more options for congestion control, error reporting, load balancing, and the like. The server can choose to accept the incoming request, ask for more verification, or interrogate the client in various ways. This could be something just like DKIM / DMARC / SPF, or even something more exotic, like making the client do proof-of-work with difficulty tied to how suspicious that client is to the server, and also the delivery scope/scale. Or forcing the client to wait for ACK for valid delivery while slow-walking it.

This gets around some of the issues in cousin comments, with respect to punishing botnets and rewarding lawful players. Established, high-trust players pay no cost. Suspicious players can still get through, albeit with a tax (that should be trivial for low-volume personal MX, but expensive for high-volume spam). Furthermore, it's adaptable.


> If it were a back-and-forth protocol, more like TCP, then you have way more options for congestion control, error reporting, load balancing, and the like. The server can choose to accept the incoming request, ask for more verification, or interrogate the client in various ways.

That's basically what graylisting aims to achieve.


Yeah, this is essentially a form of greylisting. The difference is (as I understand it, this is fairly outside my domain), with the current setup, MTAs can accept an email, and it ends up getting blackhole'd or spam-folder'd anyways. My hypothetical scheme would put more onus on the first "boundary node" to report on errors/compliance. Basically the MTA tells the client what hoops to jump through, and the client gets some indication what will happen once those conditions are met.

That could be an exchange like: "Sign this nonce, and your message will be vetted", or "this is very suss, you have to do X difficulty hashes to have any chance of delivery, and regardless it'll be flagged as potential spam". Or perhaps just a guarantee on how an action would affects the message's "spam score".

This could be used alongside nested packets/envelopes and various headers/trust levels in a network of trust to give a message some overall trust level.


The problem to adding a cost to email is that it affects everyone. The amount of CPU power you need to waste to make most spam not viable is so much that it isn't worth it.


Most spam is sent by hijacked machines and botnets is it not? They don't care about wasting CPU power; they aren't paying for it.


It definitely does not - you can allow different work loads for different senders. Mailing lists you actually want can be dropped to zero for instance.

Most spam comes from new address pairs, not existing ones. Requiring high cost to get past a first-contact filter, then near zero forever after, is completely reasonable and would practically eliminate unsolicited spam.


But now the sender needs to know the receivers policy and if they remember that there has been contact before. Or I guess you change SMTP but we still allow unencrypted connections so good luck with that.


For newsletter style stuff, nah. The "confirm your registration" email can "pay" to get past the wall, and then you're done - approved pair established, future letters can probably be zero cost and everyone receives the same one.

I wholly admit that this is arguing theoretical setups and that's always problematic, but of course patterns would be established pretty quickly. There are loads of simple tactics that would still make spam dramatically harder, and legitimate use nearly unaffected. The current reputation system has clear, massive gaps that really don't need to exist.


yeah, I believe it is called "HashCash" and works similarly to "proof of work" in cryptocurrencies


HashCash is actually referenced as part of the inspiration for Bitcoin by Satoshi themself. It's the birthplace of proof of work.


Right, I can´t believe people are re-discovering HashCash. It was a brilliant idea way ahead of its time. Sadly it was not adopted for email.


Isn't this DKIM bh= ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: