Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Bitcoin is Not Anonymous (anonymity-in-bitcoin.blogspot.com)
135 points by harrigan on July 25, 2011 | hide | past | favorite | 53 comments


I'm one of the authors.

This work is about looking at the Bitcoin transaction history as a network, and investigating privacy and anonymity, in practice, on it - something there's been a good bit of discussion around recently.

You can see a lot of non-obvious things, when you 'collapse' addresses, as we describe in the paper, and look at it as a network.

We're not really talking about the extent to which Bitcoin itself is useful as a currency, or investment - that's a whole other topic, and a big one.

If anyone has any questions on the work we did, if you post them here or on the blog, I'll try and answer.


This is very nice work, thank you. I've only had time to skim the paper; pardon me if these questions are answered therein:

1. You stop short of actually identifying the thief; is this primarily due to ethical concerns or the paucity of off-network information? Could you speculate on whether non-public information available to law enforcement would be enough to resolve the thief's identity?

2. How easy is it for users to protect themselves to foil your analysis techniques? Could client software automate some of these obfuscation mechanisms?

Incidentally, you cite my Netflix work, but my work on deanonymizing social networks based on topology (http://33bits.org/2011/03/09/link-prediction-by-de-anonymiza..., http://33bits.org/2009/03/19/de-anonymizing-social-networks/) might be more relevant, and some of the techniques potentially applicable to deanonymization of the bitcoin transaction network if and when it grows larger and gains a more substantial resemblance to the network of real-life relationships.


Those are great questions.

1) We made a decision that the purpose of this specific work was to illustrate anonymity pitfalls, for the benefit of users generally, and not to de-anonymise any individual users.

As such, we haven't dug deep to try and identify the thief. We've just examined the theft as a case study, to show that specific flows can be followed in practice.

We think that law enforcement would have, at the least, some leads to follow, if they used similar analysis techniques - we could also have looked deeper into this incident, but didn't.

We can't speculate on whether there's enough information to identify the thief - a lot would depend on whether the leads panned out, and on what sort of assumptions the thief made about trying to hide their identity - outside the scope of this work.

2) I think that some of our analysis would be possible to foil. Its probably possible for client software to avoid a lot of the account 'linking' that is due to transactions inputs being merged, perhaps by breaking the connected components formed, by putting merged Bitcoins through intermediate accounts, or perhaps by supporting mixing of some form.

There are other leakages, of off-network information, such as the Bitcoin Faucet displaying IPs, that could trivially be turned off.

But as to whether this would render Bitcoin anonymous overall, it is very hard to say. It is extremely difficult to get anonymity into your system, unless it has been an explicit design goal; and it would be possible to take this kind of analysis much further than we did.

Thanks for the tip about the paper - we should probably reference it. That was nice work - it occurred to me it was possible to use such a strategy when the competition was announced, and when we saw the results, we knew someone had!


"2) I think that some of our analysis would be possible to foil. Its probably possible for client software to avoid a lot of the account 'linking' that is due to transactions inputs being merged,"

I already did some work on this:

http://coderrr.wordpress.com/2011/06/30/patching-the-bitcoin...


That's really nice work.

I think it should be adopted by the official client, and, ideally, the users educated as to its usage; it would mitigate a lot of the entity resolution, which our work shows is a widespread problem.


One assumption that the post (and I assume the paper as well) makes is that there _was_ a thief. It is entirely possible that one user played both the victim and the thief in this story.


We are aware that this is an assumption. We refer to the 'alleged thief' at several points in our paper, and I mention this in the blog comments.

We'd never get our point across if we exhaustively stated every single assumption we made - but you are right to highlight this - its good to be aware these assumptions are there.


How much time (in hours) would law enforcement spend to perform this similar analysis technique? Or would this become trivial if a system that performed analysis continuously were built?


I honestly don't know how long it would take them. It would very much depend on their technical sophistication, their experience with writing network analysis code, etc.

I'm not sure it would make the analysis trivial, but you certainly could engineer a system that would have an easy user interface for conducting this sort of analysis on Bitcoin.

For example, the software tools we wrote would probably only require UI work to get them to a first draft of that - we've basically got a fully functional backend system, although I don't think we've any plans of taking it further.

I don't know if you've looked at the SVG we have on the blog, but its a pretty useful way of looking at this data, even as it stands, with hyperlinks to the relevant blockexplorer blocks.


This is of a kind with several other very successful de-anonymization efforts in the past few years.

There's probably some clever way to express the general problem in information theory terms and prove that any set of data with certain characteristics must be de-anonymizable to some given extent. 6 billion people in the world is still just about 33 bits to uniquely identify an individual (and of course generally we're not talking global population), and so even a very small number of bits that can be correlated back to the real world in arbitrarily clever ways will reveal real-world identities in a putatively anonymous data set. It wouldn't take much to clean that up into a mathematically rigorous statement; no matter how you slice it, low tens of bits will tend to identify people and that's a low threshold.


Indeed.

Some academic with too much time on his hands wrote an entire Ph.D thesis on that very idea [hint: me :-)] http://33bits.org/about/


Well, that was easy; my work here is done. Now if you'll excuse me, I'm off to prove white is black and get myself killed on the next zebra crossing.

I'm browsing through there and see you've got good coverage on HN already. Is your thesis available publicly anywhere? I'd be interested in reading the full-strength version.


The introductory chapter of my thesis is available as a standalone document here: http://randomwalker.info/misc/thesis-intro.pdf It's a bit dull and academic.

The rest of my thesis is (almost literally) a concatenation of several of my papers, most of which I've covered on my blog; a quick list is here: http://randomwalker.info/publications/


First, thank you for raising this point again. It has been in the past and deserve to be restated : bitcoin is not about anonymousity. If something, it is more about traceability than anything else. With a trusted third-party, some money laundering scheme become possible but they are not that different than regular money laundering operations.

What bitcoin is about, however, is about the feasibility to have pseudonymous entities that can't be traced to real persons, yet trade online commodities and services. Take lulzsec fot instance : they accepted BTC donations and could spend them to buy web hosting, to make donations to various associations, or to simply buy services from anyone accepting bitcoins.

Right now, the anonymity is lost as soon as BTC is converted into a regular currency, but as long as it stays in the BTC network, the account is nothing more than a number (and an IP if you don't use a anonymisation network).

You could have pseudonymous software developers, writing code in exchange of hosting of gfx works, all of that happening in the gray legal area of international services exchanges. Today there are no way of creating a small international dematerialized company, which is a shame and a failure of international cooperation. Bitcoin could be a tool to address just that.


Do you have an answer to this question: http://www.quora.com/Would-it-be-theoretically-possible-to-m...

Thanks!


I'd say the most correct answer is 'no, its not possible'.

Bitcoins don't exist as independently tracked entities in the system, as such. Lets say an address with 10 unmarked Bitcoins receives 10 marked Bitcoins, such that its balance is now 20 bitcoins.

If it then sends 5 bitcoins to another addresses, it is not possible to make statements about what proportion of those 5 bitcoins were marked - the individual Bitcoins are not individually identified.

So, its more meaningful to think of balances getting transferred around, rather than Bitcoins.

What you can do, is the type of flow analysis we did, where you try and track the majority of flow in and out of addresses, and make inferences about how the Bitcoins flow around. As the network is currently used, this appears to work well - which is one of our main findings - but I'm sure this analysis, as we currently do it, could be frustrated by employing mixing of various types - especially if such mixing were done at a protocol level.

This still has other problems - you are then trusting the mixer to some extent, and I'm sure there are attacks that are possible where someone deploys a malicious mixer, or where someone constantly floods a mixer with coins under their control.


Of course Bitcoin is not anonymous. The moment you make a purchase that contains some personal information about you (whether your name, IP, address, etc.) with your current wallet, any future purchases can be mapped back to you using your purchase graph. Difficult, yes, but the frequent intent with security is not to stop things cold in its tracks, but to make it such a chore to thwart all but the most dedicated intruders.

But that's not the point of Bitcoin's anonymous capabilities. The relative ease which you can create multiple wallets and keep your questionable Silk Road and Wikileaks donation purchases separate, as opposed to creating multiple offshore bank accounts in Switzerland, can establish a high degree of anonymity. Almost like how drug dealers use prepaid cell phones and discard them for new ones the moment they suspect something is compromised.


But if you ever transfer money from your secret wallet to your identifiable wallet, the anonymity gets a lot weaker.


Could a service that creates a new ´wallet´ for every single transaction provide effectively unbreakable anonymity? Forgive me if the question doesn´t make sense... I haven´t used bitcoin.


Even if it was possible to create new wallets for every transaction, wouldn't you want to spend it somehow? Then, the "flow" of cash from a subset of wallets at the same time indicates _something_.


A system could create, manage and automate a 'cloud' of several hundred wallets, all shifting small amounts of money around at random intervals. When the owner decided to make a particular payment, he could set a target sum to be accumulated by a single wallet, and within a few hours (or days), make payment from there.

Once this system had a few users, the difficulty of tracing any particular wallet 'owner' would be significant.


If these nodes aren't shared between users then you're not really getting anything more than minor obfuscation. If they are shared then you in fact have a third party cloudbank that hopefully is trustworthy and the money cloud aspect is a distraction that doesn't provide any benefits.


all shifting small amounts of money around at random intervals

I suspect that "random" in this instance would have to be very carefully defined, as a lot of signals only become more apparent (to the human eye) when cloaked with the right type of noise.


Such service could work, although it could possibly create a nuisance for continuosly updating your wallet that are tied to other services such as bitcoin mining deposit accounts, or if you ever treated your bitcoin wallet to handle monthly subscriptions, like a credit card. For the sake of true anonymity though, such inconvenience may still be worth it.


https://en.bitcoin.it/wiki/Anonymity says pretty much the same, though in a general terms

"The main problem is that every transaction is publicly logged. Anyone can see the flow of Bitcoins from address to address (see first image). Alone, this information can't identify anyone because the addresses are just random numbers. However, if any of the addresses in a transaction's past or future can be tied to an actual identity, it might be possible to work from that point and figure out who owns all of the other addresses. This identity information might come from network analysis, surveillance, or just Googling the address. The officially-encouraged practice of using a new address for every transaction is designed to make this attack more difficult."

Not that you are saying they claimed otherwise and it is exactly your article that made me look through this page in detail, so thanks for that.


Yes, absolutely.

We actually wrote a sentence in our paper addressing this: "While there is an under- standing amongst Bitcoin’s technical users that anonymity is not a prominent design goal of the system, we believe that this awareness is not shared throughout the community."

Also, there is a gap between 'might be possible to work from that point' and actually trying to do it; and it is this gap that a lot of Bitcoin users are counting on. The idea is out there that while it might be possible to tie things together in theory, its really not doable in practice.

The discussion mentioned on this blog, and the post its replying to, is an interesting example of the uncertainty that's out there, even among very tech savvy users: http://blogs.forbes.com/timothylee/2011/07/14/advanced-bitco...

So, we knew that Bitcoin didn't try make hard guarantees of anonymity, but we wondered how well analysis would work in practice; and it turned out to be work much better than we expected.

The problem of linking accounts, too, turned out to give us a lot more information than we think most people would have expected.

We aren't trying to claim any more than that - some people will read this and say 'huh, obvious' but we think a large number of people will also be surprised this practically worked - we were.


I don't understand why someone would hold a large amount of money in BTC. The only real value i see for BTC is in secure online transactions. Basically a 'last mile' currency that should be used much like cash. So having some money in BTC would make sense (say < $200 US) just for the convenience of secure online purchases. But why would someone transfer a large amount of wealth into BTC? It seems like the digital equivalent of stuffing your money into your mattress. Am I wrong on this? is there some major benefit that im missing?


Because to many people it is not the digital equivalent of stuffing money into your mattress; it is the exact opposite. It is the digital equivalent of investing.

Hell its not even the digital equivalent of investing. It's just plain old investing. (Or speculating, you might say)


You can invest in a foreign currency and BTC is as foreign as it can possibly get


Yeah but when you do currency speculation you can hold the assets in banks and be protected by deposit insurance.


What you're saying is currency speculation is usually safe.

You should be familiar with the old tradeoff- safety (aka risk) vs reward. Traditional currencies do not fluctuate very much. Bitcoin does, and being a fledgling currency, has the potential to explode in value.


bitcoin represents a set of capabilities that haven't been combined before. People believe that the market hasn't discovered the correct market price for this set of capabilities yet.


I totally agree.

There has been some criticism of bitcoin's technical aspects. I just wanted to post a thought I've been having for the last week or two, given your great answer (perhaps you could respond):

If someone creates a better cryptocurrency than bitcoin, they should lock in a 1:1 exchange with bitcoin - so give 1 bitcoin and get 1 of the new currency back. At any time, one can recall bitcoins with the new currency at the same 1:1 rate. This way current bitcoin holders don't lose out to the new currency. There might be a minuscule commission to the creator.

One issue in uptake is trusting the issuer of this new currency to stick around to make good on their promise.

One weakness I see with bitcoin is speed of transactions across the network.


That would just be a centralised version of bitcoin which is pretty much equivalent to PayPal or any of the many clones.


There is a centralized aspect to it, a fixed-rate exchange. Sort of like having Mt Gox lock in an exchange rate and promising not to shut down and being on the other side of all trades.


Great answer.


but what exactly is that set of capabilities, i've heard anonymity touted as one, but according to this article anonymity is clearly not a given.


the distributed ledger means that you don't have to trust anyone to perform a transaction. bitcoins act as digital bearer bonds in the same way that cash does in meatspace.

In addition I can securely store an encrypted file containing my private keys on several cloud services and access it from anywhere in the world. If done correctly, it is possible that no one knows that I have this asset until I try to cash it into a national currency.

That's just bitcoin itself. The bitcoin protocol opens up the possibility of all sorts of distributed and secure interactions, contracts or property titles for example. namecoin is a distributed DNS service based on this idea.


Besides anonymity, liquidity and the freedom from control by a government, particularly the US Treasury.


That is not the only real value of BTC.

For example, it can be used to transfer wealth halfway across the globe in at least ten minutes.

Also, the network operated 24 hours 7 days a weeks, so you don't have to plan around banking holidays or banking time nonsense.

It is also used a store of wealth because the network put an upper limit on bitcoin money supply.


Banks are annoying, but they provide an extremely valuable service (deposit insurance) and interest, which is why people tend to store wealth in banks. Are there any similar institutions for BTC?


You are fixated on deposit insurance. The folks letting lots of cash float in Bitcoin are not using it as a bank. You need to shift your mental thinking from banks. "Investing" in BTC can probably be likened to stocks more than banks.

P.S. The concept of deposit insurances does not hold much meaning with Bitcoin. Deposit insurance applies to banks, which hold on to & handle your money. Bitcoin is money. Now, if a Bank of Bitcoin was founded, because maybe you are afraid your PC will get hacked, and you deposited BTC with them, then deposit insurance would be meaningful.


Banks have deposit insurance because they operate under the fractional reserve system and loan out almost all of their deposits, putting them at risk of failing and losing all your money. Hence FDIC and whatnot.

The bitcoin network doesn't do that, hence no need for FDIC. The apparent risks to the BTC network are that a malicious entity gains control of > 50% of the network's computing power and uses that to illegally modify the blockchain, or the network goes down. The network is so big now that only the largest botnets would have a hope of the former, and it would take a global calamity like a scorched earth world war, asteroid, or EMP from an intense solar flare to accomplish the latter.


There's no deposit insurance in New Zealand. And deposit insurance isn't that old to begin with.


I don't see why people would hold JPN yen in the United States. (hint, investing)


Bitcoin is pseudonymous. I have always been a little irratated that the developers have not tried to dispel the myth that it is anonymous. I am not sure if they ever said it was anonymous but they do not do enough (in my opinion) to stress that it is not anonymous.


All transactions are public.

It is one of the central ideas behind bitcoin and leaves no doubt about the anonymity of the transactions.

It is a well publicised idea and it is even mentioned on the main page of the bitcoin wiki (http://bitcoin.it).


The fact that the transactions are public is a separate issue vs. whether they are anonymous.

The reason people think Bitcoin is anonymous is because they think that identities cannot be linked to the addresses involved in the public transactions.


The feature that people confuse with anonymity is that everyone can create a wallet and join the network. No need to sign up or provide personal information. Hence the myth.

When careful enough, bitcoin can be used anonymously. But the developers do not claim that it's anonymous by default, certainly not in the current mainline client. For example, the wiki page about anonymity:

https://en.bitcoin.it/wiki/Anonymity

Anonymity is not guaranteed. There are various initiatives under way to improve this. But people should stop thinking that anonymity is the single redeeming feature of bitcoin, anyway.


Awesome depth, many many thanks for the analysis! I'll definitely be reading this more thoroughly when I'm more awake :)

The moral of the story is still what it's always been, and it's a two-parter: 1) anonymity is only as anonymous as how you use it. And, because Bitcoin's transaction history is public, it's very very hard to use it truly anonymously. And 2) very few people go to even reasonable lengths to stay anonymous. For most, I simply doubt they think it's worth the effort - why anonymize legitimate use?


I imagine it will be much harder to track the flow of bitcoins as soon as larger laundering services start popping up. For instance if a major poker site switched entirely to bitcoin then then it would be very easy for someone to stash a large amount of coins in the service and pull them out slowly over time to a separate wallet. Right now it's hard to stay anonymous because there are no large anonymous entities processing transactions to hide your own transactions within.


"We contract all vertices whose corresponding public-keys belong to the same user." How?


Okay, it's not anonymous but it's easy to receive money at a wallet that is otherwise unidentifiable (and thus can be sent in a way equally unlinked to your real identity or public wallet endpoint).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: