Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm one of the authors.

This work is about looking at the Bitcoin transaction history as a network, and investigating privacy and anonymity, in practice, on it - something there's been a good bit of discussion around recently.

You can see a lot of non-obvious things, when you 'collapse' addresses, as we describe in the paper, and look at it as a network.

We're not really talking about the extent to which Bitcoin itself is useful as a currency, or investment - that's a whole other topic, and a big one.

If anyone has any questions on the work we did, if you post them here or on the blog, I'll try and answer.



This is very nice work, thank you. I've only had time to skim the paper; pardon me if these questions are answered therein:

1. You stop short of actually identifying the thief; is this primarily due to ethical concerns or the paucity of off-network information? Could you speculate on whether non-public information available to law enforcement would be enough to resolve the thief's identity?

2. How easy is it for users to protect themselves to foil your analysis techniques? Could client software automate some of these obfuscation mechanisms?

Incidentally, you cite my Netflix work, but my work on deanonymizing social networks based on topology (http://33bits.org/2011/03/09/link-prediction-by-de-anonymiza..., http://33bits.org/2009/03/19/de-anonymizing-social-networks/) might be more relevant, and some of the techniques potentially applicable to deanonymization of the bitcoin transaction network if and when it grows larger and gains a more substantial resemblance to the network of real-life relationships.


Those are great questions.

1) We made a decision that the purpose of this specific work was to illustrate anonymity pitfalls, for the benefit of users generally, and not to de-anonymise any individual users.

As such, we haven't dug deep to try and identify the thief. We've just examined the theft as a case study, to show that specific flows can be followed in practice.

We think that law enforcement would have, at the least, some leads to follow, if they used similar analysis techniques - we could also have looked deeper into this incident, but didn't.

We can't speculate on whether there's enough information to identify the thief - a lot would depend on whether the leads panned out, and on what sort of assumptions the thief made about trying to hide their identity - outside the scope of this work.

2) I think that some of our analysis would be possible to foil. Its probably possible for client software to avoid a lot of the account 'linking' that is due to transactions inputs being merged, perhaps by breaking the connected components formed, by putting merged Bitcoins through intermediate accounts, or perhaps by supporting mixing of some form.

There are other leakages, of off-network information, such as the Bitcoin Faucet displaying IPs, that could trivially be turned off.

But as to whether this would render Bitcoin anonymous overall, it is very hard to say. It is extremely difficult to get anonymity into your system, unless it has been an explicit design goal; and it would be possible to take this kind of analysis much further than we did.

Thanks for the tip about the paper - we should probably reference it. That was nice work - it occurred to me it was possible to use such a strategy when the competition was announced, and when we saw the results, we knew someone had!


"2) I think that some of our analysis would be possible to foil. Its probably possible for client software to avoid a lot of the account 'linking' that is due to transactions inputs being merged,"

I already did some work on this:

http://coderrr.wordpress.com/2011/06/30/patching-the-bitcoin...


That's really nice work.

I think it should be adopted by the official client, and, ideally, the users educated as to its usage; it would mitigate a lot of the entity resolution, which our work shows is a widespread problem.


One assumption that the post (and I assume the paper as well) makes is that there _was_ a thief. It is entirely possible that one user played both the victim and the thief in this story.


We are aware that this is an assumption. We refer to the 'alleged thief' at several points in our paper, and I mention this in the blog comments.

We'd never get our point across if we exhaustively stated every single assumption we made - but you are right to highlight this - its good to be aware these assumptions are there.


How much time (in hours) would law enforcement spend to perform this similar analysis technique? Or would this become trivial if a system that performed analysis continuously were built?


I honestly don't know how long it would take them. It would very much depend on their technical sophistication, their experience with writing network analysis code, etc.

I'm not sure it would make the analysis trivial, but you certainly could engineer a system that would have an easy user interface for conducting this sort of analysis on Bitcoin.

For example, the software tools we wrote would probably only require UI work to get them to a first draft of that - we've basically got a fully functional backend system, although I don't think we've any plans of taking it further.

I don't know if you've looked at the SVG we have on the blog, but its a pretty useful way of looking at this data, even as it stands, with hyperlinks to the relevant blockexplorer blocks.


This is of a kind with several other very successful de-anonymization efforts in the past few years.

There's probably some clever way to express the general problem in information theory terms and prove that any set of data with certain characteristics must be de-anonymizable to some given extent. 6 billion people in the world is still just about 33 bits to uniquely identify an individual (and of course generally we're not talking global population), and so even a very small number of bits that can be correlated back to the real world in arbitrarily clever ways will reveal real-world identities in a putatively anonymous data set. It wouldn't take much to clean that up into a mathematically rigorous statement; no matter how you slice it, low tens of bits will tend to identify people and that's a low threshold.


Indeed.

Some academic with too much time on his hands wrote an entire Ph.D thesis on that very idea [hint: me :-)] http://33bits.org/about/


Well, that was easy; my work here is done. Now if you'll excuse me, I'm off to prove white is black and get myself killed on the next zebra crossing.

I'm browsing through there and see you've got good coverage on HN already. Is your thesis available publicly anywhere? I'd be interested in reading the full-strength version.


The introductory chapter of my thesis is available as a standalone document here: http://randomwalker.info/misc/thesis-intro.pdf It's a bit dull and academic.

The rest of my thesis is (almost literally) a concatenation of several of my papers, most of which I've covered on my blog; a quick list is here: http://randomwalker.info/publications/


First, thank you for raising this point again. It has been in the past and deserve to be restated : bitcoin is not about anonymousity. If something, it is more about traceability than anything else. With a trusted third-party, some money laundering scheme become possible but they are not that different than regular money laundering operations.

What bitcoin is about, however, is about the feasibility to have pseudonymous entities that can't be traced to real persons, yet trade online commodities and services. Take lulzsec fot instance : they accepted BTC donations and could spend them to buy web hosting, to make donations to various associations, or to simply buy services from anyone accepting bitcoins.

Right now, the anonymity is lost as soon as BTC is converted into a regular currency, but as long as it stays in the BTC network, the account is nothing more than a number (and an IP if you don't use a anonymisation network).

You could have pseudonymous software developers, writing code in exchange of hosting of gfx works, all of that happening in the gray legal area of international services exchanges. Today there are no way of creating a small international dematerialized company, which is a shame and a failure of international cooperation. Bitcoin could be a tool to address just that.


Do you have an answer to this question: http://www.quora.com/Would-it-be-theoretically-possible-to-m...

Thanks!


I'd say the most correct answer is 'no, its not possible'.

Bitcoins don't exist as independently tracked entities in the system, as such. Lets say an address with 10 unmarked Bitcoins receives 10 marked Bitcoins, such that its balance is now 20 bitcoins.

If it then sends 5 bitcoins to another addresses, it is not possible to make statements about what proportion of those 5 bitcoins were marked - the individual Bitcoins are not individually identified.

So, its more meaningful to think of balances getting transferred around, rather than Bitcoins.

What you can do, is the type of flow analysis we did, where you try and track the majority of flow in and out of addresses, and make inferences about how the Bitcoins flow around. As the network is currently used, this appears to work well - which is one of our main findings - but I'm sure this analysis, as we currently do it, could be frustrated by employing mixing of various types - especially if such mixing were done at a protocol level.

This still has other problems - you are then trusting the mixer to some extent, and I'm sure there are attacks that are possible where someone deploys a malicious mixer, or where someone constantly floods a mixer with coins under their control.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: