If Google has already accessed, indexed, and published it, you are in pretty goo...

dragonwriter · on May 15, 2016

No, you don't. Knowledge and intent are key factors in many crimes, and you and Google aren't similarly situated.

jsprogrammer · on May 16, 2016

If google is providing you illegal knowledge, that is Google's problem.

dragonwriter · on May 16, 2016

The law doesn't work that way: knowledge is rarely illegal. Knowingly gathering without permission may be. Using a Google product as a tool in a crime doesn't make Google responsible for the crime and relieve your responsibility.

jsprogrammer · on May 16, 2016

Can you construct a hypothetical situation where clicking a link on a Google result page (or, any page, for that matter) would be a crime?

If such a thing were possible, I would view it as the ultimate betrayal of the browser's "sandbox". Certainly it would be a top priority to categorize links into "known safe to click" and "clicker beware". Who knows, maybe Google's successor will be such an engine.

dragonwriter · on May 16, 2016

"Clicking a link" on a results page by itself is unlikely to be a crime (in any US jurisdiction). OTOH, having a certain criminal intent, constructing a search query aimed to realize that intent, and then clicking on a resulting link to complete the realization of that intent might be.

Zancarius · on May 16, 2016

> Can you construct a hypothetical situation where clicking a link on a Google result page (or, any page, for that matter) would be a crime?

I'm not sure that's even necessary, and there's no point getting into a debate about the browser (you commanded it to do something, after all).

IANAL, but I don't think you need to be one to appreciate the potential for legal trouble. Depending on your interpretation of the CFAA and whether or not you agree with the assertion that the Ninth Circuit limited the scope of the CFAA's reach by requiring a certain degree of intent [1], unauthorized access alone could be construed as a crime. If you want a particularly extreme interpretation of the statute, you can find such almost anywhere you look (here's one from 2005 [2]).

In the latter case, it's notable that if you access material it 1) need not be trademarked, copyrighted, a trade secret, or even particularly sensitive--it need only be "valuable" and 2) unauthorized access is defined rather loosely as accessing "information in the computer that the accessor is not entitled so to obtain." One could argue that password protected resources or databases that are not publicly advertised are not considered something for dissemination to the public and therefore protected by statute.

So, if we apply the CFAA in a manner similar to what you might expect of a prosecutor who is up for re-election this year, let's look at the abuses the article's author committed:

Unauthorized access - check? There's no obvious revocation of the right to access Unilever's MongoDB database, but it probably passes the "reasonable person" test that this information isn't intended to be public. Playing the game of "intent" is a bit risky, so this might be another option in mounting a defense.

"Valuable" information - definite check (the author stated rather plainly: "Within the databases I found personal details like names, e-mail addresses and also private chat logs;" I suspect this would be considered "valuable" information). I don't think this is something I would have admitted. I certainly wouldn't have posted screen captures.

I admit the timing of this is funny, because I was just about to watch a few videos on bosnianbill's Youtube channel earlier when I got to thinking about how inconsistent lockpick possession laws are in the US, and it's interesting how it applies to this story. In some states (notably Tennessee), simply owning a lockpick without the appropriate license can land you a misdemeanor (fine, maybe jail time, depending on my memory of their law), while other states (like my own) require intent and/or possession of multiple "burglary tools" (e.g. a crowbar in addition to a lockpick). While intent alone is insufficient protection from particularly enthusiastic prosecutors, it does at least afford some defense if you wind up in front of a jury. Hoping for the same under the framework of the CFAA is a bit like playing with fire even if you successfully mount a defense (legal costs, opportunity costs from the time wasted on defense, etc).

Not worth it.

[1] http://www.bullivant.com/Computer-Fraud-Abuse-Act

[2] https://www.dorsey.com/newsresources/publications/2005/02/cf...

dragonwriter · on May 16, 2016

Federal prosecutors -- the only ones that can prosecute for criminal CFÀA violations -- are Presidential appointees, they are never up for election. So that scenario never actually occurs.

owenmarshall · on May 14, 2016

Google isn't going to back you in any way - why would they?

e12e · on May 14, 2016

I think the parent alludes to Google being named as accomplices along with you. I do however think you're right that that might not mean much for your case. If nothing else, if two parties commit a crime, and one has a major legal team, it seems like the most probable outcome is that the other party will take the fall.

owenmarshall · on May 14, 2016

That's beyond a fantasy - Google isnt party to a conspiracy because they have a built in affirmative defense: our actions weren't taken to further a conspiracy, they were incidental.

jsprogrammer · on May 15, 2016

The question is whether the content is fair game (to access). Google has already proved it to be fair game and if anyone wants to argue otherwise, they would need to then argue with the most flagrant offender, Google, who has much more than just "Confidential" PDFs.

Google would be guilty of any charge that could be levied against someone for accessing data that Google actively provides.

IanCal · on May 15, 2016

I'm sorry but I think this is rather ridiculous. Google's position is that they have automatically indexed everything that the server said it could, but will remove anything and provide websites a way of doing this.

Your position would have to be that you searched for obviously confidential documents, found them and downloaded them without knowing you shouldn't.

13of40 · on May 15, 2016

Guys, I think we got out in the weeds a little bit with the google thing. The question is if someone puts up a web server on the internet with no authentication and no notice that it's not open for public use, can they get me for "unauthorized access" if I download content from it? If not, what makes HTTP special - why not SQL or SMB?

comex · on May 15, 2016

The relevant question is not whether there is an explicit notice, but whether common sense suggests that you are intentionally making unauthorized accesses - as would be the case with the Google search you mentioned.

See also:

https://en.wikipedia.org/wiki/Goatse_Security#AT.26T.2FiPad_...

jsprogrammer · on May 16, 2016

Common sense?

If you send a valid HTTP GET to someone's server and they respond with a 200 OK and some content, the access was not unauthorized. The HTTP protocol actually makes authorization an explicit mechanism that may be disabled or loosened at the implementor's leisure.

comex · on May 16, 2016

To be fair, the EFF took a position in the case I linked that suggests they might agree with you in the present Google hypothetical too:

https://www.eff.org/deeplinks/2013/07/weevs-case-flawed-begi...

Not only that, I was actually surprised to find that the New Jersey court cited a state precedent along similar lines:

http://cdn.arstechnica.net/wp-content/uploads/2014/04/weevru...

->

http://caselaw.findlaw.com/nj-superior-court/1508996.html

...though that was interpreting a state law and brought up the fact that the state law has some subtle differences from the federal CFAA (despite very similar wording, quite vague in both cases).

On the other hand, in Craigslist v. 3Taps, a district judge found that simply evading an IP ban, while otherwise accessing entirely (intentionally) public information, counts as unauthorized access under the federal law. And then there's the case of Aaron Swartz.

But anyway, even under the more permissive of the possible standards, your logic is too simplistic. What if I send a HTTP GET like this?

    GET /viewarticle.php?title=x%27%20UNION%20ALL%20SELECT%20%2A%20FROM%20%27users HTTP/1.1

It's a perfectly valid and well-formed request according to the HTTP standard, and even valid at the application level, in the sense that you technically can't rule out that an article might exist titled "x' UNION ALL SELECT * FROM 'users", and a correctly written server-side script would interpret the request simply as searching for such an article. But suppose the script isn't correct, and instead of showing an article dumps its user table. Would you say that my access to user data is authorized?

Well, I actually don't know how you'd answer the previous question, but I strongly doubt any court would answer yes. If you say no, then the implication follows that either the difficulty of constructing the dubious request, or perhaps the intent, or something else relatively wishy-washy and subjective can make the difference between authorized and unauthorized. It can't be reduced to some strict technical standard.

jsprogrammer · on May 16, 2016

If the script is mixing title comments into executed SQL code, then I don't think there is much hope for it. This line of argument allows post facto rationalization for determining unauthorized access. To make a claim that something was unauthorized is to claim that there is some procedure that can determine whether something is authorized or not. That procedure is the thing that should actually be executed when deciding to serve a request. We are talking about cases where the written procedure says the request was authorized, but someone else claims that the actual procedure gives a different result [insert ad-hoc, post-facto rationalization here (ie. not policy)].

This is clearly nonsense, though it may take some time for courts to figure it out.

owenmarshall · on May 16, 2016

> This line of argument allows post facto rationalization for determining unauthorized access.

Ah, so the burglar with the bump key is allowed in because the action of the lock determines criminality? "If it opens it's allowed?"

You seem to be making the same fundamental mistake many technical individuals make when they interact with things outside of their knowledge sphere - you're attempting to map a space that is foreign to you into the world you know.

The legal system is not a computer. It does not run on rigid rules That's actually a really good thing: it allows flexibility in considering whether an action is a crime or not.

There's a spectrum to consider. It's clear on one end that a person who searches for "not for release filetype:pdf" may be looking for historical documents, and a person who attempts a SQL injection against a web application has sufficient guilty knowledge and intent.

jsprogrammer · on May 16, 2016

The legal system does run on rigid rules. Yes, there is no perfect executor (subjectivity will still exist), but the rule of logic still applies. A legal system where you may be convicted of a crime on a whim is not a legal system, it is a farce.

Everyone seems to be ignoring that a 200 OK is explicit authorization, per the protocol. It would be one thing if we were talking about a protocol with no built in authorization primitive, but we aren't. Using HTTP establishes an authorization procedure. Claiming that it may be illegal to receive responses to well-formed requests to the server requires one to make the fundamental mistake of not understanding the technical protocols that are being used to communicate.

The legal system operates on a subset of the logic involved in the technical world. Its ideas and understanding will necessarily lag the reality being created and will be subservient to the logic being established, not adversarial.

Burglary is a crime because it is an intent to commit further crime, not because a door was opened. The difference with an HTTP authorization lock is that the authorizor gets to examine every request and must run their authorization policy on every one. Arguing that the policy that was actually ran was "wrong" is an admission of incompetence.

The analogous situation is where a business posts an "OPEN 24/7" sign by their open front door, but shootgun blasts people who walk through the door.

13of40 · on May 16, 2016

That's a good point. 401 Unauthorized... They even used the right word.

jsprogrammer · on May 15, 2016

Documents are not obviously confidential if there is an established process for removing confidential documents, but the documents still show up in a simple search.

Your position is that you viewed everything that Google thought it could publish in regard to your query. It is ridiculous that someone could be jailed as a result of clicking a link on a Google search result page.

hluska · on May 15, 2016

Consider the Google search that started this:

"not for public release filetype:pdf"

That's a pretty flagrant attempt at accessing confidential documents. It isn't like someone googles "how to catch a roadrunner" and accidentally downloads confidential Acme documents. This is a full on attempt to find poorly secured documents.

Now, consider what Google does. It runs bots (that respect things like robots.txt) and then publish links to everything that they can find.

Maybe I'm missing some subtlety, but I don't understand how these are similar. Can you explain yourself further?

jsprogrammer · on May 15, 2016

That is a perfectly legitimate query. I would expect to find all manner of historical documents. Further, it does not matter what a document says. Claiming to be not for public release doesn't make it a crime to release it. The only possible exception here is for national secrets, but even then many exceptions have been made.

hluska · on May 15, 2016

Good answer - thanks very much for clarifying!

stale2002 · on May 15, 2016

Because there isn't going to be anything confidential that the search result returns. And anything you access is something that was widely available.

It'd be like googling, "Bank of America's Secret Backdoor Password to steal all it's money".

hluska · on May 15, 2016

It's possible that I have missed some subtleties in your argument so let me ask for a bit of clarification.

Because there isn't going to be anything confidential that the search result returns.

Doesn't this assume that sysadmins are actually competent? And isn't there a ton of evidence that suggests that sysadmins have routinely allowed confidential data to be indexed by Google??

In that case, isn't this analogous to what would happen if I left my front door unlocked and you 'broke' in and stole my collection of Taylor Swift CDs. (I don't actually own any Taylor Swift CDs, but it makes my point easier).

Granted, I did a shitty job of securing my valuable music collection, and Taylor Swift CDs are widely available. But fundamentally, you still came in without permission and took something that belonged to me.

Recent history has shown that you can be prosecuted for all sorts of things in cyberspace. Accessing confidential directories, downloading poorly secured files, and exploiting poorly designed APIs have all been successfully prosecuted.

I wish that we lived in a world where doing things like that would be considered a part of intellectual freedom, but the unfortunate truth is that laws are applied in such a way as to make this highly risky. The silly thing is that the state of the law actually benefits hard core criminals...