Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The content falls under copyright law. The problem is that you have to enter the company's servers to obtain this data, and the CFAA says that the company can treat their public-facing web servers like private property, and if you're caught "trespassing", you can be sued and jailed. Scraping plaintiffs are usually granted an injunction based on "trespass to chattels" (among other rationales), i.e., trespass to an individual's property (as opposed to land).

Companies like PriceZombie are forced to stop because the CFAA says that Amazon can prevent them from accessing their servers by decree alone. A ToS isn't even really necessary for this, but it helps them pin down their argument.

PriceZombie could try to get the data from third-party caches, but it only solves part of the problem, because copyright and trademarks come back into the picture once you have a replica of the target page. In Ticketmaster v. RMG Technologies, the judge found RMG infringing on Ticketmaster's trademarks and copyrights because the page they were scraping included Ticketmaster's logo. The judge said the copy of the full page that existed momentarily in RAM while the scraper extracted the non-copyrightable data constituted a copy that infringed on Ticketmaster's rights, even though the logo was never used by the application in any way, it just happened to be on the page.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: