Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Serious question: What do you gain from having an extra layer like docker?


Well it does make it extra easy to deploy a scrape node to any type of machine you might encounter (and having a diverse set of source IPs is extra important for scraping; that means you might need to deploy to AWS, Azure, Google Cloud, rackspace, digitalocean, random vps provider X and so on). So instead of having to have custom provisioning profiles for every hosting provider/image combination, you just need to get docker running on a host and you're good to go.


Because you can use pre-packaged Selenium in Docker images with a few commands: https://github.com/SeleniumHQ/docker-selenium


Selenium grid runs in docker, so it's easy to have multiple instances running. Better control.


Also, if use Kubernetes to manage the grid you can scale out to your credit card limit on GKE: https://github.com/kubernetes/kubernetes/tree/master/example...


What are the advantages of this versus a thread pool of web drivers? I'm not really familiar with Selenium Grid.


Grid can dynamically dispatch based on the browser and capabilities you want when you create the session.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: