Python + Jupyter OK, but pandas actually reads everything at once, doesn’t it. 100MB is no problem but bigger files could result in high swapping pression.
I definitely agree that with this amount of data, you should move to a more programmatic way to handle it... pandas or R.
Keep in mind that pandas (and probably also R?) internally uses optimized structures based on numpy. So a 10 GB csv, depending on the content, might end up with a much smnaller memory footprint inside pandas.
If you have 10 GB csv, I think you will be happy working with pandas locally even on a Laptop. If you go to csv files with tens of GB, a cloud vm with corresponding memory might serve you well. If you need to handle big-data-scale csvs (hundreds of GB or even >TB), a scalable parallel solution like Spark will be your thing.
Before you scale up however, maybe your task allows to pre-filter the data and reduces the amount by orders of magnitude... often, thinking the problem through reduces the amount of metal one needs to throw at the problem...
Starting to see a lot of these frameworks pop up to simplify deployment of machine learning models. I’m really hoping one or two start to stand out...but it doesn’t feel like this one.
As a data scientist with a BS and MBA, I can attest to having experienced disqualification for jobs specifically because of my lack of a PhD. What's troubling is employers think they need PhDs. It often doesn't matter if I have 10 years experience applying data science in industry, without that PhD companies think I'm unqualified.
From my perspective, the best data scientists strike a balance between technical and business knowledge. And it's the business knowledge that PhDs coming straight from academia often lack.