Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I apologize for changing topic here:

Did you bulk download the arxiv metadata, PDF and or LaTeX files?

I am trying to figure out what the required space is for just the most recent version of the PDF's.

I can find mentions of the total size in their S3 bucket but unclear if that also includes older versions of the PDF's.

I also wonder if the Kaggle dataset is kept up to date since it states merely 1.7M articles instead of 2.4 I read elsewhere.

Edit: I just found the answers to my question here: https://info.arxiv.org/help/bulk_data_s3.html



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: