We're considering both, but mainly listening to the "community voice" right now :)
Linking to data on other sites (including archive.org) seems like a better way to go, than pulling everything in (as you mentioned, storage reqs become a challenge fairly quickly)