I've been playing with GridGain lately. It's Java, but their API is really sweet. You basically implement a simple interface that defines the map/reduce operations and then your code will be copied to the cluster nodes through a peer-to-peer classloader.
In a recent test I validated a million image urls in less than a minute. In a small EC2 cluster running GridGain.
It's certainly worth looking into if you are interested in that kind of stuff.
In a recent test I validated a million image urls in less than a minute. In a small EC2 cluster running GridGain.
It's certainly worth looking into if you are interested in that kind of stuff.
http://www.gridgain.org