http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/
http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/
I would like to take a look at how to parse libpcap files in Hadoop. A problem is that the files are not easily ‘splittable’. However, we can parse PCAP files using Java using PcapDumper (sample code in the distribution’s SVN): data needs to be serialized using protocol buffers. Watch for this patch.
Eucalyptus is a private cloud-computing platform that implements the Amazon specification for EC2, S3, and EBS.
http://open.eucalyptus.com/downloads
Dr. Rich Wolski’s talk at USENIX LISA 2009. http://www.usenix.org/events/lisa09/stream1/wolski.html
Cassandra version 0.6 supports Hadoop. Check out the documentation here: http://wiki.apache.org/cassandra/HadoopSupport
I have been learning R and planning to use this in Hadoop environment.
Check out: http://www.stat.purdue.edu/~sguha/rhipe/
A great post on Yahoo’s blog about Hadoop I/O pipeline: http://developer.yahoo.net/blogs/hadoop/2009/08/the_anatomy_of_hadoop_io_pipel.html
lftp -e ‘pget -n url’