I have been learning R and planning to use this in Hadoop environment.
Check out: http://www.stat.purdue.edu/~sguha/rhipe/
I have been learning R and planning to use this in Hadoop environment.
Check out: http://www.stat.purdue.edu/~sguha/rhipe/
Increase the JVM heapspace of weka in the properties file from 128M to 1024M in:
C:\Program Files\Weka-3-6\RunWeka.ini
change the entry for maxheap to:
maxheap=1024m
I have been using Hadoop to parse web logs. Using Hadoop, I have been able to parse the logs to get multiple features. The output results are separated using a comma. The output can then be fed into Weka to perform clustering analysis.
I have been using Weka rather than Apache Mahout. Reasons:
I will move onto Apache Mahout soon, once I understand the relationship of 1 feature with another.