unsolvable memory size problems

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

unsolvable memory size problems

michel.plantie
hello

I am using weka since some time now

I wanted to use the randomforest classifier,
with 2000 data each one of 12100 numbers size

unfortunately the heap size of the java machine cannot afford the running of the algorithm.

I used the -Xmx2000m option on windows platform
and even -Xmx3750m option on solaris platform

but even with this big memory size the algorithm crashes with the following message :
"Not enough memory.... load a smaller data set or use larger heap size"

I think the randomforest algorithm comsumes a lot of memory due probably to recursive algorithm
is there any mean to improve the algorithm ?


kind regards

michel

-- 


=========================================================
Michel Plantié
Laboratoire LGI2P
Site EERIE, Ecole des Mines d'Ales
Parc Scientifique Georges Besse 
30035 Nîmes Cedex 1 - France
téléphone : 33 466387035, fax : 33 466387099
email : [hidden email]
=========================================================

 

_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Reply | Threaded
Open this post in threaded view
|

Re: unsolvable memory size problems

christian schulz
Hi,

because i have these days same problems with large datasets  in weka i give you
a small framework for r-project which seems for me less memory-hungry for randomForest
with many trees.

(1):Take the read.arff function from:
 Craig A. Struble ( http://www.cs.waikato.ac.nz/ml/weka/example_code/readarff.r )
(2): Install R and randomForest from  r-project.org

data  <- read.arff("c:/yourdata.arff)

splitP_GESAMT <- sample(2,nrow(P_GESAMTarff),replace=T,prob=c(0.7,0.3))
rfP_GESAMT <- randomForest(CLASS  ~ . ,data=P_GESAMT[splitP_GESAMT==1,],na.action=na.omit,importance=T,ntree=1000))
P_GESAMTpred <- predict(rfP_GESAMT,P_GESAMT[splitP_GESAMT==2,])

regards, christian




michel.plantie schrieb:
hello

I am using weka since some time now

I wanted to use the randomforest classifier,
with 2000 data each one of 12100 numbers size

unfortunately the heap size of the java machine cannot afford the running of the algorithm.

I used the -Xmx2000m option on windows platform
and even -Xmx3750m option on solaris platform

but even with this big memory size the algorithm crashes with the following message :
"Not enough memory.... load a smaller data set or use larger heap size"

I think the randomforest algorithm comsumes a lot of memory due probably to recursive algorithm
is there any mean to improve the algorithm ?


kind regards

michel
  

-- 


=========================================================
Michel Plantié
Laboratoire LGI2P
Site EERIE, Ecole des Mines d'Ales
Parc Scientifique Georges Besse 
30035 Nîmes Cedex 1 - France
téléphone : 33 466387035, fax : 33 466387099
email : [hidden email]
=========================================================

 

_______________________________________________ Wekalist mailing list [hidden email] https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist