number of output decimal places in randomforest

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

number of output decimal places in randomforest

tortoise

Hello,

I'm working on a java project that uses weka 3.6 and on an algorithm that applies the randomforest classifier. I'm able to optimize the various options, but whatever the option the probablities I get from the classifier are rounded to the second decimal. I found a post mentioning a similar problem, and which solved it by increasing the number of trees. The optimal number of trees in my situation is 60, but I tried numbers like 200 trees, still the probabilities look to me rounded to the second decimal. In the online java doc, I saw that the default is 2 decimal positions, and when using the GUI the option "-num-decimal-places" is used to set the number to 6. I tried to add the option in my code, but I get an

Exception in thread "main" java.lang.Exception: Illegal options: -num-decimal-places 6

The way I'm doing this is as follows

String[] options = new String [] {"-num-decimal-places","6"};
RandomForest t = new RandomForest();
t.setNumTrees(numberOfTrees);
t.setNumFeatures(numberOfFeatures);
t.setOptions(options);

Could you, please, help me to find out what I am doing wrong ? Does the option not exist in version 3.6 ? And if not, would there be a way to set it in this older version ?

Kind regards,

Yves


_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: number of output decimal places in randomforest

Eibe Frank-2
Administrator
The -num-decimal-places option only affects the textual output of the model, e.g., the number of decimal places in the output of the split points. For example, with WEKA 3.8,

java -cp /Applications/weka-3-8-4/weka.jar weka.Run .RandomForest -t ~/datasets/UCI/sonar.arff -print -num-decimal-places 1

will output the individual decision trees and round the split points in those trees to 1 decimal point.

The discreteness of the probabilities is due to how those probabilities are computed. The probability estimates obtained from WEKA’s RandomForest are obtained by averaging the probability estimates obtained from the individual trees (averaging of probability estimates is what is performed in WEKA’s Bagging,and RandomForest just calls Bagging with RandomTree as the base learner). Because those trees are unpruned, they will often return probabilities that are 0 or 1.

A nice way to get around this problem is to adjust the individual trees' probability estimates using the Laplace correction or similar. Unfortunately, this is not currently possible in WEKA’s RandomTree, the base learner for Bagging used in RandomForest. (This is in contrast to REPTree or J48, where it can be done with options -I 1 and -A respectively.) Anyway, adjusting the probability estimates may affect the accuracy of the RandomForest so 60 may no longer be the optimum number.

A more drastic option, which is available in RandomForest as it stands, is to prune the individual trees using the following two parameters:

-M <minimum number of instances>
        Set minimum number of instances per leaf.
        (default 1)
-depth <num>
        The maximum depth of the tree, 0 for unlimited.
        (default 0)

Cheers,
Eibe

> On 21/08/2020, at 12:10 AM, Yves Frémat <[hidden email]> wrote:
>
>
> Hello,
>
> I'm working on a java project that uses weka 3.6 and on an algorithm that applies the randomforest classifier. I'm able to optimize the various options, but whatever the option the probablities I get from the classifier are rounded to the second decimal. I found a post mentioning a similar problem, and which solved it by increasing the number of trees. The optimal number of trees in my situation is 60, but I tried numbers like 200 trees, still the probabilities look to me rounded to the second decimal. In the online java doc, I saw that the default is 2 decimal positions, and when using the GUI the option "-num-decimal-places" is used to set the number to 6. I tried to add the option in my code, but I get an
>
> Exception in thread "main" java.lang.Exception: Illegal options: -num-decimal-places 6
>
> The way I'm doing this is as follows
>
>
> String[] options = new String [] {"-num-decimal-places","6"};
> RandomForest t = new RandomForest();
> t.setNumTrees(numberOfTrees);
> t.setNumFeatures(numberOfFeatures);
> t.setOptions(options);
> Could you, please, help me to find out what I am doing wrong ? Does the option not exist in version 3.6 ? And if not, would there be a way to set it in this older version ?
>
> Kind regards,
>
> Yves
>
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to [hidden email]
> To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: number of output decimal places in randomforest

tortoise
The probabilities are indeed a multiple of 1/60. Thank you very much for the
detailed answer and for the quick reply.
Cheers, Yves



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html