Random forest and number of trees

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Random forest and number of trees

Ketil Oppedal
Hi,

is it the parameter "number of iterations" in RF that provides the number of trees?


Best regards,

Ketil

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Eibe Frank-2
Administrator
Yes, that’s correct.

Cheers,
Eibe

> On 11/10/2016, at 8:37 PM, Ketil Oppedal <[hidden email]> wrote:
>
> Hi,
>
> is it the parameter "number of iterations" in RF that provides the number of trees?
>
>
> Best regards,
>
> Ketil
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Ketil Oppedal
Thanks a lot,

In earlier versions the RF had a default number of trees of ten, now it is hundred. Why is that?

Regards,
Ketil

2016-10-11 22:34 GMT+02:00 Eibe Frank <[hidden email]>:
Yes, that’s correct.

Cheers,
Eibe

> On 11/10/2016, at 8:37 PM, Ketil Oppedal <[hidden email]> wrote:
>
> Hi,
>
> is it the parameter "number of iterations" in RF that provides the number of trees?
>
>
> Best regards,
>
> Ketil
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Eibe Frank-2
Administrator
Because 10 is almost never enough in a random forest. It was a silly default. The random forest implementation in R uses 500 tree by default.

Cheers,
Eibe

> On 13/10/2016, at 1:02 AM, Ketil Oppedal <[hidden email]> wrote:
>
> Thanks a lot,
>
> In earlier versions the RF had a default number of trees of ten, now it is hundred. Why is that?
>
> Regards,
> Ketil
>
> 2016-10-11 22:34 GMT+02:00 Eibe Frank <[hidden email]>:
> Yes, that’s correct.
>
> Cheers,
> Eibe
>
> > On 11/10/2016, at 8:37 PM, Ketil Oppedal <[hidden email]> wrote:
> >
> > Hi,
> >
> > is it the parameter "number of iterations" in RF that provides the number of trees?
> >
> >
> > Best regards,
> >
> > Ketil
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Michael Hall
On Oct 12, 2016, at 3:49 PM, Eibe Frank <[hidden email]> wrote:

Because 10 is almost never enough in a random forest. It was a silly default. The random forest implementation in R uses 500 tree by default.

How about Bagging, that also has a default of 10 iterations.

This,
Bagging Predictors

6.2. How Many Bootstrap Replicates Are Enough?

suggests 50 for classification, 25 for regression. (Unless I’m incorrect in assuming “bootstrap replicates” and Weka iterations are equivalent.)
Although these numbers mainly appear to be what seems correct to the author.
The example he includes gets 
"most of the improvement using only 10 bootstrap replicates.”
although it shows some improvement at 25 and slightly more at 50.

Having different defaults for classification and regression could be a little difficult for Weka since Bagging is selected first and then the actual classifier.

Michael Hall




_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Eibe Frank-2
Administrator

> On 13/10/2016, at 10:51 AM, Michael Hall <[hidden email]> wrote:
>
>> On Oct 12, 2016, at 3:49 PM, Eibe Frank <[hidden email]> wrote:
>>
>> Because 10 is almost never enough in a random forest. It was a silly default. The random forest implementation in R uses 500 tree by default.
>
> How about Bagging, that also has a default of 10 iterations.
>
> This,
> Bagging Predictors
> http://statistics.berkeley.edu/sites/default/files/tech-reports/421.pdf
>
> 6.2. How Many Bootstrap Replicates Are Enough?
>
> suggests 50 for classification, 25 for regression. (Unless I’m incorrect in assuming “bootstrap replicates” and Weka iterations are equivalent.)
> Although these numbers mainly appear to be what seems correct to the author.
> The example he includes gets
> "most of the improvement using only 10 bootstrap replicates.”
> although it shows some improvement at 25 and slightly more at 50.
>
> Having different defaults for classification and regression could be a little difficult for Weka since Bagging is selected first and then the actual classifier.

The preference is to leave defaults as they are, so that it is easier to reproduce results. However, in the case of RandomForest, we felt that we had to change the default because 10 trees often give very poor performance relative to what you can get with larger forests.

Cheers,
Eibe
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Michael Hall
> On Oct 12, 2016, at 5:01 PM, Eibe Frank <[hidden email]> wrote:
>
>>
>> On 13/10/2016, at 10:51 AM, Michael Hall <[hidden email]> wrote:
>>
>>> On Oct 12, 2016, at 3:49 PM, Eibe Frank <[hidden email]> wrote:
>>>
>>> Because 10 is almost never enough in a random forest. It was a silly default. The random forest implementation in R uses 500 tree by default.
>>
>> How about Bagging, that also has a default of 10 iterations.
>>
>>
>
> The preference is to leave defaults as they are, so that it is easier to reproduce results. However, in the case of RandomForest, we felt that we had to change the default because 10 trees often give very poor performance relative to what you can get with larger forests.
>

OK, I’m not sure I’ve noticed a lot of improvement increasing the bagging iterations the couple of times I’ve tried it. But I have seen a couple recent references to more iterations being done.

Michael Hall




_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
rby
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

rby
In reply to this post by Eibe Frank-2
Why then instead of calling the number of trees "number of iterations" you do
not use what really is? I have the same doubt until I found this thread

Changing the semantic of parameters does not help anybody ;-(

Regards
Ric



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Random forest and number of trees

Eibe Frank-3
Hi Ric,

At some stage, I simplified the RandomForest code by making RandomForest a subclass of Bagging (because, in WEKA, RandomForest is basically just Bagging applied with RandomTree as the base learner). That meant that RandomForest inherited the numIterations parameter from there.

The command-line option never changed, it's always been -I, but yes, numTrees was thus replaced by numIterations in the GUI. I think at that point WEKA did not yet provide the ability to hide inherited options, otherwise I may have hidden numIterations and added numTrees back in (which, as you say, is more meaningful).

To improve things a little bit, I have just added more information to the tool tip and the other bits of online help for the numIterations parameter. This should be in the next nightly snapshot.

Thanks for the feedback!

Cheers,
Eibe



On Sun, Sep 1, 2019 at 2:23 PM rby <[hidden email]> wrote:
Why then instead of calling the number of trees "number of iterations" you do
not use what really is? I have the same doubt until I found this thread

Changing the semantic of parameters does not help anybody ;-(

Regards
Ric



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html