> On 28/08/2019, at 9:44 AM, Nilhee2000 <[hidden email]> wrote:
> Can you explain the blow parameters of the random forest classifier in a
> simple way?
> 1-BagSize Percent
By default, if a dataset has N instances, N instances will be sampled *with replacement* for each bag. If you set the bag size to 50, for example, only 0.5 * N instances will be sampled with replacement for each bag that is used to train a RandomTree.
This is quite an obscure parameter, and it is a bit unfortunate that it is exposed for every single classifier in WEKA. Changing it will almost always have no effect. It tells WEKA what batch size to use when a classifier is used in batch prediction mode. For almost all classifiers in WEKA, there is no difference between batch prediction and regular prediction, but there are some, such as the MLRClassifier from RPlugin and the Python-based classifiers, where batch prediction is much faster. Batch prediction means that a whole batch of instances is classified at once with the distributionsForInstances(Instances) method rather than classifying each individual instance with the distributionForInstance(Instance) method. Efficient batch predictors are those classifiers for which the method implementsMoreEfficientBatchPrediction() returns the value true.
This is the maximum path length of any path of any tree that is grown. Growth of a path will stop when this limit is reached, and a leaf node will be made. For example, 1 means that you get forest of random decision stumps (i.e., trees with a single split).