Consider using the SpreadSubsample filter. It can be used to undersample all classes other than the smallest class in the data. The distributionSpread parameter determines how much data is sampled from each of the other classes.

If the smallest class has M instances, then at most distributionSpread * M instances will be picked from each of the other classes. Thus, if you set distributionSpread to 1, each of the classes will have the same size in the resulting dataset.

The filter performs sampling without replacement.

If you are working with a learning algorithm that implements the WeightedInstancesHandler interface, consider using ClassBalancer as an alternative filter. It doesn't do any sampling at all and just reweights the instances so that each class has the total weight.


hello every one !
my data set contain three imbalanced classes, i've tried to use
filter.supervised.resample to sampling the the majority class , but when i
did this the number of all classes(majority and minority) reduced ,
now i want to know if is there any way to implement undersampling only on
majority class?in wek

