Unexpected output by Weka

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Unexpected output by Weka

asadbtk
I use Genetic Search, Ant Search and PSO for feature selection algorithms on a dataset having 40 features. I am using Random Forest and MultiSearch for parameters optimizations. In MultiSearch I wrap AttributeSelectedClassifier. I use regression problem which gives me output as CC, RMSE etc. I use different results, for example, for RMSE o.99, 0.97, 1.07 for Genetic, Ant and PSO searches. Things seem normal til here but when I checked the set of features each feature selection algorithms have selected are exactly the same for all the three I used. 

Now, how on the earth it is possible that I used the same ML model (Random Forest) and dataset, same optimization algorithm, get the same set of features and still get different results? Is it something abnormal for Weka? What caused Weka to changed the results when all the three algorithms select and filter the same set of features?

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Unexpected output by Weka

Peter Reutemann-3
On August 27, 2019 7:34:54 PM GMT+10:00, javed khan <[hidden email]> wrote:

>I use Genetic Search, Ant Search and PSO for feature selection
>algorithms
>on a dataset having 40 features. I am using Random Forest and
>MultiSearch
>for parameters optimizations. In MultiSearch I wrap
>AttributeSelectedClassifier. I use regression problem which gives me
>output
>as CC, RMSE etc. I use different results, for example, for RMSE o.99,
>0.97,
>1.07 for Genetic, Ant and PSO searches. Things seem normal til here but
>when I checked the set of features each feature selection algorithms
>have
>selected are exactly the same for all the three I used.
>
>Now, how on the earth it is possible that I used the same ML model
>(Random
>Forest) and dataset, same optimization algorithm, get the same set of
>features and still get different results? Is it something abnormal for
>Weka? What caused Weka to changed the results when all the three
>algorithms
>select and filter the same set of features?

The schemes may choose the same attributes when applied to the full training data, but they may choose different ones during cross-validation. Hence the slightly different results, that get computed from X cross-validation folds.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Unexpected output by Weka

asadbtk
Hello Peter, thanks for your reply. 

However what explanation should one give for that if we have to write a report/explanation on the results of the classifiers? Is there any literature on the web which explains this scenario? 

On Tuesday, August 27, 2019, Peter Reutemann <[hidden email]> wrote:
On August 27, 2019 7:34:54 PM GMT+10:00, javed khan <[hidden email]> wrote:
>I use Genetic Search, Ant Search and PSO for feature selection
>algorithms
>on a dataset having 40 features. I am using Random Forest and
>MultiSearch
>for parameters optimizations. In MultiSearch I wrap
>AttributeSelectedClassifier. I use regression problem which gives me
>output
>as CC, RMSE etc. I use different results, for example, for RMSE o.99,
>0.97,
>1.07 for Genetic, Ant and PSO searches. Things seem normal til here but
>when I checked the set of features each feature selection algorithms
>have
>selected are exactly the same for all the three I used.
>
>Now, how on the earth it is possible that I used the same ML model
>(Random
>Forest) and dataset, same optimization algorithm, get the same set of
>features and still get different results? Is it something abnormal for
>Weka? What caused Weka to changed the results when all the three
>algorithms
>select and filter the same set of features?

The schemes may choose the same attributes when applied to the full training data, but they may choose different ones during cross-validation. Hence the slightly different results, that get computed from X cross-validation folds.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Unexpected output by Weka

Peter Reutemann-3
On August 27, 2019 9:40:14 PM GMT+10:00, javed khan <[hidden email]> wrote:

>Hello Peter, thanks for your reply.
>
>However what explanation should one give for that if we have to write a
>report/explanation on the results of the classifiers? Is there any
>literature on the web which explains this scenario?
>
>On Tuesday, August 27, 2019, Peter Reutemann <[hidden email]>
>wrote:
>
>> On August 27, 2019 7:34:54 PM GMT+10:00, javed khan
><[hidden email]>
>> wrote:
>> >I use Genetic Search, Ant Search and PSO for feature selection
>> >algorithms
>> >on a dataset having 40 features. I am using Random Forest and
>> >MultiSearch
>> >for parameters optimizations. In MultiSearch I wrap
>> >AttributeSelectedClassifier. I use regression problem which gives me
>> >output
>> >as CC, RMSE etc. I use different results, for example, for RMSE
>o.99,
>> >0.97,
>> >1.07 for Genetic, Ant and PSO searches. Things seem normal til here
>but
>> >when I checked the set of features each feature selection algorithms
>> >have
>> >selected are exactly the same for all the three I used.
>> >
>> >Now, how on the earth it is possible that I used the same ML model
>> >(Random
>> >Forest) and dataset, same optimization algorithm, get the same set
>of
>> >features and still get different results? Is it something abnormal
>for
>> >Weka? What caused Weka to changed the results when all the three
>> >algorithms
>> >select and filter the same set of features?
>>
>> The schemes may choose the same attributes when applied to the full
>> training data, but they may choose different ones during
>cross-validation.
>> Hence the slightly different results, that get computed from X
>> cross-validation folds.
>>
>> Cheers, Peter
>> --
>> Peter Reutemann
>> Dept. of Computer Science
>> University of Waikato, NZ
>> +64 (7) 858-5174
>> http://www.cms.waikato.ac.nz/~fracpete/
>> http://www.data-mining.co.nz/
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: [hidden email]
>> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/
>> mailman/listinfo/wekalist
>> List etiquette: http://www.cs.waikato.ac.nz/~
>> ml/weka/mailinglist_etiquette.html
>>

That's nothing special. Different data will give you usually different results for various attribute selection schemes.

Look at the results from the different folds. These sets of selected attributes will most likely differ slightly, resulting in the small differences in the summary statistics.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Unexpected output by Weka

Eibe Frank-3
In reply to this post by asadbtk
In the most recent versions of WEKA, you can turn on output of the models obtained from the individual training folds of a k-fold cross-validation to satisfy your curiosity., under “More options...”.

Cheers,
Eibe


On Tue, 27 Aug 2019 at 11:40 PM, javed khan <[hidden email]> wrote:
Hello Peter, thanks for your reply. 

However what explanation should one give for that if we have to write a report/explanation on the results of the classifiers? Is there any literature on the web which explains this scenario? 

On Tuesday, August 27, 2019, Peter Reutemann <[hidden email]> wrote:
On August 27, 2019 7:34:54 PM GMT+10:00, javed khan <[hidden email]> wrote:
>I use Genetic Search, Ant Search and PSO for feature selection
>algorithms
>on a dataset having 40 features. I am using Random Forest and
>MultiSearch
>for parameters optimizations. In MultiSearch I wrap
>AttributeSelectedClassifier. I use regression problem which gives me
>output
>as CC, RMSE etc. I use different results, for example, for RMSE o.99,
>0.97,
>1.07 for Genetic, Ant and PSO searches. Things seem normal til here but
>when I checked the set of features each feature selection algorithms
>have
>selected are exactly the same for all the three I used.
>
>Now, how on the earth it is possible that I used the same ML model
>(Random
>Forest) and dataset, same optimization algorithm, get the same set of
>features and still get different results? Is it something abnormal for
>Weka? What caused Weka to changed the results when all the three
>algorithms
>select and filter the same set of features?

The schemes may choose the same attributes when applied to the full training data, but they may choose different ones during cross-validation. Hence the slightly different results, that get computed from X cross-validation folds.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html