Feature selection

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Feature selection

abasian
I want to use few feature selection algorithms and select Wrapper method, it selects only 1 or maximum 2 features out of 40 features. Its very surprising. 
When I use CfsSubsetEval, it behaves Ok and selects around 10 features. I wonder here which method to use and which to avoid? 
Thanks in advance. 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Feature selection

Eibe Frank-2
Administrator

Which approach, when used inside a k-fold cross-validation or similar, gives the greater estimated accuracy?

 

Note that the default base classifier inside WrapperSubsetEval is ZeroR, so with its default configuration, you will never select any attributes at all.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 2:09 AM
To: [hidden email]
Subject: [Wekalist] Feature selection

 

I want to use few feature selection algorithms and select Wrapper method, it selects only 1 or maximum 2 features out of 40 features. Its very surprising. 

When I use CfsSubsetEval, it behaves Ok and selects around 10 features. I wonder here which method to use and which to avoid? 

Thanks in advance. 

 


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Feature selection

asadbtk
I also have had the same problem in the past and I stopped working on it because of it. I was so confused about whether to use CfsSubsetEval or Wrapper. The feature selection algorithm I used was Genetic Algorithm. I would appreciate if Eibe explains it here so that the confusion removes for all of us.

On Tue, Jul 9, 2019 at 9:16 AM Eibe Frank <[hidden email]> wrote:

Which approach, when used inside a k-fold cross-validation or similar, gives the greater estimated accuracy?

 

Note that the default base classifier inside WrapperSubsetEval is ZeroR, so with its default configuration, you will never select any attributes at all.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 2:09 AM
To: [hidden email]
Subject: [Wekalist] Feature selection

 

I want to use few feature selection algorithms and select Wrapper method, it selects only 1 or maximum 2 features out of 40 features. Its very surprising. 

When I use CfsSubsetEval, it behaves Ok and selects around 10 features. I wonder here which method to use and which to avoid? 

Thanks in advance. 

 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Feature selection

Eibe Frank-2
Administrator

When applying WrapperSubsetEval, the key is to go into its settings and change the classifier from ZeroR to something more appropriate. For example, if you want to select attributes that work well for building a J48 decision tree, use J48 as the classifier setting in WrapperSubsetEval. Read up on Ronny Kohavi’s work on the wrapper method for attribute subset evaluation if you want to get a better understanding of how it works.

 

CfsSubsetEval, as a so-called “filter” approach to attribute subset evaluation, can be much faster than the wrapper method because it does not involve building a classifier. Instead, a correlation-based measure is used to measure the quality of each attribute subset encountered in the search through attribute subset space.

 

Either subset evaluator can be used with the different search algorithms that have been implemented in WEKA, e.g., the GeneticSearch. However, if you combine a computationally expensive search method such as a genetic search with an expensive attribute subset evaluator such as WrapperSubsetEval, you better allocate plenty of compute time...

 

Regardless of which attribute subset selection method you use, make sure you use it inside the AttributeSelectedClassifier (or the FilteredClassifier) so that attributes are selected based on the information in the *training* data only and you won’t get optimistically biased performance estimates on the test data.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 8:41 PM
To: [hidden email]
Subject: Re: [Wekalist] Feature selection

 

I also have had the same problem in the past and I stopped working on it because of it. I was so confused about whether to use CfsSubsetEval or Wrapper. The feature selection algorithm I used was Genetic Algorithm. I would appreciate if Eibe explains it here so that the confusion removes for all of us.

 

On Tue, Jul 9, 2019 at 9:16 AM Eibe Frank <[hidden email]> wrote:

Which approach, when used inside a k-fold cross-validation or similar, gives the greater estimated accuracy?

 

Note that the default base classifier inside WrapperSubsetEval is ZeroR, so with its default configuration, you will never select any attributes at all.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 2:09 AM
To: [hidden email]
Subject: [Wekalist] Feature selection

 

I want to use few feature selection algorithms and select Wrapper method, it selects only 1 or maximum 2 features out of 40 features. Its very surprising. 

When I use CfsSubsetEval, it behaves Ok and selects around 10 features. I wonder here which method to use and which to avoid? 

Thanks in advance. 

 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

 


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Feature selection

asadbtk
Thanks Eibe for your explanation. Much appreciated.

Regards

Mail priva di virus. www.avast.com

On Wed, Jul 10, 2019 at 7:57 AM Eibe Frank <[hidden email]> wrote:

When applying WrapperSubsetEval, the key is to go into its settings and change the classifier from ZeroR to something more appropriate. For example, if you want to select attributes that work well for building a J48 decision tree, use J48 as the classifier setting in WrapperSubsetEval. Read up on Ronny Kohavi’s work on the wrapper method for attribute subset evaluation if you want to get a better understanding of how it works.

 

CfsSubsetEval, as a so-called “filter” approach to attribute subset evaluation, can be much faster than the wrapper method because it does not involve building a classifier. Instead, a correlation-based measure is used to measure the quality of each attribute subset encountered in the search through attribute subset space.

 

Either subset evaluator can be used with the different search algorithms that have been implemented in WEKA, e.g., the GeneticSearch. However, if you combine a computationally expensive search method such as a genetic search with an expensive attribute subset evaluator such as WrapperSubsetEval, you better allocate plenty of compute time...

 

Regardless of which attribute subset selection method you use, make sure you use it inside the AttributeSelectedClassifier (or the FilteredClassifier) so that attributes are selected based on the information in the *training* data only and you won’t get optimistically biased performance estimates on the test data.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 8:41 PM
To: [hidden email]
Subject: Re: [Wekalist] Feature selection

 

I also have had the same problem in the past and I stopped working on it because of it. I was so confused about whether to use CfsSubsetEval or Wrapper. The feature selection algorithm I used was Genetic Algorithm. I would appreciate if Eibe explains it here so that the confusion removes for all of us.

 

On Tue, Jul 9, 2019 at 9:16 AM Eibe Frank <[hidden email]> wrote:

Which approach, when used inside a k-fold cross-validation or similar, gives the greater estimated accuracy?

 

Note that the default base classifier inside WrapperSubsetEval is ZeroR, so with its default configuration, you will never select any attributes at all.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Tuesday, 9 July 2019 2:09 AM
To: [hidden email]
Subject: [Wekalist] Feature selection

 

I want to use few feature selection algorithms and select Wrapper method, it selects only 1 or maximum 2 features out of 40 features. Its very surprising. 

When I use CfsSubsetEval, it behaves Ok and selects around 10 features. I wonder here which method to use and which to avoid? 

Thanks in advance. 

 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mail priva di virus. www.avast.com

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html