Feature selection on unsupervised data

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Feature selection on unsupervised data

Ens SADEG Souhila
Hello everybody,

I would like to perform a feature selection on unsupervised data but CPA is not suitable for my application domain. Could you tell me how i can do it ? Is there any other algorithm or method ?

One solution i'm trying to ewperiment is to amke a clustering then to label data following the assignements the clusterer returns. Could this be an acceptable solution ?

Best regards and thnk you in advance to any one could give me a help.




_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Feature selection on unsupervised data

Eibe Frank-2
Administrator

> On 28/05/2017, at 9:07 AM, Ens SADEG Souhila <[hidden email]> wrote:
>
> I would like to perform a feature selection on unsupervised data but CPA is not suitable for my application domain. Could you tell me how i can do it ? Is there any other algorithm or method ?
>
> One solution i'm trying to ewperiment is to amke a clustering then to label data following the assignements the clusterer returns. Could this be an acceptable solution ?

This seems a perfectly reasonable approach to me. It should give you the attributes that are primarily responsible for separating the clusters.

There are several more sophisticated approaches for unsupervised feature selection in the literature, but they are not implemented in WEKA.

If your goal is unsupervised feature *extraction*, there are some alternatives to PCA in WEKA:

a) you can build an autoencoder using the MLPAutoencoder class in the multiLayerPerceptrons package

b) you can perform approximate kernel PCA using the Nystroem filter in the largeScaleKernelLearning package in conjunction with PCA.

These two methods can extract features that are not just weighted sums of the original attributes.

Cheers,
Eibe

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Feature selection on unsupervised data

Ens SADEG Souhila
Thank you very much for your answer.

In my application domain i must perform a dimensionality reduction by choosing the n most relevant attributes among N attributes (N>n). It is important for me to know wich attributes among the N original are selected. THis is why PCA and similar methods are not interresting for me.

Thank you again for your answer i was waiting for it since several days :)

Cheers
Souhila

2017-05-29 0:09 GMT+01:00 Eibe Frank <[hidden email]>:

> On 28/05/2017, at 9:07 AM, Ens SADEG Souhila <[hidden email]> wrote:
>
> I would like to perform a feature selection on unsupervised data but CPA is not suitable for my application domain. Could you tell me how i can do it ? Is there any other algorithm or method ?
>
> One solution i'm trying to ewperiment is to amke a clustering then to label data following the assignements the clusterer returns. Could this be an acceptable solution ?

This seems a perfectly reasonable approach to me. It should give you the attributes that are primarily responsible for separating the clusters.

There are several more sophisticated approaches for unsupervised feature selection in the literature, but they are not implemented in WEKA.

If your goal is unsupervised feature *extraction*, there are some alternatives to PCA in WEKA:

a) you can build an autoencoder using the MLPAutoencoder class in the multiLayerPerceptrons package

b) you can perform approximate kernel PCA using the Nystroem filter in the largeScaleKernelLearning package in conjunction with PCA.

These two methods can extract features that are not just weighted sums of the original attributes.

Cheers,
Eibe

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...