applying clustering to Sample data array within SMOTE filter

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

applying clustering to Sample data array within SMOTE filter

hisham
i tried to apply Kmeans clusting to SMOTE filter by adding the lines below.
and then when i run the filter inside the program. it gives me :

weka java problem filtering instances: weka.clusterers.SimpleKMeans: Cannot
handle any class attribute!  how can i solve this issue



{
SimpleKMeans kmeans = new SimpleKMeans();
sample.deleteAttributeAt(0);

kmeans.setNumClusters(2);
kmeans.setPreserveInstancesOrder(true);
{
kmeans.buildClusterer(sample);

weka.core.Instances[] datasets = new
weka.core.Instances[kmeans.getNumClusters()];

for (int i = 0; i < datasets.length; i++)
{
  datasets[i] = new Instances(sample, 0);
}

for (Instance inst : sample) {
  datasets[(int)kmeans.clusterInstance(inst)].add(inst);
}

}

    }



--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

Eibe Frank-2
Administrator
You must have a class attribute set in the Instances object referred to by “sample".

You can use

sample.setClassIndex(-1)

to unset the class before you pass the data to kmeans.

Cheers,
Eibe

> On 26/11/2018, at 2:59 AM, hisham <[hidden email]> wrote:
>
> i tried to apply Kmeans clusting to SMOTE filter by adding the lines below.
> and then when i run the filter inside the program. it gives me :
>
> weka java problem filtering instances: weka.clusterers.SimpleKMeans: Cannot
> handle any class attribute!  how can i solve this issue
>
>
>
> {
> SimpleKMeans kmeans = new SimpleKMeans();
> sample.deleteAttributeAt(0);
>
> kmeans.setNumClusters(2);
> kmeans.setPreserveInstancesOrder(true);
> {
> kmeans.buildClusterer(sample);
>
> weka.core.Instances[] datasets = new
> weka.core.Instances[kmeans.getNumClusters()];
>
> for (int i = 0; i < datasets.length; i++)
> {
>  datasets[i] = new Instances(sample, 0);
> }
>
> for (Instance inst : sample) {
>  datasets[(int)kmeans.clusterInstance(inst)].add(inst);
> }
>
> }
>
>    }
>
>
>
> --
> Sent from: http://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

hisham
it worked as a charm.

thank you very much.

Cheers :)



--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

Peter Reutemann
In reply to this post by Eibe Frank-2
> You must have a class attribute set in the Instances object referred to by “sample".
>
> You can use
>
> sample.setClassIndex(-1)
>
> to unset the class before you pass the data to kmeans.

However, make sure that you don't include the class attribute in your
clustering process. The only thing that the above API call does is to
unset the indicator for which attribute in the dataset acts as class
attribute, it doesn't actually remove it. You can use the
FilteredClusterer meta-clusterer in conjunction with the Remove filter
for that.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

hisham
well, iam clustering samples that have the same class.

so i use:
sample.setClassIndex(-1);

before the clustering, and then I am going to add the previous ClassIndex to
the newly created cluster datasets, and finally, apply my functions to those
datasets.





--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

Peter Reutemann
> well, iam clustering samples that have the same class.
>
> so i use:
> sample.setClassIndex(-1);
>
> before the clustering, and then I am going to add the previous ClassIndex to
> the newly created cluster datasets, and finally, apply my functions to those
> datasets.

My comment was to be cautious of not including the information of the
attribute that you had previously flagged as class attribute in the
clustering process (hence use FilteredClusterer with the Remove file).
Otherwise you're leaking supervised information into an unsupervised
process, which will change your clustering output.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

hisham
I understand,  its better to use FilteredClusterer with the Remove file
before applying kmeans clustering to the data.

what commands can i use in order to do it correctly?  

can i undo the FilteredClusterer with the Remove file after creating the
clusters?



--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: applying clustering to Sample data array within SMOTE filter

Peter Reutemann
> I understand,  its better to use FilteredClusterer with the Remove file
> before applying kmeans clustering to the data.
>
> what commands can i use in order to do it correctly?

From memory:

import weka.clusterers.FilteredClusterer;
import weka.clusterers.SimpleKMeans;
import weka.filters.unsupervised.attribute.Remove;

Remove remove = new Remove();
remove.setAttributeIndices("last");
SimpleKMeans kmeans = new SimpleKMeans();
// further options for kmeans
FilteredClusterer fc = new FilteredClusterer();
fc.setFilter(remove);
fc.setClusterer(kmeans);

Just use it as any other clustering algorithm.

Javadoc:
http://weka.sourceforge.net/doc.dev/weka/clusterers/FilteredClusterer.html
http://weka.sourceforge.net/doc.dev/weka/filters/unsupervised/attribute/Remove.html

> can i undo the FilteredClusterer with the Remove file after creating the
> clusters?

FilteredClusterer does not change the input data. It creates a new
dataset internally using the supplied filter. This internal dataset is
then passed on to the base clusterer for training. At clustering time,
the filter internally again to keep the data consistent for the base
clusterer.

That's the beauty of the FilteredClusterer and FilterClassifier
meta-algorithms, as you can filter the data on the fly for the
base-algorithm, without having to think about reverting any changes
introduced by a filter.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html