Discrepancy between identifying entities in clustering approaches

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Discrepancy between identifying entities in clustering approaches

Ernie Chang
To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...

I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.

Are they supposed to be the same, or am I missing something important?

Thanks

Ernie Chang
Victoria

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Eibe Frank-2
Administrator
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Ernie Chang
Thank you Frank
I didnt think i had any Class covariates but I will check out your advice.
Ernie

On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Eibe Frank-3
By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.

Cheers,
Eibe

On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
Thank you Frank
I didnt think i had any Class covariates but I will check out your advice.
Ernie

On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Ernie Chang
Hi Frank
If by deselecting you mean 'remove' then I am taking away the attribute of interest in clustering....for some reason weka seems to insist the last attribute is special. Is that only for classification?

In any case i need to find the entity numbers corresponding to the clusters...if [tab][cluster] is the correct clustering how do I get the results I want? 

Does adding ID in preprocess affect the cluster result?
Thanks
Ernie

On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.

Cheers,
Eibe

On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
Thank you Frank
I didnt think i had any Class covariates but I will check out your advice.
Ernie

On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Ernie Chang
In reply to this post by Eibe Frank-3
Hi,
I am failing ro understand some fundamentals. 

I have a dataset with N entities in rows and Y attributes. 

I assume that after some clustering algorithm like EM is applied, I can find which entities have been assigned to what clusters.

Seems i cannot do this in the Cluster tab, as the only thing I can save is the analysts results overall not the assignments.

If I add an ID in preprocess, the overall results change!

If I add a Add Cluster, the results change again....they become part of the dataset it seems.

And I dont understand, if the last attribute in my dataset is important, why I should remove it from the cluster analysis?

I have not bought the books but the online materials do not clarify.

Could you please help me to understand how i might get a list of entities assigned to clusters without invoking the Uncertainty Principle? Using EM and 8 clusters as an example?

Thanks
Ernie

On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.

Cheers,
Eibe

On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
Thank you Frank
I didnt think i had any Class covariates but I will check out your advice.
Ernie

On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Keith Werle
In reply to this post by Ernie Chang
I might suggest using the the Addcluster filter

You can tell it to ignore any attribute by giving the attribute number. In this case, the number that corresponds to the class value. 
If I recall correctly, you don’t need to also unselect the class value, but you can to be safe. 
I also believe you can select the clusterer to employ in it (k-means or other)

When clustering, even with the same random seed value, the cluster numbers can be different between the two approaches I have found - cluster panel or classify. However, they are the same
Clusters with same
Members. It’s a bit frustrating and it’s been a while since I did it. 
But beat think to do is output the clusters and compare member counts usually. 

I usually end saving the assignments to file and do a compare. 

I hope that helps


Keith

Sent from my iPhone

On Sep 22, 2019, at 3:14 AM, Ernie Chang <[hidden email]> wrote:

Hi Frank
If by deselecting you mean 'remove' then I am taking away the attribute of interest in clustering....for some reason weka seems to insist the last attribute is special. Is that only for classification?

In any case i need to find the entity numbers corresponding to the clusters...if [tab][cluster] is the correct clustering how do I get the results I want? 

Does adding ID in preprocess affect the cluster result?
Thanks
Ernie

On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.

Cheers,
Eibe

On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
Thank you Frank
I didnt think i had any Class covariates but I will check out your advice.
Ernie

On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).

Cheers,
Eibe

> On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
>
> To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
>
> I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
>
> Are they supposed to be the same, or am I missing something important?
>
> Thanks
>
> Ernie Chang
> Victoria
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Eibe Frank-2
Administrator
In reply to this post by Ernie Chang
Select “No class” from the drop-down menu above the plot showing the histogram.

Yes, adding the ID will affect the result if you do not configure the filter/clusterer to skip that attribute.

Cheers,
Eibe

> On 22/09/2019, at 7:14 PM, Ernie Chang <[hidden email]> wrote:
>
> Hi Frank
> If by deselecting you mean 'remove' then I am taking away the attribute of interest in clustering....for some reason weka seems to insist the last attribute is special. Is that only for classification?
>
> In any case i need to find the entity numbers corresponding to the clusters...if [tab][cluster] is the correct clustering how do I get the results I want?
>
> Does adding ID in preprocess affect the cluster result?
> Thanks
> Ernie
>
> On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
> By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.
>
> Cheers,
> Eibe
>
> On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
> Thank you Frank
> I didnt think i had any Class covariates but I will check out your advice.
> Ernie
>
> On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
> Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).
>
> Cheers,
> Eibe
>
> > On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
> >
> > To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> > if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
> >
> > I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
> >
> > Are they supposed to be the same, or am I missing something important?
> >
> > Thanks
> >
> > Ernie Chang
> > Victoria
> > _______________________________________________
> > Wekalist mailing list -- [hidden email]
> > Send posts to: To unsubscribe send an email to [hidden email]
> > To subscribe, unsubscribe, etc., visit
> > https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Ernie Chang
YESSS!!
It all makes sense now....combination of Class, Ignore and filter Adds in proper sequences.

Thank you.
Ernie Chang

On Sun, Sep 22, 2019, 23:19 Eibe Frank, <[hidden email]> wrote:
Select “No class” from the drop-down menu above the plot showing the histogram.

Yes, adding the ID will affect the result if you do not configure the filter/clusterer to skip that attribute.

Cheers,
Eibe

> On 22/09/2019, at 7:14 PM, Ernie Chang <[hidden email]> wrote:
>
> Hi Frank
> If by deselecting you mean 'remove' then I am taking away the attribute of interest in clustering....for some reason weka seems to insist the last attribute is special. Is that only for classification?
>
> In any case i need to find the entity numbers corresponding to the clusters...if [tab][cluster] is the correct clustering how do I get the results I want?
>
> Does adding ID in preprocess affect the cluster result?
> Thanks
> Ernie
>
> On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
> By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.
>
> Cheers,
> Eibe
>
> On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
> Thank you Frank
> I didnt think i had any Class covariates but I will check out your advice.
> Ernie
>
> On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
> Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).
>
> Cheers,
> Eibe
>
> > On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
> >
> > To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> > if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
> >
> > I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
> >
> > Are they supposed to be the same, or am I missing something important?
> >
> > Thanks
> >
> > Ernie Chang
> > Victoria
> > _______________________________________________
> > Wekalist mailing list -- [hidden email]
> > Send posts to: To unsubscribe send an email to [hidden email]
> > To subscribe, unsubscribe, etc., visit
> > https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Discrepancy between identifying entities in clustering approaches

Keith Werle
Awesome
So glad you figured it out. 
And no - it’s not obvious even for us old seasoned users

Enjoy

Sent from my iPhone

On Sep 22, 2019, at 1:29 PM, Ernie Chang <[hidden email]> wrote:

YESSS!!
It all makes sense now....combination of Class, Ignore and filter Adds in proper sequences.

Thank you.
Ernie Chang

On Sun, Sep 22, 2019, 23:19 Eibe Frank, <[hidden email]> wrote:
Select “No class” from the drop-down menu above the plot showing the histogram.

Yes, adding the ID will affect the result if you do not configure the filter/clusterer to skip that attribute.

Cheers,
Eibe

> On 22/09/2019, at 7:14 PM, Ernie Chang <[hidden email]> wrote:
>
> Hi Frank
> If by deselecting you mean 'remove' then I am taking away the attribute of interest in clustering....for some reason weka seems to insist the last attribute is special. Is that only for classification?
>
> In any case i need to find the entity numbers corresponding to the clusters...if [tab][cluster] is the correct clustering how do I get the results I want?
>
> Does adding ID in preprocess affect the cluster result?
> Thanks
> Ernie
>
> On Sun, Sep 22, 2019, 01:20 Eibe Frank, <[hidden email]> wrote:
> By default, the Preprocess panel will automatically select the last attribute as the class attribute. You will need to explicitly deselect it.
>
> Cheers,
> Eibe
>
> On Sun, 22 Sep 2019 at 1:17 PM, Ernie Chang <[hidden email]> wrote:
> Thank you Frank
> I didnt think i had any Class covariates but I will check out your advice.
> Ernie
>
> On Sat, Sep 21, 2019, 02:45 Eibe Frank, <[hidden email]> wrote:
> Good question. The reason is that the AddCluster filter removes the class attribute before the clusterer is applied. You need to select “No class” in the Preprocess Panel before applying AddCluster if you want the same result as in the Cluster Panel (with default settings).
>
> Cheers,
> Eibe
>
> > On 21/09/2019, at 6:15 AM, Ernie Chang <[hidden email]> wrote:
> >
> > To find what entities belong in what clusters in doing unsupervised learning using EM, I get different results if I use PreProcess > Filter > Choose EM [parameter] than
> > if I start with the same dataset and use Tab-Cluster > Choose EM [same parameters]...
> >
> > I deduce this by specifying 8 clusters, and the breakdown of entities in clusters from Tab-Cluster is not the same as the assigned cluster numbers for the entities in PreProcess.....which I find by saving the dataset with the added field and doing a count of the classes.
> >
> > Are they supposed to be the same, or am I missing something important?
> >
> > Thanks
> >
> > Ernie Chang
> > Victoria
> > _______________________________________________
> > Wekalist mailing list -- [hidden email]
> > Send posts to: To unsubscribe send an email to [hidden email]
> > To subscribe, unsubscribe, etc., visit
> > https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html