

Dear members of this mailing list,
At the moment I am testing if the results from the SimpleKMeans clustering algorithm can be reproduced in Python.
In Weka I have used all the default parameters except for the number of clusters. Now I am trying to fill in the parameters for the KMeans algorithm in Python and one of the parameters is 'algorithm'. So my question is if the SimpleKMeans algorithm is Lloyd's
or HartiganWong's algorithm or maybe even some other algorithm? I was not able to find a certain answer to the question on the internet so I hope someone here knows. I'm doing this little test as part of my master thesis so if someone could enlighten me I'd
be very grateful.
Kind regards,
Chivany van der Werff
Leiden Institute of Computer Science
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Administrator

SimpleKMeans implements Lloyd’s algorithm. You will probably need to look at EuclideanDistance as well to make sure that you are configuring the implementations as similarly as possible. For example, EuclideanDistance normalises all numeric attributes to the range [0,1] by default. Anyway, given that kmeans converges to a local optimum, establishing equivalence of two implementations is tricky. Cheers, Eibe From: [hidden email] Sent: Monday, 15 July 2019 1:54 PM To: [hidden email] Subject: [Wekalist] SimpleKMeans: Lloyd, HartiganWong or something else? Dear members of this mailing list, At the moment I am testing if the results from the SimpleKMeans clustering algorithm can be reproduced in Python. In Weka I have used all the default parameters except for the number of clusters. Now I am trying to fill in the parameters for the KMeans algorithm in Python and one of the parameters is 'algorithm'. So my question is if the SimpleKMeans algorithm is Lloyd's or HartiganWong's algorithm or maybe even some other algorithm? I was not able to find a certain answer to the question on the internet so I hope someone here knows. I'm doing this little test as part of my master thesis so if someone could enlighten me I'd be very grateful. Leiden Institute of Computer Science _______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


Dear Eibe,
Thank you so much for your quick response!
You're saving me a lot of experimenting with the answer as well as the additional information.
Is there any place online where I can find it being stated that SimpleKMeans implements Lloyd's algorithm so I can refer to it? Or is there any other way to refer to the information you have given me?
I'll try my best to find a similar implementation in Python which is the experiment I am about to perform. If I somewhat succeed I'll share the results if anyone's interested. I also planned on trying it out in R but I'm starting off with Python first.
Anyway, thanks again! I immensely appreciate it!
Kind regards,
Chivany van der Werff
Leiden Institute of Computer Science
Van: [hidden email] <[hidden email]> namens Eibe Frank <[hidden email]>
Verzonden: maandag 15 juli 2019 11:54
Aan: Weka machine learning workbench list.
Onderwerp: Re: [Wekalist] SimpleKMeans: Lloyd, HartiganWong or something else?
SimpleKMeans implements Lloyd’s algorithm.
You will probably need to look at EuclideanDistance as well to make sure that you are configuring the implementations as similarly as possible. For example, EuclideanDistance normalises all numeric attributes to the range [0,1] by default.
Anyway, given that kmeans converges to a local optimum, establishing equivalence of two implementations is tricky.
Cheers,
Eibe
From: [hidden email]
Sent: Monday, 15 July 2019 1:54 PM
To: [hidden email]
Subject: [Wekalist] SimpleKMeans: Lloyd, HartiganWong or something else?
Dear members of this mailing list,
At the moment I am testing if the results from the SimpleKMeans clustering algorithm can be reproduced in Python.
In Weka I have used all the default parameters except for the number of clusters. Now I am trying to fill in the parameters for the KMeans algorithm in Python and one of the parameters is 'algorithm'. So my question
is if the SimpleKMeans algorithm is Lloyd's or HartiganWong's algorithm or maybe even some other algorithm? I was not able to find a certain answer to the question on the internet so I hope someone here knows. I'm doing this little test as part of my master
thesis so if someone could enlighten me I'd be very grateful.
Leiden Institute of Computer Science
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


You could refer to the source code, e.g., the main trunk version:
Cheers, Eibe
On Tue, Jul 16, 2019 at 9:15 AM Chivany van der Werff < [hidden email]> wrote:
Dear Eibe,
Thank you so much for your quick response!
You're saving me a lot of experimenting with the answer as well as the additional information.
Is there any place online where I can find it being stated that SimpleKMeans implements Lloyd's algorithm so I can refer to it? Or is there any other way to refer to the information you have given me?
I'll try my best to find a similar implementation in Python which is the experiment I am about to perform. If I somewhat succeed I'll share the results if anyone's interested. I also planned on trying it out in R but I'm starting off with Python first.
Anyway, thanks again! I immensely appreciate it!
Kind regards,
Chivany van der Werff
Leiden Institute of Computer Science
Van: [hidden email] <[hidden email]> namens Eibe Frank <[hidden email]>
Verzonden: maandag 15 juli 2019 11:54
Aan: Weka machine learning workbench list.
Onderwerp: Re: [Wekalist] SimpleKMeans: Lloyd, HartiganWong or something else?
SimpleKMeans implements Lloyd’s algorithm.
You will probably need to look at EuclideanDistance as well to make sure that you are configuring the implementations as similarly as possible. For example, EuclideanDistance normalises all numeric attributes to the range [0,1] by default.
Anyway, given that kmeans converges to a local optimum, establishing equivalence of two implementations is tricky.
Cheers,
Eibe
From: [hidden email]
Sent: Monday, 15 July 2019 1:54 PM
To: [hidden email]
Subject: [Wekalist] SimpleKMeans: Lloyd, HartiganWong or something else?
Dear members of this mailing list,
At the moment I am testing if the results from the SimpleKMeans clustering algorithm can be reproduced in Python.
In Weka I have used all the default parameters except for the number of clusters. Now I am trying to fill in the parameters for the KMeans algorithm in Python and one of the parameters is 'algorithm'. So my question
is if the SimpleKMeans algorithm is Lloyd's or HartiganWong's algorithm or maybe even some other algorithm? I was not able to find a certain answer to the question on the internet so I hope someone here knows. I'm doing this little test as part of my master
thesis so if someone could enlighten me I'd be very grateful.
Leiden Institute of Computer Science
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

