Implementing my clusterer "unclustered Instances" issue in GUI

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Implementing my clusterer "unclustered Instances" issue in GUI

ivan
Hello,

    I'm new here, and I'm trying to implement a BDSCAN algo for
clustering
string and date type attributes.

   I overided buildClusterer(), numberOfClusters(), clusterInstance() and
getCapabilities().
   I stored into a hashmap<Instance, Integer> to know which cluster every
Instance is attributed to so I can call it in clusterInstance().
   I still don't know what correspond the prior probability of
clusterPriors()
and everything about logDensityPerClusterForInstance(), so left it as return
null, I tried to hardcode it also but it doesn't change my issue.

   So when I'm running my code, I got this:
=== Clustering stats for training data ===
Clustered Instances
Unclustered Instances : 1799
Log likelihood: NaN

   It's not recognizing what I computed.

Any help in this reference would be greatly appreciated.

Regards,

--
ivan



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Implementing my clusterer "unclustered Instances" issue in GUI

Michael Hall


> On Nov 15, 2019, at 10:37 AM, ivan <[hidden email]> wrote:
>
> Hello,
>
>    I'm new here, and I'm trying to implement a BDSCAN algo for
> clustering
> string and date type attributes.
>
>  

Did you mean DBSCAN? In searching I’m not seeing too much for BDSCAN.
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Implementing my clusterer "unclustered Instances" issue in GUI

Eibe Frank-2
Administrator
In reply to this post by ivan
Assuming you are talking about density-based clustering with DBSCAN, you should not implement the DensityBasedClusterer interface. In other words, don’t implement logDensityPerClusterForInstance(), etc., and just implement the plain Clusterer interface.

The terminology is confusing here. In WEKA, DensityBasedClusterer refers to clustering algorithms that are based on estimating a probability density function for each cluster. This is quite different from the “density-based” clustering approach that DBSCAN employs, which tries to find “densely” populated areas in the instance space.

Cheers,
Eibe

> On 16/11/2019, at 5:37 AM, ivan <[hidden email]> wrote:
>
> Hello,
>
>    I'm new here, and I'm trying to implement a BDSCAN algo for
> clustering
> string and date type attributes.
>
>   I overided buildClusterer(), numberOfClusters(), clusterInstance() and
> getCapabilities().
>   I stored into a hashmap<Instance, Integer> to know which cluster every
> Instance is attributed to so I can call it in clusterInstance().
>   I still don't know what correspond the prior probability of
> clusterPriors()
> and everything about logDensityPerClusterForInstance(), so left it as return
> null, I tried to hardcode it also but it doesn't change my issue.
>
>   So when I'm running my code, I got this:
> === Clustering stats for training data ===
> Clustered Instances
> Unclustered Instances : 1799
> Log likelihood: NaN
>
>   It's not recognizing what I computed.
>
> Any help in this reference would be greatly appreciated.
>
> Regards,
>
> --
> ivan
>
>
>
> --
> Sent from: https://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to: To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Implementing my clusterer "unclustered Instances" issue in GUI

ivan
Thank for the answer,

I think I know where the problem come from, it s from the String Attribute,
it seem when the Instances is loader from my arff file all the instance's
String Attribute inside is equal 0.0

Do I need to do something like preprocessing?

Regards,

--
Ivan



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Implementing my clusterer "unclustered Instances" issue in GUI

ivan
In reply to this post by Eibe Frank-2
Sorry, they isn't problem with String attribute.
It's when evalutionClusterer call printClusterStats(), the attribute value
of Instance "inst" below is strange they are always the same and don't
change. Is the DataSource class not doing good?

private static String printClusterStats(Clusterer clusterer, String
fileName)
{

...

if (fileName.length() != 0) {
      DataSource source = new DataSource(fileName);
      Instances structure = source.getStructure();
      Instances forBatchPredictors =
        (clusterer instanceof BatchPredictor && ((BatchPredictor) clusterer)
          .implementsMoreEfficientBatchPrediction()) ? new Instances(
          source.getStructure(), 0) : null;

      Instance inst;
      while (source.hasMoreElements(structure)) {
        inst = source.nextElement(structure); //<-- here always generate a
very strange Instance with the same wrong value
        if (forBatchPredictors != null) {
          forBatchPredictors.add(inst);
        } else {
          try {
            cnum = clusterer.clusterInstance(inst); //<- so here
clusterInstance can't recognise the Instance to return the cluster

            if (clusterer instanceof DensityBasedClusterer) {
              loglk +=
                ((DensityBasedClusterer)
clusterer).logDensityForInstance(inst);
              // temp = Utils.sum(dist);
            }
            instanceStats[cnum]++;
          } catch (Exception e) {
            unclusteredInstances++;
          }
          i++;
        }
      }
...
}



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html