numDistinctValues

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

numDistinctValues

JP de Vooght
Hello,

I am looking at code from HMMWeka and noticed the following.

On isNominal() data e.g. when attr.relation().attribute(0) returns
'{output_0,output_1}'

a call to attr.relation.numDistinctValues(0) returns 0 instead of 2.

What am I missing here?

TIA

JP

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: numDistinctValues

Eibe Frank-2
Administrator
The method numDistinctValues() returns the number of distinct values that actually occur in the dataset (excluding missing values), not the number of values in the definition of the attribute type.

Cheers,
Eibe

> On 3 Jun 2017, at 22:55, JP de Vooght <[hidden email]> wrote:
>
> Hello,
>
> I am looking at code from HMMWeka and noticed the following.
>
> On isNominal() data e.g. when attr.relation().attribute(0) returns
> '{output_0,output_1}'
>
> a call to attr.relation.numDistinctValues(0) returns 0 instead of 2.
>
> What am I missing here?
>
> TIA
>
> JP
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: numDistinctValues

JP de Vooght
Thank you Frank!
The data looks like this

seq_29,class_0,'output_0\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0'

Shouldn't it return 2 then?
I am using Weka 3.8

Is there a method to obtain this information from the definition?

Thanks!


On 04.06.2017 02:13, Eibe Frank wrote:

> The method numDistinctValues() returns the number of distinct values
> that actually occur in the dataset (excluding missing values), not the
> number of values in the definition of the attribute type.
>
> Cheers,
> Eibe
>
>> On 3 Jun 2017, at 22:55, JP de Vooght <[hidden email]> wrote:
>>
>> Hello,
>>
>> I am looking at code from HMMWeka and noticed the following.
>>
>> On isNominal() data e.g. when attr.relation().attribute(0) returns
>> '{output_0,output_1}'
>>
>> a call to attr.relation.numDistinctValues(0) returns 0 instead of 2.
>>
>> What am I missing here?
>>
>> TIA
>>
>> JP

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: numDistinctValues

Eibe Frank-2
Administrator
The relation() method of the Attribute class returns an Instances object without any instances, i.e., just the relevant header information. Use the relationalValue(int) method in the Instance class to get the data. If you run the numDistinctValues(int) method on that, you should get the desired result.

To get the number of values of a nominal attribute, you can use the numValues() method in the Attribute class.

Cheers,
Eibe


> On 4 Jun 2017, at 18:33, [hidden email] wrote:
>
> Thank you Frank!
> The data looks like this
>
> seq_29,class_0,'output_0\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_1\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0\noutput_0'
>
> Shouldn't it return 2 then?
> I am using Weka 3.8
>
> Is there a method to obtain this information from the definition?
>
> Thanks!
>
>
> On 04.06.2017 02:13, Eibe Frank wrote:
>> The method numDistinctValues() returns the number of distinct values
>> that actually occur in the dataset (excluding missing values), not the
>> number of values in the definition of the attribute type.
>> Cheers,
>> Eibe
>>> On 3 Jun 2017, at 22:55, JP de Vooght <[hidden email]> wrote:
>>> Hello,
>>> I am looking at code from HMMWeka and noticed the following.
>>> On isNominal() data e.g. when attr.relation().attribute(0) returns
>>> '{output_0,output_1}'
>>> a call to attr.relation.numDistinctValues(0) returns 0 instead of 2.
>>> What am I missing here?
>>> TIA
>>> JP
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html