Only one ranked attribute, but selected two? InfoGain Ranker

classic Classic list List threaded Threaded
3 messages Options
Tom
Reply | Threaded
Open this post in threaded view
|

Only one ranked attribute, but selected two? InfoGain Ranker

Tom
Hi list,

I've run an InfoGain evaluation on my dataset, with a Ranker on threshold 0.1.

My output via the GUI says:

Search Method:
    Attribute ranking.
    Threshold for discarding attributes:   0.1   

Attribute Evaluator (supervised, Class (nominal): 23 class):
    Information Gain Ranking Filter

Ranked attributes:
 0.141    2 nr_visits

Selected attributes: 2 : 1

In my java implementation, I do the same thing:

Ranker ranker = new Ranker();
ranker.setGenerateRanking(true);
ranker.setThreshold(0.1);

AttributeSelection attsel = new AttributeSelection();
InfoGainAttributeEval eval = new InfoGainAttributeEval();

attsel.setEvaluator(eval);
attsel.setSearch(ranker);

attsel.SelectAttributes(instances);

int[] ranked_attr = attsel.selectedAttributes();
double[][] rawscores = attsel.rankedAttributes();

Where I get similar output:

  • my ranked_attr is [1, 21] (with 1 being the nr_visits feature, and 21 another)
  • my rawscores double array does NOT contain ANY entry for 21. It has the 1, and then another feature with a score lower than my threshold.

What gives? Are there one or two selected features? Is this a bug in weka 3.8.4?

Cheers!


_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Only one ranked attribute, but selected two? InfoGain Ranker

Eibe Frank
AFAIK, the set of indices returned by selectedAttributes() includes the index of the class attribute. I assume that attribute 22 in your data is the class attribute. There is no score for the class attribute because it is the attribute that we are trying to predict.

Cheers,
Eibe

On Sat, Feb 8, 2020 at 12:35 AM Tom <[hidden email]> wrote:
Hi list,

I've run an InfoGain evaluation on my dataset, with a Ranker on threshold 0.1.

My output via the GUI says:

Search Method:
    Attribute ranking.
    Threshold for discarding attributes:   0.1   

Attribute Evaluator (supervised, Class (nominal): 23 class):
    Information Gain Ranking Filter

Ranked attributes:
 0.141    2 nr_visits

Selected attributes: 2 : 1

In my java implementation, I do the same thing:

Ranker ranker = new Ranker();
ranker.setGenerateRanking(true);
ranker.setThreshold(0.1);

AttributeSelection attsel = new AttributeSelection();
InfoGainAttributeEval eval = new InfoGainAttributeEval();

attsel.setEvaluator(eval);
attsel.setSearch(ranker);

attsel.SelectAttributes(instances);

int[] ranked_attr = attsel.selectedAttributes();
double[][] rawscores = attsel.rankedAttributes();

Where I get similar output:

  • my ranked_attr is [1, 21] (with 1 being the nr_visits feature, and 21 another)
  • my rawscores double array does NOT contain ANY entry for 21. It has the 1, and then another feature with a score lower than my threshold.

What gives? Are there one or two selected features? Is this a bug in weka 3.8.4?

Cheers!

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Tom
Reply | Threaded
Open this post in threaded view
|

Re: Only one ranked attribute, but selected two? InfoGain Ranker

Tom
You are correct, the other number is indeed the class attribute index! I didn't notice immediately because it's zero-based while the GUI is one-based. Debugging in my code actually showed me it's == to instances.classIndex in the end.

It's a tad confusing that the class index is returned, but I guess it's the way it is.

Thanks for the enlightenment!
  Tom

On Sat, Feb 8, 2020 at 5:45 AM Eibe Frank <[hidden email]> wrote:
AFAIK, the set of indices returned by selectedAttributes() includes the index of the class attribute. I assume that attribute 22 in your data is the class attribute. There is no score for the class attribute because it is the attribute that we are trying to predict.

Cheers,
Eibe

On Sat, Feb 8, 2020 at 12:35 AM Tom <[hidden email]> wrote:
Hi list,

I've run an InfoGain evaluation on my dataset, with a Ranker on threshold 0.1.

My output via the GUI says:

Search Method:
    Attribute ranking.
    Threshold for discarding attributes:   0.1   

Attribute Evaluator (supervised, Class (nominal): 23 class):
    Information Gain Ranking Filter

Ranked attributes:
 0.141    2 nr_visits

Selected attributes: 2 : 1

In my java implementation, I do the same thing:

Ranker ranker = new Ranker();
ranker.setGenerateRanking(true);
ranker.setThreshold(0.1);

AttributeSelection attsel = new AttributeSelection();
InfoGainAttributeEval eval = new InfoGainAttributeEval();

attsel.setEvaluator(eval);
attsel.setSearch(ranker);

attsel.SelectAttributes(instances);

int[] ranked_attr = attsel.selectedAttributes();
double[][] rawscores = attsel.rankedAttributes();

Where I get similar output:

  • my ranked_attr is [1, 21] (with 1 being the nr_visits feature, and 21 another)
  • my rawscores double array does NOT contain ANY entry for 21. It has the 1, and then another feature with a score lower than my threshold.

What gives? Are there one or two selected features? Is this a bug in weka 3.8.4?

Cheers!

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html