# Question about InfoGainAttributeEval with Ranker Classic List Threaded 5 messages Open this post in threaded view
|

## Question about InfoGainAttributeEval with Ranker

 Hello !I wanted to know if someone could tell me the range of the values of InformationGain from an attribute. For example: if I haveRanked attributes: 0.07231     att_1 0.07217     att_2 0.03963     att_3 0.03963     att_4I know that the attribute att_1 has more InformationGain than the others, but I want to know between which values could be. Is it high? is it low? I don't know.I know that: InfoGain(Class,Attribute) = H(Class) - H(Class | Attribute)and there's a property for entropy: 0 <= H <= log_a (n) ... where (i think) a = 2 and n = number of samples.But I don't know know to use this propery in order to calculate the range of InformationGain. If I have a classification in two groups, how could use this property to calculate H(class) and H(class | Attribute) ?Anyone could help me?Thanks in advance!Bye!-- Fernando Bugni _______________________________________________ Wekalist mailing list Send posts to: [hidden email] List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Open this post in threaded view
|

## Re: Question about InfoGainAttributeEval with Ranker

 Administrator > On 11/01/2015, at 6:26 pm, Fernando Bugni <[hidden email]> wrote: > > I know that: InfoGain(Class,Attribute) = H(Class) - H(Class | Attribute) > and there's a property for entropy: 0 <= H <= log_a (n) ... where (i think) a = 2 and n = number of samples. > But I don't know know to use this propery in order to calculate the range of InformationGain. If I have a classification in two groups, how could use this property to calculate H(class) and H(class | Attribute) ? The minimum information gain is zero, when H(Class) = H(Class | Attribute). The maximum is achieved when H(Class | Attribute) = 0. Entropy is maximal when all classes are equally likely, in which case it is log_b(c), where b = 2 (if entropy is calculated in bits) and c is the *NUMBER OF CLASS VALUES*. In the two-class case, the maximum info gain is 1 bit (and occurs when both classes are equally likely a priori, before the attribute is considered). However, in most datasets, not all classes are equally likely a priori, so H(Class) will be smaller than log_b(c). You can calculate H(Class) in WEKA, in the Classify panel, by running any classifier (e.g., ZeroR), setting “Use training set” for evaluation, and “Output entropy evaluation measures” under “More options…”. For example, running ZeroR on the iris data gives: === Summary === Correctly Classified Instances          50               33.3333 % Incorrectly Classified Instances       100               66.6667 % Kappa statistic                          0     K&B Relative Info Score                  0      % K&B Information Score                    0      bits      0      bits/instance Class complexity | order 0             237.7444 bits      1.585  bits/instance Class complexity | scheme              237.7444 bits      1.585  bits/instance Complexity improvement     (Sf)          0      bits      0      bits/instance Mean absolute error                      0.4444 Root mean squared error                  0.4714 Relative absolute error                100      % Root relative squared error            100      % Coverage of cases (0.95 level)         100      % Mean rel. region size (0.95 level)     100      % Total Number of Instances              150     "Class complexity | order 0" gives you H(Class). It is 1.585  bits/instance for the iris data (rounded) because, for this data, all three classes are equally likely, so H(Class)=log_2(3). Cheers, Eibe _______________________________________________ Wekalist mailing list Send posts to: [hidden email] List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Open this post in threaded view
|

## Re: Question about InfoGainAttributeEval with Ranker

 Hi, I was reading this old conversation because I have the same question. I read the explanation of Eibe, but I don't understand why the log base is 2 instead of e. When I look in the weka source (3.8.4) under ContingencyTables line 335 (which calculates the H( ) measures) it clearly uses base e. But the log_e(3) is 1.099, not 1.585. My Ranker scores are 1.3793 and 1.2268 for the first two found attributes, so it would seem that they are base 2 (since I have three class values). Is weka using base 2 or base e? Cheers! _______________________________________________ Wekalist mailing list -- [hidden email] Send posts to: To unsubscribe send an email to [hidden email] To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nzList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html