Attribute selection

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Attribute selection

valerio jus
Hi everyone, 

Are there any possible advantages of information gain over symmetric uncertainty?

Which one of them is more recommended in practice?

Thanks in advance.
Valerio

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Attribute selection

Eibe Frank-2
Administrator
Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.

If all your attributes have the same number of values, I’d use information gain.

Cheers,
Eibe

> On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
>
> Hi everyone,
>
> Are there any possible advantages of information gain over symmetric uncertainty?
>
> Which one of them is more recommended in practice?
>
> Thanks in advance.
> Valerio
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Attribute selection

valerio jus
Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?

Valerio

On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.

If all your attributes have the same number of values, I’d use information gain.

Cheers,
Eibe

> On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
>
> Hi everyone,
>
> Are there any possible advantages of information gain over symmetric uncertainty?
>
> Which one of them is more recommended in practice?
>
> Thanks in advance.
> Valerio
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Attribute selection

Eibe Frank-2
Administrator
There is no theory guaranteeing that it will yield results that are optimal in some sense. See also https://en.wikipedia.org/wiki/Heuristic.

Cheers,
Eibe

> On 16/06/2017, at 4:14 PM, valerio jus <[hidden email]> wrote:
>
> Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?
>
> Valerio
>
> On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
> Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.
>
> If all your attributes have the same number of values, I’d use information gain.
>
> Cheers,
> Eibe
>
> > On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
> >
> > Hi everyone,
> >
> > Are there any possible advantages of information gain over symmetric uncertainty?
> >
> > Which one of them is more recommended in practice?
> >
> > Thanks in advance.
> > Valerio
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Attribute selection

valerio jus
This now makes sense. Thank you once again.

Cheers, 
Valerio

On Fri, Jun 16, 2017 at 12:18 PM, Eibe Frank <[hidden email]> wrote:
There is no theory guaranteeing that it will yield results that are optimal in some sense. See also https://en.wikipedia.org/wiki/Heuristic.

Cheers,
Eibe

> On 16/06/2017, at 4:14 PM, valerio jus <[hidden email]> wrote:
>
> Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?
>
> Valerio
>
> On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
> Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.
>
> If all your attributes have the same number of values, I’d use information gain.
>
> Cheers,
> Eibe
>
> > On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
> >
> > Hi everyone,
> >
> > Are there any possible advantages of information gain over symmetric uncertainty?
> >
> > Which one of them is more recommended in practice?
> >
> > Thanks in advance.
> > Valerio
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html