Attribute selection

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Attribute selection

valerio jus
Hi everyone, 

Are there any possible advantages of information gain over symmetric uncertainty?

Which one of them is more recommended in practice?

Thanks in advance.
Valerio

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Attribute selection

Eibe Frank-2
Administrator
Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.

If all your attributes have the same number of values, I’d use information gain.

Cheers,
Eibe

> On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
>
> Hi everyone,
>
> Are there any possible advantages of information gain over symmetric uncertainty?
>
> Which one of them is more recommended in practice?
>
> Thanks in advance.
> Valerio
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Attribute selection

valerio jus
Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?

Valerio

On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.

If all your attributes have the same number of values, I’d use information gain.

Cheers,
Eibe

> On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
>
> Hi everyone,
>
> Are there any possible advantages of information gain over symmetric uncertainty?
>
> Which one of them is more recommended in practice?
>
> Thanks in advance.
> Valerio
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Attribute selection

Eibe Frank-2
Administrator
There is no theory guaranteeing that it will yield results that are optimal in some sense. See also https://en.wikipedia.org/wiki/Heuristic.

Cheers,
Eibe

> On 16/06/2017, at 4:14 PM, valerio jus <[hidden email]> wrote:
>
> Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?
>
> Valerio
>
> On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
> Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.
>
> If all your attributes have the same number of values, I’d use information gain.
>
> Cheers,
> Eibe
>
> > On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
> >
> > Hi everyone,
> >
> > Are there any possible advantages of information gain over symmetric uncertainty?
> >
> > Which one of them is more recommended in practice?
> >
> > Thanks in advance.
> > Valerio
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Attribute selection

valerio jus
This now makes sense. Thank you once again.

Cheers, 
Valerio

On Fri, Jun 16, 2017 at 12:18 PM, Eibe Frank <[hidden email]> wrote:
There is no theory guaranteeing that it will yield results that are optimal in some sense. See also https://en.wikipedia.org/wiki/Heuristic.

Cheers,
Eibe

> On 16/06/2017, at 4:14 PM, valerio jus <[hidden email]> wrote:
>
> Eibe, thanks a lot for the answer. However, only one thing is not quite clear to me: what do you mean here by saying "it’s also just a heuristic"?
>
> Valerio
>
> On Fri, Jun 16, 2017 at 7:49 AM, Eibe Frank <[hidden email]> wrote:
> Information gain is quite strongly biased towards choosing attributes with many values (i.e., it can overfit more easily). Symmetrical uncertainty is a “normalised” version of information gain. It *may* give you better results if you are comparing attributes with different numbers of values. However, it’s also just a heuristic. As (almost) always, experimentation is required.
>
> If all your attributes have the same number of values, I’d use information gain.
>
> Cheers,
> Eibe
>
> > On 15/06/2017, at 8:54 AM, valerio jus <[hidden email]> wrote:
> >
> > Hi everyone,
> >
> > Are there any possible advantages of information gain over symmetric uncertainty?
> >
> > Which one of them is more recommended in practice?
> >
> > Thanks in advance.
> > Valerio
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...