Ridge on logistic regression

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Ridge on logistic regression

Barrendeitor

Hi,

I'm using logisic regression (weka.classifiers.functions.Logistic).

As I read here: https://list.waikato.ac.nz/pipermail/wekalist/2012-October/056279.html

The ridge can help with stabilizing degenerate cases and can help reduce overfitting penalizing large coefficients.

I would like to know: what could be considered a "large coefficient"?

As I understood, odds ratios are the exponencials of coefficients, and they indicate the influence of each variable, and I have odd ratios with 0, and odds ratios too high like 9.7909494883895206E17 (coefficient 41.4254).

Could it be considered a degenerate case or a large coefficient? Or depends on the dataset?

Thanks and regards.


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Ridge on logistic regression

Eibe Frank-2
Administrator
The standard process in machine learning is to optimise a regularisation parameter such as this one using a hold-out set or k-fold cross-validation.

In WEKA, you can use CVParameterSelection or MultiSearch to optimise the ridge parameter in Logistic based on internal k-fold cross-validation.

Cheers,
Eibe

> On 4/05/2017, at 10:52 PM, Barrendeitor <[hidden email]> wrote:
>
> Hi,
>
> I'm using logisic regression (weka.classifiers.functions.Logistic).
>
> As I read here: https://list.waikato.ac.nz/pipermail/wekalist/2012-October/056279.html 
> The ridge can help with stabilizing degenerate cases and can help reduce overfitting penalizing large coefficients.
>
> I would like to know: what could be considered a "large coefficient"?
>
> As I understood, odds ratios are the exponencials of coefficients, and they indicate the influence of each variable, and I have odd ratios with 0, and odds ratios too high like 9.7909494883895206E17 (coefficient 41.4254).
>
> Could it be considered a degenerate case or a large coefficient? Or depends on the dataset?
>
> Thanks and regards.
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html