ROC from Predictive probabilities in WEKA and Statistical software

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

ROC from Predictive probabilities in WEKA and Statistical software

kranthi
Hi All,

With regards to a binary classification problem, I have used WEKA to get prediction probabilities (output prediction from more options) and ROC (with AUC).  Three different analyses, using classifiers like random forest, realadaboost and bagging, was performed in a 70(training) -30(test) data-split scenario .

I have then tried generating an ROC curve, by using a statistical software, with the the class attribute and the corresponding predictive probability value of the test set (from WEKA).   However, interestingly, there seems to be a significant discrepancy between the ROC curve (and the AUC values) obtained from WEKA and the statistical software.

Could you please suggest if there are any expected reasons for such dicrepancy?
Reply | Threaded
Open this post in threaded view
|

Re: ROC from Predictive probabilities in WEKA and Statistical software

Eibe Frank-2
Administrator
They should be the same (or very similar).

By default, WEKA will output the class probability corresponding to the predicted class (i.e., the largest probability amongst the different class probabilities). The predicted class can obviously be different for different instances in the list. You need to configure WEKA to output all class probabilities for each instance. For example, when you select "PlainText" output, make sure you set "outputDistribution" to true.

Cheers,
Eibe

> On 14 May 2017, at 08:57, kranthi <[hidden email]> wrote:
>
> Hi All,
>
> With regards to a binary classification problem, I have used WEKA to get
> prediction probabilities (output prediction from more options) and ROC (with
> AUC).  Three different analyses, using classifiers like random forest,
> realadaboost and bagging, was performed in a 70(training) -30(test)
> data-split scenario .
>
> I have then tried generating an ROC curve, by using a statistical software,
> with the the class attribute and the corresponding predictive probability
> value of the test set (from WEKA).   However, interestingly, there seems to
> be a significant discrepancy between the ROC curve (and the AUC values)
> obtained from WEKA and the statistical software.
>
> Could you please suggest if there are any expected reasons for such
> dicrepancy?
>
>
>
> --
> View this message in context: http://weka.8497.n7.nabble.com/ROC-from-Predictive-probabilities-in-WEKA-and-Statistical-software-tp40604.html
> Sent from the WEKA mailing list archive at Nabble.com.
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html