cost sensitive matrix and IBK

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

cost sensitive matrix and IBK

Ewy Mathe
Hello all,
I have a one dimensional data to classify and am using the K-nearest
neighbor (IBK) to classify my instances.  Because the data is very
unbalanced (one class may contain 20% of the data while the other has
the remainder), I am using a cost sensitive matrix on top of that.
However, it is very difficult for me to obtain "incremental" TP and TN
values to draw an ROC curve, because the TP decreases by .4 or
more...is it inappropriate to use a cost sensitive matrix with a KNN?
Thanks for any feedback,
Ewy

_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Reply | Threaded
Open this post in threaded view
|

Re: cost sensitive matrix and IBK

Eibe Frank
The problem with drawing an ROC curve with (default) IBk is that it
uses 1 nearest neighbour so the probability estimates you are getting
will be very discrete (mostly 1/0) and you won't get many points for
your curve. You could try using more neighbours and distance weighting.
(A totally different option is to use locally weighted learning in
conjunction with naive Bayes.)

Using the CostSensitiveClassifier in conjunction with IBk is pointless
if you are drawing an ROC curve. The structure of the classifier
doesn't change when the instance weights are changed and it's
sufficient to just change the threshold on the class probabilities to
get an ROC curve (i.e. what the Explorer does).

Cheers,
Eibe

On May 13, 2005, at 1:38 AM, Ewy Mathe wrote:

> Hello all,
> I have a one dimensional data to classify and am using the K-nearest
> neighbor (IBK) to classify my instances.  Because the data is very
> unbalanced (one class may contain 20% of the data while the other has
> the remainder), I am using a cost sensitive matrix on top of that.
> However, it is very difficult for me to obtain "incremental" TP and TN
> values to draw an ROC curve, because the TP decreases by .4 or
> more...is it inappropriate to use a cost sensitive matrix with a KNN?
> Thanks for any feedback,
> Ewy
>
> _______________________________________________
> Wekalist mailing list
> [hidden email]
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist