> I have a dataset which has a dependent variable having continuous data. Is it a good idea to apply log10 before we train the data. When I apply it, I get the rmse value as 0.2 while without it, I get the rmse value of 16.5. It means a lot of difference. My question is in which situations we need to apply it?

You do realize that scaling the class values will result in scaling

the RMSE (

https://en.wikipedia.org/wiki/Root-mean-square_deviation),

since it is calculated from the different between actual and predicted

class value?

For example, if I divide my numeric class by 1000, my RMSE will be

1000 times smaller. But the model is still the same, it just predicts

smaller numbers...

Cheers, Peter

--

Peter Reutemann

Dept. of Computer Science

University of Waikato, NZ

+64 (7) 858-5174

http://www.cms.waikato.ac.nz/~fracpete/http://www.data-mining.co.nz/_______________________________________________

Wekalist mailing list --

[hidden email]
Send posts to: To unsubscribe send an email to

[hidden email]
To subscribe, unsubscribe, etc., visit

https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nzList etiquette:

http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html