Pre-Processing the data and FilteredClassifier

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Pre-Processing the data and FilteredClassifier

Abdrahman0x
Hi

I know if we want to avoid any cheating regarding the testing case, is to
apply FilteredClassifier option in the Classify panel.
My question and please correct me if I am wrong. Is it possible to just
apply the normal filtering (pre-processing) of the data such as the
normalization or disctrization or standardization or replacing the missing
values in the Pre-process panel using the filter option, then in the
classify panel we use the "FilteredClassifier" to apply the selection and
classification algorithms? That is only he data preparation to be done
separately in the pre-process panel, and other filtering techniques from
selection and classifying to applied in the classify panel under the
"FilteredClassifier".

Thank you,
Abdrahman



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Pre-Processing the data and FilteredClassifier

Eibe Frank-3
Processing data in the Preprocess tab using a filter that does not look at the class attribute's data at all is generally reasonably safe when the filter does something simple like imputing missing values based on the mean/mode or standardising the data to zero mean and unit variance.

However, in the strictest possible sense, this actually means you will be evaluating a form of semi-supervised learning, where unlabeled test data is used to inform the model. My preference would be to use the FilteredClassifier even in that case (unless you are aiming to evaluate the classifier in the corresponding semi-supervised setting).

Cheers,
Eibe

On Thu, Aug 8, 2019 at 4:29 AM Abdrahman0x <[hidden email]> wrote:
Hi

I know if we want to avoid any cheating regarding the testing case, is to
apply FilteredClassifier option in the Classify panel.
My question and please correct me if I am wrong. Is it possible to just
apply the normal filtering (pre-processing) of the data such as the
normalization or disctrization or standardization or replacing the missing
values in the Pre-process panel using the filter option, then in the
classify panel we use the "FilteredClassifier" to apply the selection and
classification algorithms? That is only he data preparation to be done
separately in the pre-process panel, and other filtering techniques from
selection and classifying to applied in the classify panel under the
"FilteredClassifier".

Thank you,
Abdrahman



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Pre-Processing the data and FilteredClassifier

Abdrahman0x
Thank you Eibe, I got it!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html