classification with WEKA

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

classification with WEKA

miray guler
Hi,
I am trying to classify two groups. I have 8 features. After preprocessing I obtain 6 features  than I tuned the classifier parameters and obtained a high accurracy with 5 fold CV. I  want to use all 8 features for test set and no parameter tuning too..And I want to use selected 6 features in training set. Is that possible. How I can do this? That means I will not use Cross validation, right?
 Thanks in advance 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: classification with WEKA

Eibe Frank-2
Administrator

You need to include all your preprocessing and attribute selection algorithms into the actual learning process. You can do this using the FilteredClassifier and/or AttributeSelectedClassifier and/or CVParameterSelection/GridSearch/MultiSearch. You can nest these classifiers to an arbitrary depth. Alternatively, in the FilteredClassifier, you can use the MultiFilter.

 

Note that performing attribute selection and/or parameter tuning before performing a k-fold cross-validation *based on the selected attributes and parameter values* will normally give you optimistic performance estimates. That is why we have the FilteredClassifier, etc., in WEKA: so that all preprocessing steps can be treated and evaluated as part of the learning process.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Monday, 1 July 2019 9:36 PM
To: [hidden email]
Subject: [Wekalist] classification with WEKA

 

Hi,

I am trying to classify two groups. I have 8 features. After preprocessing I obtain 6 features  than I tuned the classifier parameters and obtained a high accurracy with 5 fold CV. I  want to use all 8 features for test set and no parameter tuning too..And I want to use selected 6 features in training set. Is that possible. How I can do this? That means I will not use Cross validation, right?

 Thanks in advance 

 


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: classification with WEKA

miray guler
Thank you for reply. In filteredClassifier does it filter only training data? So I can test my model on unfiltered test data. Thans in advance.

Eibe Frank <[hidden email]>, 1 Tem 2019 Pzt, 13:00 tarihinde şunu yazdı:

You need to include all your preprocessing and attribute selection algorithms into the actual learning process. You can do this using the FilteredClassifier and/or AttributeSelectedClassifier and/or CVParameterSelection/GridSearch/MultiSearch. You can nest these classifiers to an arbitrary depth. Alternatively, in the FilteredClassifier, you can use the MultiFilter.

 

Note that performing attribute selection and/or parameter tuning before performing a k-fold cross-validation *based on the selected attributes and parameter values* will normally give you optimistic performance estimates. That is why we have the FilteredClassifier, etc., in WEKA: so that all preprocessing steps can be treated and evaluated as part of the learning process.

 

Cheers,

Eibe

 

From: [hidden email]
Sent: Monday, 1 July 2019 9:36 PM
To: [hidden email]
Subject: [Wekalist] classification with WEKA

 

Hi,

I am trying to classify two groups. I have 8 features. After preprocessing I obtain 6 features  than I tuned the classifier parameters and obtained a high accurracy with 5 fold CV. I  want to use all 8 features for test set and no parameter tuning too..And I want to use selected 6 features in training set. Is that possible. How I can do this? That means I will not use Cross validation, right?

 Thanks in advance 

 

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: classification with WEKA

Peter Reutemann-3
On February 11, 2020 7:47:15 PM GMT+13:00, miray guler <[hidden email]> wrote:

>Thank you for reply. In filteredClassifier does it filter only training
>data? So I can test my model on unfiltered test data. Thans in advance.
>
>Eibe Frank <[hidden email]>, 1 Tem 2019 Pzt, 13:00 tarihinde şunu
>yazdı:
>
>> You need to include all your preprocessing and attribute selection
>> algorithms into the actual learning process. You can do this using
>the
>> FilteredClassifier and/or AttributeSelectedClassifier and/or
>> CVParameterSelection/GridSearch/MultiSearch. You can nest these
>classifiers
>> to an arbitrary depth. Alternatively, in the FilteredClassifier, you
>can
>> use the MultiFilter.
>>
>>
>>
>> Note that performing attribute selection and/or parameter tuning
>before
>> performing a k-fold cross-validation **based on the selected
>attributes
>> and parameter values** will normally give you optimistic performance
>> estimates. That is why we have the FilteredClassifier, etc., in WEKA:
>so
>> that all preprocessing steps can be treated and evaluated as part of
>the
>> learning process.
>>
>>
>>
>> Cheers,
>>
>> Eibe
>>
>>
>>
>> *From: *miray guler <[hidden email]>
>> *Sent: *Monday, 1 July 2019 9:36 PM
>> *To: *[hidden email]
>> *Subject: *[Wekalist] classification with WEKA
>>
>>
>>
>> Hi,
>>
>> I am trying to classify two groups. I have 8 features. After
>preprocessing
>> I obtain 6 features  than I tuned the classifier parameters and
>obtained a
>> high accurracy with 5 fold CV. I  want to use all 8 features for test
>set
>> and no parameter tuning too..And I want to use selected 6 features in
>> training set. Is that possible. How I can do this? That means I will
>not
>> use Cross validation, right?
>>
>>  Thanks in advance
>>
>>
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: [hidden email]
>> To subscribe, unsubscribe, etc., visit
>> https://list.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>

No, the filter gets also applied to the test data (of course, filters operating on rows rather than columns, like resample, won't do anything at prediction time). That allows you to always work with the raw data, the preprocessing pipeline part of the model.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: classification with WEKA

miray guler

Dear Reutemann,

Thank you for reply. I still can not solve my problem. I have 8 features and selected 6 of them that are discriminitive for my data (with SPSS). Then classified 2 groups with this 6 features. The reviewer wants me to test my model with not doing any preprocessing feature selection using data from the test-set . I don’t know how to do this (6 features in training test, 8 features for test set). I tried to use supplied test set option but it gave error (training and test set are not compatible). What should I do? Thank you very much in advance


Peter Reutemann <[hidden email]>, 11 Şub 2020 Sal, 11:17 tarihinde şunu yazdı:
On February 11, 2020 7:47:15 PM GMT+13:00, miray guler <[hidden email]> wrote:
>Thank you for reply. In filteredClassifier does it filter only training
>data? So I can test my model on unfiltered test data. Thans in advance.
>
>Eibe Frank <[hidden email]>, 1 Tem 2019 Pzt, 13:00 tarihinde şunu
>yazdı:
>
>> You need to include all your preprocessing and attribute selection
>> algorithms into the actual learning process. You can do this using
>the
>> FilteredClassifier and/or AttributeSelectedClassifier and/or
>> CVParameterSelection/GridSearch/MultiSearch. You can nest these
>classifiers
>> to an arbitrary depth. Alternatively, in the FilteredClassifier, you
>can
>> use the MultiFilter.
>>
>>
>>
>> Note that performing attribute selection and/or parameter tuning
>before
>> performing a k-fold cross-validation **based on the selected
>attributes
>> and parameter values** will normally give you optimistic performance
>> estimates. That is why we have the FilteredClassifier, etc., in WEKA:
>so
>> that all preprocessing steps can be treated and evaluated as part of
>the
>> learning process.
>>
>>
>>
>> Cheers,
>>
>> Eibe
>>
>>
>>
>> *From: *miray guler <[hidden email]>
>> *Sent: *Monday, 1 July 2019 9:36 PM
>> *To: *[hidden email]
>> *Subject: *[Wekalist] classification with WEKA
>>
>>
>>
>> Hi,
>>
>> I am trying to classify two groups. I have 8 features. After
>preprocessing
>> I obtain 6 features  than I tuned the classifier parameters and
>obtained a
>> high accurracy with 5 fold CV. I  want to use all 8 features for test
>set
>> and no parameter tuning too..And I want to use selected 6 features in
>> training set. Is that possible. How I can do this? That means I will
>not
>> use Cross validation, right?
>>
>>  Thanks in advance
>>
>>
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: [hidden email]
>> To subscribe, unsubscribe, etc., visit
>> https://list.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>

No, the filter gets also applied to the test data (of course, filters operating on rows rather than columns, like resample, won't do anything at prediction time). That allows you to always work with the raw data, the preprocessing pipeline part of the model.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: classification with WEKA

Eibe Frank-2
Administrator
Recent versions of WEKA should offer you to wrap the classifier into an InputMappedClassifier to take care of the missing attributes. There should be a dialog that pops up when incompatible datasets are detected.

Cheers,
Eibe

On Thu, Feb 13, 2020 at 12:12 AM miray guler <[hidden email]> wrote:

Dear Reutemann,

Thank you for reply. I still can not solve my problem. I have 8 features and selected 6 of them that are discriminitive for my data (with SPSS). Then classified 2 groups with this 6 features. The reviewer wants me to test my model with not doing any preprocessing feature selection using data from the test-set . I don’t know how to do this (6 features in training test, 8 features for test set). I tried to use supplied test set option but it gave error (training and test set are not compatible). What should I do? Thank you very much in advance


Peter Reutemann <[hidden email]>, 11 Şub 2020 Sal, 11:17 tarihinde şunu yazdı:
On February 11, 2020 7:47:15 PM GMT+13:00, miray guler <[hidden email]> wrote:
>Thank you for reply. In filteredClassifier does it filter only training
>data? So I can test my model on unfiltered test data. Thans in advance.
>
>Eibe Frank <[hidden email]>, 1 Tem 2019 Pzt, 13:00 tarihinde şunu
>yazdı:
>
>> You need to include all your preprocessing and attribute selection
>> algorithms into the actual learning process. You can do this using
>the
>> FilteredClassifier and/or AttributeSelectedClassifier and/or
>> CVParameterSelection/GridSearch/MultiSearch. You can nest these
>classifiers
>> to an arbitrary depth. Alternatively, in the FilteredClassifier, you
>can
>> use the MultiFilter.
>>
>>
>>
>> Note that performing attribute selection and/or parameter tuning
>before
>> performing a k-fold cross-validation **based on the selected
>attributes
>> and parameter values** will normally give you optimistic performance
>> estimates. That is why we have the FilteredClassifier, etc., in WEKA:
>so
>> that all preprocessing steps can be treated and evaluated as part of
>the
>> learning process.
>>
>>
>>
>> Cheers,
>>
>> Eibe
>>
>>
>>
>> *From: *miray guler <[hidden email]>
>> *Sent: *Monday, 1 July 2019 9:36 PM
>> *To: *[hidden email]
>> *Subject: *[Wekalist] classification with WEKA
>>
>>
>>
>> Hi,
>>
>> I am trying to classify two groups. I have 8 features. After
>preprocessing
>> I obtain 6 features  than I tuned the classifier parameters and
>obtained a
>> high accurracy with 5 fold CV. I  want to use all 8 features for test
>set
>> and no parameter tuning too..And I want to use selected 6 features in
>> training set. Is that possible. How I can do this? That means I will
>not
>> use Cross validation, right?
>>
>>  Thanks in advance
>>
>>
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: [hidden email]
>> To subscribe, unsubscribe, etc., visit
>> https://list.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>

No, the filter gets also applied to the test data (of course, filters operating on rows rather than columns, like resample, won't do anything at prediction time). That allows you to always work with the raw data, the preprocessing pipeline part of the model.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html