Doubt in weka.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Doubt in weka.

Vanderlei Aparecido de Lima
Good evening, how are you?

I have a question for data analysis at Weka.
I have a set of original data and I need to divide it into two other subsets; one for training the model and another for testing.
But, I need my classes to always be balanced on the two subsets of data.
Which filter should I use to solve this?
Could you help me, please?

I look forward to your response.

Best Regards,

Vanderlei.


Prof. Dr. Vanderlei Aparecido de Lima
Departamento de Química
Universidade Tecnológica Federal do Paraná - UTFPR
Telefone (46) 3220 2596
Campus Pato Branco PR


_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Doubt in weka.

Joel Ratsaby-2
I think: use Meta classifier, then choose FilteredClassifier with ClassBalancer.

On Sun, 3 May 2020 at 06:01, Vanderlei Aparecido de Lima <[hidden email]> wrote:
Good evening, how are you?

I have a question for data analysis at Weka.
I have a set of original data and I need to divide it into two other subsets; one for training the model and another for testing.
But, I need my classes to always be balanced on the two subsets of data.
Which filter should I use to solve this?
Could you help me, please?

I look forward to your response.

Best Regards,

Vanderlei.


Prof. Dr. Vanderlei Aparecido de Lima
Departamento de Química
Universidade Tecnológica Federal do Paraná - UTFPR
Telefone (46) 3220 2596
Campus Pato Branco PR

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


--
Prof. Joel Ratsaby
Electrical and Electronics Engineering Dept.
Ariel University
ARIEL 40700
ISRAEL



_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Doubt in weka.

Eibe Frank-2
Administrator
I agree. Note that using the FilteredClassifier will only balance the instance weights in the training data (but this is normally what you want: there is normally no reason to balance the test data).

Note that ClassBalancer itself does not do any resampling. It just changes the weights of the instances so that the sum of the weights of the instances pertaining to each class value is the same for all class values. Most WEKA classifiers implement the WeightedInstancesHandler interface (you can check their Capabilities to see this) so they will take the instance weights into account. If the base classifier you have specified for FilteredClassifier does not implement the WeightedInstancesHandler interface, i.e., it cannot exploit the information in instance weights, FilteredClassifier will automatically resample the data with replacement, using the instance weights to define sampling probabilities. Hence, even if your base classifier does not handle instance weights natively, going through FilteredClassifier will achieve what you want.

Cheers,
Eibe

On 3/05/2020, at 5:58 PM, Joel Ratsaby <[hidden email]> wrote:

I think: use Meta classifier, then choose FilteredClassifier with ClassBalancer.

On Sun, 3 May 2020 at 06:01, Vanderlei Aparecido de Lima <[hidden email]> wrote:
Good evening, how are you?

I have a question for data analysis at Weka.
I have a set of original data and I need to divide it into two other subsets; one for training the model and another for testing.
But, I need my classes to always be balanced on the two subsets of data.
Which filter should I use to solve this?
Could you help me, please?

I look forward to your response.

Best Regards,

Vanderlei.


Prof. Dr. Vanderlei Aparecido de Lima
Departamento de Química
Universidade Tecnológica Federal do Paraná - UTFPR
Telefone (46) 3220 2596
Campus Pato Branco PR

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


--
Prof. Joel Ratsaby
Electrical and Electronics Engineering Dept.
Ariel University
ARIEL 40700
ISRAEL


_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html