A thousand thanks to Professor Eibe and everyones help! I only have one final process left needing confirmation.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

A thousand thanks to Professor Eibe and everyones help! I only have one final process left needing confirmation.

Hao Li

 Respected Weka datamining team


A thousand thanks to everyone for clarifying my past questions! They really help a ton!!! Again thanks to everyone for your diligent insights!


Sorry for bothering you one more time, I only have one more process I need a confirmation of. I do no longer have questions after this one. I normally wouldn’t bother the same people with repeated questions, its just that weka is a software and not an algorithm thus I cannot simply ask someone else from a related field.






The question I need a confirmation of is if my interpretation of the results of the weka experimenter is completely correct. I already read the weka online appendix thoroughly and set everything up accordingly, I should be correct, however I cannot afford mistakes in the current situation. Again, the below text does not contain any real investigative questions, they are really just basic things needing a yes or no confirmation.


I want to run 5 repeated classification on 3 different datasets (iris.arff; weather.numeric.arff, labor.arff) and need to collect the average classification performance (accuracy and kappa) of the 3 repeats for each datasets.

I ran the weka experimenter (simple mode) according to the manual using 10 fold cross validation, classification. 5 repeats and “datasets first” options.


The algorithm used is J48 –C0.25 –M2.

All 3 datasets had nominal classes.

Iris.arff has 4 numeric attribute.

Weather.numeric.arff and labor.arff contain a mix of numerical and nominal attributes.


I did no data preprocessing such as scaling or converting all attributes into nominal.

Decision tree methods such as J48 and weka random forest can handle all input datasets without preprocessing, regardless if the input datasets are purely numerical, purely nominal or a mix of both. And both algorithms have intrinsic attribute selection capability hence I do not have to use an external feature selector such as CfsSubsetEval.

Am I correct on the above point???


After the experimenter run is complete, click on “Analyze” tab and click on “Experiment”

And further click on “show std deviations”

In the “Comparison field” select “Percent_correct”. Leave everything else default.

Click “Perform test” I get the following results



Tester:     weka.experiment.PairedCorrectedTTester -G 4,5,6 -D 1 -R 2 -S 0.05 -V -result-matrix "weka.experiment.ResultMatrixPlainText -mean-prec 2 -stddev-prec 2 -col-name-width 0 -row-name-width 25 -mean-width 0 -stddev-width 0 -sig-width 0 -count-width 5 -show-stddev -print-col-names -print-row-names -enum-col-names"

Analysing:  Percent_correct

Datasets:   3

Resultsets: 1

Confidence: 0.05 (two tailed)

Sorted by:  -

Date:       3/22/20 3:11 PM


Dataset                   (1) trees.J48 '-C 0.2


iris                      (50)   94.93( 5.48) |

weather                   (50)   64.00(42.90) |

labor-neg-data            (50)   77.13(17.29) |


(v/ /*)                                      |


(1) trees.J48 '-C 0.25 -M 2' -217733168393644444



The way to interpret the result is  


iris   (50)   94.93( 5.48)


The “ iris(50)” means there were 50 runs for the iris dataset consisting of 5repeats * 10folds =50 runs.

 94.93(5.48) means the classification accuracy was 94.93±5.48%. With the (5.48) being the sample standard deviation with n-1 degrees of freedom.

Similarly weather dataset had accuracy of 64.00±42.90%. And labour dataset had accuracy of 77.13±17.29%.

 If I want to get the Kappa, I select Kappa_statistics in the “Comparison field” and click “Perform test”. I get


iris                      (50)   0.92(0.08) |

weather                   (50)   0.46(0.65) |

labor-neg-data            (50)   0.48(0.38) |


So iris has a kappa of 0.92±0.08…and so on. Am I correct in my understanding?

As I understand the result of the experimenter should correspond to the results from the explorer. “Percent_correct” in the EXPERIMENTER should correspond to “Correctly Classified Instances  94.6667 %” in the EXPLORER(see below)  and

“Kappa_statistics” in the EXPERIMENTER should correspond to “Kappa statistic 0.92” in the EXPLORER(see below).


Is my understanding correct???


WEKA explorer results of J48 classification of IRIS.arff, 10 fold cross validation, 1 repeat.


=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances         142               94.6667 %

Incorrectly Classified Instances         8                5.3333 %

Kappa statistic                          0.92 

Mean absolute error                      0.044


I would like to point out that I found that repeated runs using Experimenter showed the accuracy and kappa changes slightly  which is expected. However using the explorer, I get the same 94.6667% correct every time, even if I use a different random seed every time. How come this is so?


Just 2 quick minor questions I would like a clarification.

(1) I can convert csv files into arff files using weka arffviewer to open the csv file and then save it as an arff file. Is this correct? The inverse can also be done using arffviewer to convert arff file to csv files, correct?


(2)Professor Eibe mentioned in my last question on meta.attributeselected.classifier that if the test set does not undergo feature selection. The attribute set (training set) will be ‘tuned’ to the test set. What does this mean?


you should use the AttributeSelectedClassifer for supervised attribute selection with CFS, to avoid tuning the attribute set to the test data and getting optimistic performance estimates


Does it mean that if the AttributeSelectedClassifer is not used, WEKA will add artificial features to the training set to make it the same size (same number of attributes) as the test set? This would invalidate the test process wouldn’t it?

Also, Can I manually create training and test set of the same attribute sets instead of using AttributeSelectedClassifer? E.g. I use CfssubsetEval on a training set, it gives me a set of attributes that was selected. I write down this set of attributes, open both the training and test set data file and strip them down to the set of attributes I wrote down.

Then during classification, I no longer use AttributeSelectedClassifer, can I do that?




Finally a big THANK YOU for everyone’s help!!! I could not have come this far without your support!!

Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html