WekaScript - Reproducibility of the experiments

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

WekaScript - Reproducibility of the experiments

juanantonio
Hello,

I am running a Keras Model with pyscript. In each experiment or execution of
the algorithm I have a different output as result. In my algorithm, inside
my .py file, I set a seed as everybody recommends for get the
reproducibility, however the outputs keep going differents in each execution
of the model.

Is there a way to obtein the same output in execution??

Thanks.



--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: WekaScript - Reproducibility of the experiments

Eibe Frank-2
Administrator
The data that is received by your Python script should only change if you explicitly change the seed for the random number generator in WEKA when you do a cross-validation, etc. Are you saying that this is not the case? If this is not the case, we should fix it. Otherwise, the question is probably more suitable for a Keras support forum.

Cheers,
Eibe

> On 16/05/2018, at 11:56 PM, juanantonio <[hidden email]> wrote:
>
> Hello,
>
> I am running a Keras Model with pyscript. In each experiment or execution of
> the algorithm I have a different output as result. In my algorithm, inside
> my .py file, I set a seed as everybody recommends for get the
> reproducibility, however the outputs keep going differents in each execution
> of the model.
>
> Is there a way to obtein the same output in execution??
>
> Thanks.
>
>
>
> --
> Sent from: http://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: WekaScript - Reproducibility of the experiments

Mark.Paris
Hi Eibe,
I've got a similar problem in that I'm not getting reproducible results. I'm
invoking Weka using Python running in a Jupyter notebook.

Here's my code:

>>> Start of code

# Import libraries
import weka.core.jvm as jvm
import weka.core.converters as conv
from weka.classifiers import Evaluation, Classifier
from weka.core.classes import Random

# Start the JVM
jvm.start(packages=True)

# Load iris.arff
data = conv.load_any_file("iris.arff")

# Specify that the class attribute is the last attribute.
data.class_is_last()

# Classify using the OneR rule classifier.
evl = Evaluation(data)
cls = Classifier(classname="weka.classifiers.rules.OneR")
evl.evaluate_train_test_split(cls, data, 66, Random(1))
print(evl.summary())

# Classify again using the OneR rule classifier.
evl = Evaluation(data)
cls = Classifier(classname="weka.classifiers.rules.OneR")
evl.evaluate_train_test_split(cls, data, 66, Random(1))
print(evl.summary())

# Stop the JVM
jvm.stop()

>>> End of code

... and here's my output:

>>> Start of output

Correctly Classified Instances          49               96.0784 %
Incorrectly Classified Instances         2                3.9216 %
Kappa statistic                          0.9408
Mean absolute error                      0.0261
Root mean squared error                  0.1617
Relative absolute error                  5.8824 %
Root relative squared error             34.2997 %
Total Number of Instances               51    


Correctly Classified Instances          47               92.1569 %
Incorrectly Classified Instances         4                7.8431 %
Kappa statistic                          0.8806
Mean absolute error                      0.0523
Root mean squared error                  0.2287
Relative absolute error                 11.7647 %
Root relative squared error             48.5071 %
Total Number of Instances               51  

>>> End of output

Given that I passed "Random(1)" to both calls of
"evaluate_train_test_split", I was expecting to get the same results for
both calls, but this wasn't the case.

Regards,
Mark



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: WekaScript - Reproducibility of the experiments

Peter Reutemann
> I've got a similar problem in that I'm not getting reproducible results. I'm
> invoking Weka using Python running in a Jupyter notebook.
>
> Here's my code:
>
> >>> Start of code
>
> # Import libraries
> import weka.core.jvm as jvm
> import weka.core.converters as conv
> from weka.classifiers import Evaluation, Classifier
> from weka.core.classes import Random
>
> # Start the JVM
> jvm.start(packages=True)
>
> # Load iris.arff
> data = conv.load_any_file("iris.arff")
>
> # Specify that the class attribute is the last attribute.
> data.class_is_last()
>
> # Classify using the OneR rule classifier.
> evl = Evaluation(data)
> cls = Classifier(classname="weka.classifiers.rules.OneR")
> evl.evaluate_train_test_split(cls, data, 66, Random(1))
> print(evl.summary())
>
> # Classify again using the OneR rule classifier.
> evl = Evaluation(data)
> cls = Classifier(classname="weka.classifiers.rules.OneR")
> evl.evaluate_train_test_split(cls, data, 66, Random(1))
> print(evl.summary())
>
> # Stop the JVM
> jvm.stop()
>
> >>> End of code
>
> ... and here's my output:
>
> >>> Start of output
>
> Correctly Classified Instances          49               96.0784 %
> Incorrectly Classified Instances         2                3.9216 %
> Kappa statistic                          0.9408
> Mean absolute error                      0.0261
> Root mean squared error                  0.1617
> Relative absolute error                  5.8824 %
> Root relative squared error             34.2997 %
> Total Number of Instances               51
>
>
> Correctly Classified Instances          47               92.1569 %
> Incorrectly Classified Instances         4                7.8431 %
> Kappa statistic                          0.8806
> Mean absolute error                      0.0523
> Root mean squared error                  0.2287
> Relative absolute error                 11.7647 %
> Root relative squared error             48.5071 %
> Total Number of Instances               51
>
> >>> End of output
>
> Given that I passed "Random(1)" to both calls of
> "evaluate_train_test_split", I was expecting to get the same results for
> both calls, but this wasn't the case.

The train_test_split method randomized the data (ie itself) before
generating the split in case a random number generator was provided.
The second randomization randomized the data further, resulting in a
different split with (of course) different results. Not obvious from
code nor documentation, I agree.
I just released new versions of the python-weka-wrapper and
python-weka-wrapper3 libraries which create a copy of the data now
before applying randomization. This results in the same statistics
when using the above code snippet. Thanks for reporting!

BTW pww has its own mailing list:
https://groups.google.com/forum/#!forum/python-weka-wrapper

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html