Using a trained classifier in a Java Program

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Using a trained classifier in a Java Program

sachiz
Hello,

I trained  logistic regression classifier and saved the trained model as
.model file using the WEKA GUI.
I want to use the trained model to make a prediction whenever new data
arrives to my java application.

I was referring to the WEKA Wiki on
https://waikato.github.io/weka-wiki/serialization/ to learn about how to do
it.
From what I can understand, the incoming new data has to be in saved in an
arff file in order for the deserialized model file to use it to make
predictions.

Is my understanding correct?
Is it not possible for the de-serialized model to read incoming new data
without having to write it to an arff file first?

Thank you for your help.

I really appreciate the work you do in this forum. You have saved me from
many hours of looking through obsolete/incorrect WEKA tutorials on random
parts of the Internet. Thank you again!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Using a trained classifier in a Java Program

Peter Reutemann
> I trained  logistic regression classifier and saved the trained model as
> .model file using the WEKA GUI.
> I want to use the trained model to make a prediction whenever new data
> arrives to my java application.
>
> I was referring to the WEKA Wiki on
> https://waikato.github.io/weka-wiki/serialization/ to learn about how to do
> it.
> From what I can understand, the incoming new data has to be in saved in an
> arff file in order for the deserialized model file to use it to make
> predictions.
>
> Is my understanding correct?
> Is it not possible for the de-serialized model to read incoming new data
> without having to write it to an arff file first?

You don't have to read data from an ARFF file. All you need is to
generate a new weka.core.Instance object for making predictions with
your model.

When saving a model from the Explorer or the command-line, the header
(ie dataset structure) of the dataset is stored alongside the model
(2nd object in the serialized data stream). You can use the "readAll"
method of the weka.core.SerializationHelper class for reading all
objects into an array:
  https://weka.sourceforge.io/doc.dev/weka/core/SerializationHelper.html#readAll-java.lang.String-

In terms of generating Instance objects on the fly, have a look at
this article (you already have the dataset structure, you only need to
add data rows):
  https://waikato.github.io/weka-wiki/formats_and_processing/creating_arff_file/

HTH

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Using a trained classifier in a Java Program

sachiz
Hi Peter,

Thank you for getting back so quickly. This was very helpful. I got it to
work correctly.
The example to create Instance objects you cited used Instance objects for
every row of data. When I used Instance constructor method, it said that
Instance was an abstract class and can not be instantiated. Therefore I used
the DenseInstace class instead.

A quick follow up question.
Because I want the classifier to make a prediction, my input data looks like
the following
{F,0,0,0,0,4,4,6,5,1.12,0.64,8,?} The last label ( ? ) is the class label.

I am using "?" . And I updated the test instance like this
test.setClassIndex(test.numAttributes()-1);

Does it matter if I give the ("?") in the input data or does it have to be
one of the classes in the training data set?
In my case the class label is (Yes/No)

Thank you for your help



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Using a trained classifier in a Java Program

Peter Reutemann
> Thank you for getting back so quickly. This was very helpful. I got it to
> work correctly.
> The example to create Instance objects you cited used Instance objects for
> every row of data. When I used Instance constructor method, it said that
> Instance was an abstract class and can not be instantiated. Therefore I used
> the DenseInstace class instead.

Thanks for reporting that. This example dates back to the days of Weka
3.6, where Instance was the equivalent of DenseInstance. I've updated
that.
The examples linked at the bottom (which are also part of every Weka
download) already used DenseInstance.

> A quick follow up question.
> Because I want the classifier to make a prediction, my input data looks like
> the following
> {F,0,0,0,0,4,4,6,5,1.12,0.64,8,?} The last label ( ? ) is the class label.
>
> I am using "?" . And I updated the test instance like this
> test.setClassIndex(test.numAttributes()-1);
>
> Does it matter if I give the ("?") in the input data or does it have to be
> one of the classes in the training data set?
> In my case the class label is (Yes/No)

You're doing it absolutely right. You should always use a missing
class value (?) for making predictions to avoid information leaking
into the prediction.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Using a trained classifier in a Java Program

sachiz
Thanks Peter.
This helped me a great deal.



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html