Prediction (j48)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Prediction (j48)

Paul m
Hello,
I'm new to the list and weka so I may be missing something easy but
I'm having trouble predicting with weka.  Here is what I've done so
far. I'm using command line with weka-3-4-4 on winXP.

I've trained my data set (of 36 attributes plus one class attribute),
using j48. I've set the class attribute (of which there are 6) to
nominal values like pear, apples, etc... I've outputted the model to a
file which I'll predict from. I'm using the following command line :

java -cp weka.jar weka.classifiers.trees.J48 -l j48_model_output -T
prediction-dis.arff -p 1-3

j48_model_output = model generated from j48 classifier
prediction-dis.arff = dataset of test instances (which are the ones I
wish to have predicted values for). The 'test' data set comprises of
new instances each with 36 attributes (I've removed the class
attribute – since this is the predicted value I'm after).

The output of the above is like:
0 0.0 85.0 (56,87,104)
1 0.0 74.0 (53,79,96)
2 0.0 96.0 (72,106,115)

I've read (http://www.ofai.at/~alexander.seewald/WEKA/) and I
understand that the first column is the number of the instance, the
second is the prediction and third the confidence level. The bracketed
values match with my test instances (attributes 1-3). As you can see
the predicted value is 0.0 which does not match the nominal class
value that was in the generated model.

Any pointers or thoughts about what I'm doing wrong would be appreciated.

/paul

_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Reply | Threaded
Open this post in threaded view
|

Re: Prediction (j48)

Eibe Frank
Don't delete the class attribute from the test data. It needs to be
there. You can set all class values in the test data to "missing".

You should really be getting an exception that your test data is not
compatible with the training data because it has one attribute less.
I'll take a look at whether we can fix that.

Cheers,
Eibe


On Jun 28, 2005, at 10:04 AM, Paul wrote:

> Hello,
> I'm new to the list and weka so I may be missing something easy but
> I'm having trouble predicting with weka.  Here is what I've done so
> far. I'm using command line with weka-3-4-4 on winXP.
>
> I've trained my data set (of 36 attributes plus one class attribute),
> using j48. I've set the class attribute (of which there are 6) to
> nominal values like pear, apples, etc... I've outputted the model to a
> file which I'll predict from. I'm using the following command line :
>
> java -cp weka.jar weka.classifiers.trees.J48 -l j48_model_output -T
> prediction-dis.arff -p 1-3
>
> j48_model_output = model generated from j48 classifier
> prediction-dis.arff = dataset of test instances (which are the ones I
> wish to have predicted values for). The 'test' data set comprises of
> new instances each with 36 attributes (I've removed the class
> attribute – since this is the predicted value I'm after).
>
> The output of the above is like:
> 0 0.0 85.0 (56,87,104)
> 1 0.0 74.0 (53,79,96)
> 2 0.0 96.0 (72,106,115)
>
> I've read (http://www.ofai.at/~alexander.seewald/WEKA/) and I
> understand that the first column is the number of the instance, the
> second is the prediction and third the confidence level. The bracketed
> values match with my test instances (attributes 1-3). As you can see
> the predicted value is 0.0 which does not match the nominal class
> value that was in the generated model.
>
> Any pointers or thoughts about what I'm doing wrong would be
> appreciated.
>
> /paul
> _______________________________________________
> Wekalist mailing list
> [hidden email]
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Reply | Threaded
Open this post in threaded view
|

Re: Prediction (j48)

Paul m
I've tried putting '? and leaving it blank as the class attribute for
the test file and now get the following :

0 0.0 missing (attribute list)
1 0.0 missing (attribute list)
2 0.0 missing (attribute list)

I figure that when classifying new instances you are not supposed to
know the resulting class, therefore it should either be blank or
unknown.

I've tried putting some 'initial guess' class values in the class
attribute but the output mirrors what was entered.

eg, I entered all 'apples' as the resulting class for each test
instance. The result was that all the guesses were 'apples' with a
probability just less than 1.
I entered all 'pineapples'  as the resulting class for each test
instance. The result was that all the guesses were 'pineapples' with a
probability just less than 1.
etc....

Could it be that your only supposed to have one instance in the 'test' file??

Am I missing something here??

/paul

On 6/28/05, Eibe Frank <[hidden email]> wrote:

> Don't delete the class attribute from the test data. It needs to be
> there. You can set all class values in the test data to "missing".
>
> You should really be getting an exception that your test data is not
> compatible with the training data because it has one attribute less.
> I'll take a look at whether we can fix that.
>
> Cheers,
> Eibe
>
>
> On Jun 28, 2005, at 10:04 AM, Paul wrote:
>
> > Hello,
> > I'm new to the list and weka so I may be missing something easy but
> > I'm having trouble predicting with weka.  Here is what I've done so
> > far. I'm using command line with weka-3-4-4 on winXP.
> >
> > I've trained my data set (of 36 attributes plus one class attribute),
> > using j48. I've set the class attribute (of which there are 6) to
> > nominal values like pear, apples, etc... I've outputted the model to a
> > file which I'll predict from. I'm using the following command line :
> >
> > java -cp weka.jar weka.classifiers.trees.J48 -l j48_model_output -T
> > prediction-dis.arff -p 1-3
> >
> > j48_model_output = model generated from j48 classifier
> > prediction-dis.arff = dataset of test instances (which are the ones I
> > wish to have predicted values for). The 'test' data set comprises of
> > new instances each with 36 attributes (I've removed the class
> > attribute – since this is the predicted value I'm after).
> >
> > The output of the above is like:
> > 0 0.0 85.0 (56,87,104)
> > 1 0.0 74.0 (53,79,96)
> > 2 0.0 96.0 (72,106,115)
> >
> > I've read (http://www.ofai.at/~alexander.seewald/WEKA/) and I
> > understand that the first column is the number of the instance, the
> > second is the prediction and third the confidence level. The bracketed
> > values match with my test instances (attributes 1-3). As you can see
> > the predicted value is 0.0 which does not match the nominal class
> > value that was in the generated model.
> >
> > Any pointers or thoughts about what I'm doing wrong would be
> > appreciated.
> >
> > /paul
> > _______________________________________________
> > Wekalist mailing list
> > [hidden email]
> > https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>
>

_______________________________________________
Wekalist mailing list
[hidden email]
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist