Quantcast

help with basics

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

help with basics

Manoj Agrawal

Hello Weka experts,


I just completed More Data Mining with Weka course and started to take the post course evaluation and I am stuck at the first question. I have been using Weka for a while and it is embarrassing to get stuck on something as basic as this but I am just very confused


This is the question


In the Explorer, run OneR on the weather.nominal data, evaluated using 10-fold cross-validation, and output the predictions on the test data along with the value of the first attribute (outlook). Number these predictions from 1 to 14 in the order that they are output. Which ones do not agree with the classifier model for the full training set?

 
 
 
 
 

=== Run information ===

Scheme:       weka.classifiers.rules.OneR -B 6
Relation:     weather.symbolic
Instances:    14
Attributes:   5
              outlook
              temperature
              humidity
              windy
              play
Test mode:    10-fold cross-validation

=== Classifier model (full training set) ===

outlook:
sunny -> no
overcast -> yes
rainy -> yes
(10/14 instances correct)


Time taken to build model: 0 seconds

=== Predictions on test data ===

    inst#     actual  predicted error prediction
        1       2:no      1:yes   +   1 
        2      1:yes       2:no   +   1 
        1       2:no      1:yes   +   1 
        2      1:yes      1:yes       1 
        1       2:no       2:no       1 
        2      1:yes       2:no   +   1 
        1       2:no       2:no       1 
        2      1:yes       2:no   +   1 
        1       2:no      1:yes   +   1 
        1      1:yes       2:no   +   1 
        1      1:yes      1:yes       1 
        1      1:yes       2:no   +   1 
        1      1:yes      1:yes       1 
        1      1:yes      1:yes       1 

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances           6               42.8571 %
Incorrectly Classified Instances         8               57.1429 %
Kappa statistic                         -0.1429
Mean absolute error                      0.5714
Root mean squared error                  0.7559
Relative absolute error                120      %
Root relative squared error            153.2194 %
Total Number of Instances               14     

=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 0.444    0.600    0.571      0.444    0.500      -0.149   0.422     0.611     yes
                 0.400    0.556    0.286      0.400    0.333      -0.149   0.422     0.329     no
Weighted Avg.    0.429    0.584    0.469      0.429    0.440      -0.149   0.422     0.510     

=== Confusion Matrix ===

 a b   <-- classified as
 4 5 | a = yes
 3 2 | b = no

The answer it seems is 6,8,9 and 12 but I can't seem to understand how. I tried adding IDs to get the correct instances for both cross-folds and full training set but predictions on instances 6, 8 and 9 match with both options. 



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: help with basics

Eibe Frank-2
Administrator
In the output of the predictions, you need to output the values of the first attribute as well, i.e., the values of the outlook attribute.

You can then check where the classifications obtained in the cross-validation are inconsistent with the model built from the full training set. As you can see in the output, this model is based on the outlook attribute.

Cheers,
Eibe

> On 14/02/2017, at 2:55 PM, Manoj Agrawal <[hidden email]> wrote:
>
> Hello Weka experts,
>
> I just completed More Data Mining with Weka course and started to take the post course evaluation and I am stuck at the first question. I have been using Weka for a while and it is embarrassing to get stuck on something as basic as this but I am just very confused
>
> This is the question
>
> In the Explorer, run OneR on the weather.nominal data, evaluated using 10-fold cross-validation, and output the predictions on the test data along with the value of the first attribute (outlook). Number these predictions from 1 to 14 in the order that they are output. Which ones do not agree with the classifier model for the full training set?
>
>  2, 5, 7, 9, 10
>  1, 3, 5, 7, 9
>  4, 8, 11, 12
>  6, 8, 9, 12
>  2, 5, 6, 7, 8, 10, 12
>
> and these are the results
>
> === Run information ===
>
> Scheme:       weka.classifiers.rules.OneR -B 6
> Relation:     weather.symbolic
> Instances:    14
> Attributes:   5
>               outlook
>               temperature
>               humidity
>               windy
>               play
> Test mode:    10-fold cross-validation
>
> === Classifier model (full training set) ===
>
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on test data ===
>
>     inst#     actual  predicted error prediction
>         1       2:no      1:yes   +   1
>         2      1:yes       2:no   +   1
>         1       2:no      1:yes   +   1
>         2      1:yes      1:yes       1
>         1       2:no       2:no       1
>         2      1:yes       2:no   +   1
>         1       2:no       2:no       1
>         2      1:yes       2:no   +   1
>         1       2:no      1:yes   +   1
>         1      1:yes       2:no   +   1
>         1      1:yes      1:yes       1
>         1      1:yes       2:no   +   1
>         1      1:yes      1:yes       1
>         1      1:yes      1:yes       1
>
> === Stratified cross-validation ===
> === Summary ===
>
> Correctly Classified Instances           6               42.8571 %
> Incorrectly Classified Instances         8               57.1429 %
> Kappa statistic                         -0.1429
> Mean absolute error                      0.5714
> Root mean squared error                  0.7559
> Relative absolute error                120      %
> Root relative squared error            153.2194 %
> Total Number of Instances               14    
>
> === Detailed Accuracy By Class ===
>
>                  TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
>                  0.444    0.600    0.571      0.444    0.500      -0.149   0.422     0.611     yes
>                  0.400    0.556    0.286      0.400    0.333      -0.149   0.422     0.329     no
> Weighted Avg.    0.429    0.584    0.469      0.429    0.440      -0.149   0.422     0.510    
>
> === Confusion Matrix ===
>
>  a b   <-- classified as
>  4 5 | a = yes
>  3 2 | b = no
>
> The answer it seems is 6,8,9 and 12 but I can't seem to understand how. I tried adding IDs to get the correct instances for both cross-folds and full training set but predictions on instances 6, 8 and 9 match with both options.
>
> Can someone please help me understand this question and the solution?
>
> thanks,
>
> Manoj Agrawal
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: help with basics

Manoj Agrawal

Hi Eibe,


Thanks for the reply. 


I added ID and took two runs one using full training set and one with cross-folds (removing ID using filtered classifier in the run of course). These are the results I get


Full training set


Classifier Model
outlook:
sunny -> no
overcast -> yes
rainy -> yes
(10/14 instances correct)


Time taken to build model: 0 seconds

=== Predictions on training set ===

    inst#     actual  predicted error prediction (ID,outlook)
        1       2:no       2:no       1 (1,sunny)
        2       2:no       2:no       1 (2,sunny)
        3      1:yes      1:yes       1 (3,overcast)
        4      1:yes      1:yes       1 (4,rainy)
        5      1:yes      1:yes       1 (5,rainy)
        6       2:no      1:yes   +   1 (6,rainy)
        7      1:yes      1:yes       1 (7,overcast)
        8       2:no       2:no       1 (8,sunny)
        9      1:yes       2:no   +   1 (9,sunny)
       10      1:yes      1:yes       1 (10,rainy)
       11      1:yes       2:no   +   1 (11,sunny)
       12      1:yes      1:yes       1 (12,overcast)
       13      1:yes      1:yes       1 (13,overcast)
       14       2:no      1:yes   +   1 (14,rainy)

=== Evaluation on training set ===

Time taken to test model on training data: 0.01 seconds

=== Summary ===

Correctly Classified Instances          10               71.4286 %
Incorrectly Classified Instances         4               28.5714 %

with Cross-folds


Classifier Model
outlook:
sunny -> no
overcast -> yes
rainy -> yes
(10/14 instances correct)


Time taken to build model: 0 seconds

=== Predictions on test data ===

    inst#     actual  predicted error prediction (ID,outlook)
        1       2:no      1:yes   +   1 (6,rainy)
        2      1:yes       2:no   +   1 (9,sunny)
        1       2:no      1:yes   +   1 (14,rainy)
        2      1:yes      1:yes       1 (13,overcast)
        1       2:no       2:no       1 (2,sunny)
        2      1:yes       2:no   +   1 (4,rainy)
        1       2:no       2:no       1 (8,sunny)
        2      1:yes       2:no   +   1 (12,overcast)
        1       2:no      1:yes   +   1 (1,sunny)
        1      1:yes       2:no   +   1 (11,sunny)
        1      1:yes      1:yes       1 (7,overcast)
        1      1:yes       2:no   +   1 (3,overcast)
        1      1:yes      1:yes       1 (10,rainy)
        1      1:yes      1:yes       1 (5,rainy)

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances           6               42.8571 %
Incorrectly Classified Instances         8               57.1429 %

I can see that instance 6 is incorrectly predicted as 'Yes' in both the models, 8 is correctly predicted in both and 9 is incorrectly predicted in both. From what I see, it is instance 1, 3, 4 and 12 which are differently predicted in both models.


Not sure what am i missing.


regards,


Manoj Agrawal


From: [hidden email] <[hidden email]> on behalf of Eibe Frank <[hidden email]>
Sent: Wednesday, February 15, 2017 5:53:07 PM
To: Weka machine learning workbench list.
Subject: Re: [Wekalist] help with basics
 
In the output of the predictions, you need to output the values of the first attribute as well, i.e., the values of the outlook attribute.

You can then check where the classifications obtained in the cross-validation are inconsistent with the model built from the full training set. As you can see in the output, this model is based on the outlook attribute.

Cheers,
Eibe

> On 14/02/2017, at 2:55 PM, Manoj Agrawal <[hidden email]> wrote:
>
> Hello Weka experts,
>
> I just completed More Data Mining with Weka course and started to take the post course evaluation and I am stuck at the first question. I have been using Weka for a while and it is embarrassing to get stuck on something as basic as this but I am just very confused
>
> This is the question
>
> In the Explorer, run OneR on the weather.nominal data, evaluated using 10-fold cross-validation, and output the predictions on the test data along with the value of the first attribute (outlook). Number these predictions from 1 to 14 in the order that they are output. Which ones do not agree with the classifier model for the full training set?
>
>  2, 5, 7, 9, 10
>  1, 3, 5, 7, 9
>  4, 8, 11, 12
>  6, 8, 9, 12
>  2, 5, 6, 7, 8, 10, 12
>
> and these are the results
>
> === Run information ===
>
> Scheme:       weka.classifiers.rules.OneR -B 6
> Relation:     weather.symbolic
> Instances:    14
> Attributes:   5
>               outlook
>               temperature
>               humidity
>               windy
>               play
> Test mode:    10-fold cross-validation
>
> === Classifier model (full training set) ===
>
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on test data ===
>
>     inst#     actual  predicted error prediction
>         1       2:no      1:yes   +   1
>         2      1:yes       2:no   +   1
>         1       2:no      1:yes   +   1
>         2      1:yes      1:yes       1
>         1       2:no       2:no       1
>         2      1:yes       2:no   +   1
>         1       2:no       2:no       1
>         2      1:yes       2:no   +   1
>         1       2:no      1:yes   +   1
>         1      1:yes       2:no   +   1
>         1      1:yes      1:yes       1
>         1      1:yes       2:no   +   1
>         1      1:yes      1:yes       1
>         1      1:yes      1:yes       1
>
> === Stratified cross-validation ===
> === Summary ===
>
> Correctly Classified Instances           6               42.8571 %
> Incorrectly Classified Instances         8               57.1429 %
> Kappa statistic                         -0.1429
> Mean absolute error                      0.5714
> Root mean squared error                  0.7559
> Relative absolute error                120      %
> Root relative squared error            153.2194 %
> Total Number of Instances               14    
>
> === Detailed Accuracy By Class ===
>
>                  TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
>                  0.444    0.600    0.571      0.444    0.500      -0.149   0.422     0.611     yes
>                  0.400    0.556    0.286      0.400    0.333      -0.149   0.422     0.329     no
> Weighted Avg.    0.429    0.584    0.469      0.429    0.440      -0.149   0.422     0.510    
>
> === Confusion Matrix ===
>
>  a b   <-- classified as
>  4 5 | a = yes
>  3 2 | b = no
>
> The answer it seems is 6,8,9 and 12 but I can't seem to understand how. I tried adding IDs to get the correct instances for both cross-folds and full training set but predictions on instances 6, 8 and 9 match with both options.
>
> Can someone please help me understand this question and the solution?
>
> thanks,
>
> Manoj Agrawal
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: help with basics

Eibe Frank-2
Administrator
1: ...
2: ...
...
6: 2      1:yes       2:no   +   1 (4,rainy)
...
8: 2      1:yes       2:no   +   1 (12,overcast)
...
9: 1       2:no      1:yes   +   1 (1,sunny)
...
12: 1      1:yes       2:no   +   1 (3,overcast)
...

Cheers,
Eibe

> On 16/02/2017, at 12:37 PM, Manoj Agrawal <[hidden email]> wrote:
>
> Hi Eibe,
>
> Thanks for the reply.
>
> I added ID and took two runs one using full training set and one with cross-folds (removing ID using filtered classifier in the run of course). These are the results I get
>
> Full training set
>
> Classifier Model
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on training set ===
>
>     inst#     actual  predicted error prediction (ID,outlook)
>         1       2:no       2:no       1 (1,sunny)
>         2       2:no       2:no       1 (2,sunny)
>         3      1:yes      1:yes       1 (3,overcast)
>         4      1:yes      1:yes       1 (4,rainy)
>         5      1:yes      1:yes       1 (5,rainy)
>         6       2:no      1:yes   +   1 (6,rainy)
>         7      1:yes      1:yes       1 (7,overcast)
>         8       2:no       2:no       1 (8,sunny)
>         9      1:yes       2:no   +   1 (9,sunny)
>        10      1:yes      1:yes       1 (10,rainy)
>        11      1:yes       2:no   +   1 (11,sunny)
>        12      1:yes      1:yes       1 (12,overcast)
>        13      1:yes      1:yes       1 (13,overcast)
>        14       2:no      1:yes   +   1 (14,rainy)
>
> === Evaluation on training set ===
>
> Time taken to test model on training data: 0.01 seconds
>
> === Summary ===
>
> Correctly Classified Instances          10               71.4286 %
> Incorrectly Classified Instances         4               28.5714 %
>
> with Cross-folds
>
> Classifier Model
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on test data ===
>
>     inst#     actual  predicted error prediction (ID,outlook)
>         1       2:no      1:yes   +   1 (6,rainy)
>         2      1:yes       2:no   +   1 (9,sunny)
>         1       2:no      1:yes   +   1 (14,rainy)
>         2      1:yes      1:yes       1 (13,overcast)
>         1       2:no       2:no       1 (2,sunny)
>         2      1:yes       2:no   +   1 (4,rainy)
>         1       2:no       2:no       1 (8,sunny)
>         2      1:yes       2:no   +   1 (12,overcast)
>         1       2:no      1:yes   +   1 (1,sunny)
>         1      1:yes       2:no   +   1 (11,sunny)
>         1      1:yes      1:yes       1 (7,overcast)
>         1      1:yes       2:no   +   1 (3,overcast)
>         1      1:yes      1:yes       1 (10,rainy)
>         1      1:yes      1:yes       1 (5,rainy)
>
> === Stratified cross-validation ===
> === Summary ===
>
> Correctly Classified Instances           6               42.8571 %
> Incorrectly Classified Instances         8               57.1429 %
>
> I can see that instance 6 is incorrectly predicted as 'Yes' in both the models, 8 is correctly predicted in both and 9 is incorrectly predicted in both. From what I see, it is instance 1, 3, 4 and 12 which are differently predicted in both models.
>
> Not sure what am i missing.
>
> regards,
>
> Manoj Agrawal
> From: [hidden email] <[hidden email]> on behalf of Eibe Frank <[hidden email]>
> Sent: Wednesday, February 15, 2017 5:53:07 PM
> To: Weka machine learning workbench list.
> Subject: Re: [Wekalist] help with basics
>  
> In the output of the predictions, you need to output the values of the first attribute as well, i.e., the values of the outlook attribute.
>
> You can then check where the classifications obtained in the cross-validation are inconsistent with the model built from the full training set. As you can see in the output, this model is based on the outlook attribute.
>
> Cheers,
> Eibe
>
> > On 14/02/2017, at 2:55 PM, Manoj Agrawal <[hidden email]> wrote:
> >
> > Hello Weka experts,
> >
> > I just completed More Data Mining with Weka course and started to take the post course evaluation and I am stuck at the first question. I have been using Weka for a while and it is embarrassing to get stuck on something as basic as this but I am just very confused
> >
> > This is the question
> >
> > In the Explorer, run OneR on the weather.nominal data, evaluated using 10-fold cross-validation, and output the predictions on the test data along with the value of the first attribute (outlook). Number these predictions from 1 to 14 in the order that they are output. Which ones do not agree with the classifier model for the full training set?
> >
> >  2, 5, 7, 9, 10
> >  1, 3, 5, 7, 9
> >  4, 8, 11, 12
> >  6, 8, 9, 12
> >  2, 5, 6, 7, 8, 10, 12
> >
> > and these are the results
> >
> > === Run information ===
> >
> > Scheme:       weka.classifiers.rules.OneR -B 6
> > Relation:     weather.symbolic
> > Instances:    14
> > Attributes:   5
> >               outlook
> >               temperature
> >               humidity
> >               windy
> >               play
> > Test mode:    10-fold cross-validation
> >
> > === Classifier model (full training set) ===
> >
> > outlook:
> > sunny
> > -> no
> > overcast
> > -> yes
> > rainy
> > -> yes
> > (10/14 instances correct)
> >
> >
> > Time taken to build model: 0 seconds
> >
> > === Predictions on test data ===
> >
> >     inst#     actual  predicted error prediction
> >         1       2:no      1:yes   +   1
> >         2      1:yes       2:no   +   1
> >         1       2:no      1:yes   +   1
> >         2      1:yes      1:yes       1
> >         1       2:no       2:no       1
> >         2      1:yes       2:no   +   1
> >         1       2:no       2:no       1
> >         2      1:yes       2:no   +   1
> >         1       2:no      1:yes   +   1
> >         1      1:yes       2:no   +   1
> >         1      1:yes      1:yes       1
> >         1      1:yes       2:no   +   1
> >         1      1:yes      1:yes       1
> >         1      1:yes      1:yes       1
> >
> > === Stratified cross-validation ===
> > === Summary ===
> >
> > Correctly Classified Instances           6               42.8571 %
> > Incorrectly Classified Instances         8               57.1429 %
> > Kappa statistic                         -0.1429
> > Mean absolute error                      0.5714
> > Root mean squared error                  0.7559
> > Relative absolute error                120      %
> > Root relative squared error            153.2194 %
> > Total Number of Instances               14    
> >
> > === Detailed Accuracy By Class ===
> >
> >                  TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
> >                  0.444    0.600    0.571      0.444    0.500      -0.149   0.422     0.611     yes
> >                  0.400    0.556    0.286      0.400    0.333      -0.149   0.422     0.329     no
> > Weighted Avg.    0.429    0.584    0.469      0.429    0.440      -0.149   0.422     0.510    
> >
> > === Confusion Matrix ===
> >
> >  a b   <-- classified as
> >  4 5 | a = yes
> >  3 2 | b = no
> >
> > The answer it seems is 6,8,9 and 12 but I can't seem to understand how. I tried adding IDs to get the correct instances for both cross-folds and full training set but predictions on instances 6, 8 and 9 match with both options.
> >
> > Can someone please help me understand this question and the solution?
> >
> > thanks,
> >
> > Manoj Agrawal
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: help with basics

Manoj Agrawal

Ah, it was that simple. I was trying to compare the actual instance numbers. Thanks Eibe!


regards,


Manoj Agrawal


From: [hidden email] <[hidden email]> on behalf of Eibe Frank <[hidden email]>
Sent: Wednesday, February 15, 2017 7:20:56 PM
To: Weka machine learning workbench list.
Subject: Re: [Wekalist] help with basics
 
1: ...
2: ...
...
6: 2      1:yes       2:no   +   1 (4,rainy)
...
8: 2      1:yes       2:no   +   1 (12,overcast)
...
9: 1       2:no      1:yes   +   1 (1,sunny)
...
12: 1      1:yes       2:no   +   1 (3,overcast)
...

Cheers,
Eibe

> On 16/02/2017, at 12:37 PM, Manoj Agrawal <[hidden email]> wrote:
>
> Hi Eibe,
>
> Thanks for the reply.
>
> I added ID and took two runs one using full training set and one with cross-folds (removing ID using filtered classifier in the run of course). These are the results I get
>
> Full training set
>
> Classifier Model
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on training set ===
>
>     inst#     actual  predicted error prediction (ID,outlook)
>         1       2:no       2:no       1 (1,sunny)
>         2       2:no       2:no       1 (2,sunny)
>         3      1:yes      1:yes       1 (3,overcast)
>         4      1:yes      1:yes       1 (4,rainy)
>         5      1:yes      1:yes       1 (5,rainy)
>         6       2:no      1:yes   +   1 (6,rainy)
>         7      1:yes      1:yes       1 (7,overcast)
>         8       2:no       2:no       1 (8,sunny)
>         9      1:yes       2:no   +   1 (9,sunny)
>        10      1:yes      1:yes       1 (10,rainy)
>        11      1:yes       2:no   +   1 (11,sunny)
>        12      1:yes      1:yes       1 (12,overcast)
>        13      1:yes      1:yes       1 (13,overcast)
>        14       2:no      1:yes   +   1 (14,rainy)
>
> === Evaluation on training set ===
>
> Time taken to test model on training data: 0.01 seconds
>
> === Summary ===
>
> Correctly Classified Instances          10               71.4286 %
> Incorrectly Classified Instances         4               28.5714 %
>
> with Cross-folds
>
> Classifier Model
> outlook:
> sunny
> -> no
> overcast
> -> yes
> rainy
> -> yes
> (10/14 instances correct)
>
>
> Time taken to build model: 0 seconds
>
> === Predictions on test data ===
>
>     inst#     actual  predicted error prediction (ID,outlook)
>         1       2:no      1:yes   +   1 (6,rainy)
>         2      1:yes       2:no   +   1 (9,sunny)
>         1       2:no      1:yes   +   1 (14,rainy)
>         2      1:yes      1:yes       1 (13,overcast)
>         1       2:no       2:no       1 (2,sunny)
>         2      1:yes       2:no   +   1 (4,rainy)
>         1       2:no       2:no       1 (8,sunny)
>         2      1:yes       2:no   +   1 (12,overcast)
>         1       2:no      1:yes   +   1 (1,sunny)
>         1      1:yes       2:no   +   1 (11,sunny)
>         1      1:yes      1:yes       1 (7,overcast)
>         1      1:yes       2:no   +   1 (3,overcast)
>         1      1:yes      1:yes       1 (10,rainy)
>         1      1:yes      1:yes       1 (5,rainy)
>
> === Stratified cross-validation ===
> === Summary ===
>
> Correctly Classified Instances           6               42.8571 %
> Incorrectly Classified Instances         8               57.1429 %
>
> I can see that instance 6 is incorrectly predicted as 'Yes' in both the models, 8 is correctly predicted in both and 9 is incorrectly predicted in both. From what I see, it is instance 1, 3, 4 and 12 which are differently predicted in both models.
>
> Not sure what am i missing.
>
> regards,
>
> Manoj Agrawal
> From: [hidden email] <[hidden email]> on behalf of Eibe Frank <[hidden email]>
> Sent: Wednesday, February 15, 2017 5:53:07 PM
> To: Weka machine learning workbench list.
> Subject: Re: [Wekalist] help with basics

> In the output of the predictions, you need to output the values of the first attribute as well, i.e., the values of the outlook attribute.
>
> You can then check where the classifications obtained in the cross-validation are inconsistent with the model built from the full training set. As you can see in the output, this model is based on the outlook attribute.
>
> Cheers,
> Eibe
>
> > On 14/02/2017, at 2:55 PM, Manoj Agrawal <[hidden email]> wrote:
> >
> > Hello Weka experts,
> >
> > I just completed More Data Mining with Weka course and started to take the post course evaluation and I am stuck at the first question. I have been using Weka for a while and it is embarrassing to get stuck on something as basic as this but I am just very confused
> >
> > This is the question
> >
> > In the Explorer, run OneR on the weather.nominal data, evaluated using 10-fold cross-validation, and output the predictions on the test data along with the value of the first attribute (outlook). Number these predictions from 1 to 14 in the order that they are output. Which ones do not agree with the classifier model for the full training set?
> >
> >  2, 5, 7, 9, 10
> >  1, 3, 5, 7, 9
> >  4, 8, 11, 12
> >  6, 8, 9, 12
> >  2, 5, 6, 7, 8, 10, 12
> >
> > and these are the results
> >
> > === Run information ===
> >
> > Scheme:       weka.classifiers.rules.OneR -B 6
> > Relation:     weather.symbolic
> > Instances:    14
> > Attributes:   5
> >               outlook
> >               temperature
> >               humidity
> >               windy
> >               play
> > Test mode:    10-fold cross-validation
> >
> > === Classifier model (full training set) ===
> >
> > outlook:
> > sunny
> > -> no
> > overcast
> > -> yes
> > rainy
> > -> yes
> > (10/14 instances correct)
> >
> >
> > Time taken to build model: 0 seconds
> >
> > === Predictions on test data ===
> >
> >     inst#     actual  predicted error prediction
> >         1       2:no      1:yes   +   1
> >         2      1:yes       2:no   +   1
> >         1       2:no      1:yes   +   1
> >         2      1:yes      1:yes       1
> >         1       2:no       2:no       1
> >         2      1:yes       2:no   +   1
> >         1       2:no       2:no       1
> >         2      1:yes       2:no   +   1
> >         1       2:no      1:yes   +   1
> >         1      1:yes       2:no   +   1
> >         1      1:yes      1:yes       1
> >         1      1:yes       2:no   +   1
> >         1      1:yes      1:yes       1
> >         1      1:yes      1:yes       1
> >
> > === Stratified cross-validation ===
> > === Summary ===
> >
> > Correctly Classified Instances           6               42.8571 %
> > Incorrectly Classified Instances         8               57.1429 %
> > Kappa statistic                         -0.1429
> > Mean absolute error                      0.5714
> > Root mean squared error                  0.7559
> > Relative absolute error                120      %
> > Root relative squared error            153.2194 %
> > Total Number of Instances               14    
> >
> > === Detailed Accuracy By Class ===
> >
> >                  TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
> >                  0.444    0.600    0.571      0.444    0.500      -0.149   0.422     0.611     yes
> >                  0.400    0.556    0.286      0.400    0.333      -0.149   0.422     0.329     no
> > Weighted Avg.    0.429    0.584    0.469      0.429    0.440      -0.149   0.422     0.510    
> >
> > === Confusion Matrix ===
> >
> >  a b   <-- classified as
> >  4 5 | a = yes
> >  3 2 | b = no
> >
> > The answer it seems is 6,8,9 and 12 but I can't seem to understand how. I tried adding IDs to get the correct instances for both cross-folds and full training set but predictions on instances 6, 8 and 9 match with both options.
> >
> > Can someone please help me understand this question and the solution?
> >
> > thanks,
> >
> > Manoj Agrawal
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...