Quantcast

arguments to a classifier

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

arguments to a classifier

Jenna
Hello. I used Auto-Weka. The results look like what is below. The arguments are
[-E, -K, 14, -X, -I]
and the search arguments are
 [-B, -R]

If I look at lazy.IBk through the GUI I don't really see any place to put such arguments.
I see KNN, batchSize, crossValidate, etc.

1. Can someone explain how I use these arguments to build a model.
2. Can I confirm that the way to use the results of Auto-Weka is to check what it finds for the best model, take my training data (which is the data I used to input into Auto-Weka) and build a training model with this algorithm and with the aforementioned arguments, save the model and then "load that model" [How do I load the model?] and finally, choose "Supplied test set" from Test options and press "Start"
Thank you.

=== Run information ===

Scheme:       weka.classifiers.meta.CostSensitiveClassifier -cost-matrix "[0.0 1.0; 14.0 0.0]" -S 1 -W weka.classifiers.meta.AutoWEKAClassifier -- -seed 123 -timeLimit 15 -memLimit 1024 -nBestConfigs 1
Relation:     R_data_frame-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-R698-701-weka.filters.unsupervised.attribute.Remove-V-R4,22,33,40,114,698
Instances:    10683
Attributes:   6
           *
*
*
*
*
             
Test mode:    evaluate on training data

=== Classifier model (full training set) ===

CostSensitiveClassifier using reweighted training instances

weka.classifiers.meta.AutoWEKAClassifier -seed 123 -timeLimit 15 -memLimit 1024 -nBestConfigs 1

Classifier Model
best classifier: weka.classifiers.lazy.IBk
arguments: [-E, -K, 14, -X, -I]
attribute search: weka.attributeSelection.GreedyStepwise
attribute search arguments: [-B, -R]
attribute evaluation: weka.attributeSelection.CfsSubsetEval
attribute evaluation arguments: []
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Eibe Frank-2
Administrator
Those expressions in square brackets are command-line option specifications. You can use them in WEKA’s graphical user interfaces by right-clicking on the scheme specification string stated in the user interface (e.g., given to the right of the “Choose” button in the Classify panel in the Explorer) and selecting “Enter configuration…”. Delete the commas and square brackets from Auto-WEKA’s output and prepend the learning scheme's name. In your case, ignoring the attribute selection part, you’d use the configuration string

  IBk -E -K 14 -X -I

You did not say which attribute selection search method Auto-WEKA used and which evaluator and evaluator arguments it used. This information should be included in Auto-WEKA’s output. If you have all that info, you can configure an AttributeSelectedClassifier by entering these command-line options, prepending the corresponding scheme names, into the appropriate scheme specification fields in the GenericObjectEditor for the AttributeSelectedClassifier.

Cheers,
Eibe

> On 29/12/2016, at 9:55 PM, Jenna <[hidden email]> wrote:
>
> Hello. I used Auto-Weka. The results look like what is below. The arguments
> are
> [-E, -K, 14, -X, -I]
> and the search arguments are
> [-B, -R]
>
> If I look at lazy.IBk through the GUI I don't really see any place to put
> such arguments.
> I see KNN, batchSize, crossValidate, etc.
>
> 1. Can someone explain how I use these arguments to build a model.
> 2. Can I confirm that the way to use the results of Auto-Weka is to check
> what it finds for the best model, take my training data (which is the data I
> used to input into Auto-Weka) and build a training model with this algorithm
> and with the aforementioned arguments, save the model and then "load that
> model" [How do I load the model?] and finally, choose "Supplied test set"
> from Test options and press "Start"
> Thank you.
>
> === Run information ===
>
> Scheme:       weka.classifiers.meta.CostSensitiveClassifier -cost-matrix
> "[0.0 1.0; 14.0 0.0]" -S 1 -W weka.classifiers.meta.AutoWEKAClassifier --
> -seed 123 -timeLimit 15 -memLimit 1024 -nBestConfigs 1
> Relation:    
> R_data_frame-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-R698-701-weka.filters.unsupervised.attribute.Remove-V-R4,22,33,40,114,698
> Instances:    10683
> Attributes:   6
>           *
> *
> *
> *
> *
>
> Test mode:    evaluate on training data
>
> === Classifier model (full training set) ===
>
> CostSensitiveClassifier using reweighted training instances
>
> weka.classifiers.meta.AutoWEKAClassifier -seed 123 -timeLimit 15 -memLimit
> 1024 -nBestConfigs 1
>
> Classifier Model
> best classifier: weka.classifiers.lazy.IBk
> arguments: [-E, -K, 14, -X, -I]
> attribute search: weka.attributeSelection.GreedyStepwise
> attribute search arguments: [-B, -R]
> attribute evaluation: weka.attributeSelection.CfsSubsetEval
> attribute evaluation arguments: []
>
>
>
> --
> View this message in context: http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108.html
> Sent from the WEKA mailing list archive at Nabble.com.
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Jenna
Thank you. Below is another example. 

1. In this case, are you saying, I have to enter arguments into the fields under the Select attributes tab (the tab to the right of the Associate tab and to the left of the Visualize tab). When I have used Weka before I thought that each tab was independent but if I have to set the Select Attribute tab options as well then they obviously are not independent. 
2. Do I have to enter the estimated error anywhere?
3. The learning method is weka.classifiers.rules.PART, so for the Classifier configuration I should enter 
PART -M 1
and I don't need the weka.classifiers.rules prefix. Is this correct?



Auto-WEKA result:
best classifier: weka.classifiers.rules.PART
arguments: [-M, 1]
attribute search: weka.attributeSelection.GreedyStepwise
attribute search arguments: [-C, -B, -R]
attribute evaluation: weka.attributeSelection.CfsSubsetEval
attribute evaluation arguments: [-M, -L]
estimated error: 5.805069154347308


Correctly Classified Instances      652786               94.1951 %
Incorrectly Classified Instances     40229                5.8049 %
Kappa statistic                          0.8739
Mean absolute error                      0.0821
Root mean squared error                  0.2026
Relative absolute error                 17.9999 %
Root relative squared error             42.4263 %
Total Number of Instances           693015     

=== Confusion Matrix ===

      a      b   <-- classified as
 423976  25071 |      a = NO
  15158 228810 |      b = YES

=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 0.944    0.062    0.965      0.944    0.955      0.874    0.987     0.993     NO
                 0.938    0.056    0.901      0.938    0.919      0.874    0.987     0.975     YES
Weighted Avg.    0.942    0.060    0.943      0.942    0.942      0.874    0.987     0.987     


For better performance, try giving Auto-WEKA more time.

On Mon, Jan 2, 2017 at 9:29 AM, Eibe Frank-2 [via WEKA] <[hidden email]> wrote:
Those expressions in square brackets are command-line option specifications. You can use them in WEKA’s graphical user interfaces by right-clicking on the scheme specification string stated in the user interface (e.g., given to the right of the “Choose” button in the Classify panel in the Explorer) and selecting “Enter configuration…”. Delete the commas and square brackets from Auto-WEKA’s output and prepend the learning scheme's name. In your case, ignoring the attribute selection part, you’d use the configuration string

  IBk -E -K 14 -X -I

You did not say which attribute selection search method Auto-WEKA used and which evaluator and evaluator arguments it used. This information should be included in Auto-WEKA’s output. If you have all that info, you can configure an AttributeSelectedClassifier by entering these command-line options, prepending the corresponding scheme names, into the appropriate scheme specification fields in the GenericObjectEditor for the AttributeSelectedClassifier.

Cheers,
Eibe

> On 29/12/2016, at 9:55 PM, Jenna <[hidden email]> wrote:
>
> Hello. I used Auto-Weka. The results look like what is below. The arguments
> are
> [-E, -K, 14, -X, -I]
> and the search arguments are
> [-B, -R]
>
> If I look at lazy.IBk through the GUI I don't really see any place to put
> such arguments.
> I see KNN, batchSize, crossValidate, etc.
>
> 1. Can someone explain how I use these arguments to build a model.
> 2. Can I confirm that the way to use the results of Auto-Weka is to check
> what it finds for the best model, take my training data (which is the data I
> used to input into Auto-Weka) and build a training model with this algorithm
> and with the aforementioned arguments, save the model and then "load that
> model" [How do I load the model?] and finally, choose "Supplied test set"
> from Test options and press "Start"
> Thank you.
>
> === Run information ===
>
> Scheme:       weka.classifiers.meta.CostSensitiveClassifier -cost-matrix
> "[0.0 1.0; 14.0 0.0]" -S 1 -W weka.classifiers.meta.AutoWEKAClassifier --
> -seed 123 -timeLimit 15 -memLimit 1024 -nBestConfigs 1
> Relation:    
> R_data_frame-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-R698-701-weka.filters.unsupervised.attribute.Remove-V-R4,22,33,40,114,698
> Instances:    10683
> Attributes:   6
>           *
> *
> *
> *
> *
>
> Test mode:    evaluate on training data
>
> === Classifier model (full training set) ===
>
> CostSensitiveClassifier using reweighted training instances
>
> weka.classifiers.meta.AutoWEKAClassifier -seed 123 -timeLimit 15 -memLimit
> 1024 -nBestConfigs 1
>
> Classifier Model
> best classifier: weka.classifiers.lazy.IBk
> arguments: [-E, -K, 14, -X, -I]
> attribute search: weka.attributeSelection.GreedyStepwise
> attribute search arguments: [-B, -R]
> attribute evaluation: weka.attributeSelection.CfsSubsetEval
> attribute evaluation arguments: []
>
>
>
> --
> View this message in context: http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108.html
> Sent from the WEKA mailing list archive at Nabble.com.
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



If you reply to this email, your message will be added to the discussion below:
http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108p39112.html
To unsubscribe from arguments to a classifier, click here.
NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Eibe Frank-2
Administrator
Go to the Classify tab, and “Choose” the “AttributeSelectedClassifier” from the “meta” package. Then

a) Right-click on the field specifying the “classifier” configuration and select “Enter configuration…”. Paste

  PART -M 1

into the text box and press “OK”.

b) Right-click on the field specifying the “evaluator” configuration and select “Enter configuration…”. Paste

  CfsSubsetEval -M -L

into the text box and press “OK”.

c) Right-click on the field specifying the “search” configuration and select “Enter configuration…”. Paste

  weka.attributeSelection.GreedyStepwise -B -C -R

into the text box and press “OK”.

Yes, the attribute selection tab is independent. Note that it’s normally best to use the AttributeSelectedClassifier instead, which will ensure that attributes are selected based on the information in the *training* data only. This is important to prevent optimistic performance estimates.

You don’t have to enter the estimated error anywhere.

Cheers,
Eibe

> On 2/01/2017, at 2:09 PM, Jenna <[hidden email]> wrote:
>
> Thank you. Below is another example.
>
> 1. In this case, are you saying, I have to enter arguments into the fields under the Select attributes tab (the tab to the right of the Associate tab and to the left of the Visualize tab). When I have used Weka before I thought that each tab was independent but if I have to set the Select Attribute tab options as well then they obviously are not independent.
> 2. Do I have to enter the estimated error anywhere?
> 3. The learning method is weka.classifiers.rules.PART, so for the Classifier configuration I should enter
> PART -M 1
> and I don't need the weka.classifiers.rules prefix. Is this correct?
>
>
>
> Auto-WEKA result:
> best classifier: weka.classifiers.rules.PART
> arguments: [-M, 1]
> attribute search: weka.attributeSelection.GreedyStepwise
> attribute search arguments: [-C, -B, -R]
> attribute evaluation: weka.attributeSelection.CfsSubsetEval
> attribute evaluation arguments: [-M, -L]
> estimated error: 5.805069154347308
>
>
> Correctly Classified Instances      652786               94.1951 %
> Incorrectly Classified Instances     40229                5.8049 %
> Kappa statistic                          0.8739
> Mean absolute error                      0.0821
> Root mean squared error                  0.2026
> Relative absolute error                 17.9999 %
> Root relative squared error             42.4263 %
> Total Number of Instances           693015    
>
> === Confusion Matrix ===
>
>       a      b   <-- classified as
>  423976  25071 |      a = NO
>   15158 228810 |      b = YES
>
> === Detailed Accuracy By Class ===
>
>                  TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
>                  0.944    0.062    0.965      0.944    0.955      0.874    0.987     0.993     NO
>                  0.938    0.056    0.901      0.938    0.919      0.874    0.987     0.975     YES
> Weighted Avg.    0.942    0.060    0.943      0.942    0.942      0.874    0.987     0.987    
>
>
> For better performance, try giving Auto-WEKA more time.
>
> On Mon, Jan 2, 2017 at 9:29 AM, Eibe Frank-2 [via WEKA] <[hidden email]> wrote:
> Those expressions in square brackets are command-line option specifications. You can use them in WEKA’s graphical user interfaces by right-clicking on the scheme specification string stated in the user interface (e.g., given to the right of the “Choose” button in the Classify panel in the Explorer) and selecting “Enter configuration…”. Delete the commas and square brackets from Auto-WEKA’s output and prepend the learning scheme's name. In your case, ignoring the attribute selection part, you’d use the configuration string
>
>   IBk -E -K 14 -X -I
>
> You did not say which attribute selection search method Auto-WEKA used and which evaluator and evaluator arguments it used. This information should be included in Auto-WEKA’s output. If you have all that info, you can configure an AttributeSelectedClassifier by entering these command-line options, prepending the corresponding scheme names, into the appropriate scheme specification fields in the GenericObjectEditor for the AttributeSelectedClassifier.
>
> Cheers,
> Eibe
>
> > On 29/12/2016, at 9:55 PM, Jenna <[hidden email]> wrote:
> >
> > Hello. I used Auto-Weka. The results look like what is below. The arguments
> > are
> > [-E, -K, 14, -X, -I]
> > and the search arguments are
> > [-B, -R]
> >
> > If I look at lazy.IBk through the GUI I don't really see any place to put
> > such arguments.
> > I see KNN, batchSize, crossValidate, etc.
> >
> > 1. Can someone explain how I use these arguments to build a model.
> > 2. Can I confirm that the way to use the results of Auto-Weka is to check
> > what it finds for the best model, take my training data (which is the data I
> > used to input into Auto-Weka) and build a training model with this algorithm
> > and with the aforementioned arguments, save the model and then "load that
> > model" [How do I load the model?] and finally, choose "Supplied test set"
> > from Test options and press "Start"
> > Thank you.
> >
> > === Run information ===
> >
> > Scheme:       weka.classifiers.meta.CostSensitiveClassifier -cost-matrix
> > "[0.0 1.0; 14.0 0.0]" -S 1 -W weka.classifiers.meta.AutoWEKAClassifier --
> > -seed 123 -timeLimit 15 -memLimit 1024 -nBestConfigs 1
> > Relation:    
> > R_data_frame-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-R698-701-weka.filters.unsupervised.attribute.Remove-V-R4,22,33,40,114,698
> > Instances:    10683
> > Attributes:   6
> >           *
> > *
> > *
> > *
> > *
> >
> > Test mode:    evaluate on training data
> >
> > === Classifier model (full training set) ===
> >
> > CostSensitiveClassifier using reweighted training instances
> >
> > weka.classifiers.meta.AutoWEKAClassifier -seed 123 -timeLimit 15 -memLimit
> > 1024 -nBestConfigs 1
> >
> > Classifier Model
> > best classifier: weka.classifiers.lazy.IBk
> > arguments: [-E, -K, 14, -X, -I]
> > attribute search: weka.attributeSelection.GreedyStepwise
> > attribute search arguments: [-B, -R]
> > attribute evaluation: weka.attributeSelection.CfsSubsetEval
> > attribute evaluation arguments: []
> >
> >
> >
> > --
> > View this message in context: http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108.html
> > Sent from the WEKA mailing list archive at Nabble.com.
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> If you reply to this email, your message will be added to the discussion below:
> http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108p39112.html
> To unsubscribe from arguments to a classifier, click here.
> NAML
>
>
> View this message in context: Re: arguments to a classifier
> Sent from the WEKA mailing list archive at Nabble.com.
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Jenna
In reply to this post by Eibe Frank-2
Hello.
I tried the arguments as indicated.

weka.classifiers.rules.PART -M 1 -C 0.25 -Q 1
weka.attributeSelection.CfsSubsetEval -M -L -P 1 -E 1
weka.attributeSelection.GreedyStepwise -B -C -R -T -1.7976931348623157E308 -N -1 -num-slots 1


Using the same data set and Cross-validation (set at 10).

I get a lower measure of ROC then reported in Auto-Weka. Auto-Weka reports
ROC Area = 0.987
When I run the specified classifier and configuration it shows
ROC Area = 0.772


Here are the results
=== Run information ===

Scheme:       weka.classifiers.rules.PART -M 1 -C 0.25 -Q 1
Relation:     R_data_frame-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-R698-701,703-weka.filters.unsupervised.attribute.StringToNominal-Rfirst-last-weka.filters.unsupervised.attribute.Remove-V-R4,22,33,38,59,64,90,111,698
Instances:    19002
Attributes:   9

Number of Rules  : 642


Time taken to build model: 1.8 seconds

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances       18518               97.4529 %
Incorrectly Classified Instances       484                2.5471 %
Kappa statistic                          0.3503
Mean absolute error                      0.0314
Root mean squared error                  0.1495
Relative absolute error                 72.9409 %
Root relative squared error            101.9429 %
Total Number of Instances            19002    

=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 0.989    0.670    0.985      0.989    0.987      0.352    0.772     0.989     NO
                 0.330    0.011    0.404      0.330    0.363      0.352    0.772     0.242     YES
Weighted Avg.    0.975    0.655    0.972      0.975    0.973      0.352    0.772     0.973    

=== Confusion Matrix ===

     a     b   <-- classified as
 18380   204 |     a = NO
   280   138 |     b = YES



I also tried Auto-Weka again running for a longer period of time. I
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Jenna
I apologize. I tried SMOTE first and now get results similar to Auto-Weka.
ROC Area is now 0.982.

Does a model built using SMOTE accurately work on new data that does not have SMOTE applied to it?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: arguments to a classifier

Eibe Frank-3
SMOTE should be applied in conjunction with the FilteredClassifier so that *only the training data is modified* and the test data is left unchanged.

Cheers,
Eibe

On Thu, Jan 5, 2017 at 10:37 PM, Jenna <[hidden email]> wrote:
I apologize. I tried SMOTE first and now get results similar to Auto-Weka.
ROC Area is now 0.982.

Does a model built using SMOTE accurately work on new data that does not
have SMOTE applied to it?




--
View this message in context: http://weka.8497.n7.nabble.com/arguments-to-a-classifier-tp39108p39125.html
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...