Grid and random search do not perform well

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Grid and random search do not perform well

asadbtk
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

Eibe Frank
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

asadbtk
Hello Eibe, 

I changed maximum minimum parameters in math parameters settings, but still it did not increase the accuracy. 

In r the parameters of svr are cost and sigma and the rf has parameter of mtry. What are the alternatives of these parameters in weka for rf and smoreg. I didn't understand what is provided in weka explorer. 

Best regards 

On Tuesday, December 31, 2019, Eibe Frank <[hidden email]> wrote:
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

Eibe Frank-3
Use

  numFeatures

in RandomForest instead of mtry. In SMOreg, the

  c

parameter is the "cost" parameter.

To use and tune an RBF kernel, first set the "kernel" parameter in SMOreg to "RBFKernel". Then, in MultiSearch (or GridSearch), use

  kernel.gamma

as the parameter name. Gamma is a "nested" parameter, and we need to tell MultiSearch that this parameter is nested inside the kernel parameter.

Cheers,
Eibe

On Wed, Jan 1, 2020 at 8:31 AM javed khan <[hidden email]> wrote:
Hello Eibe, 

I changed maximum minimum parameters in math parameters settings, but still it did not increase the accuracy. 

In r the parameters of svr are cost and sigma and the rf has parameter of mtry. What are the alternatives of these parameters in weka for rf and smoreg. I didn't understand what is provided in weka explorer. 

Best regards 

On Tuesday, December 31, 2019, Eibe Frank <[hidden email]> wrote:
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

asadbtk
Thanks a lot Eibe. 

On Wednesday, January 1, 2020, Eibe Frank <[hidden email]> wrote:
Use

  numFeatures

in RandomForest instead of mtry. In SMOreg, the

  c

parameter is the "cost" parameter.

To use and tune an RBF kernel, first set the "kernel" parameter in SMOreg to "RBFKernel". Then, in MultiSearch (or GridSearch), use

  kernel.gamma

as the parameter name. Gamma is a "nested" parameter, and we need to tell MultiSearch that this parameter is nested inside the kernel parameter.

Cheers,
Eibe

On Wed, Jan 1, 2020 at 8:31 AM javed khan <[hidden email]> wrote:
Hello Eibe, 

I changed maximum minimum parameters in math parameters settings, but still it did not increase the accuracy. 

In r the parameters of svr are cost and sigma and the rf has parameter of mtry. What are the alternatives of these parameters in weka for rf and smoreg. I didn't understand what is provided in weka explorer. 

Best regards 

On Tuesday, December 31, 2019, Eibe Frank <[hidden email]> wrote:
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

asadbtk
In reply to this post by Eibe Frank-3
Hello Eibe 

As you mentioned that mtry in weka is represented by numFeatures, so the auto weka generated arguments of rf are : -I, 10,-K,0, - depth, 0

Which of them represent mtry? Moreover, it's default value is 0, which is strange because usually mtry is set to the number of columns in the dataset. 

Best regards 

On Wednesday, January 1, 2020, Eibe Frank <[hidden email]> wrote:
Use

  numFeatures

in RandomForest instead of mtry. In SMOreg, the

  c

parameter is the "cost" parameter.

To use and tune an RBF kernel, first set the "kernel" parameter in SMOreg to "RBFKernel". Then, in MultiSearch (or GridSearch), use

  kernel.gamma

as the parameter name. Gamma is a "nested" parameter, and we need to tell MultiSearch that this parameter is nested inside the kernel parameter.

Cheers,
Eibe

On Wed, Jan 1, 2020 at 8:31 AM javed khan <[hidden email]> wrote:
Hello Eibe, 

I changed maximum minimum parameters in math parameters settings, but still it did not increase the accuracy. 

In r the parameters of svr are cost and sigma and the rf has parameter of mtry. What are the alternatives of these parameters in weka for rf and smoreg. I didn't understand what is provided in weka explorer. 

Best regards 

On Tuesday, December 31, 2019, Eibe Frank <[hidden email]> wrote:
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Grid and random search do not perform well

Eibe Frank
The numFeatures parameter in the GUI corresponds to the -K command-line parameter. You can verify this by changing the value of numFeatures in the GUI.

The value 0 forces RandomForest to use its built-in heuristic to compute the number of attributes/features. The relevant code in RandomTree is

m_KValue = (int) Utils.log2(data.numAttributes() - 1) + 1;

Note that data.numAttributes() - 1 corresponds to the number of predictor attributes in the data.

Cheers,
Eibe

On Mon, Jan 6, 2020 at 6:37 AM javed khan <[hidden email]> wrote:
Hello Eibe 

As you mentioned that mtry in weka is represented by numFeatures, so the auto weka generated arguments of rf are : -I, 10,-K,0, - depth, 0

Which of them represent mtry? Moreover, it's default value is 0, which is strange because usually mtry is set to the number of columns in the dataset. 

Best regards 

On Wednesday, January 1, 2020, Eibe Frank <[hidden email]> wrote:
Use

  numFeatures

in RandomForest instead of mtry. In SMOreg, the

  c

parameter is the "cost" parameter.

To use and tune an RBF kernel, first set the "kernel" parameter in SMOreg to "RBFKernel". Then, in MultiSearch (or GridSearch), use

  kernel.gamma

as the parameter name. Gamma is a "nested" parameter, and we need to tell MultiSearch that this parameter is nested inside the kernel parameter.

Cheers,
Eibe

On Wed, Jan 1, 2020 at 8:31 AM javed khan <[hidden email]> wrote:
Hello Eibe, 

I changed maximum minimum parameters in math parameters settings, but still it did not increase the accuracy. 

In r the parameters of svr are cost and sigma and the rf has parameter of mtry. What are the alternatives of these parameters in weka for rf and smoreg. I didn't understand what is provided in weka explorer. 

Best regards 

On Tuesday, December 31, 2019, Eibe Frank <[hidden email]> wrote:
Most likely, the specification of the option (i.e., parameter) name is incorrect. Unfortunately, MultiSearch currently does not pop up a dialog when the user tries to optimise a non-existent parameter. It simply skips that parameter and proceeds with evaluation. However, if you run WEKA with a console window open, you will see error messages if this happens.

As an example, here is a configuration for optimising the -M (i.e., minNumObj) and -C (i.e., confidenceLevel) parameters of J48 using the default search method:

weka.classifiers.meta.MultiSearch -E ACC -search "weka.core.setupgenerator.MathParameter -property minNumObj -min 1.0 -max 10.0 -step 1.0 -base 10.0 -expression I" -search "weka.core.setupgenerator.MathParameter -property confidenceFactor -min -10.0 -max -1.0 -step 1.0 -base 2.0 -expression pow(BASE,I)" -class-label 1 -algorithm "weka.classifiers.meta.multisearch.DefaultSearch -sample-size 100.0 -initial-folds 2 -subsequent-folds 10 -initial-test-set . -subsequent-test-set . -num-slots 1" -log-file "C:\\Program Files\\Weka-3-8-4" -S 1 -W weka.classifiers.trees.J48 -- -C 0.5 -M 10

Running this on the diabetes data that comes with WEKA, you will see that the result is different from standard J48 (the tree is *much* smaller, with similar accuracy). Switching to random search gives a similar but not identical result.

Cheers,
Eibe

On Tue, Dec 31, 2019 at 1:07 PM javed khan <[hidden email]> wrote:
Hello

I have used the multisearch and tried grid and random searches for hyperparameter tuning of several ML models but in all experiments, both of them performed exactly the same as the default parameters. How it is possible, am I doing it in a wrong way..? I just select the multisearch and selects either random search or default search which is like a grid search. 

Regards 
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html