Merit of CFS subset selector

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Merit of CFS subset selector

Philipp Marx
Hi,

I have a question regarding the merit of the CFS subset selector. I know that it is printing out the correlation of subsets based on a merit between 0 and 1 but is there more information about the meaning of the merit? I.e. what is a "good" value?

The same question applies to the InfoGainAttribute selection? Is there more information about the range and the meaning of the output?

Thanks :)

Cheers,
Philipp

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Merit of CFS subset selector

G. MULLER - Work
Hi

In the docs for this "selector", I see this document as the reference:

M. A. Hall (1998). Correlation-based Feature Subset Selection for
Machine Learning. Hamilton, New Zealand.

It doesn't mention the merit?

Cheers

GM
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Merit of CFS subset selector

Philipp Marx
In reply to this post by Philipp Marx
Thanks, I have seen the thesis already and it indeed mentions the merit but unfortunately doesn't state what a "good" merit is.

The only relevant information I found was a "general article about merits in statistics". Which basically states that a merit is "good" if its > 0.8. But in Weka merit is used with many different scales and the JavaDoc states that it up to the evaluator to define "merit". So I wonder if there is more information for CFS and InfoGain.


Cheers,
Philipp


Message: 2
Date: Fri, 18 Dec 2015 10:38:00 +0100
From: Philipp Marx <[hidden email]>
To: [hidden email]
Subject: [Wekalist] Merit of CFS subset selector
Message-ID:
        <[hidden email]>
Content-Type: text/plain; charset="utf-8"

Hi,

I have a question regarding the merit of the CFS subset selector. I know
that it is printing out the correlation of subsets based on a merit between
0 and 1 but is there more information about the meaning of the merit? I.e.
what is a "good" value?

The same question applies to the InfoGainAttribute selection? Is there more
information about the range and the meaning of the output?

Thanks :)

Cheers,
Philipp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.waikato.ac.nz/pipermail/wekalist/attachments/20151218/5ed89007/attachment-0001.html>

------------------------------

Message: 3
Date: Fri, 18 Dec 2015 10:50:43 +0100
From: "G. MULLER - Work" <[hidden email]>
To: "Weka machine learning workbench list."
        <[hidden email]>
Subject: Re: [Wekalist] Merit of CFS subset selector
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi

In the docs for this "selector", I see this document as the reference:

M. A. Hall (1998). Correlation-based Feature Subset Selection for
Machine Learning. Hamilton, New Zealand.

It doesn't mention the merit?

Cheers

GM


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Merit of CFS subset selector

Eibe Frank-2
Administrator
As this merit is estimated based on the training data only, it's not a particularly trustworthy measure anyway (e.g., it does not take the size of the training set into account). To get a more meaningful estimate of the worth of the output generated by an attribute selection scheme, I would run a cross-validation with a random forest (or a similarly powerful classifier) using AttributeSelectedClassifier in conjunction with the chosen attribute selection scheme and consider classification accuracy obtained (or AUROC, etc).

Cheers,
Eibe

> On 18 Dec 2015, at 23:33, Philipp Marx <[hidden email]> wrote:
>
> Thanks, I have seen the thesis already and it indeed mentions the merit but unfortunately doesn't state what a "good" merit is.
>
> The only relevant information I found was a "general article about merits in statistics". Which basically states that a merit is "good" if its > 0.8. But in Weka merit is used with many different scales and the JavaDoc states that it up to the evaluator to define "merit". So I wonder if there is more information for CFS and InfoGain.
>
>
> Cheers,
> Philipp
>
>
> Message: 2
> Date: Fri, 18 Dec 2015 10:38:00 +0100
> From: Philipp Marx <[hidden email]>
> To: [hidden email]
> Subject: [Wekalist] Merit of CFS subset selector
> Message-ID:
>         <[hidden email]>
> Content-Type: text/plain; charset="utf-8"
>
> Hi,
>
> I have a question regarding the merit of the CFS subset selector. I know
> that it is printing out the correlation of subsets based on a merit between
> 0 and 1 but is there more information about the meaning of the merit? I.e.
> what is a "good" value?
>
> The same question applies to the InfoGainAttribute selection? Is there more
> information about the range and the meaning of the output?
>
> Thanks :)
>
> Cheers,
> Philipp
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://list.waikato.ac.nz/pipermail/wekalist/attachments/20151218/5ed89007/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 18 Dec 2015 10:50:43 +0100
> From: "G. MULLER - Work" <[hidden email]>
> To: "Weka machine learning workbench list."
>         <[hidden email]>
> Subject: Re: [Wekalist] Merit of CFS subset selector
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Hi
>
> In the docs for this "selector", I see this document as the reference:
>
> M. A. Hall (1998). Correlation-based Feature Subset Selection for
> Machine Learning. Hamilton, New Zealand.
>
> It doesn't mention the merit?
>
> Cheers
>
> GM
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Merit of CFS subset selector

pikunimohanty@gmail.com
This post has NOT been accepted by the mailing list yet.
Hi

I am trying to display "Merit of best subset found" using my own code. The toResultsString() method display everything. I only need the merit to be displayed. Please find the code below.

    System.out.println("\n3. Low-level");
    AttributeSelection attsel = new AttributeSelection();
    CfsSubsetEval eval = new CfsSubsetEval();
    GreedyStepwise search = new GreedyStepwise();
    search.setSearchBackwards(true);
    attsel.setEvaluator(eval);
    attsel.setSearch(search);
    attsel.SelectAttributes(data);
    int[] indices = attsel.selectedAttributes();
    System.out.println("selected attribute indices (starting with 0):\n" + Utils.arrayToString(indices));

Waiting for your reply
    System.out.println("Results =  " + attsel.toResultsString());
Reply | Threaded
Open this post in threaded view
|

Re: Merit of CFS subset selector

pikunimohanty@gmail.com
This post has NOT been accepted by the mailing list yet.
In reply to this post by G. MULLER - Work
Hi

I am trying to display "Merit of best subset found" using my own code. The toResultsString() method display everything. I only need the merit to be displayed. Please find the code below.

    System.out.println("\n3. Low-level");
    AttributeSelection attsel = new AttributeSelection();
    CfsSubsetEval eval = new CfsSubsetEval();
    GreedyStepwise search = new GreedyStepwise();
    search.setSearchBackwards(true);
    attsel.setEvaluator(eval);
    attsel.setSearch(search);
    attsel.SelectAttributes(data);
    int[] indices = attsel.selectedAttributes();
    System.out.println("selected attribute indices (starting with 0):\n" + Utils.arrayToString(indices));
    System.out.println("Results =  " + attsel.toResultsString());

Waiting for your reply