Quantcast

Suggesting Attributes for J48 Classification

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Suggesting Attributes for J48 Classification

Cerin
I've trained a J48 decision tree classifier on a large corpus that
contains several thousand attributes. For my domain, I want to
classify a sample submitted by a user. However, this sample is being
submitted in real-time, with the user manually entering the value for
each attribute (subject to some minor validation via a web form).
Therefore, to prevent the user from going insane, I only want to
require that they enter the top N attributes most likely to reduce
entropy enough to make an accurate classification. Is this possible to
do with J48? Given a sparse sample, how would I query the next best
attribute from a J48 model? I've written my own basic decision tree
algorithm that can calculate the next attribute to ask the user after
each submission, but I'd like to make use of Weka's more mature code.

Regards,
Chris

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

mhall
Administrator
On 9/30/10 8:12 AM, Chris Spencer wrote:

> I've trained a J48 decision tree classifier on a large corpus that
> contains several thousand attributes. For my domain, I want to
> classify a sample submitted by a user. However, this sample is being
> submitted in real-time, with the user manually entering the value for
> each attribute (subject to some minor validation via a web form).
> Therefore, to prevent the user from going insane, I only want to
> require that they enter the top N attributes most likely to reduce
> entropy enough to make an accurate classification. Is this possible to
> do with J48? Given a sparse sample, how would I query the next best
> attribute from a J48 model? I've written my own basic decision tree
> algorithm that can calculate the next attribute to ask the user after
> each submission, but I'd like to make use of Weka's more mature code.

Can't you just rank your attributes according to information gain/gain
ratio with respect to the class, choose the top N (where N is not large
enough to drive users insane) and then just build your J48 model using
these N attributes?

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Cerin
On Thu, Sep 30, 2010 at 4:03 AM, Mark Hall <[hidden email]> wrote:

> On 9/30/10 8:12 AM, Chris Spencer wrote:
>>
>> I've trained a J48 decision tree classifier on a large corpus that
>> contains several thousand attributes. For my domain, I want to
>> classify a sample submitted by a user. However, this sample is being
>> submitted in real-time, with the user manually entering the value for
>> each attribute (subject to some minor validation via a web form).
>> Therefore, to prevent the user from going insane, I only want to
>> require that they enter the top N attributes most likely to reduce
>> entropy enough to make an accurate classification. Is this possible to
>> do with J48? Given a sparse sample, how would I query the next best
>> attribute from a J48 model? I've written my own basic decision tree
>> algorithm that can calculate the next attribute to ask the user after
>> each submission, but I'd like to make use of Weka's more mature code.
>
> Can't you just rank your attributes according to information gain/gain ratio
> with respect to the class, choose the top N (where N is not large enough to
> drive users insane) and then just build your J48 model using these N
> attributes?
>
> Cheers,
> Mark.

I see your point. This would certainly be an easy solution. However, a
hard-coded selection of attributes would never be comprehensive enough
for my usage. My data set covers a wide range of topics that would be
difficult to widdle-down to a handful of attributes. For example,
think of it like a biological taxonomy. If 90% of my attributes
describe plants (e.g. leaf shape, bark color, seasonality, etc), and I
pick the top N attributes, these would likely fail to classify the 5%
of attributes that describe mammals (e.g. diet, fur color, etc).
That's why I'd prefer to do this dynamically, so I don't loose any
accuracy. Also, in cases where the user doesn't mind entering
additional attributes, I want to be able to dynamically handle them,
which I wouldn't be able to do if I train it on a fixed subset of
attributes.

Is this technically difficult to do with the current J48
implementation? I realize it's not how decision tree classifiers are
typically used, but it's analogous to manually walking the decision
tree.

Regards,
Chris

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

mhall
Administrator
On 1/10/10 2:03 AM, Chris Spencer wrote:

> On Thu, Sep 30, 2010 at 4:03 AM, Mark Hall<[hidden email]>  wrote:
>> On 9/30/10 8:12 AM, Chris Spencer wrote:
>>>
>>> I've trained a J48 decision tree classifier on a large corpus that
>>> contains several thousand attributes. For my domain, I want to
>>> classify a sample submitted by a user. However, this sample is being
>>> submitted in real-time, with the user manually entering the value for
>>> each attribute (subject to some minor validation via a web form).
>>> Therefore, to prevent the user from going insane, I only want to
>>> require that they enter the top N attributes most likely to reduce
>>> entropy enough to make an accurate classification. Is this possible to
>>> do with J48? Given a sparse sample, how would I query the next best
>>> attribute from a J48 model? I've written my own basic decision tree
>>> algorithm that can calculate the next attribute to ask the user after
>>> each submission, but I'd like to make use of Weka's more mature code.
>>
>> Can't you just rank your attributes according to information gain/gain ratio
>> with respect to the class, choose the top N (where N is not large enough to
>> drive users insane) and then just build your J48 model using these N
>> attributes?
>>
>> Cheers,
>> Mark.
>
> I see your point. This would certainly be an easy solution. However, a
> hard-coded selection of attributes would never be comprehensive enough
> for my usage. My data set covers a wide range of topics that would be
> difficult to widdle-down to a handful of attributes. For example,
> think of it like a biological taxonomy. If 90% of my attributes
> describe plants (e.g. leaf shape, bark color, seasonality, etc), and I
> pick the top N attributes, these would likely fail to classify the 5%
> of attributes that describe mammals (e.g. diet, fur color, etc).
> That's why I'd prefer to do this dynamically, so I don't loose any
> accuracy. Also, in cases where the user doesn't mind entering
> additional attributes, I want to be able to dynamically handle them,
> which I wouldn't be able to do if I train it on a fixed subset of
> attributes.

If you have a highly skewed class distribution, then performance on the
minority classes is likely to be poor with a decision tree.

>
> Is this technically difficult to do with the current J48
> implementation? I realize it's not how decision tree classifiers are
> typically used, but it's analogous to manually walking the decision
> tree.

To classify a new example using a tree learned by J48, you have to
traverse the tree from the root to a leaf. The set of attributes tested
on this path is fixed - you can't change it dynamically.

If you are going to use trees, I'd suggest building a set of trees - one
for each class (i.e. one-against-the-rest) and balance the class
distribution for each via under/over sampling. Next, decide on how many
attributes in total that the user could handle entering values for.
Build decision trees to various depths for each class (using REPTree
since it has a parameter to control max depth) until the union of all
attributes tested in all trees is approximately of size equal to the
maximum number of values that a user will be asked to enter. At testing
time, ask the user to enter a value for the attribute that is tested at
the root of each tree. When constructing the test instance to pump
through each tree, set the value entered for the attribute that
corresponds to the test at the root of that particular tree and then set
all other values to missing. Get a predicted class distribution from
each tree. The highest probability from each tree will give an overall
likelihood of the respective class for this point in the process. You
can then repeat this process by asking the user to enter values for
those attributes that are tested at the second level of each tree, and
so on. At some point, it might become clear that one class is much more
likely than any of the others. You could set a threshold on this which
could be used to terminate the process of querying the user for more
values and then generate a final prediction.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Cerin
On Thu, Sep 30, 2010 at 5:06 PM, Mark Hall <[hidden email]> wrote:

>> Is this technically difficult to do with the current J48
>> implementation? I realize it's not how decision tree classifiers are
>> typically used, but it's analogous to manually walking the decision
>> tree.
>
> To classify a new example using a tree learned by J48, you have to traverse
> the tree from the root to a leaf. The set of attributes tested on this path
> is fixed - you can't change it dynamically.
>
> If you are going to use trees, I'd suggest building a set of trees - one for
> each class (i.e. one-against-the-rest) and balance the class distribution
> for each via under/over sampling. Next, decide on how many attributes in
> total that the user could handle entering values for. Build decision trees
> to various depths for each class (using REPTree since it has a parameter to
> control max depth) until the union of all attributes tested in all trees is
> approximately of size equal to the maximum number of values that a user will
> be asked to enter. At testing time, ask the user to enter a value for the
> attribute that is tested at the root of each tree. When constructing the
> test instance to pump through each tree, set the value entered for the
> attribute that corresponds to the test at the root of that particular tree
> and then set all other values to missing. Get a predicted class distribution
> from each tree. The highest probability from each tree will give an overall
> likelihood of the respective class for this point in the process. You can
> then repeat this process by asking the user to enter values for those
> attributes that are tested at the second level of each tree, and so on. At
> some point, it might become clear that one class is much more likely than
> any of the others. You could set a threshold on this which could be used to
> terminate the process of querying the user for more values and then generate
> a final prediction.

I apologize, because I don't think I'm explaining myself clearly. I
don't really understand your solution, but it seems immensely
overcomplicated for what I'm trying to achieve. All I want to do is
find the root node of the decision tree. And then ask the user for the
value of this node. Then, based on that answer, find the next node,
and ask them for this value. And then repeat this process until a
classification occurs. At first glance, this task seems trivial, but
it appears J48 isn't explicitly designed to do this, so I'll probably
have to hack the code to allow me to query the order of decision tree
nodes. I see there's a command line option to dump the tree as some
sort of XML graph, so I may be able to find the data I need there.

Thanks for your help.

Regards,
Chris

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Vishal Belsare-2
Chris,

Would 'to String' or 'dumpTree' methods work for you? AFAIK they work
for J48. Look at this:

http://www.lucsorel.com/index.php?page=downloads#wekatext2xml

Having a parsable XML will help you 'walk' the decision tree.


Best,
Vishal Belsare


On Sun, Oct 3, 2010 at 4:39 AM, Chris Spencer <[hidden email]> wrote:

> On Thu, Sep 30, 2010 at 5:06 PM, Mark Hall <[hidden email]> wrote:
>>> Is this technically difficult to do with the current J48
>>> implementation? I realize it's not how decision tree classifiers are
>>> typically used, but it's analogous to manually walking the decision
>>> tree.
>>
>> To classify a new example using a tree learned by J48, you have to traverse
>> the tree from the root to a leaf. The set of attributes tested on this path
>> is fixed - you can't change it dynamically.
>>
>> If you are going to use trees, I'd suggest building a set of trees - one for
>> each class (i.e. one-against-the-rest) and balance the class distribution
>> for each via under/over sampling. Next, decide on how many attributes in
>> total that the user could handle entering values for. Build decision trees
>> to various depths for each class (using REPTree since it has a parameter to
>> control max depth) until the union of all attributes tested in all trees is
>> approximately of size equal to the maximum number of values that a user will
>> be asked to enter. At testing time, ask the user to enter a value for the
>> attribute that is tested at the root of each tree. When constructing the
>> test instance to pump through each tree, set the value entered for the
>> attribute that corresponds to the test at the root of that particular tree
>> and then set all other values to missing. Get a predicted class distribution
>> from each tree. The highest probability from each tree will give an overall
>> likelihood of the respective class for this point in the process. You can
>> then repeat this process by asking the user to enter values for those
>> attributes that are tested at the second level of each tree, and so on. At
>> some point, it might become clear that one class is much more likely than
>> any of the others. You could set a threshold on this which could be used to
>> terminate the process of querying the user for more values and then generate
>> a final prediction.
>
> I apologize, because I don't think I'm explaining myself clearly. I
> don't really understand your solution, but it seems immensely
> overcomplicated for what I'm trying to achieve. All I want to do is
> find the root node of the decision tree. And then ask the user for the
> value of this node. Then, based on that answer, find the next node,
> and ask them for this value. And then repeat this process until a
> classification occurs. At first glance, this task seems trivial, but
> it appears J48 isn't explicitly designed to do this, so I'll probably
> have to hack the code to allow me to query the order of decision tree
> nodes. I see there's a command line option to dump the tree as some
> sort of XML graph, so I may be able to find the data I need there.
>
> Thanks for your help.
>
> Regards,
> Chris
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>



--
We agnostics often envy the True Believer, who thus acquires so easily
that sense of security which is forever denied to us.
~ E. T. Jaynes

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Cerin
On Sat, Oct 2, 2010 at 7:30 PM, Vishal Belsare <[hidden email]> wrote:
> Chris,
>
> Would 'to String' or 'dumpTree' methods work for you? AFAIK they work
> for J48. Look at this:
>
> http://www.lucsorel.com/index.php?page=downloads#wekatext2xml
>
> Having a parsable XML will help you 'walk' the decision tree.

Thanks. But how do you save a J48 tree as "text-syntax"? The default
seems to be binary, which that project doesn't seem to support. I see
something like that syntax included in the training output when the -U
option is given, but it doesn't look like a complete tree. To use my
earlier example data set:

@relation test

%@attribute a numeric
@attribute a {1,2,3,4,5,6,7,8,9}
@attribute b numeric
@attribute c numeric
@attribute class {BOB,SUE,JON,LUE,MOE,VAL,JIM,ZOE,XYZ}

@data
{0 1, 1 1, 2 1, 3 BOB}
{0 2, 1 1, 2 1, 3 SUE}
{0 3, 1 2, 2 1, 3 JON}
{0 4, 1 2, 2 1, 3 LUE}
{0 5, 1 3, 2 2, 3 MOE}
{0 6, 1 3, 2 2, 3 VAL}
{0 7, 1 4, 2 2, 3 JIM}
{0 8, 1 4, 2 2, 3 ZOE}
{0 9, 1 5, 2 3, 3 XYZ}

I'm seeing the tree output:

c <= 1
|   b <= 1: BOB (2.0/1.0)
|   b > 1: JON (2.0/1.0)
c > 1
|   b <= 3: MOE (2.0/1.0)
|   b > 3: JIM (3.0/2.0)

Why is J48 ignoring the "a" attribute, reducing accuracy by 50%? How
do I fix this to include all attributes? Is there an easier way to get
this representation then having to search for it in the output from
training?

Regards,
Chris

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Winton Davies-3
Because it has a rule for minimun items in a tree, in this case, the
default is 2 :)

W



On Sun, Oct 3, 2010 at 3:55 PM, Chris Spencer <[hidden email]> wrote:

> On Sat, Oct 2, 2010 at 7:30 PM, Vishal Belsare <[hidden email]> wrote:
>> Chris,
>>
>> Would 'to String' or 'dumpTree' methods work for you? AFAIK they work
>> for J48. Look at this:
>>
>> http://www.lucsorel.com/index.php?page=downloads#wekatext2xml
>>
>> Having a parsable XML will help you 'walk' the decision tree.
>
> Thanks. But how do you save a J48 tree as "text-syntax"? The default
> seems to be binary, which that project doesn't seem to support. I see
> something like that syntax included in the training output when the -U
> option is given, but it doesn't look like a complete tree. To use my
> earlier example data set:
>
> @relation test
>
> %@attribute a numeric
> @attribute a {1,2,3,4,5,6,7,8,9}
> @attribute b numeric
> @attribute c numeric
> @attribute class {BOB,SUE,JON,LUE,MOE,VAL,JIM,ZOE,XYZ}
>
> @data
> {0 1, 1 1, 2 1, 3 BOB}
> {0 2, 1 1, 2 1, 3 SUE}
> {0 3, 1 2, 2 1, 3 JON}
> {0 4, 1 2, 2 1, 3 LUE}
> {0 5, 1 3, 2 2, 3 MOE}
> {0 6, 1 3, 2 2, 3 VAL}
> {0 7, 1 4, 2 2, 3 JIM}
> {0 8, 1 4, 2 2, 3 ZOE}
> {0 9, 1 5, 2 3, 3 XYZ}
>
> I'm seeing the tree output:
>
> c <= 1
> |   b <= 1: BOB (2.0/1.0)
> |   b > 1: JON (2.0/1.0)
> c > 1
> |   b <= 3: MOE (2.0/1.0)
> |   b > 3: JIM (3.0/2.0)
>
> Why is J48 ignoring the "a" attribute, reducing accuracy by 50%? How
> do I fix this to include all attributes? Is there an easier way to get
> this representation then having to search for it in the output from
> training?
>
> Regards,
> Chris
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Winton Davies-3
Also, almost any algorithm, J4.8 included, will try to find the
shortest route to classification -- for example in the case above, if
you set min items per tree node to 1 (and maybe play with the
confidence factor, or if you cut and paste the training data twice),
you will have a depth 1 tree, that spays out to 9 terminal nodes.
However, in general I suspect you'll be disappointed to find that it
won't adhere to the taxonomy that you want.

W



On Sun, Oct 3, 2010 at 5:10 PM, Winton Davies <[hidden email]> wrote:

> Because it has a rule for minimun items in a tree, in this case, the
> default is 2 :)
>
> W
>
>
>
> On Sun, Oct 3, 2010 at 3:55 PM, Chris Spencer <[hidden email]> wrote:
>> On Sat, Oct 2, 2010 at 7:30 PM, Vishal Belsare <[hidden email]> wrote:
>>> Chris,
>>>
>>> Would 'to String' or 'dumpTree' methods work for you? AFAIK they work
>>> for J48. Look at this:
>>>
>>> http://www.lucsorel.com/index.php?page=downloads#wekatext2xml
>>>
>>> Having a parsable XML will help you 'walk' the decision tree.
>>
>> Thanks. But how do you save a J48 tree as "text-syntax"? The default
>> seems to be binary, which that project doesn't seem to support. I see
>> something like that syntax included in the training output when the -U
>> option is given, but it doesn't look like a complete tree. To use my
>> earlier example data set:
>>
>> @relation test
>>
>> %@attribute a numeric
>> @attribute a {1,2,3,4,5,6,7,8,9}
>> @attribute b numeric
>> @attribute c numeric
>> @attribute class {BOB,SUE,JON,LUE,MOE,VAL,JIM,ZOE,XYZ}
>>
>> @data
>> {0 1, 1 1, 2 1, 3 BOB}
>> {0 2, 1 1, 2 1, 3 SUE}
>> {0 3, 1 2, 2 1, 3 JON}
>> {0 4, 1 2, 2 1, 3 LUE}
>> {0 5, 1 3, 2 2, 3 MOE}
>> {0 6, 1 3, 2 2, 3 VAL}
>> {0 7, 1 4, 2 2, 3 JIM}
>> {0 8, 1 4, 2 2, 3 ZOE}
>> {0 9, 1 5, 2 3, 3 XYZ}
>>
>> I'm seeing the tree output:
>>
>> c <= 1
>> |   b <= 1: BOB (2.0/1.0)
>> |   b > 1: JON (2.0/1.0)
>> c > 1
>> |   b <= 3: MOE (2.0/1.0)
>> |   b > 3: JIM (3.0/2.0)
>>
>> Why is J48 ignoring the "a" attribute, reducing accuracy by 50%? How
>> do I fix this to include all attributes? Is there an easier way to get
>> this representation then having to search for it in the output from
>> training?
>>
>> Regards,
>> Chris
>>
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: [hidden email]
>> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>
>

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Suggesting Attributes for J48 Classification

Cerin
Thanks. I had already played around with the -M option, but was only
able to get a one-level or two-level tree, like you said. I was hoping
to find a tree that included everything, but I guess I can work with
it assuming it's only leaving out the least useful attributes.

Chris

On Sun, Oct 3, 2010 at 8:14 PM, Winton Davies <[hidden email]> wrote:

> Also, almost any algorithm, J4.8 included, will try to find the
> shortest route to classification -- for example in the case above, if
> you set min items per tree node to 1 (and maybe play with the
> confidence factor, or if you cut and paste the training data twice),
> you will have a depth 1 tree, that spays out to 9 terminal nodes.
> However, in general I suspect you'll be disappointed to find that it
> won't adhere to the taxonomy that you want.
>
> W
>
>
>
> On Sun, Oct 3, 2010 at 5:10 PM, Winton Davies <[hidden email]> wrote:
>> Because it has a rule for minimun items in a tree, in this case, the
>> default is 2 :)
>>
>> W
>>
>>
>>
>> On Sun, Oct 3, 2010 at 3:55 PM, Chris Spencer <[hidden email]> wrote:
>>> On Sat, Oct 2, 2010 at 7:30 PM, Vishal Belsare <[hidden email]> wrote:
>>>> Chris,
>>>>
>>>> Would 'to String' or 'dumpTree' methods work for you? AFAIK they work
>>>> for J48. Look at this:
>>>>
>>>> http://www.lucsorel.com/index.php?page=downloads#wekatext2xml
>>>>
>>>> Having a parsable XML will help you 'walk' the decision tree.
>>>
>>> Thanks. But how do you save a J48 tree as "text-syntax"? The default
>>> seems to be binary, which that project doesn't seem to support. I see
>>> something like that syntax included in the training output when the -U
>>> option is given, but it doesn't look like a complete tree. To use my
>>> earlier example data set:
>>>
>>> @relation test
>>>
>>> %@attribute a numeric
>>> @attribute a {1,2,3,4,5,6,7,8,9}
>>> @attribute b numeric
>>> @attribute c numeric
>>> @attribute class {BOB,SUE,JON,LUE,MOE,VAL,JIM,ZOE,XYZ}
>>>
>>> @data
>>> {0 1, 1 1, 2 1, 3 BOB}
>>> {0 2, 1 1, 2 1, 3 SUE}
>>> {0 3, 1 2, 2 1, 3 JON}
>>> {0 4, 1 2, 2 1, 3 LUE}
>>> {0 5, 1 3, 2 2, 3 MOE}
>>> {0 6, 1 3, 2 2, 3 VAL}
>>> {0 7, 1 4, 2 2, 3 JIM}
>>> {0 8, 1 4, 2 2, 3 ZOE}
>>> {0 9, 1 5, 2 3, 3 XYZ}
>>>
>>> I'm seeing the tree output:
>>>
>>> c <= 1
>>> |   b <= 1: BOB (2.0/1.0)
>>> |   b > 1: JON (2.0/1.0)
>>> c > 1
>>> |   b <= 3: MOE (2.0/1.0)
>>> |   b > 3: JIM (3.0/2.0)
>>>
>>> Why is J48 ignoring the "a" attribute, reducing accuracy by 50%? How
>>> do I fix this to include all attributes? Is there an easier way to get
>>> this representation then having to search for it in the output from
>>> training?
>>>
>>> Regards,
>>> Chris
>>>
>>> _______________________________________________
>>> Wekalist mailing list
>>> Send posts to: [hidden email]
>>> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>>> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>>
>>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...