Request confirmation of understanding of the J48 algorithm

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Request confirmation of understanding of the J48 algorithm

Hao Li

 

Respected Weka datamining team.

 

While waiting for answers for my last questions, I have 2 short and simple questions I would like a confirmation of.

 

(Q1) The Weka J48 classification tree, which is the weka name for C4.5 algorithm, does NOT require the input data to be normalized (e.g. substract by mean, divide by standard deviation), is this correct?

As I understand, decision trees of any kind, including ensembles of them such as random forest does not require scaling of the input data. They can even handle dataset of mixed continuous and discreet features. For example I can have one feature (molecular weight with range between 10 to 1000 dalton) and another feature (pH value ranging between 1 to 14).

Is what I said above correct?

 

(Q2) The Weka J48 classification tree is capable of feature selection by itself, I don’t have to use a feature selection algorithm to reduce the dataset before feeding it to the J48. Is this correct?

When trying out the J48 algorithm on a dataset of 144 samples and 257 featuers, J48 was able to achieve 85% accuracy with a tree of just 5 nodes. So my understanding is that I don’t have to manually perform feature selection before feeding the 257 feature dataset to J48.

Is what I said above correct?

 

 

(Q3) What authoritative source can I cite in support of the above 2 points? I went through Weka’s user manual and related books but found no line of text that I can directly reference to back up my understanding.

I’m pretty sure I do know the answers to the questions above but my situation demands me to be very formal and I do require a reference to authority for pretty much everything I do.

 



_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Request confirmation of understanding of the J48 algorithm

Peter Reutemann
> While waiting for answers for my last questions, I have 2 short and simple questions I would like a confirmation of.

Eibe busy teaching several papers.

> (Q1) The Weka J48 classification tree, which is the weka name for C4.5 algorithm, does NOT require the input data to be normalized (e.g. substract by mean, divide by standard deviation), is this correct?
>
> As I understand, decision trees of any kind, including ensembles of them such as random forest does not require scaling of the input data. They can even handle dataset of mixed continuous and discreet features. For example I can have one feature (molecular weight with range between 10 to 1000 dalton) and another feature (pH value ranging between 1 to 14).
>
> Is what I said above correct?

Yes. However, you can always see whether normalizing/standardizing
helps improving the results. Simply wrap J48 in a FilteredClassifier
and select the appropriate filter for pre-processing the data.

> (Q2) The Weka J48 classification tree is capable of feature selection by itself, I don’t have to use a feature selection algorithm to reduce the dataset before feeding it to the J48. Is this correct?
>
> When trying out the J48 algorithm on a dataset of 144 samples and 257 featuers, J48 was able to achieve 85% accuracy with a tree of just 5 nodes. So my understanding is that I don’t have to manually perform feature selection before feeding the 257 feature dataset to J48.
>
> Is what I said above correct?

If I'm not mistaken (it's been many years since I looked in detail at
J48), it uses InfoGain (or something similar) to select the best
suited attribute for splitting the data. This is implicitly ranking
the attributes based on their merit in separating the data. In other
words, implicit attribute selection.

> (Q3) What authoritative source can I cite in support of the above 2 points? I went through Weka’s user manual and related books but found no line of text that I can directly reference to back up my understanding.

Sorry can't help with that. You can always try looking through
Qinlan's publication on C4.5 (the basis for J48):
https://weka.sourceforge.io/doc.dev/weka/classifiers/trees/J48.html

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html