Quantcast

J48 - Algorithm - Complexity

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

J48 - Algorithm - Complexity

Desai Ankit
Dear All,

Which algorithm is implemented as J48 in weka? I mean there are many decision tree variants available viz. C4.5. 

I need to calculate the theoretical and practical complexity of J48 implementation. 

Request for help.

Sincerely,
--
Ankit Desai

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: J48 - Algorithm - Complexity

Eibe Frank-2
Administrator
J48 implements C4.5 Release 8.

A note on something that is not well-known about the algorithm: When you have numeric attributes, the most expensive part is potentially the search for a split point equalling a numeric value that actually occurs in the training set. With WEKA’s implementation (and the original C4.5 implementation) this requires a scan through the entire training set for each node of the tree that exhibits a split on a numeric attribute. Depending on the data source, the number of nodes in the tree can be O(n), where n is the number of training instances, making the time complexity for this part O(n^2). That’s why recent versions of J48 have the "-doNotMakeSplitPointActualValue” flag that prevents this expensive search (which does not appear to have a strong raison d’être in the first place).

Cheers,
Eibe

> On 19/03/2016, at 11:17 PM, Desai Ankit <[hidden email]> wrote:
>
> Dear All,
>
> Which algorithm is implemented as J48 in weka? I mean there are many decision tree variants available viz. C4.5.
>
> I need to calculate the theoretical and practical complexity of J48 implementation.
>
> Request for help.
>
> Sincerely,
> --
> Ankit Desai
> desaiankitb.tk
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: J48 - Algorithm - Complexity

maytalsaar
This post has NOT been accepted by the mailing list yet.
Eibe,

For a given data set I found that WEKA's j48 results in a different attribute as the first split (at the root) when J48 is running without pruning, than what it selects as the first split when pruning is turned off.

In one case it selected a numeric attribute (when pruning was turned off), hence I was wondering if there is any random element to the search of split point and thus the infoGain computed was might be different.

Otherwise, is there another explanation?

Thanks!
Maytal.
Loading...