

Essentially, for each internal node of the tree, error estimates for three scenarios are compared : a) leaving the tree as is, b) performing subtree replacement, and c) performing subtree raising (by raising the subtree attached to the largest branch). The best option is chosen.
The error estimate is computed in the same way in each case, by taking the upper limit of a confidence interval computed from the error on the training data (per leaf node).
Note that, if subtree raising occurs, the whole procedure, including all three options, is performed again on the resulting pruned tree.
Cheers, Eibe
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


Dear Eibe,
Thanks a lot for the answers they were very enlightening. I have still a few
question, but i do not know if i should create a new topic.
In page 214 of the book or 247 for pdf it writes for sub tree raising: Of
course, if we perform this raising operation, it is necessary to reclassify
the examples at the nodes marked 4 and 5 into the new subtree headed by C.
This is why the daughters of that node are marked with primes: 1',2', and
3'—to indicate that they are not the same as the original daughters 1, 2,
and 3 but differ by the inclusion of the examples originally covered by 4
and 5. "
1 ) The two examples of the internal node "B" will be reclassified to the
internal node 3 the way they are portrayed in the image i uploaded?
< http://weka.8497.n7.nabble.com/file/t6270/example.png>
2) As far as the training set is considered In
http://weka.8497.n7.nabble.com/Usetrainingsettd39248.html you answered
that: "Use training set" computes what is also known as the resubstitution
estimate. This estimate is often hopelessly optimistic but it can be useful
for diagnostic purposes."
Does that mean that "training set" makes its own model and then it test it
on the same data? ( I got a bit confused)
3) In "percentage Split" data is first randomized by default . Does the
same happen with K  fold cross validation?

Sent from: http://weka.8497.n7.nabble.com/_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Administrator

 I think you understand the basic mechanism correctly, but the redistributed instances may not necessarily end up in the 2^{nd} and 3^{rd} leaf node respectively. It depends on their values for attribute C.
 The Explorer’s Classify panel will always build a tree for the full dataset as loaded into the Preprocess panel. This is the tree that is shown at the start of the output, before performance statistics are shown. If you perform evaluation on the training data, no other tree will be built and the tree that is shown will simply be evaluated on the whole dataset as loaded into the Preprocess panel. The performance statistics are then based on this evaluation.
 Yes, the data is randomized when you perform kfold crossvalidation in WEKA.
Cheers, Eibe From: [hidden email] Sent: Saturday, 2 September 2017 6:31 AM To: [hidden email] Subject: Re: [Wekalist] C45  pruning method  raising subtree or replacementsubtree Dear Eibe, Thanks a lot for the answers they were very enlightening. I have still a few question, but i do not know if i should create a new topic. In page 214 of the book or 247 for pdf it writes for sub tree raising: Of course, if we perform this raising operation, it is necessary to reclassify the examples at the nodes marked 4 and 5 into the new subtree headed by C. This is why the daughters of that node are marked with primes: 1',2', and 3'—to indicate that they are not the same as the original daughters 1, 2, and 3 but differ by the inclusion of the examples originally covered by 4 and 5. " 1 ) The two examples of the internal node "B" will be reclassified to the internal node 3 the way they are portrayed in the image i uploaded? <http://weka.8497.n7.nabble.com/file/t6270/example.png> 2) As far as the training set is considered In http://weka.8497.n7.nabble.com/Usetrainingsettd39248.html you answered that: "Use training set" computes what is also known as the resubstitution estimate. This estimate is often hopelessly optimistic but it can be useful for diagnostic purposes." Does that mean that "training set" makes its own model and then it test it on the same data? ( I got a bit confused) 3) In "percentage Split" data is first randomized by default . Does the same happen with K  fold cross validation?  Sent from: http://weka.8497.n7.nabble.com/ _______________________________________________ Wekalist mailing list Send posts to: [hidden email] List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html _______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


Dear Eibe,
I am still confused with what is the purpose of the "Use training set" in
test options. ( I understood how the "supplied test set", "K  fold Cross
validation" and "Percentage Split" works.
If i understood correctly the "Use training set" is not suggested because
1) Does the " Use training set" splits the instances?
2) How exactly works the "Use training set"?
3) Why the "Use training set" is not suggested for learning purposes?
As far as the third question is concerned from my research everyone answers
because of the "resubstitution error".
4) With "resubstitution error" we include all errors? ( Mean absolute error,
Root mean square error, Relative absolute error etc..)
Thanks in advance,
John

Sent from: http://weka.8497.n7.nabble.com/_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Administrator

 No.
 The instances that are used for training the model are also used for testing it.
 You mean for “testing” purposes? Because the model has been fitted to this data.
 Yes. It applies whenever the training data is resubstituted into the model for evaluating the model.
It’s trivial to build a classifier that achieves high accuracy on the training data (i.e., low resubstitution error). For example, a completely unpruned decision tree will achieve this, or a onenearestneighour classifier. However, this doesn’t mean the classifier will be any good on new data drawn from the same domain: it may just be overfitting the training data. Thus, performance estimates on the training data will not give you a reliable indication of performance on new data. Cheers, Eibe From: [hidden email] Sent: Saturday, 2 September 2017 9:31 PM To: [hidden email] Subject: Re: [Wekalist] C45  pruning method  raising subtree orreplacementsubtree Dear Eibe, I am still confused with what is the purpose of the "Use training set" in test options. ( I understood how the "supplied test set", "K  fold Cross validation" and "Percentage Split" works. If i understood correctly the "Use training set" is not suggested because 1) Does the " Use training set" splits the instances? 2) How exactly works the "Use training set"? 3) Why the "Use training set" is not suggested for learning purposes? As far as the third question is concerned from my research everyone answers because of the "resubstitution error". 4) With "resubstitution error" we include all errors? ( Mean absolute error, Root mean square error, Relative absolute error etc..) Thanks in advance, John  Sent from: http://weka.8497.n7.nabble.com/ _______________________________________________ Wekalist mailing list Send posts to: [hidden email] List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html _______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Administrator

> On 4/09/2017, at 1:33 AM, Driver007 < [hidden email]> wrote:
>
> 1) As far as J48 algorithm is considered could you explain me what is " φ
> "??
> ( I could not find any pseudocode similar to this in the book.
>
> Algorithm [1] J48:
> INPUT:
> D //Training data
> OUTPUT
> T //Decision tree
> DTBUILD (*D)
> {
> T=φ;
> T= Create root node and label with splitting attribute;
> T= Add arc to root node for each split predicate and
> label;
> For each arc do
> D= Database created by applying splitting
> predicate to D;
> If stopping point reached for this path, then
> T’= create leaf node and label with
> appropriate class;
> Else
> T’= DTBUILD(D);
> T= add T’ to arc;
> }
>
It probably refers to an empty tree?
> 2)Could you explain me the quadratic loss function (about the error measures
> of nominal attributes)?
> I have read
> http://weka.8497.n7.nabble.com/errormeasuresofnominalattributestd503.html#a509> but i am struggling to understand the concept.
It is the multiclass version of the Brier score, i.e., it’s the score as originally defined by Brier. See
https://en.wikipedia.org/wiki/Brier_scorefor more information.
WEKA’s evaluation module takes the average of this score across all test instances, divides it by the number of class values, and then takes the square root to compute the root mean squared error for nominal class prediction problems.
Cheers,
Eibe
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


This post was updated on .
Dear Eibe,
Thank you for the answers.
If i understood correctly from the link you sent me Brier's score calculates
the mean squared error. The mean squared error is not the same with root mean squared error.
1) How is calculated the mean absolute error for nominal values in j48?
2) How is calculated the relative absolute error for nominal values in j48?
3) How is calculated the root relative squared error for nominal values in
j48?
Thanks in advance,
John

Sent from: http://weka.8497.n7.nabble.com/_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@list.waikato.ac.nz
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


1) Whenever the square of a difference is used in the squared error, the absolute value of the difference is used instead in the absolute error.
2) The absolute error of the learning scheme (e.g., J48) is divided by the absolute error of ZeroR, which takes the proportions of the classes in the training set as the class probability estimates.
3) The same as 2) but using squared error values rather than absolute error values.
Cheers, Eibe
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalistList etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

