Weka Test

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Weka Test

houda mejri
Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

cao_rs_1.arff (80K) Download Attachment
cao_rs_1_test.arff (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Eibe Frank-2
Administrator
Here are the attribute declarations from your test file:

@attribute Seq {36}
@attribute op1 {0}
@attribute op2 {0}
@attribute op3 {0.2}
@attribute op4 {0}
@attribute op5 {0}
@attribute op6 {0}
@attribute op7 {1}
@attribute op8 {0}
@attribute op9 {0}
@attribute op10 {0}
@attribute op11 {0}
@attribute op12 {1}
@attribute op13 {0}
@attribute op14 {1}
@attribute sp1 {0}
@attribute sp2 {1.864543}
@attribute sp3 {0}
@attribute sp4 {1.864543}
@attribute sp5 {0}
@attribute sp6 {1.864543}
@attribute cao_ro_1 {3.780653}
@attribute cao_ro_2 {0.059275}
@attribute cao_ro_3 {43}
@attribute cao_ro_4 {29624}
@attribute cao_ro_5 {1344}
@attribute cao_ro_6 {0.168685}
@attribute cao_ro_7 {63.781314}
@attribute cao_ro_8 {5.142415}
@attribute cao_ro_9 {0.351395}
@attribute cao_ro_10 {0.275494}
@attribute cao_ro_11 {22.041667}
@attribute cao_ro_12 {-1044.238037}
@attribute cao_ro_13 {403.954651}
@attribute cao_ro_14 {0.784}
@attribute cao_ro_15 {0.588667}
@attribute cao_ro_16 {0.21516}
@attribute cao_rs_1 {}
@attribute cao_adio_1 {0.735854}
@attribute cao_adio_2 {0.00509}
@attribute cao_adio_3 {10.910081}
@attribute cao_adio_4 {0.29058}
@attribute cao_adio_5 {0.0304}
@attribute cao_adio_6 {0.028072}
@attribute cao_adio_7 {0.082094}
@attribute cao_adio_8 {0.397668}
@attribute interaction_1 {476}
@attribute interaction_7 {-50.681992}

And here are the ones from your training file:

@attribute Seq {0,1,2,3,4,5,7,8,10,12,13,16,21,22,33,35,36}
@attribute op1 {0,0.2,0.5,1}
@attribute op2 {0,0.2,1}
@attribute op3 {0,0.2,1}
@attribute op4 {0,0.2,0.5,1}
@attribute op5 {0,0.2,1}
@attribute op6 {0,0.2,1}
@attribute op7 {0,0.2,1}
@attribute op8 {0,0.2,1}
@attribute op9 {0,0.2,1}
@attribute op10 {0,0.2,1}
@attribute op11 {0,1}
@attribute op12 {0,1}
@attribute op13 {-1,0,1}
@attribute op14 {0,1}
@attribute sp1 {0,1.573,1.849085,2.926557,3.068244,3.082028,3.196038,3.223366,3.4869,3.508749,3.763936,3.768612,3.78328}
@attribute sp2 {0,1.286089,1.453926,1.460415,1.532163,1.751619,1.828829,1.864543,1.867842,1.945673,1.9563,2.563851,2.767545,2.962402,2.970765,3.00725,3.044012,3.293013,3.693662,3.752181,3.763936,3.766284,3.768612,3.91485,4.000985,5.355813,6.568187,7.894129,11.036739,11.037706,11.989897,11.996739,12.158096,14.348267,14.645771,24.365977,29.260126}
@attribute sp3 {0,1.460415,1.462245,1.573,1.646104,1.646507,1.864543,2.742849,2.759426,2.999185,3.012335,3.075175,7.850895,8.765296}
@attribute sp4 {0,1.457355,1.646104,1.646154,1.646507,1.864543,1.867842,1.919808,2.49442,2.742849,2.742852,2.759185,2.759426,2.806408,3.170736,3.196038,3.545,5.0822,8.307273,12.158096,12.182988,13.068802}
@attribute sp5 {0,1.062311,1.356,1.619215,1.849085,1.955142,2.009768,2.028154,2.096458,2.574618,2.948256,3.068244,3.196038,3.223366,3.223807,3.231397,3.349287,3.4869,4.332188,4.663671,5.0822,7.939148,11.754461,13.0688,14.348267,33.82,68.136422,90.481342}
@attribute sp6 {0,1.062311,1.849085,1.864543,1.867842,1.955142,2.574618,2.742852,3.170736,3.196038,3.223366,3.223807,3.349287,11.754461,12.158096,12.182988,90.481342}
@attribute cao_ro_1 {1.062311,1.458089,1.619215,1.733616,1.944548,2.028154,2.188105,2.574618,2.830939,3.012277,3.088699,3.349287,3.780653,3.842233,4.663671,5.0822,7.939148,13.218242,14.363975,26.669014,30,33.82,68.136422,90.481342}
@attribute cao_ro_2 {0.009716,0.014829,0.015455,0.015771,0.022992,0.024203,0.027562,0.028199,0.0408,0.055862,0.059275,0.077649,0.092871,0.105083,0.106264,0.114424,0.118999,0.126596,0.225722,0.297288,0.352469,0.66,1.343017,1.775151}
@attribute cao_ro_3 {1,2,3,10,11,16,20,43}
@attribute cao_ro_4 {12,24,36,60,92,236,268,288,420,1084,1304,1450,1920,2660,2984,4072,4246,4896,5296,6572,20400,21570,29624,77912}
@attribute cao_ro_5 {6,10,12,16,18,20,26,28,32,46,52,58,65,72,77,156,162,186,220,320,724,1053,1344}
@attribute cao_ro_6 {0.0126,0.02145,0.035458,0.065082,0.0717,0.0936,0.099803,0.106407,0.124,0.126016,0.1281,0.135188,0.168685,0.323734,0.33,0.68,0.758245,0.924352,0.938408,0.938477,0.938838,0.989806,2.128,3.943114,10.8,133.036982}
@attribute cao_ro_7 {0.000003,0.000004,0.000006,0.000011,0.000012,0.000013,0.00008,0.000085,0.000138,0.000634,0.00083,0.001078,0.001864,0.003307,50.677326,63.781314,75.400174,104.413137,109.531906,125.823543,136.69201,233.071348,259.093994}
@attribute cao_ro_8 {1.387995,1.6802,2.016791,2.200554,2.407491,4.001622,4.005916,4.433518,4.702435,4.794244,5.142415,6.480822,6.893951,8.867357,9.019821,9.210759,9.225577,10.841196,11.574642,12.761979,13.162812,13.338372,13.926828,23.397931}
@attribute cao_ro_9 {0.013343,0.027527,0.030179,0.036349,0.056437,0.109696,0.111929,0.111972,0.11198,0.216079,0.239856,0.258583,0.274013,0.351395,0.381315,0.393291,0.626899,0.631117,0.684007,0.720493,0.856965,0.908015,0.930549,0.970588,0.988256,1.251697}
@attribute cao_ro_10 {0.08258,0.106528,0.110289,0.133433,0.275494,0.294603,0.481855,0.541511,0.564933,0.585954,0.58618,0.586223,0.752408,0.81924,1.055592,1.457771,2.585833,2.628563,2.656592,3.746562,4.767081,5.188344,6.269022,6.311169,9.305491,9.705882}
@attribute cao_ro_11 {2,3.333333,3.538462,3.630769,4.62069,4.927273,5.833333,14.4,16.043011,20.48433,22.041667,31.384615,32.691358,41.73913,45.3125,46.571429,51.153846,55.142857,63.75,107.61326,254.5,657.2}
@attribute cao_ro_12 {-1044.238037,-255.974731,-252.913101,-231.482162,-206.127426,-156.632202,-137.702438,-117.594917,-111.019638,-102.031059,-73.647285,-62.401493,-50.014229,-38.787102,-25.204201,-25.018787,-25.012617,-5.041495,-4.431469,-2.919193,-0.755427,0,2.662531}
@attribute cao_ro_13 {-4.92243,-3.335678,-2.365589,-0.94641,-0.600925,-0.157695,0,0.212338,2.823443,5.272197,6.380524,6.485836,15.368589,23.555862,49.942123,50.009041,114.80999,206.652435,241.362976,304.828125,403.954651}
@attribute cao_ro_14 {0.784,1.1,2.1,2.23,2.684,3,3.1,5.15,5.2,5.22,5.235076,5.25,1420.115234,2930.684267,3415.554199,3654.490116,5320.080566,10000,10000.04531}
@attribute cao_ro_15 {0.588667,0.8,0.806667,0.899,1.093304,1.147667,1.166667,1.336677,1.625533,1.796667,1.825,1.864643,2.02391,2.030333,2.162649,2.162676,2.162816,2.163333,2.266667,2.642,3.399934,3.424667,3.466627,3.913333,5.12,5.940232}
@attribute cao_ro_16 {0.004125,0.006,0.006508,0.006665,0.00936,0.00998,0.024003,0.0244,0.02625,0.026714,0.04,0.047716,0.068,0.094782,0.179254,0.179267,0.179336,0.189618,0.2128,0.21516,0.258726,0.3,0.650899,1.078978,3.6,13.303698}
@attribute cao_rs_1 {1.062311,1.281926,1.286089,1.356,1.453926,1.457355,1.458089,1.460415,1.462245,1.532163,1.573,1.619215,1.646104,1.646154,1.646507,1.683771,1.733616,1.751619,1.828829,1.846831,1.849085,1.864543,1.867842,1.919808,1.944548,1.945673,1.955142,1.9563,1.957425,2.009768,2.028154,2.096458,2.188105,2.315623,2.49442,2.574618,2.677907,2.723035,2.742849,2.742852,2.759185,2.759426,2.767545,2.806408,2.830939,2.926557,2.948256,2.962402,2.970765,2.997474,2.999185,3.00725,3.012277,3.012335,3.044012,3.068244,3.075175,3.082028,3.088699,3.170736,3.196038,3.207249,3.223366,3.223807,3.231397,3.349287,3.4869,3.508749,3.545,3.752181,3.763936,3.766284,3.768612,3.780653,3.78328,3.842233,4.000985,4.332188,4.479986,4.663671,5.0822,6.568187,7.826256,7.850895,7.894129,7.939148,8.307273,8.765296,8.788898,11.754461,12.158096,12.182988,13.0688,13.068802,13.218242,14.348267,14.363975,14.630063,14.645771,26.669014,30,33.82,34.8076,68.136422,69.778422,90.481342,121.796071}
@attribute cao_adio_1 {0.206765,0.283798,0.315159,0.337425,0.37848,0.394753,0.425886,0.501115,0.551005,0.5863,0.601174,0.651894,0.735854,0.74784,0.907721,0.989183,1.545249,2.572755,2.795756,5.190768,5.8391,6.582612,13.261847,17.610988}
@attribute cao_adio_2 {0.000834,0.001273,0.001327,0.001354,0.001974,0.002078,0.002367,0.002421,0.003503,0.004797,0.00509,0.006668,0.007975,0.009023,0.009125,0.009826,0.010218,0.010871,0.019383,0.025528,0.030266,0.056674,0.115324,0.152431}
@attribute cao_adio_3 {0.13806,0.403447,0.4229,0.583324,0.594564,0.602398,0.628536,0.686745,1.153607,1.607588,2.145031,2.208411,2.240762,2.345007,2.564202,2.599198,2.802014,2.824404,3.004571,3.070688,3.071084,3.07116,3.820686,4.706682,5.8391,10.910081}
@attribute cao_adio_4 {0.000603,0.000979,0.001499,0.002363,0.00244,0.002665,0.006205,0.006674,0.008469,0.010006,0.010387,0.010389,0.011013,0.01154,0.012504,0.014354,0.018078,0.018655,0.025528,0.030046,0.031645,0.066918,0.26971,0.29058,0.440864}
@attribute cao_adio_5 {0,0.004426,0.006228,0.007507,0.009866,0.010801,0.01259,0.016913,0.019779,0.021807,0.022662,0.025576,0.0304,0.031089,0.040275,0.044956,0.076906,0.135943,0.148757,0.286367,0.323619,0.366339,0.75011,1}
@attribute cao_adio_6 {0,0.002897,0.003251,0.00343,0.00752,0.008206,0.010108,0.010469,0.017607,0.026139,0.028072,0.03848,0.047102,0.054019,0.054688,0.05931,0.061902,0.066204,0.122353,0.16289,0.194146,0.368342,0.755225,1}
@attribute cao_adio_7 {0,0.006228,0.008459,0.010801,0.016913,0.025576,0.025577,0.040275,0.044007,0.044956,0.054344,0.054453,0.076906,0.082094,0.084563,0.366339,0.460575,0.506523,0.75011,1}
@attribute cao_adio_8 {0,0.005593,0.008206,0.010108,0.017607,0.03848,0.047102,0.061902,0.073034,0.119621,0.194146,0.321446,0.367129,0.368342,0.397668,0.745786,0.755225,0.807519,0.8911,1}
@attribute interaction_1 {0,10,20,24,32,64,92.244,152,280,298.778,434.855,446,460,476,491.313,500.441,580,626,656,892,920,963,1277,1278,1620,1689,1870,1908,3053}
@attribute interaction_7 {-50.681992,-50.594737,-48.285601,-30.303388,-26.958928,-17.268367,-12.081795,-11.796264,-8.944336,-8.944233,-8.402016,-8.393993,-8.124473,-7.769908,-7.191981,-7.00155,-5.249546,-5.047962,-5.045061,-5.02473,-2.343749,-2.125313,-1.655702,-1.421961,-1.378085,-1.272308,-1.111618,-0.662276,-0.567054,-0.491407,-0.442187,-0.434622,-0.380064,-0.318507,-0.285531,-0.215974,-0.166869,-0.050358,0,0.000009,0.001947,0.057854,0.069486,0.159527,0.285044,0.604332,0.662186,1.038454,1.053547,2.409871,2.656035,3.986866,4.131746,4.359989,4.374276,4.619995,4.636829,5.080245,5.49212,6.530564,12.751804,14.173968,20.929771,22.989681,24.571874,34.609047,48.073427,51.624381,72.470495,72.934913,156.331292}

They are clearly different. They need to be identical, otherwise WEKA will complain.

Cheers,
Eibe

> On 27 May 2017, at 10:15, houda mejri <[hidden email]> wrote:
>
> Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
>
> <cao_rs_1.arff><cao_rs_1_test.arff>_______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

houda mejri
They contain the same number of attributes, and the test file contains a question mark for the value to be determined. Can you please explain to me
Thanks 

2017-05-27 7:40 GMT+02:00 Eibe Frank <[hidden email]>:
Here are the attribute declarations from your test file:

@attribute Seq {36}
@attribute op1 {0}
@attribute op2 {0}
@attribute op3 {0.2}
@attribute op4 {0}
@attribute op5 {0}
@attribute op6 {0}
@attribute op7 {1}
@attribute op8 {0}
@attribute op9 {0}
@attribute op10 {0}
@attribute op11 {0}
@attribute op12 {1}
@attribute op13 {0}
@attribute op14 {1}
@attribute sp1 {0}
@attribute sp2 {1.864543}
@attribute sp3 {0}
@attribute sp4 {1.864543}
@attribute sp5 {0}
@attribute sp6 {1.864543}
@attribute cao_ro_1 {3.780653}
@attribute cao_ro_2 {0.059275}
@attribute cao_ro_3 {43}
@attribute cao_ro_4 {29624}
@attribute cao_ro_5 {1344}
@attribute cao_ro_6 {0.168685}
@attribute cao_ro_7 {63.781314}
@attribute cao_ro_8 {5.142415}
@attribute cao_ro_9 {0.351395}
@attribute cao_ro_10 {0.275494}
@attribute cao_ro_11 {22.041667}
@attribute cao_ro_12 {-1044.238037}
@attribute cao_ro_13 {403.954651}
@attribute cao_ro_14 {0.784}
@attribute cao_ro_15 {0.588667}
@attribute cao_ro_16 {0.21516}
@attribute cao_rs_1 {}
@attribute cao_adio_1 {0.735854}
@attribute cao_adio_2 {0.00509}
@attribute cao_adio_3 {10.910081}
@attribute cao_adio_4 {0.29058}
@attribute cao_adio_5 {0.0304}
@attribute cao_adio_6 {0.028072}
@attribute cao_adio_7 {0.082094}
@attribute cao_adio_8 {0.397668}
@attribute interaction_1 {476}
@attribute interaction_7 {-50.681992}

And here are the ones from your training file:

@attribute Seq {0,1,2,3,4,5,7,8,10,12,13,16,21,22,33,35,36}
@attribute op1 {0,0.2,0.5,1}
@attribute op2 {0,0.2,1}
@attribute op3 {0,0.2,1}
@attribute op4 {0,0.2,0.5,1}
@attribute op5 {0,0.2,1}
@attribute op6 {0,0.2,1}
@attribute op7 {0,0.2,1}
@attribute op8 {0,0.2,1}
@attribute op9 {0,0.2,1}
@attribute op10 {0,0.2,1}
@attribute op11 {0,1}
@attribute op12 {0,1}
@attribute op13 {-1,0,1}
@attribute op14 {0,1}
@attribute sp1 {0,1.573,1.849085,2.926557,3.068244,3.082028,3.196038,3.223366,3.4869,3.508749,3.763936,3.768612,3.78328}
@attribute sp2 {0,1.286089,1.453926,1.460415,1.532163,1.751619,1.828829,1.864543,1.867842,1.945673,1.9563,2.563851,2.767545,2.962402,2.970765,3.00725,3.044012,3.293013,3.693662,3.752181,3.763936,3.766284,3.768612,3.91485,4.000985,5.355813,6.568187,7.894129,11.036739,11.037706,11.989897,11.996739,12.158096,14.348267,14.645771,24.365977,29.260126}
@attribute sp3 {0,1.460415,1.462245,1.573,1.646104,1.646507,1.864543,2.742849,2.759426,2.999185,3.012335,3.075175,7.850895,8.765296}
@attribute sp4 {0,1.457355,1.646104,1.646154,1.646507,1.864543,1.867842,1.919808,2.49442,2.742849,2.742852,2.759185,2.759426,2.806408,3.170736,3.196038,3.545,5.0822,8.307273,12.158096,12.182988,13.068802}
@attribute sp5 {0,1.062311,1.356,1.619215,1.849085,1.955142,2.009768,2.028154,2.096458,2.574618,2.948256,3.068244,3.196038,3.223366,3.223807,3.231397,3.349287,3.4869,4.332188,4.663671,5.0822,7.939148,11.754461,13.0688,14.348267,33.82,68.136422,90.481342}
@attribute sp6 {0,1.062311,1.849085,1.864543,1.867842,1.955142,2.574618,2.742852,3.170736,3.196038,3.223366,3.223807,3.349287,11.754461,12.158096,12.182988,90.481342}
@attribute cao_ro_1 {1.062311,1.458089,1.619215,1.733616,1.944548,2.028154,2.188105,2.574618,2.830939,3.012277,3.088699,3.349287,3.780653,3.842233,4.663671,5.0822,7.939148,13.218242,14.363975,26.669014,30,33.82,68.136422,90.481342}
@attribute cao_ro_2 {0.009716,0.014829,0.015455,0.015771,0.022992,0.024203,0.027562,0.028199,0.0408,0.055862,0.059275,0.077649,0.092871,0.105083,0.106264,0.114424,0.118999,0.126596,0.225722,0.297288,0.352469,0.66,1.343017,1.775151}
@attribute cao_ro_3 {1,2,3,10,11,16,20,43}
@attribute cao_ro_4 {12,24,36,60,92,236,268,288,420,1084,1304,1450,1920,2660,2984,4072,4246,4896,5296,6572,20400,21570,29624,77912}
@attribute cao_ro_5 {6,10,12,16,18,20,26,28,32,46,52,58,65,72,77,156,162,186,220,320,724,1053,1344}
@attribute cao_ro_6 {0.0126,0.02145,0.035458,0.065082,0.0717,0.0936,0.099803,0.106407,0.124,0.126016,0.1281,0.135188,0.168685,0.323734,0.33,0.68,0.758245,0.924352,0.938408,0.938477,0.938838,0.989806,2.128,3.943114,10.8,133.036982}
@attribute cao_ro_7 {0.000003,0.000004,0.000006,0.000011,0.000012,0.000013,0.00008,0.000085,0.000138,0.000634,0.00083,0.001078,0.001864,0.003307,50.677326,63.781314,75.400174,104.413137,109.531906,125.823543,136.69201,233.071348,259.093994}
@attribute cao_ro_8 {1.387995,1.6802,2.016791,2.200554,2.407491,4.001622,4.005916,4.433518,4.702435,4.794244,5.142415,6.480822,6.893951,8.867357,9.019821,9.210759,9.225577,10.841196,11.574642,12.761979,13.162812,13.338372,13.926828,23.397931}
@attribute cao_ro_9 {0.013343,0.027527,0.030179,0.036349,0.056437,0.109696,0.111929,0.111972,0.11198,0.216079,0.239856,0.258583,0.274013,0.351395,0.381315,0.393291,0.626899,0.631117,0.684007,0.720493,0.856965,0.908015,0.930549,0.970588,0.988256,1.251697}
@attribute cao_ro_10 {0.08258,0.106528,0.110289,0.133433,0.275494,0.294603,0.481855,0.541511,0.564933,0.585954,0.58618,0.586223,0.752408,0.81924,1.055592,1.457771,2.585833,2.628563,2.656592,3.746562,4.767081,5.188344,6.269022,6.311169,9.305491,9.705882}
@attribute cao_ro_11 {2,3.333333,3.538462,3.630769,4.62069,4.927273,5.833333,14.4,16.043011,20.48433,22.041667,31.384615,32.691358,41.73913,45.3125,46.571429,51.153846,55.142857,63.75,107.61326,254.5,657.2}
@attribute cao_ro_12 {-1044.238037,-255.974731,-252.913101,-231.482162,-206.127426,-156.632202,-137.702438,-117.594917,-111.019638,-102.031059,-73.647285,-62.401493,-50.014229,-38.787102,-25.204201,-25.018787,-25.012617,-5.041495,-4.431469,-2.919193,-0.755427,0,2.662531}
@attribute cao_ro_13 {-4.92243,-3.335678,-2.365589,-0.94641,-0.600925,-0.157695,0,0.212338,2.823443,5.272197,6.380524,6.485836,15.368589,23.555862,49.942123,50.009041,114.80999,206.652435,241.362976,304.828125,403.954651}
@attribute cao_ro_14 {0.784,1.1,2.1,2.23,2.684,3,3.1,5.15,5.2,5.22,5.235076,5.25,1420.115234,2930.684267,3415.554199,3654.490116,5320.080566,10000,10000.04531}
@attribute cao_ro_15 {0.588667,0.8,0.806667,0.899,1.093304,1.147667,1.166667,1.336677,1.625533,1.796667,1.825,1.864643,2.02391,2.030333,2.162649,2.162676,2.162816,2.163333,2.266667,2.642,3.399934,3.424667,3.466627,3.913333,5.12,5.940232}
@attribute cao_ro_16 {0.004125,0.006,0.006508,0.006665,0.00936,0.00998,0.024003,0.0244,0.02625,0.026714,0.04,0.047716,0.068,0.094782,0.179254,0.179267,0.179336,0.189618,0.2128,0.21516,0.258726,0.3,0.650899,1.078978,3.6,13.303698}
@attribute cao_rs_1 {1.062311,1.281926,1.286089,1.356,1.453926,1.457355,1.458089,1.460415,1.462245,1.532163,1.573,1.619215,1.646104,1.646154,1.646507,1.683771,1.733616,1.751619,1.828829,1.846831,1.849085,1.864543,1.867842,1.919808,1.944548,1.945673,1.955142,1.9563,1.957425,2.009768,2.028154,2.096458,2.188105,2.315623,2.49442,2.574618,2.677907,2.723035,2.742849,2.742852,2.759185,2.759426,2.767545,2.806408,2.830939,2.926557,2.948256,2.962402,2.970765,2.997474,2.999185,3.00725,3.012277,3.012335,3.044012,3.068244,3.075175,3.082028,3.088699,3.170736,3.196038,3.207249,3.223366,3.223807,3.231397,3.349287,3.4869,3.508749,3.545,3.752181,3.763936,3.766284,3.768612,3.780653,3.78328,3.842233,4.000985,4.332188,4.479986,4.663671,5.0822,6.568187,7.826256,7.850895,7.894129,7.939148,8.307273,8.765296,8.788898,11.754461,12.158096,12.182988,13.0688,13.068802,13.218242,14.348267,14.363975,14.630063,14.645771,26.669014,30,33.82,34.8076,68.136422,69.778422,90.481342,121.796071}
@attribute cao_adio_1 {0.206765,0.283798,0.315159,0.337425,0.37848,0.394753,0.425886,0.501115,0.551005,0.5863,0.601174,0.651894,0.735854,0.74784,0.907721,0.989183,1.545249,2.572755,2.795756,5.190768,5.8391,6.582612,13.261847,17.610988}
@attribute cao_adio_2 {0.000834,0.001273,0.001327,0.001354,0.001974,0.002078,0.002367,0.002421,0.003503,0.004797,0.00509,0.006668,0.007975,0.009023,0.009125,0.009826,0.010218,0.010871,0.019383,0.025528,0.030266,0.056674,0.115324,0.152431}
@attribute cao_adio_3 {0.13806,0.403447,0.4229,0.583324,0.594564,0.602398,0.628536,0.686745,1.153607,1.607588,2.145031,2.208411,2.240762,2.345007,2.564202,2.599198,2.802014,2.824404,3.004571,3.070688,3.071084,3.07116,3.820686,4.706682,5.8391,10.910081}
@attribute cao_adio_4 {0.000603,0.000979,0.001499,0.002363,0.00244,0.002665,0.006205,0.006674,0.008469,0.010006,0.010387,0.010389,0.011013,0.01154,0.012504,0.014354,0.018078,0.018655,0.025528,0.030046,0.031645,0.066918,0.26971,0.29058,0.440864}
@attribute cao_adio_5 {0,0.004426,0.006228,0.007507,0.009866,0.010801,0.01259,0.016913,0.019779,0.021807,0.022662,0.025576,0.0304,0.031089,0.040275,0.044956,0.076906,0.135943,0.148757,0.286367,0.323619,0.366339,0.75011,1}
@attribute cao_adio_6 {0,0.002897,0.003251,0.00343,0.00752,0.008206,0.010108,0.010469,0.017607,0.026139,0.028072,0.03848,0.047102,0.054019,0.054688,0.05931,0.061902,0.066204,0.122353,0.16289,0.194146,0.368342,0.755225,1}
@attribute cao_adio_7 {0,0.006228,0.008459,0.010801,0.016913,0.025576,0.025577,0.040275,0.044007,0.044956,0.054344,0.054453,0.076906,0.082094,0.084563,0.366339,0.460575,0.506523,0.75011,1}
@attribute cao_adio_8 {0,0.005593,0.008206,0.010108,0.017607,0.03848,0.047102,0.061902,0.073034,0.119621,0.194146,0.321446,0.367129,0.368342,0.397668,0.745786,0.755225,0.807519,0.8911,1}
@attribute interaction_1 {0,10,20,24,32,64,92.244,152,280,298.778,434.855,446,460,476,491.313,500.441,580,626,656,892,920,963,1277,1278,1620,1689,1870,1908,3053}
@attribute interaction_7 {-50.681992,-50.594737,-48.285601,-30.303388,-26.958928,-17.268367,-12.081795,-11.796264,-8.944336,-8.944233,-8.402016,-8.393993,-8.124473,-7.769908,-7.191981,-7.00155,-5.249546,-5.047962,-5.045061,-5.02473,-2.343749,-2.125313,-1.655702,-1.421961,-1.378085,-1.272308,-1.111618,-0.662276,-0.567054,-0.491407,-0.442187,-0.434622,-0.380064,-0.318507,-0.285531,-0.215974,-0.166869,-0.050358,0,0.000009,0.001947,0.057854,0.069486,0.159527,0.285044,0.604332,0.662186,1.038454,1.053547,2.409871,2.656035,3.986866,4.131746,4.359989,4.374276,4.619995,4.636829,5.080245,5.49212,6.530564,12.751804,14.173968,20.929771,22.989681,24.571874,34.609047,48.073427,51.624381,72.470495,72.934913,156.331292}

They are clearly different. They need to be identical, otherwise WEKA will complain.

Cheers,
Eibe

> On 27 May 2017, at 10:15, houda mejri <[hidden email]> wrote:
>
> Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
>
> <cao_rs_1.arff><cao_rs_1_test.arff>_______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Marc Stein
In reply to this post by houda mejri
I'm curious about something (or maybe I'm just confused).

I have a set of 100k records. I divide it into a 90k set and reserve a 10k set for testing. 

I build a model using the 90k set with a 80/20 split. 

I then use 1k of the 10k test set as a supplied test set. Shouldn't the 20% test set in the training phase be similar to the 1k external test set in outcomes?

What accounts for the variances, assuming that all sets are created randomly and have similar distributions?

Many thanks in advance.

Best,

Marc

On Fri, May 26, 2017 at 6:15 PM, houda mejri <[hidden email]> wrote:
Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Eibe Frank-2
Administrator
In reply to this post by houda mejri
The attribute definitions starting with "@attribute", including all the values listed, must be the same for the training file and the test file.

Cheers,
Eibe

> On 28 May 2017, at 00:03, houda mejri <[hidden email]> wrote:
>
> They contain the same number of attributes, and the test file contains a question mark for the value to be determined. Can you please explain to me
> Thanks
>
> 2017-05-27 7:40 GMT+02:00 Eibe Frank <[hidden email]>:
> Here are the attribute declarations from your test file:
>
> @attribute Seq {36}
> @attribute op1 {0}
> @attribute op2 {0}
> @attribute op3 {0.2}
> @attribute op4 {0}
> @attribute op5 {0}
> @attribute op6 {0}
> @attribute op7 {1}
> @attribute op8 {0}
> @attribute op9 {0}
> @attribute op10 {0}
> @attribute op11 {0}
> @attribute op12 {1}
> @attribute op13 {0}
> @attribute op14 {1}
> @attribute sp1 {0}
> @attribute sp2 {1.864543}
> @attribute sp3 {0}
> @attribute sp4 {1.864543}
> @attribute sp5 {0}
> @attribute sp6 {1.864543}
> @attribute cao_ro_1 {3.780653}
> @attribute cao_ro_2 {0.059275}
> @attribute cao_ro_3 {43}
> @attribute cao_ro_4 {29624}
> @attribute cao_ro_5 {1344}
> @attribute cao_ro_6 {0.168685}
> @attribute cao_ro_7 {63.781314}
> @attribute cao_ro_8 {5.142415}
> @attribute cao_ro_9 {0.351395}
> @attribute cao_ro_10 {0.275494}
> @attribute cao_ro_11 {22.041667}
> @attribute cao_ro_12 {-1044.238037}
> @attribute cao_ro_13 {403.954651}
> @attribute cao_ro_14 {0.784}
> @attribute cao_ro_15 {0.588667}
> @attribute cao_ro_16 {0.21516}
> @attribute cao_rs_1 {}
> @attribute cao_adio_1 {0.735854}
> @attribute cao_adio_2 {0.00509}
> @attribute cao_adio_3 {10.910081}
> @attribute cao_adio_4 {0.29058}
> @attribute cao_adio_5 {0.0304}
> @attribute cao_adio_6 {0.028072}
> @attribute cao_adio_7 {0.082094}
> @attribute cao_adio_8 {0.397668}
> @attribute interaction_1 {476}
> @attribute interaction_7 {-50.681992}
>
> And here are the ones from your training file:
>
> @attribute Seq {0,1,2,3,4,5,7,8,10,12,13,16,21,22,33,35,36}
> @attribute op1 {0,0.2,0.5,1}
> @attribute op2 {0,0.2,1}
> @attribute op3 {0,0.2,1}
> @attribute op4 {0,0.2,0.5,1}
> @attribute op5 {0,0.2,1}
> @attribute op6 {0,0.2,1}
> @attribute op7 {0,0.2,1}
> @attribute op8 {0,0.2,1}
> @attribute op9 {0,0.2,1}
> @attribute op10 {0,0.2,1}
> @attribute op11 {0,1}
> @attribute op12 {0,1}
> @attribute op13 {-1,0,1}
> @attribute op14 {0,1}
> @attribute sp1 {0,1.573,1.849085,2.926557,3.068244,3.082028,3.196038,3.223366,3.4869,3.508749,3.763936,3.768612,3.78328}
> @attribute sp2 {0,1.286089,1.453926,1.460415,1.532163,1.751619,1.828829,1.864543,1.867842,1.945673,1.9563,2.563851,2.767545,2.962402,2.970765,3.00725,3.044012,3.293013,3.693662,3.752181,3.763936,3.766284,3.768612,3.91485,4.000985,5.355813,6.568187,7.894129,11.036739,11.037706,11.989897,11.996739,12.158096,14.348267,14.645771,24.365977,29.260126}
> @attribute sp3 {0,1.460415,1.462245,1.573,1.646104,1.646507,1.864543,2.742849,2.759426,2.999185,3.012335,3.075175,7.850895,8.765296}
> @attribute sp4 {0,1.457355,1.646104,1.646154,1.646507,1.864543,1.867842,1.919808,2.49442,2.742849,2.742852,2.759185,2.759426,2.806408,3.170736,3.196038,3.545,5.0822,8.307273,12.158096,12.182988,13.068802}
> @attribute sp5 {0,1.062311,1.356,1.619215,1.849085,1.955142,2.009768,2.028154,2.096458,2.574618,2.948256,3.068244,3.196038,3.223366,3.223807,3.231397,3.349287,3.4869,4.332188,4.663671,5.0822,7.939148,11.754461,13.0688,14.348267,33.82,68.136422,90.481342}
> @attribute sp6 {0,1.062311,1.849085,1.864543,1.867842,1.955142,2.574618,2.742852,3.170736,3.196038,3.223366,3.223807,3.349287,11.754461,12.158096,12.182988,90.481342}
> @attribute cao_ro_1 {1.062311,1.458089,1.619215,1.733616,1.944548,2.028154,2.188105,2.574618,2.830939,3.012277,3.088699,3.349287,3.780653,3.842233,4.663671,5.0822,7.939148,13.218242,14.363975,26.669014,30,33.82,68.136422,90.481342}
> @attribute cao_ro_2 {0.009716,0.014829,0.015455,0.015771,0.022992,0.024203,0.027562,0.028199,0.0408,0.055862,0.059275,0.077649,0.092871,0.105083,0.106264,0.114424,0.118999,0.126596,0.225722,0.297288,0.352469,0.66,1.343017,1.775151}
> @attribute cao_ro_3 {1,2,3,10,11,16,20,43}
> @attribute cao_ro_4 {12,24,36,60,92,236,268,288,420,1084,1304,1450,1920,2660,2984,4072,4246,4896,5296,6572,20400,21570,29624,77912}
> @attribute cao_ro_5 {6,10,12,16,18,20,26,28,32,46,52,58,65,72,77,156,162,186,220,320,724,1053,1344}
> @attribute cao_ro_6 {0.0126,0.02145,0.035458,0.065082,0.0717,0.0936,0.099803,0.106407,0.124,0.126016,0.1281,0.135188,0.168685,0.323734,0.33,0.68,0.758245,0.924352,0.938408,0.938477,0.938838,0.989806,2.128,3.943114,10.8,133.036982}
> @attribute cao_ro_7 {0.000003,0.000004,0.000006,0.000011,0.000012,0.000013,0.00008,0.000085,0.000138,0.000634,0.00083,0.001078,0.001864,0.003307,50.677326,63.781314,75.400174,104.413137,109.531906,125.823543,136.69201,233.071348,259.093994}
> @attribute cao_ro_8 {1.387995,1.6802,2.016791,2.200554,2.407491,4.001622,4.005916,4.433518,4.702435,4.794244,5.142415,6.480822,6.893951,8.867357,9.019821,9.210759,9.225577,10.841196,11.574642,12.761979,13.162812,13.338372,13.926828,23.397931}
> @attribute cao_ro_9 {0.013343,0.027527,0.030179,0.036349,0.056437,0.109696,0.111929,0.111972,0.11198,0.216079,0.239856,0.258583,0.274013,0.351395,0.381315,0.393291,0.626899,0.631117,0.684007,0.720493,0.856965,0.908015,0.930549,0.970588,0.988256,1.251697}
> @attribute cao_ro_10 {0.08258,0.106528,0.110289,0.133433,0.275494,0.294603,0.481855,0.541511,0.564933,0.585954,0.58618,0.586223,0.752408,0.81924,1.055592,1.457771,2.585833,2.628563,2.656592,3.746562,4.767081,5.188344,6.269022,6.311169,9.305491,9.705882}
> @attribute cao_ro_11 {2,3.333333,3.538462,3.630769,4.62069,4.927273,5.833333,14.4,16.043011,20.48433,22.041667,31.384615,32.691358,41.73913,45.3125,46.571429,51.153846,55.142857,63.75,107.61326,254.5,657.2}
> @attribute cao_ro_12 {-1044.238037,-255.974731,-252.913101,-231.482162,-206.127426,-156.632202,-137.702438,-117.594917,-111.019638,-102.031059,-73.647285,-62.401493,-50.014229,-38.787102,-25.204201,-25.018787,-25.012617,-5.041495,-4.431469,-2.919193,-0.755427,0,2.662531}
> @attribute cao_ro_13 {-4.92243,-3.335678,-2.365589,-0.94641,-0.600925,-0.157695,0,0.212338,2.823443,5.272197,6.380524,6.485836,15.368589,23.555862,49.942123,50.009041,114.80999,206.652435,241.362976,304.828125,403.954651}
> @attribute cao_ro_14 {0.784,1.1,2.1,2.23,2.684,3,3.1,5.15,5.2,5.22,5.235076,5.25,1420.115234,2930.684267,3415.554199,3654.490116,5320.080566,10000,10000.04531}
> @attribute cao_ro_15 {0.588667,0.8,0.806667,0.899,1.093304,1.147667,1.166667,1.336677,1.625533,1.796667,1.825,1.864643,2.02391,2.030333,2.162649,2.162676,2.162816,2.163333,2.266667,2.642,3.399934,3.424667,3.466627,3.913333,5.12,5.940232}
> @attribute cao_ro_16 {0.004125,0.006,0.006508,0.006665,0.00936,0.00998,0.024003,0.0244,0.02625,0.026714,0.04,0.047716,0.068,0.094782,0.179254,0.179267,0.179336,0.189618,0.2128,0.21516,0.258726,0.3,0.650899,1.078978,3.6,13.303698}
> @attribute cao_rs_1 {1.062311,1.281926,1.286089,1.356,1.453926,1.457355,1.458089,1.460415,1.462245,1.532163,1.573,1.619215,1.646104,1.646154,1.646507,1.683771,1.733616,1.751619,1.828829,1.846831,1.849085,1.864543,1.867842,1.919808,1.944548,1.945673,1.955142,1.9563,1.957425,2.009768,2.028154,2.096458,2.188105,2.315623,2.49442,2.574618,2.677907,2.723035,2.742849,2.742852,2.759185,2.759426,2.767545,2.806408,2.830939,2.926557,2.948256,2.962402,2.970765,2.997474,2.999185,3.00725,3.012277,3.012335,3.044012,3.068244,3.075175,3.082028,3.088699,3.170736,3.196038,3.207249,3.223366,3.223807,3.231397,3.349287,3.4869,3.508749,3.545,3.752181,3.763936,3.766284,3.768612,3.780653,3.78328,3.842233,4.000985,4.332188,4.479986,4.663671,5.0822,6.568187,7.826256,7.850895,7.894129,7.939148,8.307273,8.765296,8.788898,11.754461,12.158096,12.182988,13.0688,13.068802,13.218242,14.348267,14.363975,14.630063,14.645771,26.669014,30,33.82,34.8076,68.136422,69.778422,90.481342,121.796071}
> @attribute cao_adio_1 {0.206765,0.283798,0.315159,0.337425,0.37848,0.394753,0.425886,0.501115,0.551005,0.5863,0.601174,0.651894,0.735854,0.74784,0.907721,0.989183,1.545249,2.572755,2.795756,5.190768,5.8391,6.582612,13.261847,17.610988}
> @attribute cao_adio_2 {0.000834,0.001273,0.001327,0.001354,0.001974,0.002078,0.002367,0.002421,0.003503,0.004797,0.00509,0.006668,0.007975,0.009023,0.009125,0.009826,0.010218,0.010871,0.019383,0.025528,0.030266,0.056674,0.115324,0.152431}
> @attribute cao_adio_3 {0.13806,0.403447,0.4229,0.583324,0.594564,0.602398,0.628536,0.686745,1.153607,1.607588,2.145031,2.208411,2.240762,2.345007,2.564202,2.599198,2.802014,2.824404,3.004571,3.070688,3.071084,3.07116,3.820686,4.706682,5.8391,10.910081}
> @attribute cao_adio_4 {0.000603,0.000979,0.001499,0.002363,0.00244,0.002665,0.006205,0.006674,0.008469,0.010006,0.010387,0.010389,0.011013,0.01154,0.012504,0.014354,0.018078,0.018655,0.025528,0.030046,0.031645,0.066918,0.26971,0.29058,0.440864}
> @attribute cao_adio_5 {0,0.004426,0.006228,0.007507,0.009866,0.010801,0.01259,0.016913,0.019779,0.021807,0.022662,0.025576,0.0304,0.031089,0.040275,0.044956,0.076906,0.135943,0.148757,0.286367,0.323619,0.366339,0.75011,1}
> @attribute cao_adio_6 {0,0.002897,0.003251,0.00343,0.00752,0.008206,0.010108,0.010469,0.017607,0.026139,0.028072,0.03848,0.047102,0.054019,0.054688,0.05931,0.061902,0.066204,0.122353,0.16289,0.194146,0.368342,0.755225,1}
> @attribute cao_adio_7 {0,0.006228,0.008459,0.010801,0.016913,0.025576,0.025577,0.040275,0.044007,0.044956,0.054344,0.054453,0.076906,0.082094,0.084563,0.366339,0.460575,0.506523,0.75011,1}
> @attribute cao_adio_8 {0,0.005593,0.008206,0.010108,0.017607,0.03848,0.047102,0.061902,0.073034,0.119621,0.194146,0.321446,0.367129,0.368342,0.397668,0.745786,0.755225,0.807519,0.8911,1}
> @attribute interaction_1 {0,10,20,24,32,64,92.244,152,280,298.778,434.855,446,460,476,491.313,500.441,580,626,656,892,920,963,1277,1278,1620,1689,1870,1908,3053}
> @attribute interaction_7 {-50.681992,-50.594737,-48.285601,-30.303388,-26.958928,-17.268367,-12.081795,-11.796264,-8.944336,-8.944233,-8.402016,-8.393993,-8.124473,-7.769908,-7.191981,-7.00155,-5.249546,-5.047962,-5.045061,-5.02473,-2.343749,-2.125313,-1.655702,-1.421961,-1.378085,-1.272308,-1.111618,-0.662276,-0.567054,-0.491407,-0.442187,-0.434622,-0.380064,-0.318507,-0.285531,-0.215974,-0.166869,-0.050358,0,0.000009,0.001947,0.057854,0.069486,0.159527,0.285044,0.604332,0.662186,1.038454,1.053547,2.409871,2.656035,3.986866,4.131746,4.359989,4.374276,4.619995,4.636829,5.080245,5.49212,6.530564,12.751804,14.173968,20.929771,22.989681,24.571874,34.609047,48.073427,51.624381,72.470495,72.934913,156.331292}
>
> They are clearly different. They need to be identical, otherwise WEKA will complain.
>
> Cheers,
> Eibe
>
> > On 27 May 2017, at 10:15, houda mejri <[hidden email]> wrote:
> >
> > Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
> >
> > <cao_rs_1.arff><cao_rs_1_test.arff>_______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Eibe Frank-2
Administrator
In reply to this post by Marc Stein
Yes, given your assumptions, the results should be fairly similar. However, the model evaluated on the 20% split of 90k instances will only be built from 0.8*90k instances. The model you evaluate on the 1k test set will be built from the full 90k records (assuming you used the standard Explorer set-up, which applies the model built from the full training set loaded into the Preprocess panel to the specified test set). This can cause some difference because the training sets are different in size in the two cases. The difference is 0.2*90k = 18k instances.

There is obviously also some variance in the estimates because of the limited size of data used for testing. In particular, the 1k test set estimate will have relatively high variance. For classification error rate, there is a way to compute a confidence interval for the error estimate, for a given number of test instances, in our book (Section 5.2 in the fourth edition; see also Slide 13 in http://www.cs.waikato.ac.nz/ml/weka/slides/Chapter5.pptx).

Cheers,
Eibe

> On 28 May 2017, at 15:13, Marc Stein <[hidden email]> wrote:
>
> I'm curious about something (or maybe I'm just confused).
>
> I have a set of 100k records. I divide it into a 90k set and reserve a 10k set for testing.
>
> I build a model using the 90k set with a 80/20 split.
>
> I then use 1k of the 10k test set as a supplied test set. Shouldn't the 20% test set in the training phase be similar to the 1k external test set in outcomes?
>
> What accounts for the variances, assuming that all sets are created randomly and have similar distributions?
>
> Many thanks in advance.
>
> Best,
>
> Marc
>
> On Fri, May 26, 2017 at 6:15 PM, houda mejri <[hidden email]> wrote:
> Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Marc Stein
Thanks Elbe!

Is there a best practice approach for sizing between split training and holdout files?

Best,

Marc

On Sat, May 27, 2017 at 11:38 PM, Eibe Frank <[hidden email]> wrote:
Yes, given your assumptions, the results should be fairly similar. However, the model evaluated on the 20% split of 90k instances will only be built from 0.8*90k instances. The model you evaluate on the 1k test set will be built from the full 90k records (assuming you used the standard Explorer set-up, which applies the model built from the full training set loaded into the Preprocess panel to the specified test set). This can cause some difference because the training sets are different in size in the two cases. The difference is 0.2*90k = 18k instances.

There is obviously also some variance in the estimates because of the limited size of data used for testing. In particular, the 1k test set estimate will have relatively high variance. For classification error rate, there is a way to compute a confidence interval for the error estimate, for a given number of test instances, in our book (Section 5.2 in the fourth edition; see also Slide 13 in http://www.cs.waikato.ac.nz/ml/weka/slides/Chapter5.pptx).

Cheers,
Eibe

> On 28 May 2017, at 15:13, Marc Stein <[hidden email]> wrote:
>
> I'm curious about something (or maybe I'm just confused).
>
> I have a set of 100k records. I divide it into a 90k set and reserve a 10k set for testing.
>
> I build a model using the 90k set with a 80/20 split.
>
> I then use 1k of the 10k test set as a supplied test set. Shouldn't the 20% test set in the training phase be similar to the 1k external test set in outcomes?
>
> What accounts for the variances, assuming that all sets are created randomly and have similar distributions?
>
> Many thanks in advance.
>
> Best,
>
> Marc
>
> On Fri, May 26, 2017 at 6:15 PM, houda mejri <[hidden email]> wrote:
> Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Weka Test

Eibe Frank-2
Administrator
If you have enough time, it’s probably best to try different sizes to create a learning curve (and compute confidence intervals, if possible). Ideally, you’d also use repetition, i.e., the repeated hold-out method.

If the learning curve flattens out early, you can get away with using less training data. If it remains steep, you will probably want to use as much training data as possible.

Cheers,
Eibe

> On 29/05/2017, at 6:55 AM, Marc Stein <[hidden email]> wrote:
>
> Thanks Elbe!
>
> Is there a best practice approach for sizing between split training and holdout files?
>
> Best,
>
> Marc
>
> On Sat, May 27, 2017 at 11:38 PM, Eibe Frank <[hidden email]> wrote:
> Yes, given your assumptions, the results should be fairly similar. However, the model evaluated on the 20% split of 90k instances will only be built from 0.8*90k instances. The model you evaluate on the 1k test set will be built from the full 90k records (assuming you used the standard Explorer set-up, which applies the model built from the full training set loaded into the Preprocess panel to the specified test set). This can cause some difference because the training sets are different in size in the two cases. The difference is 0.2*90k = 18k instances.
>
> There is obviously also some variance in the estimates because of the limited size of data used for testing. In particular, the 1k test set estimate will have relatively high variance. For classification error rate, there is a way to compute a confidence interval for the error estimate, for a given number of test instances, in our book (Section 5.2 in the fourth edition; see also Slide 13 in http://www.cs.waikato.ac.nz/ml/weka/slides/Chapter5.pptx).
>
> Cheers,
> Eibe
>
> > On 28 May 2017, at 15:13, Marc Stein <[hidden email]> wrote:
> >
> > I'm curious about something (or maybe I'm just confused).
> >
> > I have a set of 100k records. I divide it into a 90k set and reserve a 10k set for testing.
> >
> > I build a model using the 90k set with a 80/20 split.
> >
> > I then use 1k of the 10k test set as a supplied test set. Shouldn't the 20% test set in the training phase be similar to the 1k external test set in outcomes?
> >
> > What accounts for the variances, assuming that all sets are created randomly and have similar distributions?
> >
> > Many thanks in advance.
> >
> > Best,
> >
> > Marc
> >
> > On Fri, May 26, 2017 at 6:15 PM, houda mejri <[hidden email]> wrote:
> > Good evening, I have an arff file with a discrete attribute when I generated a test file it did not work. Attached the two files can you help me please.
> >
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html