Quantcast

Replacing missing values

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Replacing missing values

Mohamad El Abed
Hello everyone,

I want to use the filter "ReplaceMissingValues" in order to have a data
set without missing values. I found that after aplying this filter
(using WEKA Explorer), not all the missing values are replaced. Can
anyone help me to replace all the missing values existing in the dataset ?


Thanks in advance.



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Harri Saarikoski-2


2009/10/28 Mohamad El Abed <[hidden email]>
Hello everyone,

I want to use the filter "ReplaceMissingValues" in order to have a data set without missing values. I found that after aplying this filter (using WEKA Explorer), not all the missing values are replaced. Can anyone help me to replace all the missing values existing in the dataset ?

what sort of features were not replaced? it should work with numeric and nominal features
or perhaps the value slot was not empty, so the filter didn't interpret it as missing?
 


Thanks in advance.



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



--
-----------------
Harri M.T. Saarikoski
M.A, PhD graduate student
Helsinki University
Finland

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Mohamad El Abed
I join for u the data set used. In fact, it is  "weather.arff" data set
that i have modified to put missing values.

Harri Saarikoski a écrit :

>
>
> 2009/10/28 Mohamad El Abed <[hidden email]
> <mailto:[hidden email]>>
>
>     Hello everyone,
>
>     I want to use the filter "ReplaceMissingValues" in order to have a
>     data set without missing values. I found that after aplying this
>     filter (using WEKA Explorer), not all the missing values are
>     replaced. Can anyone help me to replace all the missing values
>     existing in the dataset ?
>
>
> what sort of features were not replaced? it should work with numeric
> and nominal features
> or perhaps the value slot was not empty, so the filter didn't
> interpret it as missing?
>  
>
>
>
>     Thanks in advance.
>
>
>
>     _______________________________________________
>     Wekalist mailing list
>     Send posts to: [hidden email]
>     <mailto:[hidden email]>
>     List info and subscription status:
>     https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>     List etiquette:
>     http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>     <http://www.cs.waikato.ac.nz/%7Eml/weka/mailinglist_etiquette.html>
>
>
>
>
> --
> -----------------
> Harri M.T. Saarikoski
> M.A, PhD graduate student
> Helsinki University
> Finland
> ------------------------------------------------------------------------
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>  

--
Mohamad EL Abed
PhD Student in computer science
GREYC - ENSICAEN - France
http://www.ecole.ensicaen.fr/~elabed/


@relation weather

@attribute outlook {sunny,overcast,rainy}
@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE,FALSE}
@attribute play {yes,no}

@data
?,85,85,FALSE,?
?,?,90,TRUE,?
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Harri Saarikoski-2
isn't a missing value to be expressed as null, not a question mark?

I'm afraid this is an issue peter etc. are better to answer

2009/10/28 Mohamad El Abed <[hidden email]>
I join for u the data set used. In fact, it is  "weather.arff" data set that i have modified to put missing values.

Harri Saarikoski a écrit :


2009/10/28 Mohamad El Abed <[hidden email] <mailto:[hidden email]>>


   Hello everyone,

   I want to use the filter "ReplaceMissingValues" in order to have a
   data set without missing values. I found that after aplying this
   filter (using WEKA Explorer), not all the missing values are
   replaced. Can anyone help me to replace all the missing values
   existing in the dataset ?


what sort of features were not replaced? it should work with numeric and nominal features
or perhaps the value slot was not empty, so the filter didn't interpret it as missing?
 


   Thanks in advance.



   _______________________________________________
   Wekalist mailing list
   Send posts to: [hidden email]
   <mailto:[hidden email]>    <http://www.cs.waikato.ac.nz/%7Eml/weka/mailinglist_etiquette.html>





--
-----------------
Harri M.T. Saarikoski
M.A, PhD graduate student
Helsinki University
Finland
------------------------------------------------------------------------


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
 


--
Mohamad EL Abed
PhD Student in computer science
GREYC - ENSICAEN - France
http://www.ecole.ensicaen.fr/~elabed/


@relation weather

@attribute outlook {sunny,overcast,rainy}
@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE,FALSE}
@attribute play {yes,no}

@data
?,85,85,FALSE,?
?,?,90,TRUE,?
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html




--
-----------------
Harri M.T. Saarikoski
M.A, PhD graduate student
Helsinki University
Finland

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Peter Reutemann-3
In reply to this post by Mohamad El Abed
> I want to use the filter "ReplaceMissingValues" in order to have a data set
> without missing values. I found that after aplying this filter (using WEKA
> Explorer), not all the missing values are replaced. Can anyone help me to
> replace all the missing values existing in the dataset ?

You have to provide a few more details, like what particular missing
values didn't get replaced. Also, knowing what dataset you were using
to replicate this behavior (you can attach a trimmed down version of
the dataset with your posts).

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Peter Reutemann-3
In reply to this post by Mohamad El Abed
> I join for u the data set used. In fact, it is  "weather.arff" data set that
> i have modified to put missing values.

[...]

> @relation weather
>
> @attribute outlook {sunny,overcast,rainy}
> @attribute temperature numeric
> @attribute humidity numeric
> @attribute windy {TRUE,FALSE}
> @attribute play {yes,no}
>
> @data
> ?,85,85,FALSE,?
> ?,?,90,TRUE,?
> overcast,83,86,FALSE,yes
> rainy,70,96,FALSE,yes
> rainy,68,80,FALSE,yes
> rainy,65,70,TRUE,no
> overcast,64,65,TRUE,yes
> sunny,72,95,FALSE,no
> sunny,69,70,FALSE,yes
> rainy,75,80,FALSE,yes
> sunny,75,70,TRUE,yes
> overcast,72,90,TRUE,yes
> overcast,81,75,FALSE,yes
> rainy,71,91,TRUE,no

The ReplaceMissingValues filter *doesn't touch* the class attribute
when set. Since you were using the Explorer, you need to *unset* the
class attribute first before applying the filter. By default, the
Explorer automatically uses the last attribute as class attribute.
Just select "No class" in the combo box on the preprocess panel.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Peter Reutemann-3
In reply to this post by Harri Saarikoski-2
> isn't a missing value to be expressed as null, not a question mark?

No, a missing value has (always) been represented by a question mark
in ARFF files. NULL (and not the string "NULL"), when data is
retrieved from a database, is interpreted as missing value.

[...]

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Mohamad El Abed
In reply to this post by Peter Reutemann-3
Ok got it...I just have a simple question plz Peter. After creating my
variables (numeric and nominal), am affecting the attribute values from
an excel file (.xls).
This a part from my code:

......
Instances dataset;
Instance element=new Instance(sheet.getColumns()); //creating an instance

.....

//When having a numeric value
nc = (NumberCell) a[j];
element.setValue(att_num[indice_num],nc.getValue()); //nc.getValue
returns the numeric value in the excel cell a[j]

//When having a nominal value
lc = (LabelCell) a[j];
element.setValue(att_nom[indice_nom], lc.getString()); //lc.getString()
returns the nominal value in the excel cell a[j]

*//when havine a missing value..........HERE IS MY QUESTION: what I
should put on the second paremeter of this function?*
element.setValue(att_nom[indice_nom], ----)

element.setDataset(dataset);
dataset.add(element);




Tkanks you Peter a lot for helping me....









Peter Reutemann a écrit :

>> I join for u the data set used. In fact, it is  "weather.arff" data set that
>> i have modified to put missing values.
>>    
>
> [...]
>
>  
>> @relation weather
>>
>> @attribute outlook {sunny,overcast,rainy}
>> @attribute temperature numeric
>> @attribute humidity numeric
>> @attribute windy {TRUE,FALSE}
>> @attribute play {yes,no}
>>
>> @data
>> ?,85,85,FALSE,?
>> ?,?,90,TRUE,?
>> overcast,83,86,FALSE,yes
>> rainy,70,96,FALSE,yes
>> rainy,68,80,FALSE,yes
>> rainy,65,70,TRUE,no
>> overcast,64,65,TRUE,yes
>> sunny,72,95,FALSE,no
>> sunny,69,70,FALSE,yes
>> rainy,75,80,FALSE,yes
>> sunny,75,70,TRUE,yes
>> overcast,72,90,TRUE,yes
>> overcast,81,75,FALSE,yes
>> rainy,71,91,TRUE,no
>>    
>
> The ReplaceMissingValues filter *doesn't touch* the class attribute
> when set. Since you were using the Explorer, you need to *unset* the
> class attribute first before applying the filter. By default, the
> Explorer automatically uses the last attribute as class attribute.
> Just select "No class" in the combo box on the preprocess panel.
>
> Cheers, Peter
>  


--
Mohamad EL Abed
PhD Student in computer science
GREYC - ENSICAEN - France
http://www.ecole.ensicaen.fr/~elabed/


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Peter Reutemann-3
Please no top-posting, see mailing list etiquette why
(http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html).

> Ok got it...I just have a simple question plz Peter. After creating my
> variables (numeric and nominal), am affecting the attribute values from an
> excel file (.xls).
> This a part from my code:
>
> ......
> Instances dataset;
> Instance element=new Instance(sheet.getColumns()); //creating an instance
>
> .....
>
> //When having a numeric value
> nc = (NumberCell) a[j];
> element.setValue(att_num[indice_num],nc.getValue()); //nc.getValue returns
> the numeric value in the excel cell a[j]
>
> //When having a nominal value
> lc = (LabelCell) a[j];
> element.setValue(att_nom[indice_nom], lc.getString()); //lc.getString()
> returns the nominal value in the excel cell a[j]
>
> *//when havine a missing value..........HERE IS MY QUESTION: what I should
> put on the second paremeter of this function?*
> element.setValue(att_nom[indice_nom], ----)
>
> element.setDataset(dataset);
> dataset.add(element);

When reading the Javadoc of the weka.core.Instance class (or
interface, if you're working with the most current code of the
developer version) you will come across the following methods:
  setMissingValue(int)
  setMissingValue(Attribute)

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Replacing missing values

Harri Saarikoski-2
In reply to this post by Peter Reutemann-3


2009/10/28 Peter Reutemann <[hidden email]>
> isn't a missing value to be expressed as null, not a question mark?

No, a missing value has (always) been represented by a question mark
in ARFF files. NULL (and not the string "NULL"), when data is
retrieved from a database, is interpreted as missing value.

I actually meant having no value, empty=null, not string
but not really my affair to answer these questions
 
[...]

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



--
-----------------
Harri M.T. Saarikoski
M.A, PhD graduate student
Helsinki University
Finland

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Loading...