Modify an instance by iterating through attributes

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Modify an instance by iterating through attributes

imagiko
I would like to perform some custom modifications/pre-processing to an
instance. The instance is created by reading a csv file. I dont want to
modify the source dataset, so I would like to iterate through each element
in the row and modify it. This is my first time using weka/java in general,
so I'm a bit lost reading the documentation. Right now, I have an instance
created using Datasource.read(). I can access the index of the field using
.attribute() function. How can I run iterate through all the values of that
attribute? It is a numeric field.



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Modify an instance by iterating through attributes

Peter Reutemann
> I would like to perform some custom modifications/pre-processing to an
> instance. The instance is created by reading a csv file. I dont want to
> modify the source dataset, so I would like to iterate through each element
> in the row and modify it. This is my first time using weka/java in general,
> so I'm a bit lost reading the documentation. Right now, I have an instance
> created using Datasource.read(). I can access the index of the field using
> .attribute() function. How can I run iterate through all the values of that
> attribute? It is a numeric field.

From what I remember, R datasets are column based. However, Weka uses
a row based approach.
You have to iterate through all the Instance objects (= rows) of your
Instances object (= dataset).
Use the Instance.value(int) or Instance.stringValue(int) (or
relationalValue(int)) methods to retrieve a specific column from that
row, depending on its type.
.attribute(int) gives you the information for the column as defined by
the dataset (Instances).

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Modify an instance by iterating through attributes

imagiko
Thank you for the answer. I was able to iterate through the instance. Just
putting it down here for ref:

                Instances dataset = DataSource.read(path);
                Attribute A = dataset.attribute("A");
                for (int i = 1; i < dataset.numInstances(); i++) {
                Instance row= dataset.instance(i);
                System.out.println(row.value(A));
                }



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Modify an instance by iterating through attributes

Peter Reutemann
> Thank you for the answer. I was able to iterate through the instance. Just
> putting it down here for ref:
>
>                 Instances dataset = DataSource.read(path);
>                 Attribute A = dataset.attribute("A");
>                 for (int i = 1; i < dataset.numInstances(); i++) {
>                         Instance row= dataset.instance(i);
>                         System.out.println(row.value(A));
>                 }

Starting from 1, is that on purpose? Java indices start at 0.

Also, I believe using the int index of an attribute will be faster for
accessing the value than the Attribute reference object. You can
retrieve the index via the Attribute.index() method.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html