Combining different data sets

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Combining different data sets

Geet
Hi ,I am currently working on two data sets(tabular and image data set)-both
data sets classify the same class attribute.The idea is to increase accuracy
in class prediction by using different types of data sets for the same
class.For eg: using both iris flower tabular data set and iris flower images
data set to classify the iris species type.
How can I combine both the data sets to predict the class?can i take the
prediction results from each data set and create a third model that will
help in classification
or Can I extract features from tabular data and image data and use these
features to create another data set (of features) and use it for
classification?If so how can I create an arff file with the extracted
features

please advice



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Combining different data sets

Michael Hall


> On Apr 19, 2020, at 11:30 AM, Geet <[hidden email]> wrote:
>
> Hi ,I am currently working on two data sets(tabular and image data set)-both
> data sets classify the same class attribute.The idea is to increase accuracy
> in class prediction by using different types of data sets for the same
> class.For eg: using both iris flower tabular data set and iris flower images
> data set to classify the iris species type.
> How can I combine both the data sets to predict the class?can i take the
> prediction results from each data set and create a third model that will
> help in classification
> or Can I extract features from tabular data and image data and use these
> features to create another data set (of features) and use it for
> classification?If so how can I create an arff file with the extracted
> features
>
> please advice

Predict both ways. Probably different classifiers. Save the predicted probabilities. Average probabilities across the two methods. Pick the highest average probability. Sort of a voting ensemble.
Might be one way.
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Combining different data sets

Eibe Frank-2
Administrator
In reply to this post by Geet
You can use Vote as a meta classifier (or maybe Stacking) and use it to combine multiple FilteredClassifier objects. In each FilteredClassifier, use the Remove filter inside a MultiFilter to remove the attributes that you want to exclude from a particular view on the data. Subsequent filters in the MultiFilter can then be used to further process each view, e.g., filters from the imageFilters package.

Cheers,
Eibe

> On 20/04/2020, at 4:30 AM, Geet <[hidden email]> wrote:
>
> Hi ,I am currently working on two data sets(tabular and image data set)-both
> data sets classify the same class attribute.The idea is to increase accuracy
> in class prediction by using different types of data sets for the same
> class.For eg: using both iris flower tabular data set and iris flower images
> data set to classify the iris species type.
> How can I combine both the data sets to predict the class?can i take the
> prediction results from each data set and create a third model that will
> help in classification
> or Can I extract features from tabular data and image data and use these
> features to create another data set (of features) and use it for
> classification?If so how can I create an arff file with the extracted
> features
>
> please advice
>
>
>
> --
> Sent from: https://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to [hidden email]
> To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Combining different data sets

Geet
Hi Eibe,

Thank you for the response.I tried this method and ran J48 on iris flower
data set-sepal length,sepal width,petal length and petal width.I also tried
running J48 on Iris flower images(generated several features using Image
filter) and saved both the models.But when I try running voting using the
pre-built filtered classifiers(model 1 and model 2),the message displayed is
that  model 1 was trained with a data that has structure different from the
incoming training data,which happens to be the problem.How can this be
resolved?Is there any other way of combining classifiers run on different
types of data set ?



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Combining different data sets

Eibe Frank-2
Administrator
My assumption was that you have both images and other attributes for the *same* instance. It sounds like you are using the Iris image classification dataset from the WEKA project site (https://downloads.sourceforge.net/weka/iris_reloaded.zip) and Fisher’s iris data to build the two models? There is no direct correspondence between the specimens in the two datasets so the approach I suggested cannot be applied. Basically, what I suggested still assumes that you have a single table of information, where each row has numeric attributes as well as a string attribute giving the name of the corresponding image file.
The one thing you could do is use missing values to make a single table that has the above form:

?,5.1,3.5,1.4,0.2,Iris-setosa
...
iris-setosa/42777301.600x600.png,?,?,?,?,iris-setosa
...

However, I can see no benefit in doing this unless you also have some instances where the rows of data are complete.
Cheers,
Eibe

> On 25/04/2020, at 8:42 AM, Geet <[hidden email]> wrote:
>
> Hi Eibe,
>
> Thank you for the response.I tried this method and ran J48 on iris flower
> data set-sepal length,sepal width,petal length and petal width.I also tried
> running J48 on Iris flower images(generated several features using Image
> filter) and saved both the models.But when I try running voting using the
> pre-built filtered classifiers(model 1 and model 2),the message displayed is
> that  model 1 was trained with a data that has structure different from the
> incoming training data,which happens to be the problem.How can this be
> resolved?Is there any other way of combining classifiers run on different
> types of data set ?
>
>
>
> --
> Sent from: https://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to [hidden email]
> To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Combining different data sets

M Peter Jurkat
got it - thanks - p

Martin Peter Jurkat
Lecturer II

Anderson School of Management

University of New Mexico
[hidden email]



From: Eibe Frank <[hidden email]>
Sent: Friday, April 24, 2020 6:48 PM
To: Weka machine learning workbench list. <[hidden email]>
Subject: [Wekalist] Re: Combining different data sets
 
  UNM-IT Warning: This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

My assumption was that you have both images and other attributes for the *same* instance. It sounds like you are using the Iris image classification dataset from the WEKA project site (https://downloads.sourceforge.net/weka/iris_reloaded.zip) and Fisher’s iris data to build the two models? There is no direct correspondence between the specimens in the two datasets so the approach I suggested cannot be applied. Basically, what I suggested still assumes that you have a single table of information, where each row has numeric attributes as well as a string attribute giving the name of the corresponding image file.
The one thing you could do is use missing values to make a single table that has the above form:

?,5.1,3.5,1.4,0.2,Iris-setosa
...
iris-setosa/42777301.600x600.png,?,?,?,?,iris-setosa
...

However, I can see no benefit in doing this unless you also have some instances where the rows of data are complete.
Cheers,
Eibe

> On 25/04/2020, at 8:42 AM, Geet <[hidden email]> wrote:
>
> Hi Eibe,
>
> Thank you for the response.I tried this method and ran J48 on iris flower
> data set-sepal length,sepal width,petal length and petal width.I also tried
> running J48 on Iris flower images(generated several features using Image
> filter) and saved both the models.But when I try running voting using the
> pre-built filtered classifiers(model 1 and model 2),the message displayed is
> that  model 1 was trained with a data that has structure different from the
> incoming training data,which happens to be the problem.How can this be
> resolved?Is there any other way of combining classifiers run on different
> types of data set ?
>
>
>
> --
> Sent from: https://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list -- [hidden email]
> Send posts to [hidden email]
> To unsubscribe send an email to [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html