Epitope prediction

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Epitope prediction

Kintan_r29
Hello, I'm working on epitope prediction models. My input file has data in
string format. And therefore there are very few classifiers working with my
file because most of them doesn't work with String, they require nominal or
numeric input. Therefore i converted my input file into nominal format, most
of classifiers which work with nominal input works now, but the problem is
when i use those models for prediction from a protein sequence, they doesn't
work because models were trained using nominal input and im providing string
input in test. Can anyone provide me a solution on, what should i do to run
those models on my protein sequence, or anyone can provide me a better way
to utilize classifiers for epitope prediction.



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Epitope prediction

Gabriel  Del Rio
Hi,

It is convenient to transform protein sequences into numeric attributes. There are different ways to do that, for instance, counting the frequency of each aminoacid, or each duplet or triplet, etc. I prefer to use chemical descriptors. ProtDCal is a good implementation for that goal (https://protdcal.zmb.uni-due.de/). I believe the manual includes how to transform ProtDCal output to be used as input for Weka.

I hope that helps.

Best,

Gabriel

El vie, 26 mar 2021 a las 15:30, Kintan_r29 (<[hidden email]>) escribió:
Hello, I'm working on epitope prediction models. My input file has data in
string format. And therefore there are very few classifiers working with my
file because most of them doesn't work with String, they require nominal or
numeric input. Therefore i converted my input file into nominal format, most
of classifiers which work with nominal input works now, but the problem is
when i use those models for prediction from a protein sequence, they doesn't
work because models were trained using nominal input and im providing string
input in test. Can anyone provide me a solution on, what should i do to run
those models on my protein sequence, or anyone can provide me a better way
to utilize classifiers for epitope prediction.



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Epitope prediction

Eibe Frank
It may be possible to do something similar directly from WEKA by using the FilteredClassifier in conjunction with the StringToWordVector filter, using the CharacterNGramTokenizer as the string tokernizer in StringToWordVector filter. Note that StringToWordVector also has a flag forcing it to output counts ("word" counts).

The FilteredClassifier can be used to apply a filter to both training and test data in a consistent way. It is generally best to use the FilteredClassifier when applying filters in WEKA to a classification and regression problems.

Cheers,
Eibe

On Sat, Mar 27, 2021 at 2:20 PM Gabriel Del Rio <[hidden email]> wrote:
Hi,

It is convenient to transform protein sequences into numeric attributes. There are different ways to do that, for instance, counting the frequency of each aminoacid, or each duplet or triplet, etc. I prefer to use chemical descriptors. ProtDCal is a good implementation for that goal (https://protdcal.zmb.uni-due.de/). I believe the manual includes how to transform ProtDCal output to be used as input for Weka.

I hope that helps.

Best,

Gabriel

El vie, 26 mar 2021 a las 15:30, Kintan_r29 (<[hidden email]>) escribió:
Hello, I'm working on epitope prediction models. My input file has data in
string format. And therefore there are very few classifiers working with my
file because most of them doesn't work with String, they require nominal or
numeric input. Therefore i converted my input file into nominal format, most
of classifiers which work with nominal input works now, but the problem is
when i use those models for prediction from a protein sequence, they doesn't
work because models were trained using nominal input and im providing string
input in test. Can anyone provide me a solution on, what should i do to run
those models on my protein sequence, or anyone can provide me a better way
to utilize classifiers for epitope prediction.



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Epitope prediction

Kintan_r29
Hello, Thanks for your reply, I've mentioned in my previous query that i was
able to make models for my data. I'm using EpiT, which is epitope toolkit
for epitope prediction. There are two parts of it one is model builder which
is same as WEKA's model builder, the second part is predictor, which takes
raw data, model and then it tries to use that model to predict epitopes. The
model that i've built is not working in predictor, it is showing me an error
like "THE TRAINING DATA AND TEST DATA DOESN'T MATCH". I converted my amino
acid sequences in numeric form and then built a model, if i convert my input
data in predictor it gives me numbers which are not comprehensible. I would
like to have a solution for this problem. Thanks



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Epitope prediction

Eibe Frank-3
As I said, the FilteredClassifier is normally the solution to this problem: it makes sure that the training data and the data used for prediction will be in the same format before they are passed to the actual machine learning algorithm (for training respectively prediction). Looking at the EpiT tutorial at


it also mentions the FilteredClassifier (in the context of processing data with string attributes using naive Bayes).

Make sure that all your preprocessing steps are defined using filters that are specified as arguments to FilteredClassifier (you can use nested FilteredClassifier objects or the MultiFilter to apply multiple filters).

Cheers,
Eibe


On Mon, Apr 5, 2021 at 1:44 PM Kintan_r29 <[hidden email]> wrote:
Hello, Thanks for your reply, I've mentioned in my previous query that i was
able to make models for my data. I'm using EpiT, which is epitope toolkit
for epitope prediction. There are two parts of it one is model builder which
is same as WEKA's model builder, the second part is predictor, which takes
raw data, model and then it tries to use that model to predict epitopes. The
model that i've built is not working in predictor, it is showing me an error
like "THE TRAINING DATA AND TEST DATA DOESN'T MATCH". I converted my amino
acid sequences in numeric form and then built a model, if i convert my input
data in predictor it gives me numbers which are not comprehensible. I would
like to have a solution for this problem. Thanks



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html