Forecasting with WEKA

classic Classic list List threaded Threaded
4 messages Options
JC
Reply | Threaded
Open this post in threaded view
|

Forecasting with WEKA

JC
I would like to know how can I process data in forecast.

I have the following dataset:

@relation csv1000filas

@attribute IDPatient numeric
@attribute Date DATE "dd-MM-yyyy HH:mm:ss"
@attribute Temp numeric
@attribute SPO2min numeric
@attribute SPO2max numeric
@attribute BPMmin numeric
@attribute BPMmax numeric
@attribute BPMavg numeric
@attribute Sys numeric
@attribute Dia numeric
@attribute EDAmin numeric
@attribute EDAmax numeric
@attribute EDAavg numeric
@attribute Disease numeric

@data
1,'13-12-2019 17:10:00',34.27,95.5,98.02,69,74,71.42,141,91,4.67,4.94,4.78,1
1,'25-12-2019 16:54:00',34.22,96.45,100,69,73,71.76,159,85,1.31,1.39,1.36,1
1,'24-01-2020 20:30:00',35.1,94.26,99.68,64,68,66.88,171,86,1.73,1.82,1.78,1
1,'25-01-2020
18:44:00',34.93,93.98,96.75,72,76,74.24,228,124,3.25,3.43,3.35,1
1,'29-01-2020 20:58:00',34.94,94.96,97.9,74,77,75.7,169,84,1.77,1.85,1.81,1
1,'11-02-2020 17:33:00',35.3,94.16,98.13,69,72,71.5,168,79,9.31,9.91,9.5,1
1,'12-02-2020
16:00:00',34.88,94.39,99.84,73,77,75.72,165,72,5.82,6.64,6.08,1
1,'13-02-2020 17:08:00',35.48,94.95,97.43,69,77,73.79,125,73,5.7,6.4,6.02,1

For example, these cases refer to patient 1.
*What I am trying to do is make predictions with test data, predict the
"Disease" attribute of a new instance.* To do this, a new instance is
introduced and compared with the time evaluation of other instances. How
could I do this? I have tried it with *SVO* or *LinearRegression* but the
*graph and evolution are not shown*.

I am doing it with the explorer of WEKA, not code.

Thanks in advance!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Forecasting with WEKA

Eibe Frank-3
To predict "Disease", you first need to tick this attribute under "Target Selection". Then, if you click "Start", a model will be built from the entire dataset by creating lag attributes based on the target attribute, adding remapped attributes based on the date, and adding combinations of lag variables and the remapped date.

The default model is a linear regression model. Note that LinearRegression in WEKA, if you do not reconfigure it, performs attribute selection and tries to eliminate highly correlated attributes. Thus, you may end up with a "null model" that simply predicts the mean target value, which happens with the data you have posted.

The "Output" tab will show you the model built from the entire training set. It will also show you the "predictions" of the target attribute for the training set and a prediction obtained by moving one time step into the future. This latter value is the only actual prediction (in the default mode of the Forecast panel). It is marked with an asterisk. The other values are not meaningful (as predictions) because the actual target data for those time steps was used for training the model!

If you select the "Train future pred." tab, it will show you a plot of the "predicted" values. The actual prediction (i.e., the last point) will be plotted as a little circle. The predictions on the training data will be plotted as little squares.

To predict further into the future, you can increment the "Number of time units to forecast".

Cheers,
Eibe

On Sun, Dec 22, 2019 at 12:06 AM JC <[hidden email]> wrote:
I would like to know how can I process data in forecast.

I have the following dataset:

@relation csv1000filas

@attribute IDPatient numeric
@attribute Date DATE "dd-MM-yyyy HH:mm:ss"
@attribute Temp numeric
@attribute SPO2min numeric
@attribute SPO2max numeric
@attribute BPMmin numeric
@attribute BPMmax numeric
@attribute BPMavg numeric
@attribute Sys numeric
@attribute Dia numeric
@attribute EDAmin numeric
@attribute EDAmax numeric
@attribute EDAavg numeric
@attribute Disease numeric

@data
1,'13-12-2019 17:10:00',34.27,95.5,98.02,69,74,71.42,141,91,4.67,4.94,4.78,1
1,'25-12-2019 16:54:00',34.22,96.45,100,69,73,71.76,159,85,1.31,1.39,1.36,1
1,'24-01-2020 20:30:00',35.1,94.26,99.68,64,68,66.88,171,86,1.73,1.82,1.78,1
1,'25-01-2020
18:44:00',34.93,93.98,96.75,72,76,74.24,228,124,3.25,3.43,3.35,1
1,'29-01-2020 20:58:00',34.94,94.96,97.9,74,77,75.7,169,84,1.77,1.85,1.81,1
1,'11-02-2020 17:33:00',35.3,94.16,98.13,69,72,71.5,168,79,9.31,9.91,9.5,1
1,'12-02-2020
16:00:00',34.88,94.39,99.84,73,77,75.72,165,72,5.82,6.64,6.08,1
1,'13-02-2020 17:08:00',35.48,94.95,97.43,69,77,73.79,125,73,5.7,6.4,6.02,1

For example, these cases refer to patient 1.
*What I am trying to do is make predictions with test data, predict the
"Disease" attribute of a new instance.* To do this, a new instance is
introduced and compared with the time evaluation of other instances. How
could I do this? I have tried it with *SVO* or *LinearRegression* but the
*graph and evolution are not shown*.

I am doing it with the explorer of WEKA, not code.

Thanks in advance!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
JC
Reply | Threaded
Open this post in threaded view
|

Re: Forecasting with WEKA

JC
Thank you Eibe for your response!

I am trying it but I can't get the future predictions. By the way this is
what I get:

Date                                 Disease
28-09-2018 08:25:56        2
29-09-2018 07:25:51        2
30-09-2018 06:25:46        2
01-10-2018 05:25:41        3
02-10-2018 04:25:35        2
03-10-2018 03:25:30        3
04-10-2018 02:25:25        3
05-10-2018 01:25:20        3
06-10-2018 00:25:14        3
06-10-2018 23:25:09        3
07-10-2018 22:25:04        3
08-10-2018 21:24:59        2


But my last question is this, how can I introduce a new patient, for example
with idpatient 95, which has, for example, 7 instances? that is, how can I
compare these 7 instances as if it were just one with the model?

*At the end, is what I am trying it, compare for example 7 instances of a
Patient with all the instances of the differents patients of the model and
try to predict what disease can develop the new patient*.

Thanks in advance!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Forecasting with WEKA

Eibe Frank
It seems you want to do time series classification, i.e., assign a label to an entire time series. The forecasting facility in WEKA is not the right tool for that.

There are basically three options to perform time series classification in WEKA. Regardless of which option you choose, you will need to represent your individual time series as values of a relation-valued attribute. The easiest way to do this is to apply the PropositionalToMultiInstance filter from the multiInstanceFilters package. It expects the bag identifier (i.e., the time series identifier) to be a nominal attribute. Assuming your data (in the format you gave in your message) is stored in a file called temp.arff, the following command-line will transform it into relation-valued format:

java -cp weka-3-8-4/weka.jar weka.Run .MultiFilter -i temp.arff -F ".NumericToNominal -R first,last" -F ".PropositionalToMultiInstance -no-weights" -o temp.rel.arff

In the resulting temp.rel.arff file, each row (i.e., instance) will contain an entire time series. (Note that you will need at least two distinct class values for this command to work.)

The hidden Markov model from the HMM package can be applied to this data for time series classification in principle (but it is a bit numerically fragile). That is the first option:

java -cp weka-3-8-4/weka.jar weka.Run .HMM -t temp.rel.arff

The second option is to use a recurrent neural network by installing the wekaDeeplearning4j package and applying the network after deletion of the sequence ID attribute:

java -cp weka-3-8-4/weka.jar weka.Run .FilteredClassifier -t temp.rel.arff -F ".Remove -R 1" -W .RnnSequenceClassifier -- -iterator ".RelationalInstanceIterator"

You will need to configure the layers of the network and other hyperparameters to get reasonable results with this approach. This command-line only shows the absolutely essential bits to get a network going.

The last option to process time series for classification (or regression) is to use the SAXTransformer filter in the timeSeriesFilters package, followed by StringToWordVector (or similar). However, this method currently only works for univariate time series. Your time series are multivariate because you have multiple attributes describing each point in time.

Cheers,
Eibe

On Mon, Dec 23, 2019 at 12:15 AM JC <[hidden email]> wrote:
Thank you Eibe for your response!

I am trying it but I can't get the future predictions. By the way this is
what I get:

Date                                 Disease
28-09-2018 08:25:56        2
29-09-2018 07:25:51        2
30-09-2018 06:25:46        2
01-10-2018 05:25:41        3
02-10-2018 04:25:35        2
03-10-2018 03:25:30        3
04-10-2018 02:25:25        3
05-10-2018 01:25:20        3
06-10-2018 00:25:14        3
06-10-2018 23:25:09        3
07-10-2018 22:25:04        3
08-10-2018 21:24:59        2


But my last question is this, how can I introduce a new patient, for example
with idpatient 95, which has, for example, 7 instances? that is, how can I
compare these 7 instances as if it were just one with the model?

*At the end, is what I am trying it, compare for example 7 instances of a
Patient with all the instances of the differents patients of the model and
try to predict what disease can develop the new patient*.

Thanks in advance!



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html