Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Hello, I been having problems when loading a .csv file into weka.
It's a 7 column with 469579 rows file.
The column headers are: TARGET ID (number), TITLE (text), DETAIL (text), DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS (nominal).
I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter spaces. When I try to load the file into weka I get the error: wrong number of values. Read2, expected 7, read Tolen[EOL], line 789.
I tried everything with no success, what I'm doing is open the csv file go to the line 789 and delete that row, save it, and then try again, I get the same error but now in line 1200 for example.
I'm sure I can solve this problem following this logic, but it is going to take a while. Do you have any idea what I'm doing wrong? I just want to put out there that I'm a newbie in all of this, so please bear with me. Thanks in advance.
Regards.
Ivan.

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Peter Reutemann
> Hello, I been having problems when loading a .csv file into weka.
> It's a 7 column with 469579 rows file.
> The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> (nominal).
> I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> spaces. When I try to load the file into weka I get the error: wrong number
> of values. Read2, expected 7, read Tolen[EOL], line 789.
> I tried everything with no success, what I'm doing is open the csv file go
> to the line 789 and delete that row, save it, and then try again, I get the
> same error but now in line 1200 for example.
> I'm sure I can solve this problem following this logic, but it is going to
> take a while. Do you have any idea what I'm doing wrong? I just want to put
> out there that I'm a newbie in all of this, so please bear with me. Thanks
> in advance.

Instead of using comma as column separator, try using tab. That tends
to solve a lot of problems with non-escaped single/double quotes and
commas.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
Regards.
Ivan.

2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> Hello, I been having problems when loading a .csv file into weka.
> It's a 7 column with 469579 rows file.
> The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> (nominal).
> I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> spaces. When I try to load the file into weka I get the error: wrong number
> of values. Read2, expected 7, read Tolen[EOL], line 789.
> I tried everything with no success, what I'm doing is open the csv file go
> to the line 789 and delete that row, save it, and then try again, I get the
> same error but now in line 1200 for example.
> I'm sure I can solve this problem following this logic, but it is going to
> take a while. Do you have any idea what I'm doing wrong? I just want to put
> out there that I'm a newbie in all of this, so please bear with me. Thanks
> in advance.

Instead of using comma as column separator, try using tab. That tends
to solve a lot of problems with non-escaped single/double quotes and
commas.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
<a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174">+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
Can you share one of the lines of data that give you problems?

Cheers,
Eibe

> On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
>
> Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> Regards.
> Ivan.
>
> 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > Hello, I been having problems when loading a .csv file into weka.
> > It's a 7 column with 469579 rows file.
> > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > (nominal).
> > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > spaces. When I try to load the file into weka I get the error: wrong number
> > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > I tried everything with no success, what I'm doing is open the csv file go
> > to the line 789 and delete that row, save it, and then try again, I get the
> > same error but now in line 1200 for example.
> > I'm sure I can solve this problem following this logic, but it is going to
> > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > in advance.
>
> Instead of using comma as column separator, try using tab. That tends
> to solve a lot of problems with non-escaped single/double quotes and
> commas.
>
> Cheers, Peter
> --
> Peter Reutemann
> Dept. of Computer Science
> University of Waikato, NZ
> +64 (7) 858-5174
> http://www.cms.waikato.ac.nz/~fracpete/
> http://www.data-mining.co.nz/
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:

line 6077

517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX

line 28105

4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX

So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
Regards.
Ivan.

2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
Can you share one of the lines of data that give you problems?

Cheers,
Eibe

> On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
>
> Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> Regards.
> Ivan.
>
> 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > Hello, I been having problems when loading a .csv file into weka.
> > It's a 7 column with 469579 rows file.
> > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > (nominal).
> > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > spaces. When I try to load the file into weka I get the error: wrong number
> > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > I tried everything with no success, what I'm doing is open the csv file go
> > to the line 789 and delete that row, save it, and then try again, I get the
> > same error but now in line 1200 for example.
> > I'm sure I can solve this problem following this logic, but it is going to
> > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > in advance.
>
> Instead of using comma as column separator, try using tab. That tends
> to solve a lot of problems with non-escaped single/double quotes and
> commas.
>
> Cheers, Peter
> --
> Peter Reutemann
> Dept. of Computer Science
> University of Waikato, NZ
> <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174">+64 (7) 858-5174
> http://www.cms.waikato.ac.nz/~fracpete/
> http://www.data-mining.co.nz/
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace

  , "

by

  ,"


Not sure whether this is by design or due to a bug in the CSV loader.

Cheers,
Eibe

> On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
>
> When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
>
> line 6077
>
> 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
>
> line 28105
>
> 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
>
> So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> Regards.
> Ivan.
>
> 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> Can you share one of the lines of data that give you problems?
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> >
> > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > Regards.
> > Ivan.
> >
> > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > Hello, I been having problems when loading a .csv file into weka.
> > > It's a 7 column with 469579 rows file.
> > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > (nominal).
> > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > spaces. When I try to load the file into weka I get the error: wrong number
> > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > I tried everything with no success, what I'm doing is open the csv file go
> > > to the line 789 and delete that row, save it, and then try again, I get the
> > > same error but now in line 1200 for example.
> > > I'm sure I can solve this problem following this logic, but it is going to
> > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > in advance.
> >
> > Instead of using comma as column separator, try using tab. That tends
> > to solve a lot of problems with non-escaped single/double quotes and
> > commas.
> >
> > Cheers, Peter
> > --
> > Peter Reutemann
> > Dept. of Computer Science
> > University of Waikato, NZ
> > +64 (7) 858-5174
> > http://www.cms.waikato.ac.nz/~fracpete/
> > http://www.data-mining.co.nz/
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
Regards.
Ivan.

2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace

  , "

by

  ,"


Not sure whether this is by design or due to a bug in the CSV loader.

Cheers,
Eibe

> On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
>
> When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
>
> line 6077
>
> 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
>
> line 28105
>
> 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
>
> So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> Regards.
> Ivan.
>
> 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> Can you share one of the lines of data that give you problems?
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> >
> > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > Regards.
> > Ivan.
> >
> > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > Hello, I been having problems when loading a .csv file into weka.
> > > It's a 7 column with 469579 rows file.
> > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > (nominal).
> > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > spaces. When I try to load the file into weka I get the error: wrong number
> > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > I tried everything with no success, what I'm doing is open the csv file go
> > > to the line 789 and delete that row, save it, and then try again, I get the
> > > same error but now in line 1200 for example.
> > > I'm sure I can solve this problem following this logic, but it is going to
> > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > in advance.
> >
> > Instead of using comma as column separator, try using tab. That tends
> > to solve a lot of problems with non-escaped single/double quotes and
> > commas.
> >
> > Cheers, Peter
> > --
> > Peter Reutemann
> > Dept. of Computer Science
> > University of Waikato, NZ
> > <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174">+64 (7) 858-5174
> > http://www.cms.waikato.ac.nz/~fracpete/
> > http://www.data-mining.co.nz/
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
Out of interest, and to test the CSVReader, I have gone through the process of converting your file.

There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:

cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv

Here is some info on what the components of this command-line do:

LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)

tr -d '\r': removes part of DOS line terminator so that only \n remains

gsed '/^$/d': deletes empty lines

gsed "s/\"\"/'/g": replaces "" by '

gsed "s/\"$//g": removes " at end of line

gsed "s/^\"//g": removes " at start of line

gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)

I've also copied the processed CSV file to

  http://www.cs.waikato.ac.nz/~eibe/temp.csv

Please let me know when you have downloaded it so that I can delete it.

Cheers,
Eibe

> On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
>
> Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> Regards.
> Ivan.
>  buiigcjoin1.csv
> ​
>
> 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
>
>   , "
>
> by
>
>   ,"
>
>
> Not sure whether this is by design or due to a bug in the CSV loader.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> >
> > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> >
> > line 6077
> >
> > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> >
> > line 28105
> >
> > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> >
> > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > Regards.
> > Ivan.
> >
> > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > Can you share one of the lines of data that give you problems?
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > Hello, I been having problems when loading a .csv file into weka.
> > > > It's a 7 column with 469579 rows file.
> > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > (nominal).
> > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > same error but now in line 1200 for example.
> > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > in advance.
> > >
> > > Instead of using comma as column separator, try using tab. That tends
> > > to solve a lot of problems with non-escaped single/double quotes and
> > > commas.
> > >
> > > Cheers, Peter
> > > --
> > > Peter Reutemann
> > > Dept. of Computer Science
> > > University of Waikato, NZ
> > > +64 (7) 858-5174
> > > http://www.cms.waikato.ac.nz/~fracpete/
> > > http://www.data-mining.co.nz/
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything. 
Regards. 
Ivan Ruiz. 

On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
Out of interest, and to test the CSVReader, I have gone through the process of converting your file.

There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:

cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv

Here is some info on what the components of this command-line do:

LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)

tr -d '\r': removes part of DOS line terminator so that only \n remains

gsed '/^$/d': deletes empty lines

gsed "s/\"\"/'/g": replaces "" by '

gsed "s/\"$//g": removes " at end of line

gsed "s/^\"//g": removes " at start of line

gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)

I've also copied the processed CSV file to

  http://www.cs.waikato.ac.nz/~eibe/temp.csv

Please let me know when you have downloaded it so that I can delete it.

Cheers,
Eibe

> On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
>
> Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> Regards.
> Ivan.
>  buiigcjoin1.csv
> ​
>
> 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
>
>   , "
>
> by
>
>   ,"
>
>
> Not sure whether this is by design or due to a bug in the CSV loader.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> >
> > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> >
> > line 6077
> >
> > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> >
> > line 28105
> >
> > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> >
> > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > Regards.
> > Ivan.
> >
> > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > Can you share one of the lines of data that give you problems?
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > Hello, I been having problems when loading a .csv file into weka.
> > > > It's a 7 column with 469579 rows file.
> > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > (nominal).
> > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > same error but now in line 1200 for example.
> > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > in advance.
> > >
> > > Instead of using comma as column separator, try using tab. That tends
> > > to solve a lot of problems with non-escaped single/double quotes and
> > > commas.
> > >
> > > Cheers, Peter
> > > --
> > > Peter Reutemann
> > > Dept. of Computer Science
> > > University of Waikato, NZ
> > > <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174">+64 (7) 858-5174
> > > http://www.cms.waikato.ac.nz/~fracpete/
> > > http://www.data-mining.co.nz/
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.

If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.

Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.

Cheers,
Eibe

> On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
>
> Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
>
> There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
>
> cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
>
> Here is some info on what the components of this command-line do:
>
> LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
>
> tr -d '\r': removes part of DOS line terminator so that only \n remains
>
> gsed '/^$/d': deletes empty lines
>
> gsed "s/\"\"/'/g": replaces "" by '
>
> gsed "s/\"$//g": removes " at end of line
>
> gsed "s/^\"//g": removes " at start of line
>
> gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
>
> I've also copied the processed CSV file to
>
>   http://www.cs.waikato.ac.nz/~eibe/temp.csv
>
> Please let me know when you have downloaded it so that I can delete it.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> >
> > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > Regards.
> > Ivan.
> >  buiigcjoin1.csv
> > ​
> >
> > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> >
> >   , "
> >
> > by
> >
> >   ,"
> >
> >
> > Not sure whether this is by design or due to a bug in the CSV loader.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > >
> > > line 6077
> > >
> > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > >
> > > line 28105
> > >
> > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > >
> > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > Can you share one of the lines of data that give you problems?
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > It's a 7 column with 469579 rows file.
> > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > (nominal).
> > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > same error but now in line 1200 for example.
> > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > in advance.
> > > >
> > > > Instead of using comma as column separator, try using tab. That tends
> > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > commas.
> > > >
> > > > Cheers, Peter
> > > > --
> > > > Peter Reutemann
> > > > Dept. of Computer Science
> > > > University of Waikato, NZ
> > > > +64 (7) 858-5174
> > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > http://www.data-mining.co.nz/
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time. 
Regards. 
Ivan Ruiz. 

On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.

If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.

Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.

Cheers,
Eibe

> On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
>
> Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
>
> There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
>
> cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
>
> Here is some info on what the components of this command-line do:
>
> LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
>
> tr -d '\r': removes part of DOS line terminator so that only \n remains
>
> gsed '/^$/d': deletes empty lines
>
> gsed "s/\"\"/'/g": replaces "" by '
>
> gsed "s/\"$//g": removes " at end of line
>
> gsed "s/^\"//g": removes " at start of line
>
> gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
>
> I've also copied the processed CSV file to
>
>   http://www.cs.waikato.ac.nz/~eibe/temp.csv
>
> Please let me know when you have downloaded it so that I can delete it.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> >
> > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > Regards.
> > Ivan.
> >  buiigcjoin1.csv
> > ​
> >
> > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> >
> >   , "
> >
> > by
> >
> >   ,"
> >
> >
> > Not sure whether this is by design or due to a bug in the CSV loader.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > >
> > > line 6077
> > >
> > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > >
> > > line 28105
> > >
> > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > >
> > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > Can you share one of the lines of data that give you problems?
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > It's a 7 column with 469579 rows file.
> > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > (nominal).
> > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > same error but now in line 1200 for example.
> > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > in advance.
> > > >
> > > > Instead of using comma as column separator, try using tab. That tends
> > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > commas.
> > > >
> > > > Cheers, Peter
> > > > --
> > > > Peter Reutemann
> > > > Dept. of Computer Science
> > > > University of Waikato, NZ
> > > > +64 (7) 858-5174
> > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > http://www.data-mining.co.nz/
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-3
There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.

Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.

The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.

Cheers,
Eibe

On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time. 
Regards. 
Ivan Ruiz. 

On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.

If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.

Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.

Cheers,
Eibe

> On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
>
> Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
>
> There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
>
> cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
>
> Here is some info on what the components of this command-line do:
>
> LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
>
> tr -d '\r': removes part of DOS line terminator so that only \n remains
>
> gsed '/^$/d': deletes empty lines
>
> gsed "s/\"\"/'/g": replaces "" by '
>
> gsed "s/\"$//g": removes " at end of line
>
> gsed "s/^\"//g": removes " at start of line
>
> gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
>
> I've also copied the processed CSV file to
>
>   http://www.cs.waikato.ac.nz/~eibe/temp.csv
>
> Please let me know when you have downloaded it so that I can delete it.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> >
> > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > Regards.
> > Ivan.
> >  buiigcjoin1.csv
> > ​
> >
> > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> >
> >   , "
> >
> > by
> >
> >   ,"
> >
> >
> > Not sure whether this is by design or due to a bug in the CSV loader.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > >
> > > line 6077
> > >
> > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > >
> > > line 28105
> > >
> > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > >
> > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > Can you share one of the lines of data that give you problems?
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > It's a 7 column with 469579 rows file.
> > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > (nominal).
> > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > same error but now in line 1200 for example.
> > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > in advance.
> > > >
> > > > Instead of using comma as column separator, try using tab. That tends
> > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > commas.
> > > >
> > > > Cheers, Peter
> > > > --
> > > > Peter Reutemann
> > > > Dept. of Computer Science
> > > > University of Waikato, NZ
> > > > <a href="tel:+64%207-858%205174" value="+6478585174" target="_blank">+64 (7) 858-5174
> > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > http://www.data-mining.co.nz/
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:

java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(Unknown Source)
at java.util.zip.ZipFile.<init>(Unknown Source)
at java.util.zip.ZipFile.<init>(Unknown Source)
at weka.core.packageManagement.DefaultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
at weka.core.packageManagement.DefaultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
at weka.core.packageManagement.DefaultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
at weka.core.packageManagement.DefaultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
at javax.swing.SwingWorker$1.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at javax.swing.SwingWorker.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
Regards.
Ivan Ruiz.


2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.

Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.

The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.

Cheers,
Eibe

On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time. 
Regards. 
Ivan Ruiz. 

On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.

If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.

Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.

Cheers,
Eibe

> On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
>
> Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
>
> There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
>
> cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
>
> Here is some info on what the components of this command-line do:
>
> LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
>
> tr -d '\r': removes part of DOS line terminator so that only \n remains
>
> gsed '/^$/d': deletes empty lines
>
> gsed "s/\"\"/'/g": replaces "" by '
>
> gsed "s/\"$//g": removes " at end of line
>
> gsed "s/^\"//g": removes " at start of line
>
> gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
>
> I've also copied the processed CSV file to
>
>   http://www.cs.waikato.ac.nz/~eibe/temp.csv
>
> Please let me know when you have downloaded it so that I can delete it.
>
> Cheers,
> Eibe
>
> > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> >
> > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > Regards.
> > Ivan.
> >  buiigcjoin1.csv
> > ​
> >
> > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> >
> >   , "
> >
> > by
> >
> >   ,"
> >
> >
> > Not sure whether this is by design or due to a bug in the CSV loader.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > >
> > > line 6077
> > >
> > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > >
> > > line 28105
> > >
> > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > >
> > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > Regards.
> > > Ivan.
> > >
> > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > Can you share one of the lines of data that give you problems?
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > It's a 7 column with 469579 rows file.
> > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > (nominal).
> > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > same error but now in line 1200 for example.
> > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > in advance.
> > > >
> > > > Instead of using comma as column separator, try using tab. That tends
> > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > commas.
> > > >
> > > > Cheers, Peter
> > > > --
> > > > Peter Reutemann
> > > > Dept. of Computer Science
> > > > University of Waikato, NZ
> > > > <a href="tel:+64%207-858%205174" value="+6478585174" target="_blank">+64 (7) 858-5174
> > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > http://www.data-mining.co.nz/
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
You’ll probably have to update to WEKA 3.8.1. This exception is due to the issue with downloading package .zip files from SourceForge that got fixed in 3.8.1/3.9.1.

Cheers,
Eibe

> On 24/01/2017, at 2:30 AM, Ivan Ruiz <[hidden email]> wrote:
>
> Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:
>
> java.util.zip.ZipException: error in opening zip file
> at java.util.zip.ZipFile.open(Native Method)
> at java.util.zip.ZipFile.<init>(Unknown Source)
> at java.util.zip.ZipFile.<init>(Unknown Source)
> at java.util.zip.ZipFile.<init>(Unknown Source)
> at weka.core.packageManagement.DefaultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
> at weka.core.packageManagement.DefaultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
> at weka.core.packageManagement.DefaultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
> at weka.core.packageManagement.DefaultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
> at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
> at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
> at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
> at javax.swing.SwingWorker$1.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at javax.swing.SwingWorker.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>
> this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
> Regards.
> Ivan Ruiz.
>
>
> 2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
> There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.
>
> Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.
>
> The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.
>
> Cheers,
> Eibe
>
> On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
> Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
> Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.
>
> If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.
>
> Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.
>
> Cheers,
> Eibe
>
> > On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
> >
> > Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> > Regards.
> > Ivan Ruiz.
> >
> > On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> > Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
> >
> > There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
> >
> > cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
> >
> > Here is some info on what the components of this command-line do:
> >
> > LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
> >
> > tr -d '\r': removes part of DOS line terminator so that only \n remains
> >
> > gsed '/^$/d': deletes empty lines
> >
> > gsed "s/\"\"/'/g": replaces "" by '
> >
> > gsed "s/\"$//g": removes " at end of line
> >
> > gsed "s/^\"//g": removes " at start of line
> >
> > gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
> >
> > I've also copied the processed CSV file to
> >
> >   http://www.cs.waikato.ac.nz/~eibe/temp.csv
> >
> > Please let me know when you have downloaded it so that I can delete it.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > > Regards.
> > > Ivan.
> > >  buiigcjoin1.csv
> > > ​
> > >
> > > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> > >
> > >   , "
> > >
> > > by
> > >
> > >   ,"
> > >
> > >
> > > Not sure whether this is by design or due to a bug in the CSV loader.
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > > >
> > > > line 6077
> > > >
> > > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > > >
> > > > line 28105
> > > >
> > > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > > >
> > > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > > Can you share one of the lines of data that give you problems?
> > > >
> > > > Cheers,
> > > > Eibe
> > > >
> > > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > > >
> > > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > > Regards.
> > > > > Ivan.
> > > > >
> > > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > > It's a 7 column with 469579 rows file.
> > > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > > (nominal).
> > > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > > same error but now in line 1200 for example.
> > > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > > in advance.
> > > > >
> > > > > Instead of using comma as column separator, try using tab. That tends
> > > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > > commas.
> > > > >
> > > > > Cheers, Peter
> > > > > --
> > > > > Peter Reutemann
> > > > > Dept. of Computer Science
> > > > > University of Waikato, NZ
> > > > > +64 (7) 858-5174
> > > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > > http://www.data-mining.co.nz/
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > > >
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
When I download 3.8.1 I can install LibLINEAR with no problems, however, when I click on the "Choose" button on "classify" section nothing happens. If I check on the package manager it shows a lot of packages already installed and loaded. The menus works on the other sections like "cluster" and the rest, but not on the "classify". If I go back to weka 3.8 everything is fine. Sorry for sucking so much at this.
Regards.
Ivan Ruiz.

2017-01-24 5:44 GMT+08:00 Eibe Frank <[hidden email]>:
You’ll probably have to update to WEKA 3.8.1. This exception is due to the issue with downloading package .zip files from SourceForge that got fixed in 3.8.1/3.9.1.

Cheers,
Eibe

> On 24/01/2017, at 2:30 AM, Ivan Ruiz <[hidden email]> wrote:
>
> Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:
>
> java.util.zip.ZipException: error in opening zip file
>       at java.util.zip.ZipFile.open(Native Method)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at weka.core.packageManagement.DefaultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
>       at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
>       at javax.swing.SwingWorker$1.call(Unknown Source)
>       at java.util.concurrent.FutureTask.run(Unknown Source)
>       at javax.swing.SwingWorker.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>       at java.lang.Thread.run(Unknown Source)
>
> this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
> Regards.
> Ivan Ruiz.
>
>
> 2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
> There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.
>
> Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.
>
> The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.
>
> Cheers,
> Eibe
>
> On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
> Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
> Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.
>
> If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.
>
> Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.
>
> Cheers,
> Eibe
>
> > On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
> >
> > Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> > Regards.
> > Ivan Ruiz.
> >
> > On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> > Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
> >
> > There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
> >
> > cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
> >
> > Here is some info on what the components of this command-line do:
> >
> > LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
> >
> > tr -d '\r': removes part of DOS line terminator so that only \n remains
> >
> > gsed '/^$/d': deletes empty lines
> >
> > gsed "s/\"\"/'/g": replaces "" by '
> >
> > gsed "s/\"$//g": removes " at end of line
> >
> > gsed "s/^\"//g": removes " at start of line
> >
> > gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
> >
> > I've also copied the processed CSV file to
> >
> >   http://www.cs.waikato.ac.nz/~eibe/temp.csv
> >
> > Please let me know when you have downloaded it so that I can delete it.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > > Regards.
> > > Ivan.
> > >  buiigcjoin1.csv
> > > ​
> > >
> > > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> > >
> > >   , "
> > >
> > > by
> > >
> > >   ,"
> > >
> > >
> > > Not sure whether this is by design or due to a bug in the CSV loader.
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > > >
> > > > line 6077
> > > >
> > > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > > >
> > > > line 28105
> > > >
> > > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > > >
> > > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > > Can you share one of the lines of data that give you problems?
> > > >
> > > > Cheers,
> > > > Eibe
> > > >
> > > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > > >
> > > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > > Regards.
> > > > > Ivan.
> > > > >
> > > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > > It's a 7 column with 469579 rows file.
> > > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > > (nominal).
> > > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > > same error but now in line 1200 for example.
> > > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > > in advance.
> > > > >
> > > > > Instead of using comma as column separator, try using tab. That tends
> > > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > > commas.
> > > > >
> > > > > Cheers, Peter
> > > > > --
> > > > > Peter Reutemann
> > > > > Dept. of Computer Science
> > > > > University of Waikato, NZ
> > > > > <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174">+64 (7) 858-5174
> > > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > > http://www.data-mining.co.nz/
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > > >
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-3
This is possibly due to the new class loading mechanism in WEKA 3.8.1/3.9.1, which broke some packages, and very likely not your fault at all.

If you have the following packages installed, make sure you have their latest versions:

DTNB
wekaPython
distributedWekaBase
all distributedWekaHadoop packages
all netlibNative packages
kfGroovy
massiveOnlineAnalysis
percentageErrorMetrics
predictiveApriori
scriptingClassifiers
J48graft
wekaServer

Alternatively, start with a fresh wekafiles folder and install the packages you need again.

Cheers,
Eibe

On Tue, Jan 24, 2017 at 4:28 PM, Ivan Ruiz <[hidden email]> wrote:
When I download 3.8.1 I can install LibLINEAR with no problems, however, when I click on the "Choose" button on "classify" section nothing happens. If I check on the package manager it shows a lot of packages already installed and loaded. The menus works on the other sections like "cluster" and the rest, but not on the "classify". If I go back to weka 3.8 everything is fine. Sorry for sucking so much at this.
Regards.
Ivan Ruiz.

2017-01-24 5:44 GMT+08:00 Eibe Frank <[hidden email]>:
You’ll probably have to update to WEKA 3.8.1. This exception is due to the issue with downloading package .zip files from SourceForge that got fixed in 3.8.1/3.9.1.

Cheers,
Eibe

> On 24/01/2017, at 2:30 AM, Ivan Ruiz <[hidden email]> wrote:
>
> Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:
>
> java.util.zip.ZipException: error in opening zip file
>       at java.util.zip.ZipFile.open(Native Method)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at weka.core.packageManagement.DefaultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
>       at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
>       at javax.swing.SwingWorker$1.call(Unknown Source)
>       at java.util.concurrent.FutureTask.run(Unknown Source)
>       at javax.swing.SwingWorker.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>       at java.lang.Thread.run(Unknown Source)
>
> this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
> Regards.
> Ivan Ruiz.
>
>
> 2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
> There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.
>
> Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.
>
> The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.
>
> Cheers,
> Eibe
>
> On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
> Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
> Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.
>
> If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.
>
> Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.
>
> Cheers,
> Eibe
>
> > On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
> >
> > Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> > Regards.
> > Ivan Ruiz.
> >
> > On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> > Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
> >
> > There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
> >
> > cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
> >
> > Here is some info on what the components of this command-line do:
> >
> > LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
> >
> > tr -d '\r': removes part of DOS line terminator so that only \n remains
> >
> > gsed '/^$/d': deletes empty lines
> >
> > gsed "s/\"\"/'/g": replaces "" by '
> >
> > gsed "s/\"$//g": removes " at end of line
> >
> > gsed "s/^\"//g": removes " at start of line
> >
> > gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
> >
> > I've also copied the processed CSV file to
> >
> >   http://www.cs.waikato.ac.nz/~eibe/temp.csv
> >
> > Please let me know when you have downloaded it so that I can delete it.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > > Regards.
> > > Ivan.
> > >  buiigcjoin1.csv
> > > ​
> > >
> > > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> > >
> > >   , "
> > >
> > > by
> > >
> > >   ,"
> > >
> > >
> > > Not sure whether this is by design or due to a bug in the CSV loader.
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > > >
> > > > line 6077
> > > >
> > > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > > >
> > > > line 28105
> > > >
> > > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > > >
> > > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > > Can you share one of the lines of data that give you problems?
> > > >
> > > > Cheers,
> > > > Eibe
> > > >
> > > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > > >
> > > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > > Regards.
> > > > > Ivan.
> > > > >
> > > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > > It's a 7 column with 469579 rows file.
> > > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > > (nominal).
> > > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > > same error but now in line 1200 for example.
> > > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > > in advance.
> > > > >
> > > > > Instead of using comma as column separator, try using tab. That tends
> > > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > > commas.
> > > > >
> > > > > Cheers, Peter
> > > > > --
> > > > > Peter Reutemann
> > > > > Dept. of Computer Science
> > > > > University of Waikato, NZ
> > > > > <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174" target="_blank">+64 (7) 858-5174
> > > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > > http://www.data-mining.co.nz/
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > > >
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
Thanks for all your help.
Now I'm running out of memory. I remember with the last version I just changed the maxheap to 6 gigabyte on the .ini file and it worked fine. On this version I see that it changed to maxstack, I change that to 4096 m but when I check the system info on the simple CLI it shows an initial memory of 256 m and memory max of 3616 m. It doesn't matter if I change it to something bigger or lower. What am I doing wrong?
Another question. I use to run a classification on training test, and then run the same one on the test set. After that, I open the view classifier errors, and save the arrf file to see the predicted class. Now when I save it just saves the arrf file with 0 bytes. I also don't know what's the problem there.
Thanks in advance.
Regards.
Ivan Ruiz.

2017-01-24 11:49 GMT+08:00 Eibe Frank <[hidden email]>:
This is possibly due to the new class loading mechanism in WEKA 3.8.1/3.9.1, which broke some packages, and very likely not your fault at all.

If you have the following packages installed, make sure you have their latest versions:

DTNB
wekaPython
distributedWekaBase
all distributedWekaHadoop packages
all netlibNative packages
kfGroovy
massiveOnlineAnalysis
percentageErrorMetrics
predictiveApriori
scriptingClassifiers
J48graft
wekaServer

Alternatively, start with a fresh wekafiles folder and install the packages you need again.

Cheers,
Eibe


On Tue, Jan 24, 2017 at 4:28 PM, Ivan Ruiz <[hidden email]> wrote:
When I download 3.8.1 I can install LibLINEAR with no problems, however, when I click on the "Choose" button on "classify" section nothing happens. If I check on the package manager it shows a lot of packages already installed and loaded. The menus works on the other sections like "cluster" and the rest, but not on the "classify". If I go back to weka 3.8 everything is fine. Sorry for sucking so much at this.
Regards.
Ivan Ruiz.

2017-01-24 5:44 GMT+08:00 Eibe Frank <[hidden email]>:
You’ll probably have to update to WEKA 3.8.1. This exception is due to the issue with downloading package .zip files from SourceForge that got fixed in 3.8.1/3.9.1.

Cheers,
Eibe

> On 24/01/2017, at 2:30 AM, Ivan Ruiz <[hidden email]> wrote:
>
> Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:
>
> java.util.zip.ZipException: error in opening zip file
>       at java.util.zip.ZipFile.open(Native Method)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at java.util.zip.ZipFile.<init>(Unknown Source)
>       at weka.core.packageManagement.DefaultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
>       at weka.core.packageManagement.DefaultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
>       at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
>       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
>       at javax.swing.SwingWorker$1.call(Unknown Source)
>       at java.util.concurrent.FutureTask.run(Unknown Source)
>       at javax.swing.SwingWorker.run(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>       at java.lang.Thread.run(Unknown Source)
>
> this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
> Regards.
> Ivan Ruiz.
>
>
> 2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
> There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.
>
> Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.
>
> The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.
>
> Cheers,
> Eibe
>
> On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
> Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time.
> Regards.
> Ivan Ruiz.
>
> On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
> Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.
>
> If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.
>
> Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.
>
> Cheers,
> Eibe
>
> > On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
> >
> > Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
> > Regards.
> > Ivan Ruiz.
> >
> > On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
> > Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
> >
> > There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
> >
> > cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
> >
> > Here is some info on what the components of this command-line do:
> >
> > LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
> >
> > tr -d '\r': removes part of DOS line terminator so that only \n remains
> >
> > gsed '/^$/d': deletes empty lines
> >
> > gsed "s/\"\"/'/g": replaces "" by '
> >
> > gsed "s/\"$//g": removes " at end of line
> >
> > gsed "s/^\"//g": removes " at start of line
> >
> > gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
> >
> > I've also copied the processed CSV file to
> >
> >   http://www.cs.waikato.ac.nz/~eibe/temp.csv
> >
> > Please let me know when you have downloaded it so that I can delete it.
> >
> > Cheers,
> > Eibe
> >
> > > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
> > >
> > > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
> > > Regards.
> > > Ivan.
> > >  buiigcjoin1.csv
> > > ​
> > >
> > > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
> > >
> > >   , "
> > >
> > > by
> > >
> > >   ,"
> > >
> > >
> > > Not sure whether this is by design or due to a bug in the CSV loader.
> > >
> > > Cheers,
> > > Eibe
> > >
> > > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
> > > >
> > > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
> > > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
> > > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
> > > >
> > > > line 6077
> > > >
> > > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
> > > >
> > > > line 28105
> > > >
> > > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
> > > >
> > > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
> > > > Regards.
> > > > Ivan.
> > > >
> > > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
> > > > Can you share one of the lines of data that give you problems?
> > > >
> > > > Cheers,
> > > > Eibe
> > > >
> > > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
> > > > >
> > > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
> > > > > Regards.
> > > > > Ivan.
> > > > >
> > > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
> > > > > > Hello, I been having problems when loading a .csv file into weka.
> > > > > > It's a 7 column with 469579 rows file.
> > > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
> > > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
> > > > > > (nominal).
> > > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
> > > > > > spaces. When I try to load the file into weka I get the error: wrong number
> > > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
> > > > > > I tried everything with no success, what I'm doing is open the csv file go
> > > > > > to the line 789 and delete that row, save it, and then try again, I get the
> > > > > > same error but now in line 1200 for example.
> > > > > > I'm sure I can solve this problem following this logic, but it is going to
> > > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
> > > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
> > > > > > in advance.
> > > > >
> > > > > Instead of using comma as column separator, try using tab. That tends
> > > > > to solve a lot of problems with non-escaped single/double quotes and
> > > > > commas.
> > > > >
> > > > > Cheers, Peter
> > > > > --
> > > > > Peter Reutemann
> > > > > Dept. of Computer Science
> > > > > University of Waikato, NZ
> > > > > <a href="tel:%2B64%20%287%29%20858-5174" value="+6478585174" target="_blank">+64 (7) 858-5174
> > > > > http://www.cms.waikato.ac.nz/~fracpete/
> > > > > http://www.data-mining.co.nz/
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > > >
> > > > > _______________________________________________
> > > > > Wekalist mailing list
> > > > > Send posts to: [hidden email]
> > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > > >
> > > > _______________________________________________
> > > > Wekalist mailing list
> > > > Send posts to: [hidden email]
> > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > >
> > > _______________________________________________
> > > Wekalist mailing list
> > > Send posts to: [hidden email]
> > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> >
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
> > _______________________________________________
> > Wekalist mailing list
> > Send posts to: [hidden email]
> > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Mark Hall
maxstack is for setting the stack size. The default of 20m will be sufficient most of the time. The maxheap setting (and its default of 1Gb) was removed so that java will set its own heap size based on your available memory. You can force a specific heap size by setting the JAVA_OPTS environment variable. The command that launches the GUI in RunWeka.ini uses this environment variable. You could set it to:

JAVA_OPTS=-Xmx4096m

Unfortunately, there is a bug in the Explorer when re-evaluating a loaded model on a separate test set. This prevents the visualization of ROC curves and classifier errors etc. in this case. I've just committed a fix for trunk and stable-3-8 that fixes the problem. You can get the fix in the next nightly (NZ time) snapshot of Weka.

Cheers,
Mark.

On 7/02/17, 7:39 PM, "Ivan Ruiz" <[hidden email] on behalf of [hidden email]> wrote:

    Thanks for all your help.
   
    Now I'm running out of memory. I remember with the last version I just changed the maxheap to 6 gigabyte on the .ini file and it worked fine. On this version I see that it changed to maxstack, I change that to 4096 m but when I check the system info on the simple CLI it shows an initial memory of 256 m and memory max of 3616 m. It doesn't matter if I change it to something bigger or lower. What am I doing wrong?
   
    Another question. I use to run a classification on training test, and then run the same one on the test set. After that, I open the view classifier errors, and save the arrf file to see the predicted class. Now when I save it just saves the arrf file with 0 bytes. I also don't know what's the problem there.
   
    Thanks in advance.
   
    Regards.
   
    Ivan Ruiz.
   
   
    2017-01-24 11:49 GMT+08:00 Eibe Frank <[hidden email]>:
   
    This is possibly due to the new class loading mechanism in WEKA 3.8.1/3.9.1, which broke some packages, and very likely not your fault at all.
   
   
    If you have the following packages installed, make sure you have their latest versions:
   
    DTNB
    wekaPython
    distributedWekaBase
    all distributedWekaHadoop packages
    all netlibNative packages
    kfGroovy
    massiveOnlineAnalysis
    percentageErrorMetrics
    predictiveApriori
    scriptingClassifiers
    J48graft
    wekaServer
   
   
    Alternatively, start with a fresh wekafiles folder and install the packages you need again.
   
   
    Cheers,
   
    Eibe
   
    On Tue, Jan 24, 2017 at 4:28 PM, Ivan Ruiz <[hidden email]> wrote:
   
    When I download 3.8.1 I can install LibLINEAR with no problems, however, when I click on the "Choose" button on "classify" section nothing happens. If I check on the package manager it shows a lot of packages already installed and loaded. The menus works on the other sections like "cluster" and the rest, but not on the "classify". If I go back to weka 3.8 everything is fine. Sorry for sucking so much at this.
   
    Regards.
   
    Ivan Ruiz.
   
   
    2017-01-24 5:44 GMT+08:00 Eibe Frank <[hidden email]>:
   
    You’ll probably have to update to WEKA 3.8.1. This exception is due to the issue with downloading package .zip files from SourceForge that got fixed in 3.8.1/3.9.1. <http://3.9.1.>
   
    Cheers,
    Eibe
   
    > On 24/01/2017, at 2:30 AM, Ivan Ruiz <[hidden email]> wrote:
    >
    > Thank you, Eibe. I really appreciate it. I don't get the accuracies I want with NaiveBayesMultinomial, and using LibSVM takes too long. I tried to install LibLINEAR as you suggested but I get the following error:
    >
    > java.util.zip.ZipException: error in opening zip file
    >       at java.util.zip.ZipFile.open(Native Method)
    >       at java.util.zip.ZipFile.<init>(Unknown Source)
    >       at java.util.zip.ZipFile.<init>(Unknown Source)
    >       at java.util.zip.ZipFile.<init>(Unknown Source)
    >       at weka.core.packageManagement.De <http://weka.core.packageManagement.De>faultPackageManager.getPackageArchiveInfo(DefaultPackageManager.java:354)
    >       at weka.core.packageManagement.De <http://weka.core.packageManagement.De>faultPackageManager.installPackageFromArchive(DefaultPackageManager.java:501)
    >       at weka.core.packageManagement.De <http://weka.core.packageManagement.De>faultPackageManager.installPackageFromURL(DefaultPackageManager.java:769)
    >       at weka.core.packageManagement.De <http://weka.core.packageManagement.De>faultPackageManager.installPackageFromRepository(DefaultPackageManager.java:753)
    >       at weka.core.WekaPackageManager.installPackageFromRepository(WekaPackageManager.java:1938)
    >       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:1308)
    >       at weka.gui.PackageManager$InstallTask.doInBackground(PackageManager.java:863)
    >       at javax.swing.SwingWorker$1.call(Unknown Source)
    >       at java.util.concurrent.FutureTask.run(Unknown Source)
    >       at javax.swing.SwingWorker.run(Unknown Source)
    >       at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    >       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    >       at java.lang.Thread.run(Unknown Source)
    >
    > this is trying to install it through the GUI interface. I don't know what I'm doing wrong. Any other tip of how to achieve better accuracies with different algorithms will be really appreciated it too. Thanks in advance for everything.
    > Regards.
    > Ivan Ruiz.
    >
    >
    > 2017-01-17 18:06 GMT+08:00 Eibe Frank <[hidden email]>:
    > There might be some information in the WEKA log file in $HOME/wekafiles/weka.log. WEKA probably ran out of memory, i.e., heap space. You could try to increase the heap space, e.g., using the _JAVA_OPTIONS environment variable.
    >
    > Linear classifiers are quite popular for text classification problems. Perhaps try LibLINEAR in WEKA. It is able to process quite large datasets without consuming as much memory as RandomForest. It also often works better than RandomForest on text data.
    >
    > The fastest algorithm for text data, which often gives pretty decent results, is NaiveBayesMultinomial. There is also a version that includes the StringToWordVector functionality: NaiveBayesMultinomialText. You should probably try NaiveBayesMultinomial before you try any other classifier to get an initial result.
    >
    > Cheers,
    > Eibe
    >
    > On Mon, Jan 16, 2017 at 7:56 AM, Ivan Ruiz <[hidden email]> wrote:
    > Thanks for everything. Now I have new problems. First I filter my data to make it nominal to string and then string to word vector. After that I try to classify it using random forests it started working and building the model it took 2 and a half days and when it's supposed to do the 10 fold cross validation it just stops. The bird stopped moving and everything and weka doesn't give me any message or anything. What am I doing wrong? Thanks in advance for your time.
    > Regards.
    > Ivan Ruiz.
    >
    > On Dec 21, 2016 2:38 AM, "Eibe Frank" <[hidden email]> wrote:
    > Unfortunately some of the old WEKA packages aren't compatible with 3.8.1. This might be causing the issue with your classifier menu. Mark has posted a list of those affected packages. If you don't have that many packages, it's probably best to simply remove your wekafiles folder (just search for "wekafiles" and delete that folder) and reinstall the packages you want from scratch.
    >
    > If you have access to a spreadsheet program that is able to read your CSV file and that can export in Excel format, you can use the WekaExcel package to read the data in Excel format into WEKA. It provides loaders and savers for .xls and .xlsx files. (Make sure you have a header row in your spreadsheet for the attribute names.) There is also a WekaODF package so that files in open document format can be read.
    >
    > Another option is to use R to read the CSV file and export it as an ARFF file. There are several packages in R for this. The rio package (https://cran.r-project.org/web/packages/rio/vignettes/rio.html) might be a good option, but I haven't tried it yet. If you have the RPlugin for WEKA installed (it's a bit tricky to set up) you can issue the relevant R commands from the R Console in WEKA.
    >
    > Cheers,
    > Eibe
    >
    > > On 21 Dec 2016, at 21:02, Ivan Ruiz <[hidden email]> wrote:
    > >
    > > Thanks I downloaded it. I installed weka 3.8.1 and when I try to choose a classifier algorithm there is none. I uninstalled and go back to 3.8 and all the algorithms are there. It's weird. Also how can I do what you did to other csv files? Just copy paste your code in os x? Thanks very much for everything I really appreciated it and sorry for not understand anything.
    > > Regards.
    > > Ivan Ruiz.
    > >
    > > On Dec 21, 2016 10:44 AM, "Eibe Frank" <[hidden email]> wrote:
    > > Out of interest, and to test the CSVReader, I have gone through the process of converting your file.
    > >
    > > There were some problems in your file that even Excel could not cope with, more specifically, fields with "" that were not terminated by another "". Once I had fixed those issues manually, I ran the following command on OS X (with gsed installed) to create a new temp.csv file that can be loaded into WEKA:
    > >
    > > cat buiigcjoin1.csv | LANG=C tr -cd '\11\12\15\40-\176' | tr -d '\r' | gsed '/^$/d' | gsed "s/\"\"/'/g" | gsed "s/\"$//g" | gsed "s/^\"//g" | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' | gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}' > temp.csv
    > >
    > > Here is some info on what the components of this command-line do:
    > >
    > > LANG=C tr -cd '\11\12\15\40-\176': keeps only printable standard ASCII characters (there were some unusual non-printable characters in your file)
    > >
    > > tr -d '\r': removes part of DOS line terminator so that only \n remains
    > >
    > > gsed '/^$/d': deletes empty lines
    > >
    > > gsed "s/\"\"/'/g": replaces "" by '
    > >
    > > gsed "s/\"$//g": removes " at end of line
    > >
    > > gsed "s/^\"//g": removes " at start of line
    > >
    > > gsed '/$/{$!{N;s/\n[^0-9]//;ty;P;D;:y}}': joins line A and the next line B unless B starts with a digit (fortunately, each instance in your file starts with a digit in the first attribute value, so instances extending across multiple lines could be joined this way)
    > >
    > > I've also copied the processed CSV file to
    > >
    > >   http://www.cs.waikato.ac.nz/~eibe/temp.csv
    > >
    > > Please let me know when you have downloaded it so that I can delete it.
    > >
    > > Cheers,
    > > Eibe
    > >
    > > > On 16 Dec 2016, at 20:48, Ivan Ruiz <[hidden email]> wrote:
    > > >
    > > > Well here is the full file if you have time I'll appreciate it, I will go back of deleting one by one. Thanks for everything.
    > > > Regards.
    > > > Ivan.
    > > >  buiigcjoin1.csv
    > > > ​
    > > >
    > > > 2016-12-16 15:37 GMT+08:00 Eibe Frank <[hidden email]>:
    > > > I don't have any problems loading your line 28105 if I put it into a CSV file (with WEKA 3.9.1-SNAPSHOT). Line 6077 loads if I replace
    > > >
    > > >   , "
    > > >
    > > > by
    > > >
    > > >   ,"
    > > >
    > > >
    > > > Not sure whether this is by design or due to a bug in the CSV loader.
    > > >
    > > > Cheers,
    > > > Eibe
    > > >
    > > > > On 16 Dec 2016, at 19:51, Ivan Ruiz <[hidden email]> wrote:
    > > > >
    > > > > When I try to load the xlsx file I get: Problem setting base instances: java.lang.reflect.invocationTargetException
    > > > > When I change the file to tab delimited some lines get divided in two thus giving me the wrong number of values error.
    > > > > When I have the file in csv and look at the line with the problems sometimes they are missing commas and sometimes is like this:
    > > > >
    > > > > line 6077
    > > > >
    > > > > 517, "    COCO DEALgreen  pink    B Watsapp     fleurfleurhk",   uucuubccueuubfueuuecuabuba  fleurfleurhk    fleurfleurhk  profile picture httpphotosg ak instagram comhphotosakx  a, 2.02E+13 ,,, COMRESMIX
    > > > >
    > > > > line 28105
    > > > >
    > > > > 4733,"careUKker  USker noun PROTECTION U the process of protecting and looking after someone or something      The standard of care at our local hospital is excellent      Miras going to be very weak for a long time after the operation  so shell ne",       ernkam  profile picture httpphotosg ak instagram comhphotosakxftt sx   a jpg  id   full name,2.02E+13,,,COMRESMIX
    > > > >
    > > > > So my stategy is I go to every line and just delete it try again and try again and again and it's taking so much work. Thanks in advance.
    > > > > Regards.
    > > > > Ivan.
    > > > >
    > > > > 2016-12-16 14:37 GMT+08:00 Eibe Frank <[hidden email]>:
    > > > > Can you share one of the lines of data that give you problems?
    > > > >
    > > > > Cheers,
    > > > > Eibe
    > > > >
    > > > > > On 16 Dec 2016, at 19:24, Ivan Ruiz <[hidden email]> wrote:
    > > > > >
    > > > > > Hello and thanks for your reply Peter. I change the file to tab delimited and I keep getting the same errors: wrong number of values at line xxxx, when I go and delete it I get the same error but in a different line. I don't understand.
    > > > > > Regards.
    > > > > > Ivan.
    > > > > >
    > > > > > 2016-12-16 3:58 GMT+08:00 Peter Reutemann <[hidden email]>:
    > > > > > > Hello, I been having problems when loading a .csv file into weka.
    > > > > > > It's a 7 column with 469579 rows file.
    > > > > > > The column headers are: TARGET ID (number), TITLE (text), DETAIL (text),
    > > > > > > DATE (numeric), TRAINING (nominal), DETAIL TRAINING (nominal), CLASS
    > > > > > > (nominal).
    > > > > > > I deleted all weird symbols like (!@#$%^&*()_-+=,.<>:;"'[]\|) even the enter
    > > > > > > spaces. When I try to load the file into weka I get the error: wrong number
    > > > > > > of values. Read2, expected 7, read Tolen[EOL], line 789.
    > > > > > > I tried everything with no success, what I'm doing is open the csv file go
    > > > > > > to the line 789 and delete that row, save it, and then try again, I get the
    > > > > > > same error but now in line 1200 for example.
    > > > > > > I'm sure I can solve this problem following this logic, but it is going to
    > > > > > > take a while. Do you have any idea what I'm doing wrong? I just want to put
    > > > > > > out there that I'm a newbie in all of this, so please bear with me. Thanks
    > > > > > > in advance.
    > > > > >
    > > > > > Instead of using comma as column separator, try using tab. That tends
    > > > > > to solve a lot of problems with non-escaped single/double quotes and
    > > > > > commas.
    > > > > >
    > > > > > Cheers, Peter
    > > > > > --
    > > > > > Peter Reutemann
    > > > > > Dept. of Computer Science
    > > > > > University of Waikato, NZ
    > > > > > +64 (7) 858-5174 <tel:%2B64%20%287%29%20858-5174>
    > > > > > http://www.cms.waikato.ac.nz/~fracpete/
    > > > > > http://www.data-mining.co.nz/
    > > > > > _______________________________________________
    > > > > > Wekalist mailing list
    > > > > > Send posts to: [hidden email]
    > > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > > > >
    > > > > > _______________________________________________
    > > > > > Wekalist mailing list
    > > > > > Send posts to: [hidden email]
    > > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > > >
    > > > > _______________________________________________
    > > > > Wekalist mailing list
    > > > > Send posts to: [hidden email]
    > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > > >
    > > > > _______________________________________________
    > > > > Wekalist mailing list
    > > > > Send posts to: [hidden email]
    > > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > >
    > > > _______________________________________________
    > > > Wekalist mailing list
    > > > Send posts to: [hidden email]
    > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > >
    > > > _______________________________________________
    > > > Wekalist mailing list
    > > > Send posts to: [hidden email]
    > > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > >
    > > _______________________________________________
    > > Wekalist mailing list
    > > Send posts to: [hidden email]
    > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    > > _______________________________________________
    > > Wekalist mailing list
    > > Send posts to: [hidden email]
    > > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    >
    > _______________________________________________
    > Wekalist mailing list
    > Send posts to: [hidden email]
    > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    >
    > _______________________________________________
    > Wekalist mailing list
    > Send posts to: [hidden email]
    > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    >
    >
    >
    > _______________________________________________
    > Wekalist mailing list
    > Send posts to: [hidden email]
    > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
    >
    >
    > _______________________________________________
    > Wekalist mailing list
    > Send posts to: [hidden email]
    > List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    > List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
   
    _______________________________________________
    Wekalist mailing list
    Send posts to: [hidden email]
    List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
   
   
   
   
   
   
   
   
   
   
    _______________________________________________
    Wekalist mailing list
    Send posts to: [hidden email]
    List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
   
   
   
   
   
   
   
   
   
   
   
   
   
    _______________________________________________
    Wekalist mailing list
    Send posts to: [hidden email]
    List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
   
   
   
   
   
   
    _______________________________________________
    Wekalist mailing list
    Send posts to: [hidden email]
    List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
    List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
   


_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

ivanfrustrado
I just have to ask, is this, you guys?

https://goo.gl/photos/7cEMcpxDgR43oXr96

2017-02-23 16:34 GMT+08:00 Ivan Ruiz <[hidden email]>:
I just have to ask, is this, you guys? 




_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Wrong number of values. Read 2, expected 7, readToken[EOL] line xxxx

Eibe Frank-2
Administrator
Yes, Mark and I are co-authors of the book. Glad to see you got yourself a copy! :-)

Cheers,
Eibe

> On 23/02/2017, at 9:53 PM, Ivan Ruiz <[hidden email]> wrote:
>
> I just have to ask, is this, you guys?
>
> https://goo.gl/photos/7cEMcpxDgR43oXr96
>
> 2017-02-23 16:34 GMT+08:00 Ivan Ruiz <[hidden email]>:
> I just have to ask, is this, you guys?
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
12