Getting an Error while opening a large .CSV file

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Getting an Error while opening a large .CSV file

Wilger
Hello everyone!

I am a high school student new to Weka and Big Data in general, and I am
having a problem loading a .csv file into Weka. I get the error: "37 Problem
encountered on line:2"

I'm thinking the problem is the large number of symbols in some of the
attribute columns, but I am unsure how to get rid of them. As well, as it
says it encountered 37 problems (I only have 37 columns) so there seems to
be something wrong in each.

I tried uploading the dataset but it exceeds the 5mb limit. Here is the link
to the datasets (the .csv files): https://webrobots.io/kickstarter-datasets/

Any response is greatly appreciated!

Nicholas Wilger



--
Sent from: http://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Getting an Error while opening a large .CSV file

Eibe Frank-2
Administrator
WEKA’s CSV reader is fairly non-standard. There are (at least) two reasons why it cannot process Kickstarter.csv from https://s3.amazonaws.com/weruns/forfun/Kickstarter/Kickstarter_2018-12-13T03_20_05_701Z.zip:

- Nested quotes in WEKA’s CSV files need to be escaped by backslash characters, e.g., "Here is some ""quoted text""" would have to become "Here is some \"quoted text\"".
 
- Newlines are not permitted in a value even if this value is quoted. This is the bigger problem. They need be replaced by \n.

Cheers,
Eibe

> On 13/01/2019, at 12:33 PM, Wilger <[hidden email]> wrote:
>
> Hello everyone!
>
> I am a high school student new to Weka and Big Data in general, and I am
> having a problem loading a .csv file into Weka. I get the error: "37 Problem
> encountered on line:2"
>
> I'm thinking the problem is the large number of symbols in some of the
> attribute columns, but I am unsure how to get rid of them. As well, as it
> says it encountered 37 problems (I only have 37 columns) so there seems to
> be something wrong in each.
>
> I tried uploading the dataset but it exceeds the 5mb limit. Here is the link
> to the datasets (the .csv files): https://webrobots.io/kickstarter-datasets/
>
> Any response is greatly appreciated!
>
> Nicholas Wilger
>
>
>
> --
> Sent from: http://weka.8497.n7.nabble.com/
> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html