TextDirectoryLoader not working

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

TextDirectoryLoader not working

Horsten_T
Hi,

I went through forum and not sure what I am doing wrong. From a list of txt
files stored in directory c:\neg I wanted to create one arff file (neg and
pos classified texts are coming from
http://www.cs.cornell.edu/people/pabo/movie-review-data/).

I use: Windows, Weka 3.8.4

I tried:

/1/ SimpleCLI
java weka.core.converters.TextDirectoryLoader –dir c:\neg\ > A2_neg.arff
java weka.core.converters.TextDirectoryLoader –dir c:/neg/ > A2_neg.arff

I tried also with/without signs C: / \

each time I receive:
java.io.IOException: Directory '' not found         //WHY?
weka.core.converters.TextDirectoryLoader.setSource(TextDirectoryLoader.java:391)
weka.core.converters.TextDirectoryLoader.setDirectory(TextDirectoryLoader.java:360)
weka.core.converters.TextDirectoryLoader.setOptions(TextDirectoryLoader.java:212)
weka.core.converters.TextDirectoryLoader.run(TextDirectoryLoader.java:682)
weka.core.converters.TextDirectoryLoader.main(TextDirectoryLoader.java:649)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)
java.base/java.lang.reflect.Method.invoke(Unknown Source)
weka.gui.SimpleCLIPanel$ClassRunner.run(SimpleCLIPanel.java:328)

/2/ Preprocess => Open file .. =>File name c:/neg/ => Choose =>
weka.coге.converters.TextDігеctoryLoader

Results:
Instances = 0         //? WHY
Attributes
1 text                     string
2 @@class@@        nominal

Could anyone help?
Thanks
Horsten



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: TextDirectoryLoader not working

Peter Reutemann-3
On December 30, 2019 9:49:16 AM GMT+13:00, Horsten_T <[hidden email]> wrote:

>Hi,
>
>I went through forum and not sure what I am doing wrong. From a list of
>txt
>files stored in directory c:\neg I wanted to create one arff file (neg
>and
>pos classified texts are coming from
>http://www.cs.cornell.edu/people/pabo/movie-review-data/).
>
>I use: Windows, Weka 3.8.4
>
>I tried:
>
>/1/ SimpleCLI
>java weka.core.converters.TextDirectoryLoader –dir c:\neg\ >
>A2_neg.arff
>java weka.core.converters.TextDirectoryLoader –dir c:/neg/ >
>A2_neg.arff
>
>I tried also with/without signs C: / \
>
>each time I receive:
>java.io.IOException: Directory '' not found         //WHY?
>weka.core.converters.TextDirectoryLoader.setSource(TextDirectoryLoader.java:391)
>weka.core.converters.TextDirectoryLoader.setDirectory(TextDirectoryLoader.java:360)
>weka.core.converters.TextDirectoryLoader.setOptions(TextDirectoryLoader.java:212)
>weka.core.converters.TextDirectoryLoader.run(TextDirectoryLoader.java:682)
>weka.core.converters.TextDirectoryLoader.main(TextDirectoryLoader.java:649)
>java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>Method)
>java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown
>Source)
>java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
>Source)
>java.base/java.lang.reflect.Method.invoke(Unknown Source)
>weka.gui.SimpleCLIPanel$ClassRunner.run(SimpleCLIPanel.java:328)
>
>/2/ Preprocess => Open file .. =>File name c:/neg/ => Choose =>
>weka.coге.converters.TextDігеctoryLoader
>
>Results:
>Instances = 0         //? WHY
>Attributes
>1 text                     string
>2 @@class@@        nominal
>
>Could anyone help?
>Thanks
>Horsten
>
>
>
>--
>Sent from: https://weka.8497.n7.nabble.com/
>_______________________________________________
>Wekalist mailing list -- [hidden email]
>Send posts to: To unsubscribe send an email to
>[hidden email]
>To subscribe, unsubscribe, etc., visit
>https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
>List etiquette:
>http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

You have to supply the parent directory to the -dir option which contains all the subdirectories (= your classes) with the respective text files.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: TextDirectoryLoader not working

Horsten_T
In reply to this post by Horsten_T
to: Peter Reutemann

Hi Peter,

Thanks for your feedback. I used your advice, and created in the main
directory, two subdirectories "neg" and "pos" (java
weka.core.converters.TextDirectoryLoader –dir C:/wekadata/ >
C:/wekadata/A2_neg.arff). This command creates file.

Problem: A2_neg.arff  is empty with en error "java.io.IOException: premature
end of file, read Token[EOF] line 1" "not recognised as ARFF file".

Maybe the problem is txt files I want to convert - pls see attachéd example.

cv999_14636.txt <https://weka.8497.n7.nabble.com/file/t7032/cv999_14636.txt>  

Thanks for help.
H.

//In my final exercise, I will be working with hundreds of transcripted
dialogues, not sure yet if in txt or csv. Due to scale I will not be able to
work on a single files. Thus I am trying to learn it on available text files
and use bulk convert into single arff file//.  



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: TextDirectoryLoader not working

Peter Reutemann
> Thanks for your feedback. I used your advice, and created in the main

> directory, two subdirectories "neg" and "pos" (java
> weka.core.converters.TextDirectoryLoader –dir C:/wekadata/ >
> C:/wekadata/A2_neg.arff). This command creates file.
>
> Problem: A2_neg.arff  is empty with en error "java.io.IOException: premature
> end of file, read Token[EOF] line 1" "not recognised as ARFF file".
>
> Maybe the problem is txt files I want to convert - pls see attachéd example.
>
> cv999_14636.txt <https://weka.8497.n7.nabble.com/file/t7032/cv999_14636.txt>
I don't have any problems generating an ARFF file from attached
example data, which includes your text file, using the following
command-line (Weka 3.9.3, Java 10 and 11):
java -cp weka.jar weka.core.converters.TextDirectoryLoader -dir
/some/where/test3/ > /somewhere/else/blah.arff

NB: I run this straight from a terminal, not from the SimpleCLI.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/

_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to: To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

test3.zip (158K) Download Attachment