slow command line vs Explorer

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

slow command line vs Explorer

Francesco Sigona
Hi all,

I'm getting started with Weka and I apologize if my question has been answered in the past.

My question is about the speed of the command line interface (CLI) with respect to the Explorer: the CLI seems to be much slower than Explorer (I use Weka 3.8 on Windows 7 and the standard DOS ad CLI).

My train file (.arff) has 1 nominal attribute with 2 values, i.e. 2 classes, 42 real attributes (single precision real numbers), 256908 instances.
My test file (.arff) has the same header, 1209 instances, and just the first class is present.
I'm trying the following classifier: weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"", i.e. the IBK with default values.
In Explorer I get 0.12 second to build the model, and 64.97 second to test.

In CLI (DOS) I use the following line from the WEKA installation directory:
java -cp .\weka.jar weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"" -t c:\users\fs\Desktop\train_001.arff -c 1 -T c:\users\fs\Desktop\test_001.arff
In this case, the execution never end (or it is really long) and I have to interrupt.

So I tried another classifier: ZeroR. In Explorer I get 0.06 second to build the model and 0.07 second to test, while in DOS I get 0.31 second to build the model and 2,29 second to test, i.e. approximately 5x the time to build the model and 32x the time to test!

Maybe I'm misusing the CLI sintax? Or CLI is really so slow? How can I cope with this?

Many thanks!

Francesco


--
Francesco SIGONA
Electronics engineer

Piazza Filippo Muratore
73100 - Lecce - Italy
tel.: +39 0832 335006
fax.: +39 0832 335007
============================================================
Center for Interdisciplinary Research on Language (CRIL) &
Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS)
Dipartimento di Studi umanistici
Università del Salento
============================================================
Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina
(DReAM)

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: slow command line vs Explorer

Eibe Frank-2
Administrator
This seems very peculiar. Are you sure you are using the same datasets? Is it perhaps using a different, very slow Java virtual machine when you are running it from the CLI? What information do you get when you run

  java -version

in the CLI?

Cheers,
Eibe

> On 17/02/2017, at 12:33 AM, Francesco Sigona <[hidden email]> wrote:
>
> Hi all,
>
> I'm getting started with Weka and I apologize if my question has been answered in the past.
>
> My question is about the speed of the command line interface (CLI) with respect to the Explorer: the CLI seems to be much slower than Explorer (I use Weka 3.8 on Windows 7 and the standard DOS ad CLI).
>
> My train file (.arff) has 1 nominal attribute with 2 values, i.e. 2 classes, 42 real attributes (single precision real numbers), 256908 instances.
> My test file (.arff) has the same header, 1209 instances, and just the first class is present.
> I'm trying the following classifier: weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"", i.e. the IBK with default values.
> In Explorer I get 0.12 second to build the model, and 64.97 second to test.
>
> In CLI (DOS) I use the following line from the WEKA installation directory:
> java -cp .\weka.jar weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"" -t c:\users\fs\Desktop\train_001.arff -c 1 -T c:\users\fs\Desktop\test_001.arff
> In this case, the execution never end (or it is really long) and I have to interrupt.
>
> So I tried another classifier: ZeroR. In Explorer I get 0.06 second to build the model and 0.07 second to test, while in DOS I get 0.31 second to build the model and 2,29 second to test, i.e. approximately 5x the time to build the model and 32x the time to test!
>
> Maybe I'm misusing the CLI sintax? Or CLI is really so slow? How can I cope with this?
>
> Many thanks!
>
> Francesco
>
>
> --
> Francesco SIGONA
> Electronics engineer
> Piazza Filippo Muratore
> 73100 - Lecce - Italy <dmhcajmihfnadkfa.gif>                            
> tel.: +39 0832 335006
> fax.: +39 0832 335007 ============================================================
> Center for Interdisciplinary Research on Language (CRIL) &
> Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS)                 <cbgkfcnokbcnkpfk.gif>            
> Dipartimento di Studi umanistici
> Università del Salento <jbgphmhciofnecin.gif>             ============================================================
> Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina
> (DReAM) <pcmkjbifpbjfeljh.gif>             _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: slow command line vs Explorer

Francesco Sigona
Dear Eibe,

thank you for your answer.

I'm sure to use the same dataset. Also, I get the same result in both environments.

This is the output of java -version in DOS CLI:
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode).

I get the same java version by looking at the "SystemInfo" windows provided by WEKA Launcher.

The original train file is 39.9 MB long, if zipped: I can manage to share it if you like.

Kind regards.

Francesco

Il 17/02/2017 03:07, Eibe Frank ha scritto:
This seems very peculiar. Are you sure you are using the same datasets? Is it perhaps using a different, very slow Java virtual machine when you are running it from the CLI? What information do you get when you run 

  java -version 

in the CLI?

Cheers,
Eibe

On 17/02/2017, at 12:33 AM, Francesco Sigona [hidden email] wrote:

Hi all,

I'm getting started with Weka and I apologize if my question has been answered in the past.

My question is about the speed of the command line interface (CLI) with respect to the Explorer: the CLI seems to be much slower than Explorer (I use Weka 3.8 on Windows 7 and the standard DOS ad CLI).

My train file (.arff) has 1 nominal attribute with 2 values, i.e. 2 classes, 42 real attributes (single precision real numbers), 256908 instances.
My test file (.arff) has the same header, 1209 instances, and just the first class is present.
I'm trying the following classifier: weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"", i.e. the IBK with default values.
In Explorer I get 0.12 second to build the model, and 64.97 second to test.

In CLI (DOS) I use the following line from the WEKA installation directory:
java -cp .\weka.jar weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"" -t c:\users\fs\Desktop\train_001.arff -c 1 -T c:\users\fs\Desktop\test_001.arff
In this case, the execution never end (or it is really long) and I have to interrupt.

So I tried another classifier: ZeroR. In Explorer I get 0.06 second to build the model and 0.07 second to test, while in DOS I get 0.31 second to build the model and 2,29 second to test, i.e. approximately 5x the time to build the model and 32x the time to test!

Maybe I'm misusing the CLI sintax? Or CLI is really so slow? How can I cope with this?

Many thanks!

Francesco


-- 
Francesco SIGONA 
Electronics engineer	
Piazza Filippo Muratore 
73100 - Lecce - Italy	<dmhcajmihfnadkfa.gif>                            
tel.: +39 0832 335006 
fax.: +39 0832 335007 ============================================================
Center for Interdisciplinary Research on Language (CRIL) & 
Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS)                 	<cbgkfcnokbcnkpfk.gif>             
Dipartimento di Studi umanistici 
Università del Salento	<jbgphmhciofnecin.gif>             ============================================================
Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina 
(DReAM)	<pcmkjbifpbjfeljh.gif>             _______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html


--
Francesco SIGONA
Electronics engineer

Piazza Filippo Muratore
73100 - Lecce - Italy
tel.: +39 0832 335006
fax.: +39 0832 335007
============================================================
Center for Interdisciplinary Research on Language (CRIL) &
Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS)
Dipartimento di Studi umanistici
Università del Salento
============================================================
Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina
(DReAM)

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: slow command line vs Explorer

Eibe Frank-2
Administrator
Perhaps it is due to different Java heap sizes. Try to increase the heap size using -Xmx when you run the java command. You could also try to monitor the behaviour in the two cases (GUI vs. CLI) using the jvisualvm tool (http://docs.oracle.com/javase/7/docs/technotes/tools/share/jvisualvm.html).

Cheers,
Eibe

> On 18 Feb 2017, at 08:53, Francesco Sigona <[hidden email]> wrote:
>
> Dear Eibe,
>
> thank you for your answer.
>
> I'm sure to use the same dataset. Also, I get the same result in both environments.
>
> This is the output of java -version in DOS CLI:
> java version "1.8.0_111"
> Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode).
>
> I get the same java version by looking at the "SystemInfo" windows provided by WEKA Launcher.
>
> The original train file is 39.9 MB long, if zipped: I can manage to share it if you like.
>
> Kind regards.
>
> Francesco
>
> Il 17/02/2017 03:07, Eibe Frank ha scritto:
>> This seems very peculiar. Are you sure you are using the same datasets? Is it perhaps using a different, very slow Java virtual machine when you are running it from the CLI? What information do you get when you run
>>
>>   java -version
>>
>> in the CLI?
>>
>> Cheers,
>> Eibe
>>
>>
>>> On 17/02/2017, at 12:33 AM, Francesco Sigona <[hidden email]>
>>>  wrote:
>>>
>>> Hi all,
>>>
>>> I'm getting started with Weka and I apologize if my question has been answered in the past.
>>>
>>> My question is about the speed of the command line interface (CLI) with respect to the Explorer: the CLI seems to be much slower than Explorer (I use Weka 3.8 on Windows 7 and the standard DOS ad CLI).
>>>
>>> My train file (.arff) has 1 nominal attribute with 2 values, i.e. 2 classes, 42 real attributes (single precision real numbers), 256908 instances.
>>> My test file (.arff) has the same header, 1209 instances, and just the first class is present.
>>> I'm trying the following classifier: weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"", i.e. the IBK with default values.
>>> In Explorer I get 0.12 second to build the model, and 64.97 second to test.
>>>
>>> In CLI (DOS) I use the following line from the WEKA installation directory:
>>> java -cp .\weka.jar weka.classifiers.lazy.IBk -K 1 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"" -t c:\users\fs\Desktop\train_001.arff -c 1 -T c:\users\fs\Desktop\test_001.arff
>>> In this case, the execution never end (or it is really long) and I have to interrupt.
>>>
>>> So I tried another classifier: ZeroR. In Explorer I get 0.06 second to build the model and 0.07 second to test, while in DOS I get 0.31 second to build the model and 2,29 second to test, i.e. approximately 5x the time to build the model and 32x the time to test!
>>>
>>> Maybe I'm misusing the CLI sintax? Or CLI is really so slow? How can I cope with this?
>>>
>>> Many thanks!
>>>
>>> Francesco
>>>
>>>
>>> --
>>> Francesco SIGONA
>>> Electronics engineer
>>> Piazza Filippo Muratore
>>> 73100 - Lecce - Italy <dmhcajmihfnadkfa.gif>                            
>>> tel.: +39 0832 335006
>>> fax.: +39 0832 335007 ============================================================
>>> Center for Interdisciplinary Research on Language (CRIL) &
>>> Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS)                 <cbgkfcnokbcnkpfk.gif>            
>>> Dipartimento di Studi umanistici
>>> Università del Salento <jbgphmhciofnecin.gif>             ============================================================
>>> Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina
>>> (DReAM) <pcmkjbifpbjfeljh.gif>             _______________________________________________
>>> Wekalist mailing list
>>> Send posts to:
>>> [hidden email]
>>>
>>> List info and subscription status:
>>> https://list.waikato.ac.nz/mailman/listinfo/wekalist
>>>
>>> List etiquette:
>>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to:
>> [hidden email]
>>
>> List info and subscription status:
>> https://list.waikato.ac.nz/mailman/listinfo/wekalist
>>
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>
> --
> Francesco SIGONA
> Electronics engineer
> Piazza Filippo Muratore
> 73100 - Lecce - Italy <Mail Attachment.gif>  
> tel.: +39 0832 335006
> fax.: +39 0832 335007 ============================================================
> Center for Interdisciplinary Research on Language (CRIL) &
> Cognitive Neuroscience of Language and Speech Sciences Lab (CNLSS) <Mail Attachment.gif>
> Dipartimento di Studi umanistici
> Università del Salento <Mail Attachment.gif> ============================================================
> Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla Medicina
> (DReAM) <Mail Attachment.gif> _______________________________________________
> Wekalist mailing list
> Send posts to: [hidden email]
> List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

_______________________________________________
Wekalist mailing list
Send posts to: [hidden email]
List info and subscription status: https://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html