Help with FilteredClassifier in Java

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with FilteredClassifier in Java

trubinsh
Hello!
I am building binary classifier for web page classification and have trained
FilteredClassifier with SVM classifier and StringToWordVectorer, like this
but now i am not able to classify new instances. I get following error
 I have tried building just SVM model and passing it to FilteredClassifier
along with StringToWordVectorer (following this
https://weka.8497.n7.nabble.com/Classifying-New-Data-from-Java-IDF-Transform-td11221.html)
like this  but then i get this error

I could just train model and do all the tokenization on my own, but I need
TF-IDF scores, but I don't know how to get IDF scores from weka. Can someone
please help me understand how to get FilteredClassifier working or at least
how to get IDF scores out of training data, rest I could do on my own?



--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Reply | Threaded
Open this post in threaded view
|

Re: Help with FilteredClassifier in Java

trubinsh
Sorry. Code didn't copy
Code:
        FilteredClassifier classifier = (FilteredClassifier)
SerializationHelper.read(String.format("%s%sbinary_filter.model",
"D:\\resources", File.separator));
        ArrayList<Attribute> attributes = new ArrayList<>();
        attributes.add(new Attribute("class", Arrays.asList("Positive",
"Negative")));
        attributes.add(new Attribute("text", (ArrayList)null));
        Instances instances = new Instances("ClassifyInstances", attributes,
1);
        double[] inst = new double[2];
        inst[0] = 0;
        inst[1] = instances.attribute(1).addStringValue("some text to
classify");
        instances.add(new DenseInstance(1, inst));
        classifier.classifyInstance(instances.get(0));
Error:
Exception in thread "main" java.lang.IllegalArgumentException: Src and Dest
differ in # of attributes: 2 != 739

2nd code:
        FilteredClassifier classifier = new FilteredClassifier();
        LibSVM sclassifier =
(LibSVM)SerializationHelper.read(String.format("%s%sbinary.model",
"D:\\resources", File.separator));
        classifier.setClassifier(sclassifier);
        StringToWordVector vectorer = new StringToWordVector();
        vectorer.setTFTransform(true);
        vectorer.setIDFTransform(true);
        NGramTokenizer n = new NGramTokenizer();
        n.setNGramMaxSize(2);
        vectorer.setTokenizer(n);
        vectorer.setLowerCaseTokens(true);
        classifier.setFilter(vectorer);
        ArrayList<Attribute> attributes = new ArrayList<>();
        attributes.add(new Attribute("class", Arrays.asList("Positive",
"Negative")));
        attributes.add(new Attribute("text", (ArrayList)null));
        Instances instances = new Instances("ClassifyInstances", attributes,
1);
        double[] inst = new double[2];
        inst[0] = 0;
        inst[1] = instances.attribute(1).addStringValue("some text to
classify");
        instances.add(new DenseInstance(1, inst));
        classifier.classifyInstance(instances.get(0));
2nd error:
Exception in thread "main" java.lang.NullPointerException: No output
instance format defined




--
Sent from: https://weka.8497.n7.nabble.com/
_______________________________________________
Wekalist mailing list -- [hidden email]
Send posts to [hidden email]
To unsubscribe send an email to [hidden email]
To subscribe, unsubscribe, etc., visit https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html