2017-02-03 3 views
2

J'utilise Stanford PNL v3.6 (JAVA) pour calculer sentiment de anglais phrases.sentiment Stanford PNL résultat ambigu

Stanford PNL calcule la polarité de la phrase de 0 à 4.

  • 0 très négatif
  • 1 négatif
  • 2 neutre
  • 3 positif
  • 4 très positif

Je cours quelques cas de test très simples mais j'ai très résultat nge.

Exemple:

  1. Text = Jhon est une bonne personne, le sentiment = 3 (ie positif)
  2. Text = David est une bonne personne, le sentiment = 2 (c.-à-neutre)

Dans l'exemple ci-dessus, les phrases sont identiques, autre que le nom David, Jhon, mais les valeurs de sentiment sont différentes. N'est-ce pas cette ambiguïté?

J'ai utilisé ce code Java pour le calcul de sentiment:

public static float calSentiment(String text) { 

      // pipeline must get initialized before proceeding further 
      Properties props = new Properties(); 
      props.setProperty("annotators", "tokenize, ssplit, parse, sentiment"); 
      StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 

      int mainSentiment = 0; 
      if (text != null && text.length() > 0) { 
       int longest = 0; 
       Annotation annotation = pipeline.process(text); 

       for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) { 
        Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class); 
        int sentiment = RNNCoreAnnotations.getPredictedClass(tree); 
        String partText = sentence.toString(); 

        if (partText.length() > longest) { 
         mainSentiment = sentiment; 
         longest = partText.length(); 
        } 
       } 
      } 
      if (mainSentiment > 4 || mainSentiment < 0) { 
       return -9999; 
      } 
      return mainSentiment; 

     } 

I est me manque quelque chose dans le code java? J'ai également obtenu un sentiment négatif (c'est-à-dire moins de 2) lorsque la phrase était positive et vice versa.

Merci.

Voici les résultats que j'ai eu avec des phrases simples en anglais:

Sentence: Tendulkar is a great batsman 
Sentiment: 3 
Sentence: David is a great batsman 
Sentiment: 3 
Sentence: Tendulkar is not a great batsman 
Sentiment: 1 
Sentence: David is not a great batsman 
Sentiment: 2 
Sentence: Shyam is not a great batsman 
Sentiment: 1 
Sentence: Dhoni loves playing football 
Sentiment: 3 
Sentence: John, Julia loves playing football 
Sentiment: 3 
Sentence: Drake loves playing football 
Sentiment: 3 
Sentence: David loves playing football 
Sentiment: 2 
Sentence: Virat is a good boy 
Sentiment: 2 
Sentence: David is a good boy 
Sentiment: 2 
Sentence: Virat is not a good boy 
Sentiment: 1 
Sentence: David is not a good boy 
Sentiment: 2 
Sentence: I love every moment of life 
Sentiment: 3 
Sentence: I hate every moment of life 
Sentiment: 2 
Sentence: I like dancing and listening to music 
Sentiment: 3 
Sentence: Messi does not like to play cricket 
Sentiment: 1 
Sentence: This was the worst movie I have ever seen 
Sentiment: 0 
Sentence: I really appreciated the movie 
Sentiment: 1 
Sentence: I really appreciate the movie 
Sentiment: 3 
Sentence: Varun talks in a condescending way 
Sentiment: 2 
Sentence: Ram is angry he did not win the tournament 
Sentiment: 1 
Sentence: Today's dinner was awful 
Sentiment: 1 
Sentence: Johny is always complaining 
Sentiment: 3 
Sentence: Modi's demonetisation has been very controversial and confusing 
Sentiment: 1 
Sentence: People are left devastated by floods and droughts 
Sentiment: 2 
Sentence: Chahal did a fantastic job by getting the 6 wickets 
Sentiment: 3 
Sentence: England played terribly bad 
Sentiment: 1 
Sentence: Rahul Gandhi is a funny man 
Sentiment: 3 
Sentence: Always be grateful to those who are generous towards you 
Sentiment: 3 
Sentence: A friend in need is a friend indeed 
Sentiment: 3 
Sentence: Mary is a jubilant girl 
Sentiment: 2 
Sentence: There is so much of love and hatred in this world 
Sentiment: 3 
Sentence: Always be positive 
Sentiment: 3 
Sentence: Always be negative 
Sentiment: 1 
Sentence: Never be negative 
Sentiment: 1 
Sentence: Stop complaining and start doing something 
Sentiment: 2 
Sentence: He is a awesome thief 
Sentiment: 3 
Sentence: Ram did unbelievably well in this year's exams 
Sentiment: 2 
Sentence: This product is well designed and easy to use 
Sentiment: 3 
+1

Je reçois des résultats tout aussi absurdes avec la version 3.7.0 et Python. Je pense que c'est un bug. – sds

+0

Voir https://github.com/stanfordnlp/CoreNLP/issues/351 – sds

Répondre

0

Les décisions de sentiment sont prises par un réseau de neurones formés. Malheureusement, il agit étrangement sur la base de différents noms que vous fournissez dans la même phrase, mais c'est à prévoir. Comme discuté sur GitHub, un facteur est probablement que divers noms n'apparaissent pas souvent dans les données d'apprentissage.