Sentiment Analysis at Different Structural Levels in Text
Sentiment can be measured at many structural levels in text ranging from individual words to entire documents. The sentiment analysis provided by TheySay’s PreCeive API supports measuring sentiment at the level of document, sentence, entities (cf. Noun Phrases, keywords, key phrases), entity relations, and more. Choosing and focusing on the appropriate level of analysis for your analytical requirements and use cases greatly enhances the coverage and accuracy of automated sentiment analysis.
The default – and often baseline – analysis targets entire documents. A sentiment score is calculated for the whole piece of text submitted for analysis. A document may contain three paragraphs or a just single sentence. Since a single paragraph can express both positive and negative sentiment concurrently, document-level sentiment analysis is often too holistic and may not tell the full story.
In a slightly more detailed analysis, a sentiment score is generated for each sentence. In our analysis, this is not simply an average of the entities or words in each sentence but rather an exhaustive aggregation of each entity in the sentence through exhaustive grammatical analysis. This provides greater accuracy and vastly greater coverage over naïve methods.
This is an even more granular and thorough examination of deeper sentiment contexts and signals in the text. Entities are words or phrases such as “doctors”, “nurses”, “broken bones”, and “second-degree burns”. In our analysis, all entities are analysed and a fine-grained sentiment score is assigned to each.
For example, consider the following text:
On Saturday night, I had a car accident and was severely hurt. I was rushed to Accident and Emergency where I was treated for many injuries including several broken bones. The doctors and nurses who treated were fantastic, their care was amazing. I would like to thank all staff at the Hospital for their wonderful care.
Document-level Sentiment: Overall, the document is mildly negative as it is a mixture of negative and positive sentiments.
Sentence-level Sentiment: There are four sentences. The first two are negative, whereas the last two express highly positive sentiment.
Entity-level Sentiment: the Noun Phrases “broken bones” and “injuries” are detected as negative entities whereas “doctors”, “nurses”, and “wonder care” are strongly positive.
Therefore, if I was interested in the sentiment regarding my care by medical staff, holistic document-level analysis would deliver a very unclear response.
If I instead use either sentence- or entity-level analysis, the answer would be unambiguously positive and hence more accurate and relevant in my use case.
What structural level is appropriate is entirely driven by your requirements and the type of text you are processing. For tweets or very short texts, document-level sentiment analysis is often enough. However, if you are profiling sentiment around a specific aspect or feature of a brand, person, product or organisation, and your text is more than a few sentences, then sentence- or entity-level analysis is often a far better choice.