Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis

Orestes Appel, Francisco Chiclana, Jenny Carter, Hamido Fujita

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexicon (off-line process) by human intervention to guarantee no noise is introduced into the lexicon, which prevents the classification system to provide an immediate answer; or (2) use the services of a word-frequency dictionary (on-line process), which is computationally costly to build. This paper investigates an alternative approach to compensate for the lack of ability of a lexicon-based method to produce a classification output. The method is based on the combination of the classification outputs of non lexicon-based tools. Specifically, firstly the outcome values of applying two or more non-lexicon classification methods are obtained. Secondly, these non-lexicon outcomes are fused using a uninorm based approach, which has been proved to have desirable compensation properties as required in the SA context, to generate the classification output the lexicon based approach is unable to achieve. Experimental results based on the execution of two well-known supervised machine learning algorithms, namely Naïve Bayes and Maximum Entropy, and the application of a cross-ratio uninorm operator are presented. Performance indices associated to options (1) and (2) above are compared against the results obtained using the proposed approach for two different datasets. Additionally, the performance of the proposed cross-ratio uninorm operator based approach is also compared when the aggregation operator used is the arithmetic mean instead. It is shown that the combination of non lexicon-based classification methods with specific uninorm operators improves the classification performance of lexicon-based methods, and it enables the offering of an alternative solution to the SA classification problem when needed. The proposed aggregation method could be used as well as a replacement of ensemble averaging techniques commonly applied when combining the results of several machine learning classifiers’ outputs.
LanguageEnglish
Pages16-22
Number of pages7
JournalKnowledge-Based Systems
Volume124
Early online date28 Feb 2017
DOIs
Publication statusPublished - 15 May 2017
Externally publishedYes

Fingerprint

Agglomeration
Learning systems
Sentiment analysis
Glossaries
Learning algorithms
Classifiers
Entropy
Operator

Cite this

Appel, Orestes ; Chiclana, Francisco ; Carter, Jenny ; Fujita, Hamido. / Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis. In: Knowledge-Based Systems. 2017 ; Vol. 124. pp. 16-22.
@article{addd9fc7440b4492a3ba3ec28691abe4,
title = "Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis",
abstract = "There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexicon (off-line process) by human intervention to guarantee no noise is introduced into the lexicon, which prevents the classification system to provide an immediate answer; or (2) use the services of a word-frequency dictionary (on-line process), which is computationally costly to build. This paper investigates an alternative approach to compensate for the lack of ability of a lexicon-based method to produce a classification output. The method is based on the combination of the classification outputs of non lexicon-based tools. Specifically, firstly the outcome values of applying two or more non-lexicon classification methods are obtained. Secondly, these non-lexicon outcomes are fused using a uninorm based approach, which has been proved to have desirable compensation properties as required in the SA context, to generate the classification output the lexicon based approach is unable to achieve. Experimental results based on the execution of two well-known supervised machine learning algorithms, namely Na{\"i}ve Bayes and Maximum Entropy, and the application of a cross-ratio uninorm operator are presented. Performance indices associated to options (1) and (2) above are compared against the results obtained using the proposed approach for two different datasets. Additionally, the performance of the proposed cross-ratio uninorm operator based approach is also compared when the aggregation operator used is the arithmetic mean instead. It is shown that the combination of non lexicon-based classification methods with specific uninorm operators improves the classification performance of lexicon-based methods, and it enables the offering of an alternative solution to the SA classification problem when needed. The proposed aggregation method could be used as well as a replacement of ensemble averaging techniques commonly applied when combining the results of several machine learning classifiers’ outputs.",
keywords = "Cross-ratio uninorms, Hybrid sentiment analysis, Maximum entropy, Na{\"i}ve Bayes, Semantic orientation aggregation, Supervised machine learning",
author = "Orestes Appel and Francisco Chiclana and Jenny Carter and Hamido Fujita",
year = "2017",
month = "5",
day = "15",
doi = "10.1016/j.knosys.2017.02.028",
language = "English",
volume = "124",
pages = "16--22",
journal = "Knowledge-Based Systems",
issn = "0950-7051",
publisher = "Elsevier",

}

Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis. / Appel, Orestes; Chiclana, Francisco; Carter, Jenny; Fujita, Hamido.

In: Knowledge-Based Systems, Vol. 124, 15.05.2017, p. 16-22.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Cross-ratio uninorms as an effective aggregation mechanism in sentiment analysis

AU - Appel, Orestes

AU - Chiclana, Francisco

AU - Carter, Jenny

AU - Fujita, Hamido

PY - 2017/5/15

Y1 - 2017/5/15

N2 - There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexicon (off-line process) by human intervention to guarantee no noise is introduced into the lexicon, which prevents the classification system to provide an immediate answer; or (2) use the services of a word-frequency dictionary (on-line process), which is computationally costly to build. This paper investigates an alternative approach to compensate for the lack of ability of a lexicon-based method to produce a classification output. The method is based on the combination of the classification outputs of non lexicon-based tools. Specifically, firstly the outcome values of applying two or more non-lexicon classification methods are obtained. Secondly, these non-lexicon outcomes are fused using a uninorm based approach, which has been proved to have desirable compensation properties as required in the SA context, to generate the classification output the lexicon based approach is unable to achieve. Experimental results based on the execution of two well-known supervised machine learning algorithms, namely Naïve Bayes and Maximum Entropy, and the application of a cross-ratio uninorm operator are presented. Performance indices associated to options (1) and (2) above are compared against the results obtained using the proposed approach for two different datasets. Additionally, the performance of the proposed cross-ratio uninorm operator based approach is also compared when the aggregation operator used is the arithmetic mean instead. It is shown that the combination of non lexicon-based classification methods with specific uninorm operators improves the classification performance of lexicon-based methods, and it enables the offering of an alternative solution to the SA classification problem when needed. The proposed aggregation method could be used as well as a replacement of ensemble averaging techniques commonly applied when combining the results of several machine learning classifiers’ outputs.

AB - There are situations in which lexicon-based methods for Sentiment Analysis (SA) are not able to generate a classification output for specific instances of a given dataset. Most often, the reason for this situation is the absence of specific terms in the sentiment lexicon required in the classification effort. In such cases, there were only two possible paths to follow: (1) add terms to the lexicon (off-line process) by human intervention to guarantee no noise is introduced into the lexicon, which prevents the classification system to provide an immediate answer; or (2) use the services of a word-frequency dictionary (on-line process), which is computationally costly to build. This paper investigates an alternative approach to compensate for the lack of ability of a lexicon-based method to produce a classification output. The method is based on the combination of the classification outputs of non lexicon-based tools. Specifically, firstly the outcome values of applying two or more non-lexicon classification methods are obtained. Secondly, these non-lexicon outcomes are fused using a uninorm based approach, which has been proved to have desirable compensation properties as required in the SA context, to generate the classification output the lexicon based approach is unable to achieve. Experimental results based on the execution of two well-known supervised machine learning algorithms, namely Naïve Bayes and Maximum Entropy, and the application of a cross-ratio uninorm operator are presented. Performance indices associated to options (1) and (2) above are compared against the results obtained using the proposed approach for two different datasets. Additionally, the performance of the proposed cross-ratio uninorm operator based approach is also compared when the aggregation operator used is the arithmetic mean instead. It is shown that the combination of non lexicon-based classification methods with specific uninorm operators improves the classification performance of lexicon-based methods, and it enables the offering of an alternative solution to the SA classification problem when needed. The proposed aggregation method could be used as well as a replacement of ensemble averaging techniques commonly applied when combining the results of several machine learning classifiers’ outputs.

KW - Cross-ratio uninorms

KW - Hybrid sentiment analysis

KW - Maximum entropy

KW - Naïve Bayes

KW - Semantic orientation aggregation

KW - Supervised machine learning

U2 - 10.1016/j.knosys.2017.02.028

DO - 10.1016/j.knosys.2017.02.028

M3 - Article

VL - 124

SP - 16

EP - 22

JO - Knowledge-Based Systems

T2 - Knowledge-Based Systems

JF - Knowledge-Based Systems

SN - 0950-7051

ER -