Técnicas para el Análisis de Sentimiento en Twitter: Aprendizaje Automático Supervisado y SentiStrength

Tomás Baviera

doi:10.7203/rd.v1i3.74

Autors/ores

Tomás Baviera Universidad Internacional de Valencia

DOI:

https://doi.org/10.7203/rd.v1i3.74

Resum

El análisis del sentimiento en los mensajes publicados en Twitter ofrece posibilidades de gran interés para evaluar las corrientes de opinión difundidas a través de este medio. Los enormes volúmenes de textos requieren de herramientas capaces de procesar automáticamente estos mensajes sin perder abilidad. Este artículo describe dos tipos de técnicas para abordar este problema. La primera estrategia se basa en los procesos de Aprendizaje Automático Supervisado. Su aplicación requiere integrar algunas herramientas del Procesamiento de Lenguajes Naturales y tomar como punto de partida un corpus clasi cado. El segundo enfoque está basado en diccionarios de polaridad. En esta línea se sitúa la herramienta de SentiStrength, la cual se está aplicando cada vez más a los estudios de Twitter en inglés. El artículo evalúa los estudios más avanzados que utilizan cada uno de estos enfoques para el análisis de los tweets en castellano. Por último, se señalan las ventajas y limitaciones de cada uno de estos enfoques para su aplicación a la investigación en comunicación política. Si bien el aprendizaje automático supervisado permite tener en cuenta el contexto, el investigador requiere competencias de analista de datos con el n de a nar mejor el proceso. En cambio, SentiStrength está más orientado al contenido semántico de los términos del mensaje, y se requiere más bien una competencia en lingüística por parte del investigador. La principal conclusión es que ambos métodos automáticos de análisis no pueden prescindir de una exigente codi cación manual si se desea utilizarlos con abilidad en la investigación.

Biografia de l'autor/a

Tomás Baviera, Universidad Internacional de Valencia

Profesor Asociado del Master en Comunicación Social de la Investigación Científica, Universidad Internacional de Valencia (VIU).

Investigador del Grupo Mediaflows, Universidad de Valencia.

Investigador del Institute for Ethics in Communication and Organizations (IECO).

Referències

Alvarez, R.; Garcia, D.; Moreno, Y. y Schweitzer, F. (2015): "Sentiment cascades in the 15M movement", en EPJ Data Science, vol. 4, pp. 6. Disponible en Internet: http://dx.doi.org/10.1140/epjds/s13688-015-0042-4 [Consulta: 6/10/2016].

Baccianella, S.; Esuli, A. y Sebastiani, F. (2010): "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining". Comunicación presentada en el Seventh Conference on International Language Resources and Evaluation. Valetta, Malta. Disponible en Internet: http://nmis.isti.cnr.it/sebastiani/Publications/LREC10.pdf [Consulta: 12/10/2016].

Batrinca, B. y Treleaven, P. C. (2015): "Social media analytics: a survey of techniques, tools and platforms", en AI and Society, vol. 30, no 1, pp. 89–116.

Berrios, R.; Totterdell, P. y Kellett, S. (2015): "Eliciting mixed emotions: a meta-analysis comparing models, types, and measures", en Frontiers in Psychology, vol. 6, no 428.

Bhattacharya, S.; Srinivasan, P. y Polgreen, P. (2014): "Engagement with Health Agencies on Twitter", en PLoS ONE, vol. 9, no 11, pp. e112235.

Bird, S.; Klein, E. y Loper, E. (2009): Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. Sebastopol, CA: O’Reilly Media.

Bravo-Marquez, F.; Mendoza, M. y Poblete, B. (2014): "Meta-level sentiment models for big social data analysis", en Knowledge-Based Systems, vol. 69, no 1, pp. 86–99.

Brooke, J.; Tofiloski, M. y Taboada, M. (2009): "Cross-Linguistic Sentiment Analysis: From English to Spanish". Comunicación presentada en el International Conference on Recent Advances in Natural Language Processing (RANLP). Borovets, Bulgaria.

Caton, S.; Hall, M. y Weinhardt, C. (2015): "How do politicians use Facebook? An applied Social Observatory", en Big Data & Society, vol. 2, no 2, pp. 2053951715612822.

Da Silva, N. F. F.; Hruschka, E. R. y Hruschka, E. R. (2014): "Tweet sentiment analysis with classifier ensembles", en Decision Support Systems, vol. 66, pp. 170–179.

Dang-Xuan, L.; Stieglitz, S.; Wladarsch, J. y Neuberger, C. (2013): "An Investigation of Influentials and the Role of Sentiment in Political Communication on Twitter During Election Periods", en Information, Communication & Society, vol. 16, no 5, pp. 795–825.

Ferrara, E. y Yang, Z. (2015): "Measuring Emotional Contagion in Social Media", en PLoS ONE, vol. 10, no 11, pp. e0142390.

Gamallo, P. y García, M. (2014): “Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets”. Comunicación presentada en el 8th International Workshop on Semantic Evaluation (SemEval 2014). Dublin, 23-24 de agosto.

García Cumbreras, M. Á.; Martínez Cámara, E.; Villena Román, J. y García Morera, J. (2016): "TASS 2015 - The evolution of the Spanish opinion mining systems", en Revista de Procesamiento Del Lenguaje Natural, vol. 56, pp. 33–40.

Guo, L. y Vargo, C. (2015): "The Power of Message Networks: A Big-Data Analysis of the Network Agenda Setting Model and Issue Ownership", en Mass Communication and Society, vol. 18, no 5, pp. 557–576.

Hurtado, L.-F.; Pla, F. y Buscaldi, D. (2015): "ELiRF-UPV en TASS 2015: Análisis de Sentimientos en Twitter". Comunicación presentada en el TASS: Workshop on Sentiment Analysis at SEPLN. Disponible en Internet: http://ceur-ws.org/Vol-1397/elirf_upv.pdf [Consulta: 6/10/2016].

Jungherr, A. (2015): Analyzing Political Communication with Digital Trace Data: The Role of Twitter Messages in Social Science Research. Cham: Springer International Publishing Switzerland.

Martínez-Cámara, E.; Martín-Valdivia, M. T.; Ureña-López, L. A. y Montejo-Ráez, A. (2012): "Sentiment analysis in Twitter", en Natural Language Engineering, vol. 20, no 1, pp. 1–28.

Medhat, W.; Hassan, A. y Korashy, H. (2014): "Sentiment analysis algorithms and applications: A survey", en Ain Shams Engineering Journal, vol. 5, no 4, pp. 1093–1113.

Molina González, M. D.; Martínez Cámara, E. y Martín Valdivia, M. T. (2015): "CRiSOL: Base de conocimiento de opiniones para el español", en Procesamiento Del Lenguaje Natural, vol. 55, pp. 143–150.

Nayak, J.; Naik, B. y Behera, H. S. (2015): "A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges", en International Journal of Database Theory and Application, vol. 8, no 1, pp. 169–186.

Padró, L. y Stanilovsky, E. (2012): "FreeLing 3.0: Towards Wider Multilinguality". Comunicación presentada en el Language Resources and Evaluation Conference (LREC 2012) ELRA. Istanbul, Turkey.

Pennebaker, J. W.; Mehl, M. R. y Niederhoffer, K. G. (2003): "Psychological Aspects of Natural Language Use: Our Words, Our Selves", en Annual Review of Psychology, vol. 54, pp. 547–577.

Pfitzner, R.; Garas, A. y Schweitzer, F. (2012): "Emotional Divergence Influences Information Spreading in Twitter". Comunicación presentada en el Sixth International AAAI Conference on Weblogs and Social Media. Disponible en Internet: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4596/5053 [Consulta: 15/10/2016].

Pla, F. y Hurtado, L.-F. (2013): "ELiRF-UPV en TASS 2013: Análisis de Sentimientos en Twitter". Comunicación presentada en el TASS workshop at SEPLN 2013. IV Congreso Español de Informática. Madrid. Disponible en Internet: http://users.dsic.upv.es/~lhurtado/papers/pdfs/2013_pla13_tass.pdf [Consulta: 6/10/2016].

Prata, D. N.; Soares, K. P.; Silva, M. A.; Trevisan, D. Q. y Letouze, P. (2016): "Social Data Analysis of Brazilian’s Mood from Twitter", en International Journal of Social Science and Humanity, vol. 6, no 3, pp. 179–183.

Saif, H.; Fernandez, M.; He, Y. y Alani, H. (2012): "Semantic sentiment analysis of Twitter". Comunicación presentada en el 1th International Conference on the Semantic Web. Boston.

Saif, H.; He, Y. y Alani, H. (2012): "Alleviating data sparsity for Twitter sentiment analysis". Comunicación presentada en el 2nd Workshop on Making Sense of Microposts (#MSM2012) at the 21st International Conference on the World Wide Web (WWW’12). Lyon, France.

Saif, H.; He, Y.; Fernandez, M. y Alani, H. (2016): "Contextual semantics for sentiment analysis of Twitter", en Information Processing and Management, vol. 52, no 1, pp. 5–19.

Saralegi, X. y San Vicente, I. (2013): "Elhuyar at TASS 2013". Comunicación presentada en el XXIX Congreso de la Sociedad Española de Procesamiento de Lenguaje Natural. Madrid.

Thelwall, M.; Buckley, K. y Paltoglou, G. (2012): "Sentiment Strength Detection for the Social Web", en Journal of the American Society for Information Science and Technology, vol. 63, no 1, pp. 163–173.

Thelwall, M.; Buckley, K.; Paltoglou, G. y Cai, D. (2010): "Sentiment Strength Detection in Short Informal Text", en Journal of the American Society for Information Science and Technology, vol. 61, no 12, pp. 2544–2558.

Vapnik, V. V. (1998): Statistical Learning Theory. New York: John Wiley & Sons.

Vargo, C. J.; Guo, L.; Mccombs, M. y Shaw, D. L. (2014): "Network Issue Agendas on Twitter During the 2012 U.S. Presidential Election", en Journal of Communication, vol. 64, pp. 296–316.

Vilares, D.; Thelwall, M. y Alonso, M. A. (2015): "The megaphone of the people? Spanish SentiStrength for real-time analysis of political tweets", en Journal of Information Science, vol. 41, no 6, pp. 799–813.

Villena-Román, J.; Lana-Serrano, S.; Martínez-Cámara, E. y González-Cristóbal, J. C. (2013): "TASS - Workshop on Sentiment Analysis at SEPLN", en Revista de Procesamiento Del Lenguaje Natural, vol. 50, pp. 37–44.

Wilson, T.; Wiebe, J. y Hoffmann, P. (2005): "Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis". Comunicación presentada en el Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Vancouver. Disponible en Internet: https://people.cs.pitt.edu/~wiebe/pubs/papers/emnlp05polarity.pdf [Consulta: 12/10/2016].

Yu, Y. y Wang, X. (2015): "World Cup 2014 in the Twitter World: A big data analysis of sentiments in U.S. sports fans’ tweets", en Computers in Human Behavior, vol. 48, pp. 392–400.

Técnicas para el Análisis de Sentimiento en Twitter: Aprendizaje Automático Supervisado y SentiStrength

Autors/ores

DOI:

Resum

Biografia de l'autor/a

Tomás Baviera, Universidad Internacional de Valencia

Referències

Descàrregues

Publicades

Número

Secció

Llicència

premio

callforpapers

Llengua

xarxes