ECU Libraries Catalog

Comparison of topic modeling methods for analyzing tweets on COVID-19 vaccine / by Zeinab Khanjarinezhadjooneghani.

Author/creator Khanjarinezhadjooneghani, Zeinab author.
Other author/creatorTabrizi, M. H. N., degree supervisor.
Other author/creatorEast Carolina University. Department of Computer Science.
Format Theses and dissertations, Electronic, and Book
Publication Info [Greenville, N.C.] : [East Carolina University], 2021.
Description1 online resource (100 pages) : color illustrations
Supplemental Content Access via ScholarShip
Subject(s)
Summary Twitter is a microblogging site and a popular social media platform for sharing thoughts on current world events. The dynamic of Twitter discussions makes it a valuable data source for mining people's opinions and emotions towards world events. Tweets' dynamic nature can be used to analyze opinion shifting and sentiment shifting for specific targets. The COVID-19 outbreak is one of the recent worldwide events that affect people's lives worldwide in the last two years. Many people share their feelings and experiences through social media towards this pandemic. COVID-19-related tweets have recently been the subject of some research. This thesis also analyzes tweets related to the COVID-19 vaccine. The main objective of this thesis is to mine human concerns towards the COVID-19 vaccine using Twitter data. This thesis applies three topic modeling methods to discover the discussed subjects about the COVID-19 vaccine and analyze the topics' dynamic over a specific period. The models are Latent Dirichlet Allocation (LDA), LDA with Gibbs Sampling, Nonnegative Matrix Factorization (NMF), and Top2vec models. Furthermore, this thesis compares these three topic modeling methods based on human judgment, coherence value, and topics uniqueness. The results show both LDA outperformed NMF in terms of Jaccard score. In addition, LDA-Mallet outperformed LDA and NMF in terms of Coherence score. It is difficult to determine which one of NMF and LDA definitely provided the better score for some of the experiments. But, at all, it can be stated NMF performed better than LDA in terms of Coherence score. Top2Vec returned 255 topics for this case study, which is not desired for the purpose of this study. Three other methods outperform Top2vec in terms of Jaccard score and coherence value.
General notePresented to the Department of Computer Science
General noteAdvisor: Nasseh Tabrizi
General noteTitle from PDF t.p. (viewed July 19, 2022).
Dissertation noteM.S. East Carolina University 2021
Bibliography noteIncludes bibliographical references.
Technical detailsSystem requirements: Adobe Reader.
Technical detailsMode of access: World Wide Web.
Genre/formAcademic theses.
Genre/formAcademic theses.
Genre/formThèses et écrits académiques.

Available Items

Library Location Call Number Status Item Actions
Electronic Resources Access Content Online ✔ Available