Text Analysis of ETDs in ProQuest Dissertations and Theses (PQDT) Global (2016-2018)

By 👩‍🔬Manika Lamba in Conference Paper 2019

November 6, 2019

1

Abstract

The information explosion in the form of ETDs poses the challenge of management and extraction of appropriate knowledge for decision making. Thus, the present study forwards a solution to the above problem by applying topic mining and prediction modeling tools to full-text 263 ETDs submitted to the PQDT Global database during 2016-18 in the field of library science. This study was divided into two phases. The first phase determined the core topics from the ETDs using Topic-Modeling-Tool (TMT), which was based on latent dirichlet allocation (LDA), whereas the second phase employed prediction analysis using RapidMiner platform to annotate the future research articles on the basis of the modeled topics. The core topics (tags) for the studied period were found to be book history, school librarian, public library, communicative ecology, and informatics followed by text network and trend analysis on the high probability co-occurred words. Lastly, a prediction model using the Support Vector Machine (SVM) classifier was created in order to accurately predict the placement of future ETDs going to be submitted to PQDT Global under the five modeled topics (a to e). The tested dataset against the trained data set for the predictive performed perfectly.

Posted on:
November 6, 2019
Length:
1 minute read, 196 words
Categories:
Conference Paper 2019
See Also: