Mapping of ETDs in ProQuest dissertations and theses (PQDT) global database (2014-2018)

By 👩‍🔬Manika Lamba, Margam Madhusudhan in Article 2020

January 6, 2020



The information explosion in the form of ETDs poses the challenge of management and extraction of appropriate knowledge for decision making by information practitioners. This study presents a solution to the problem by applying topic mining and prediction modeling to 441 full-text ETDs extracted from the PQDT Global database during 2014-2018 in the field of library science using the RapidMiner platform. This study was divided into three phases. In the first phase, metadata analysis of the ETDs retrieved from the database was performed to identify the association of various entities such as universities, departments, types of degrees, and geographical areas with the ETDs. In the second phase, 8 core topics namely children literature; academic library; information retrieval; archival science; user study; digital library; library leadership; and digital communication were determined using latent dirichlet allocation (LDA) and each ETD was then annotated with the modeled topic. Lastly, a prediction model using the Support Vector Machine (SVM) was created to classify the untagged ETDs going to be submitted in the database under the 8 modeled topics (a to h).

Posted on:
January 6, 2020
1 minute read, 178 words
Article 2020
See Also: