Exploring OCR Errors in Full-Text Large Documents: A Study of LIS Theses and Dissertations

Abstract The accuracy of OCR output for text mining and NLP analyses of large text documents can be impacted by errors that occur during the OCR process. The methodology involves retrieving electronic theses and dissertations (ETDs) for LIS discipline from the ProQuest Dissertations and Theses Global database and manually reviewing the full-text ETDs for OCR problems associated with the conversion of PDF files into plain text format.

Topic Modelling and its Application in Libraries: a review of specialized literature

Abstract Text mining application is one of the most trending and highly researched areas in social sciences. Todate, library professionals’ knowledge of text mining tools and practice is mainly limited, resultantly,the library community poorly understands the full range of issues related to text mining. This articleprovides information on applying a text mining approach called topic modelling in the library andinformation science domain. Topic modelling is a text mining approach that determines a generativemodel for documents.

Gender Research in Political Science Journals: A Dataset

Abstract Research on gender and politics is becoming increasingly mainstreamed within political science. To document this process, we introduce a comprehensive dataset of articles published in 37 political science journals through 2019 that can be considered “gender and politics” research. While a recent related literature has explored the descriptive representation of women in political science by examining authorship and citation patterns, we argue that the identification of publications substantively focused on gender and politics not only illuminates trends but can contribute to broader conversations about substantive representation and methodological diversity in the discipline.

Barriers to Scholarly Publishing Among Library and Information Science Researchers: International Perspectives

Abstract The 21 authors of this study, 19 of whom are non-native English speakers, reflect on the barriers to publishing academic journal articles in top international journals. Each author responded to the same set of questions pertaining to educational (PhD) opportunities for emerging scholars, financial conditions for researchers, and challenges in publishing their work. Limited English language skills, lack of research funding, and different research topics were identified as the most significant barriers to publish in the journals.

Research Evaluation of Computer Science Departments: A Cohort Study of Indian Central Universities

Abstract Purpose: Social interaction applications and reference tools are actively used by researchers to share and manage their research publications. Thus, this paper aims to determine the scholarly impact of selected Indian central universities. Design/methodology/approach: This study analyzed 669 articles having both Dimensions citations and Altmetric attention scores published by 35 Indian central universities for 4 subfields of Computer Science using Altmetric Explorer. This paper determined each university’s contribution in the studied subfields of Computer Science and the correlation among Altmetric attention score (aggregated and individual), Dimensions citation, and Mendeley readership counts for all 669 articles and stratified percentile sets of top 25%, and top 50% of the overall number of articles.

Bibliometric Analysis of papers published during 1992 to 2019 in DESIDOC Journal of Library and Information Technology

Abstract The study analyses papers published in DESIDOC Journal of Library and Information Technology (DJLIT) using bibliometric techniques for the period of 1992-2019 (28 years) and citations received by these papers until 20th March 2020as reflected by Google Scholar. The study examined the pattern of growth, geographical distribution of the articles; identified the prolific authors & institutions, and their output; and the pattern of citations of the papers and identified most cited authors.

Research productivity of health care policy faculty: a cohort study of Harvard Medical School

Abstract In today’s publish or perish environment, the scholarly impact of a research article holds great importance. The present study examined 2343 articles having both altmetric attention scores and citations published by 22 core health care policy faculty members at Harvard Medical School. Web of Science was used to retrieve the citations, whereas Altmetric Explorer was used to determine the altmetric attention score. The evaluation metrics in this study were focused on article-level information to determine each faculty member’s contribution to the health care policy department collected in November 2018.

Mapping of ETDs in ProQuest dissertations and theses (PQDT) global database (2014-2018)

Abstract The information explosion in the form of ETDs poses the challenge of management and extraction of appropriate knowledge for decision making by information practitioners. This study presents a solution to the problem by applying topic mining and prediction modeling to 441 full-text ETDs extracted from the PQDT Global database during 2014-2018 in the field of library science using the RapidMiner platform. This study was divided into three phases.

Metadata Tagging and Prediction Modeling: Case Study of DESIDOC Journal of Library and Information Technology (2008–17)

Abstract The present paper describes the importance and usage of metadata tagging and prediction modeling tools for researchers and librarians. 387 articles were downloaded from DESIDOC Journal of Library and Information Technology (DJLIT) for the period 2008-17 excluding guest editorials and special editions. This study was divided into two phases. The first phase determined the core Topics from the research articles using Topic-Modeling-Tool (TMT) , which was based on latent Dirichlet allocation (LDA), whereas the second phase employed prediction analysis using RapidMiner toolbox to annotate the future research articles on the basis of the modeled topics.

Mapping of topics in DESIDOC Journal of Library and Information Technology, India: a study

Abstract This study analyzed 928 full-text research articles retrieved from DESIDOC Journal of Library and Information Technology for the period of 1981–2018 using Latent Dirichlet Allocation. The study further tagged the articles with the modeled topics. 50 core topics were identified throughout the period of 38 years whereas only 26 topics were unique in nature. Bibliometrics, ICT, information retrieval, and user studies were highly researched areas in India for the epoch.

Author-Topic Modeling of DESIDOC Journal of Library and Information Technology (2008-2017), India

Abstract This study presents a method to analyze textual data and applying it to the field of Library and Information Science. This paper subsumes a special case of Latent Dirichlet Allocation and Author-Topic models where each article has one unique author and each author has one unique topic. Topic Modeling Toolkit is used to perform the author-topic modeling. The study further which considers topics and their changes over time by taking into account both the word co-occurrence pattern and time.

Marketing of academic health libraries 2.0: a case study

Abstract The advent of Web 2.0 in libraries persuades the librarians to adopt new ways to communicate, determine, and satisfy the needs of the users. The paper aims to discuss this issue. A 30-question questionnaire was given to 30 undergraduate medical students of Vardhman Mahavir Medical College and a 10-question questionnaire was given to the librarian, to find out the marketing and promotional strategies employed by the library; determine the awareness and satisfaction level of the users; prepare library profile, customer profile and market profile; and perform SWOT analysis.

Application of sentiment analysis in libraries to provide temporal information service: a case study on various facets of productivity

Abstract With the advent of social media, people have found new ways through which they can express their views, opinions, and beliefs . This study presents an interdisciplinary nature of research where sentiment analysis is applied to the economics discipline of productivity as an experimental study to introduce new service for libraries’ users. Firstly, data were retrieved from Twitter on 20 different queries related to productivity using RapidMiner platform and then sentiment analysis was performed employing AYLIEN Text Analysis Software.