ANALYSIS OF KNOWLEDGE GRAPHS FOR COVID-19 LITERATURE USING NLP

Authors

  • Habib Ul Rehman
  • Muhammad Javed Iqbal

Keywords:

Semantic Search, Knowledge Graphs, NLP, CORD-19, SPECTER

Abstract

Artificial intelligence and natural language processing have swiftly revolutionized information retrieval and semantic search methodologies. This thesis presents a comprehensive framework for document-level semantic search and sentence-level analysis, utilizing state-of-the-art embedding models and vector database technologies, which focuses on addressing complex research queries related to critical domains such as healthcare, leveraging the CORD-19 dataset as a case study. The proposed methodology enables SPECTER embeddings for document- level representation, facilitating effective semantic search using the QDrant vector database. Post-retrieval, Doc2Vec and Word2Vec models are employed to extract contextually relevant sentences, ensuring precise and meaningful results. The framework's performance is evaluated using five research queries, highlighting its capability to retrieve and analyze domain-specific knowledge efficiently. To enhance interpretability, knowledge graphs are generated to visualize relationships among extracted entities, offering insights into the semantic structure of the retrieved information. This research provides a scalable and modular solution for semantic search, with potential applications in academic research, healthcare, and other domains requiring precise information retrieval. Future directions include extending the framework to incorporate real-time data streams and exploring advanced embedding models for improved performance.

Downloads

Published

2025-04-17

How to Cite

Habib Ul Rehman, & Muhammad Javed Iqbal. (2025). ANALYSIS OF KNOWLEDGE GRAPHS FOR COVID-19 LITERATURE USING NLP. Spectrum of Engineering Sciences, 3(4), 537–552. Retrieved from https://sesjournal.com/index.php/1/article/view/277