Comparative Analysis of FinBERT and DistilRoBERTa for NLP-Based Financial Insights in Pakistan's Stock Market
Abstract
The rapid growth of financial data and news articles has heightened the need for advanced natural language processing (NLP) techniques to extract meaningful insights. This study evaluates two state-of-the-art NLP models, FinBERT and DistilRoberta, for sentiment analysis on the Pakistan Stock Exchange (PSX) and Dawn News. Finbert, a domain-specific model, is fine-tuned for financial text, while DistilRoberta offers a lightweight, efficient alternative. Using web scraping, we collected stock market data and news articles to assess the models' performance. Results show that DistilRoberta achieved perfect accuracy on news headlines, outperforming FinBERT (70% accuracy). On the Kaggle stock market dataset, both models agreed on 90% of predictions, with Distil Roberta showing greater consistency in borderline cases. Distil Roberta’s efficiency and adaptability make it suitable for real-time applications, while FinBERT excels in domain-specific tasks. This study highlights the potential of NLP models in emerging markets and suggests future research directions, including hybrid models and improved interpretability, to enhance financial sentiment analysis.