Enhancing News Tweets Classification Through Pre-Processing Techniques

Authors

  • Rabia Latif Department of Computer Science, HITEC University, Taxila Cantt Taxila, Pakistan
  • Muhammad Khalid Department of Computer Science, HITEC University, Taxila Cantt Taxila, Pakistan
  • Samrin Fatima Department of Computer Science, HITEC University, Taxila Cantt Taxila, Pakistan
  • Dr. Saima Shaheen Department of Computer Science, HITEC University, Taxila Cantt Taxila, Pakistan
  • Abdullah Asif Department of Computer Science, HITEC University, Taxila Cantt Taxila, Pakistan

Abstract

Today in the era of technology, social media platforms have reshaped the dissemination of news Twitter emerged as a main source for real-time news updates. As a large number of Twitter news is generated every second there is a need for a system that accurately classification of news content for better real-time media monitoring.  In this research, a machine learning based approach to enhance the classification of news tweets through preprocessing techniques is introduced. A combination of different preprocessing is implemented on Wall Street Journal twitter news tweets. This preprocessing especially design for twitter includes removing URLs, removing mentions, removing emoticons along with basic text preprocessing. The pre-processed text corpus is evaluated with different machine-learning models. Support Vector Machine (SVM) outperforms others with an accuracy of 95%.

Keywords: Wall Street Journal; Tokenization; Vectorization; Machine Learning Models; Deep Learning Models; Text Classification; Twitter; Preprocessing; Data Mining; Feature Engineering

Downloads

Published

2025-02-11

How to Cite

Rabia Latif, Muhammad Khalid, Samrin Fatima, Dr. Saima Shaheen, & Abdullah Asif. (2025). Enhancing News Tweets Classification Through Pre-Processing Techniques. Spectrum of Engineering Sciences, 3(2), 226–248. Retrieved from https://sesjournal.com/index.php/1/article/view/153