SemanticCuetSync at CheckThat! 2024: Pre-trained Transformer-based Approach to Detect Check-Worthy Tweets
Published in CLEF 2024: Conference and Labs of the Evaluation Forum, September 09–12, 2024, Grenoble, France. , 2024
This paper presents an intelligent technique for classifying English, Arabic, and Dutch texts as checkworthy, harnessing the power of the BERT-based model. The study explores ten baseline models, including LR, MNB, SVM, CNN+LSTM, CNN+BiLSTM, BERT-Base-Uncased, RoBERTa, AraBERTv2, Dutch-RoBERTa, and Dutch-BERT, to address the shared task. The study also investigates an LLM using few-shots, such as SetFit, to identify checkworthy tweets or texts. Evaluation results unequivocally demonstrate the superiority of transformer-based models, with RoBERTa achieving the highest F1 scores of 75.82% for English tweets, Dehate-BERT scoring 52.55% for Arabic texts, and Dutch-BERT obtaining a maximum score of 58.42% for Dutch texts. Our team ranked 6th overall for English, 5th for Arabic, and 16th for Dutch in the shared task challenge.
Recommended citation: Shohan, S. H., Hossain, M. S., Paran, A. I., Hossain, J., Ahsan, S., & Hoque, M. M. (2024). SemanticCuetSync at CheckThat! 2024: Pre-trained Transformer-based Approach to Detect Check-Worthy Tweets.
Download Paper