Leveraging Large Language Models to Estimate the Check-Worthiness of Multilingual Tweets

Published in 27th International Conference on Computer and Information Technology (ICCIT) 20-22 December 2024, Coxs Bazar, Bangladesh, 2024

Social media has significantly altered communication patterns, leading to an exponential increase in textual data online. This surge has, in turn, heightened the risk of spreading misinformation across platforms, making fact-checking and identifying check-worthy content more crucial than ever. This paper presents an automated approach for detecting check-worthy content in multilingual tweets, leveraging open-source large language models (LLMs) such as Llama-3, Gemma-2, Phi-3, and Qwen-2 on the benchmark dataset. The study investigates the effectiveness of transformer-based LLMs in identifying check-worthy tweets or texts in English, Arabic, and Dutch. We applied Low-Rank Adaptation (LoRA) to optimize the computational efficiency of the models. The evaluation results demonstrate that the Qwen2-7B outperforms other models in English and Arabic with an F1 score of 85.11% and 58.33%. Besides, in Dutch, Gemma-2-9B scores highest (F1 score of 66.75%). The findings indicate that the proposed model for English (Qwen2-7B) and Arabic (Qwen2-7B) surpasses the previous works, whereas the model (Gemma-2-9B) needs to be upgraded for the Dutch language.

Recommended citation: Paran, A. I., Hossain, M. S., Shohan, S. H., Hossain, J., Ahsan, S., & Hoque, M. M. (2024). Leveraging Large Language Models to Estimate the Check-Worthiness of Multilingual Tweets. In 27th International Conference on Computer and Information Technology (ICCIT) 20-22 December 2024, Coxs Bazar, Bangladesh.
Download Paper