Graph Sanitation to boost Bot-Detection Performance

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Bots on social media platforms such as X (formerly Twitter) are widely exploited by malicious actors to further personal and financial goals via coordinated efforts, thereby correlating with decreased social trust and cohesion. Studying this phenomenon is an important and ongoing challenge, as bots actively adapt to circumvent existing detection mechanisms. Recent literature jointly leverages network and user data from large relational datasets to facilitate classification. In this paper, we propose RL-Sanitize, a modular reinforcement learning graph sanitation framework that pre-processes graph data and aims to improve the performance of downstream node classification models beyond their original capacity. To the best of our knowledge, this is the first method to apply graph sanitation on the bot-detection task. Our framework is analyzed and refined upon through a series of small-scale experiments, and its final performance is evaluated on the TwiBot-20 dataset.

Keywords

Citation