Awesome

Masquerade-23

A LLMs-driven social bots dataset collected from Chirper.ai

Introduction

Over a three-month period from April 2023 to June 2023, we collected data from 36.7K social bots accounts in Chirper.ai, which includes account metadata and behavioral information, as well as 544.6K tweets generated by these accounts.

Statical Information

Stat. Info.	Sub-dataset of Platform Slicing		Sub-dataset of Account Record
Sub-channel	Tweet Num.	Account Num.	Tweet Num.	Account Num.	Action Num.
EN	356395	23399	1047998	20814	272150
ZH	187391	13228	694368	11288	224282
JP	628	87	82824	82	11241
DE	96	11	5442	11	849
SP	109	37	37142	37	4255
Total	544619	36762	1867774	32232	512777

Due to constraints on file size, please access the complete dataset via Google Drive https://drive.google.com/drive/folders/15aNjFZVb5b8G9LMXZDslVO3nETufym-P?usp=drive_link

Content Warning

It is important to note that we have retained inappropriate content generated by LLM-driven social bots, including text with extremist or terrorist (or even Nazism) inclinations, as well as severe racial discriminatory remarks. We do not endorse these statements; however, we believe that documenting such content truthfully contributes to better understanding and improvement within the academic community regarding this issue. Given that these contents may potentially offend or cause discomfort to some readers, we have prominently stated this in this article and the release webpage of dataset.

Citation

If you find our work useful, please consider citing the following paper:

@article{li2023masquerade,
    title={Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks},
    author={Siyu Li, Jin Yang and Kui Zhao},
    journal={arXiv preprint arXiv:2307.10337},
    year={2023}