Skip to the content.

AraGenEval: Arabic Authorship Style Transfer and AI Generated Text Detection Shared Task

Hosted with Arabic Natural Language Processing Conference (ArabicNLP 2025)

1. Overview of the Shared Task

The rapid expansion of user-generated content across social media, digital news platforms, and online communication has created a growing demand for sophisticated Natural Language Processing (NLP) techniques to analyze and manipulate writing styles. Unlike general text style analysis [1], which focuses on broad linguistic features, Authorship Style Transfer (AST) aims to transform a given text to match the distinctive writing style of a specific author while preserving its original meaning [2]. This contrasts with traditional stylistic analysis, where the goal is to identify and characterize an author’s style rather than actively modify text to conform to it.
In addition, recent advances in Arabic-based large language models have made it increasingly difficult to distinguish between human-written and AI-generated Arabic content [3]. We believe that Arabic style identification can help detect such content.
This shared task seeks to promote research in Arabic AST, an area that remains relatively underdeveloped compared to other languages. Participants will develop models for one or more of the following subtasks:

  1. Authorship Style Transfer (Text Generation)
  2. Authorship Identification (Multiclass Classification)
  3. AI-Generated Text Detection (Binary Classification)

2. Motivation

Authorship style transfer and AI-generated text detection can be applied in various domains, including education, cultural adaptation, and social media content generation. The motivation for launching this shared task arises from the increasing presence of Arabic-language discussions on various socio-political and technological topics. Although authorship style transfer [4, 5] is explored in NLP, the Arabic domain presents distinct challenges:

3. Data Collection and Creation

3.1 Authorship Style Transfer (Tasks 1 & 2)

Dataset Statistics

Id Author Train Test Val
(1) Ahmed Amin 2892 594 246
(2) Ahmed Taymour Pasha 804 142 53
(3) Ahmed Shawqi 596 46 58
(4) Ameen Rihani 1557 624 142
(5) Tharwat Abaza 755 191 90
(6) Gibran Khalil Gibran 748 240 30
(7) Jurji Zaydan 2762 562 326
(8) Hassan Hanafi 3735 1002 548
(9) Robert Barr 2680 512 82
(10) Salama Moussa 984 282 119
(11) Taha Hussein 2371 534 253
(12) Abbas M. Al-Aqqad 1820 499 267
(13) Abdel Ghaffar Makawi 1520 464 396
(14) Gustave Le Bon 1515 358 150
(15) Fouad Zakaria 1771 294 125
(16) Kamel Kilani 399 109 25
(17) Mohamed H. Heikal 2627 492 260
(18) Naguib Mahfouz 1630 343 327
(19) Nawal El Saadawi 1415 382 295
(20) William Shakespeare 1236 358 238
(21) Yusuf Idris 1140 349 120

3.2 AI-Text Detection (Subtask 3)

4. Task Description

4.1 Subtask 1: Authorship Style Transfer

4.2 Subtask 2: Authorship Identification

4.3 Subtask 3: AI-Generated Text Detection

Focused on two domains:

  1. Arabic News Text Detection (ArabicNewsGen)
    • Full-length articles and short excerpts; genres include politics, economy, technology, sports.
  2. Arabic Literature Text Detection (ArabicLitGen)
    • Literary forms, especially poetry; diverse stylistic and genre expressions.
      - Evaluation: F1-Score (primary), Accuracy (secondary).

5. Tentative Timeline

6. Organizers’ Details

7. Participation Guidelines

References

  1. Hu et al. “Text style transfer: A review and experimental evaluation.” ACM SIGKDD Explorations Newsletter, 24(1), 2022.
  2. Shao et al. “Authorship style transfer with inverse transfer data augmentation.” AI Open, 5, 2024.
  3. Alghamdi et al. “Distinguishing Arabic GenAI-generated Tweets and Human Tweets utilizing Machine Learning.” Engineering, Technology & Applied Science Research, 14(5), 16720-16726, 2024.
  4. Patel et al. “Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?” arXiv preprint arXiv:2212.08986, 2022.
  5. Horvitz et al. “TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings.” arXiv preprint arXiv:2406.15586, 2024.

Logistics and Support

Stay Updated

Anti-Harassment policy

We uphold the ACL Anti-Harassment Policy, and participants in this shared task are encouraged to reach out with any concerns or questions to any of the shared task organizers.