Skip to the content.

Shared Task on Multi-Domain Detection of AI-Generated Text (M-DAIGT)

Task Overview

We invite researchers and practitioners to participate in the Multi-Domain Detection of AI-Generated Text (M-DAIGT) Shared Task, which focuses on detecting AI-generated text across multiple domains, specifically news articles and academic writing. With the growing prevalence of large language models, distinguishing human-written content from AI-generated text has become a critical challenge for information integrity and academic honesty.

Subtasks

Participants are encouraged to develop models for one or both of the following subtasks:

  1. News Article Detection (NAD)
    • Binary classification of news articles as either human-written or AI-generated
    • Evaluation on both full articles and article snippets
    • Coverage of various news genres, including politics, technology, sports, etc.
  2. Academic Writing Detection (AWD)
    • Binary classification of academic texts as either human-written or AI-generated
    • Includes student coursework and research papers
    • Spans multiple academic disciplines and writing styles

Dataset

The dataset for this shared task consists of:

Dataset Statistics:

Evaluation Metrics

Submissions will be evaluated using the following classification metrics:

Timeline

(All deadlines are 11:59 PM UTC-12:00, “Anywhere on Earth”)

How to Participate

Participants must:

Organizers

Resources Provided

Participants will have access to:

Expected Impact

This shared task aims to:

  1. Advance research in AI-generated text detection across multiple domains
  2. Develop real-world applications to support news organizations and academic integrity initiatives
  3. Establish a benchmark dataset for AI text detection research

Baseline Systems

To support participants, we will provide:

  1. Simple Statistical Baseline (TF-IDF + SVM)
  2. Transformer-Based Baseline (RoBERTa)
  3. Evaluation scripts and sample submission formats

Novelty and Significance

This shared task differentiates itself from existing work by:

  1. Covering two different domains for cross-domain analysis
  2. Conducting comprehensive evaluations across various text types and lengths
  3. Using multiple AI generation sources for content diversity
  4. Addressing real-world applications in media and academia

Logistics and Support

Stay Updated

Anti-Harassment policy

We uphold the ACL Anti-Harassment Policy, and participants in this shared task are encouraged to reach out with any concerns or questions to any of the shared task organizers.