How do AI Content Detectors Work?

How AI Content Detectors Work

Many tools promise to be able to distinguish AI-generated content from human-written content, but until now I doubted that they actually work. Recognising AI-generated content is much more difficult than recognising old, "spun" or plagiarised content. Most AI texts can be considered original in some sense - not simply copied from the internet.

However, as we're building an AI content detector at Ahrefs, I've been digging deeper into the topic. To understand how these tools work, I interviewed an expert who really understands the science and research behind them.

All AI content detectors work on the same basic principle: they look for patterns or anomalies in text that differ from human-written content. This requires two things: lots of examples of human- and AI-generated text, and a mathematical model for analysis.

There are three common approaches:

1. Statistical detection (traditional but effective method)

Attempts to recognise machine text generation have been around since the 2000s. These older methods can still work well today. Statistical detection methods distinguish between human-written and machine-generated text by counting writing patterns.

Word frequencies (how often certain words appear)
N-gram frequencies (how often given word sequences occur)
Syntactic structures (e.g. frequency of subject-object-object structures)
Stylistic subtleties (e.g. use of the first person, informal style, etc.)

2. Neural networks (modern deep learning methods)

Neural networks are computer systems that loosely mimic the way the human brain works. These networks are able to recognise and learn what distinguishes AI-generated texts.

These methods can also work effectively with smaller models, provided they have enough data to train them (a few thousand examples may be enough).

3. Watermarking (hidden symbols in generated texts)

The purpose of watermarking is to allow AI-generated text to contain hidden signals that identify that the content is machine-generated. This is similar to the UV ink on banknotes, which distinguishes genuine money from counterfeits.

Watermarking can be applied in three ways:

Add watermarks to the datasets you output.
Include watermarks when generating text.
Add watermarks after text generation.

Summary

AI content detectors can be useful tools, but they also have limitations. To get the right results, it is important to be aware of the capabilities and limitations of the tools and always be critical of the results they produce.

How do AI Content Detectors Work?

How AI Content Detectors Work