Who wrote this? Engineers discover novel method to identify AI-generated text
Raidar detects machine-gen­er­at­ed text by cal­cu­lat­ing rewrit­ing mod­i­fi­ca­tions. This illus­tra­tion shows the char­ac­ter dele­tion in red and the char­ac­ter inser­tion in orange. Human-gen­er­at­ed text tends to trig­ger more mod­i­fi­ca­tions than machine-gen­er­at­ed text when asked to be rewrit­ten. Cred­it: Yang and Von­drick labs

Com­put­er sci­en­tists at Colum­bia Engi­neer­ing have devel­oped a trans­for­ma­tive method for detect­ing AI-gen­er­at­ed text. Their find­ings promise to rev­o­lu­tion­ize how we authen­ti­cate dig­i­tal con­tent, address­ing mount­ing con­cerns sur­round­ing large lan­guage mod­els (LLMs), dig­i­tal integri­ty, mis­in­for­ma­tion, and trust.

Com­put­er Sci­ence Pro­fes­sors Jun­feng Yang and Carl Von­drick spear­head­ed the devel­op­ment of Raidar (gen­eR­a­tive AI Detec­tion viA Rewrit­ing), which intro­duces an inno­v­a­tive approach for iden­ti­fy­ing whether text has been writ­ten by a human or gen­er­at­ed by AI or LLMs like Chat­G­PT, with­out need­ing access to a mod­el’s inter­nal work­ings.

The paper, which includes open-sourced code and datasets, will be pre­sent­ed at the Inter­na­tion­al Con­fer­ence on Learn­ing Rep­re­sen­ta­tions (ICLR) in Vien­na, Aus­tria, May 7–11, 2024. It is cur­rent­ly avail­able on the arX­iv preprint serv­er.

The researchers lever­aged a unique char­ac­ter­is­tic of LLMs that they term “stubbornness”—LLMs show a ten­den­cy to alter human-writ­ten text more read­i­ly than AI-gen­er­at­ed text. This occurs because LLMs often regard AI-gen­er­at­ed text as already opti­mal and thus make min­i­mal changes.

The new approach, Raidar, uses a lan­guage mod­el to rephrase or alter a giv­en text and then mea­sures how many edits the sys­tem makes to the giv­en text. Raidar receives a piece of text, such as a social media post, prod­uct review, or blog post, and then prompts an LLM to rewrite it. The LLM replies with the rewrit­ten text, and Raidar com­pares the orig­i­nal text with the rewrit­ten text to mea­sure mod­i­fi­ca­tions. Many edits mean the text is like­ly writ­ten by humans, while few­er mod­i­fi­ca­tions mean the text is like­ly machine-gen­er­at­ed.

Cred­it: Colum­bia Uni­ver­si­ty School of Engi­neer­ing and Applied Sci­ence

Raidar’s remark­able accu­ra­cy is noteworthy—it sur­pass­es pre­vi­ous meth­ods by up to 29%. This leap in per­for­mance is achieved using state-of-the-art LLMs to rewrite the input, with­out need­ing access to the AI’s archi­tec­ture, algo­rithms, or train­ing data—a first in the field of AI-gen­er­at­ed text detec­tion.

Raidar is also high­ly accu­rate even on short texts or snip­pets. This is a sig­nif­i­cant break­through as pri­or tech­niques have required long texts to have good accu­ra­cy. Dis­cern­ing accu­ra­cy and detect­ing mis­in­for­ma­tion is espe­cial­ly cru­cial in today’s online envi­ron­ment, where brief mes­sages, such as social media posts or inter­net com­ments, play a piv­otal role in infor­ma­tion dis­sem­i­na­tion and can have a pro­found impact on pub­lic opin­ion and dis­course.

Authenticating digital content

In an era when AI’s capa­bil­i­ties con­tin­ue to expand, the abil­i­ty to dis­tin­guish between human and machine-gen­er­at­ed con­tent is crit­i­cal for uphold­ing integri­ty and trust across dig­i­tal plat­forms. From social media to news arti­cles, aca­d­e­m­ic essays to online reviews, Raidar promis­es to be a pow­er­ful tool in com­bat­ing the spread of mis­in­for­ma­tion and ensur­ing the cred­i­bil­i­ty of dig­i­tal infor­ma­tion.

“Our method­’s abil­i­ty to accu­rate­ly detect AI-gen­er­at­ed con­tent fills a cru­cial gap in cur­rent tech­nol­o­gy,” said the paper’s lead author Chengzhi Mao, who is a for­mer Ph.D. stu­dent at Colum­bia Engi­neer­ing and cur­rent post­doc of Yang and Von­drick. “It’s not just excit­ing; it’s essen­tial for any­one who val­ues the integri­ty of dig­i­tal con­tent and the soci­etal impli­ca­tions of AI’s expand­ing capa­bil­i­ties.”

The team plans to broad­en its inves­ti­ga­tion to encom­pass var­i­ous text domains, includ­ing mul­ti­lin­gual con­tent and var­i­ous pro­gram­ming lan­guages. They are also explor­ing the detec­tion of machine-gen­er­at­ed images, videos, and audio, aim­ing to devel­op com­pre­hen­sive tools for iden­ti­fy­ing AI-gen­er­at­ed con­tent across mul­ti­ple media types.


