Sabotage tool takes on AI image scrapers
Exam­ple images gen­er­at­ed by the clean (unpoi­soned) and poi­soned SD-XL mod­els with dif­fer­ent # of poi­son data. The attack effect is appar­ent with 1000 poi­son­ing sam­ples, but not at 500 sam­ples. Cred­it: arX­iv (2023). DOI: 10.48550/arxiv.2310.13828

Artists who have stood by help­less­ly as their online works remained ripe for the pick­ing with­out autho­riza­tion by AI web scrap­ing oper­a­tions can final­ly fight back.

Researchers at the Uni­ver­si­ty of Chica­go announced the devel­op­ment of a tool that “poi­sons” graph­ics appro­pri­at­ed by AI com­pa­nies to train image-gen­er­at­ing mod­els. The tool, Night­shade, manip­u­lates image pix­els that will alter the out­put dur­ing train­ing. The alter­ations are not vis­i­ble to the naked eye pri­or to pro­cess­ing.

Ben Zhao, an author of the paper “Prompt-Spe­cif­ic Poi­son­ing Attacks on Text-to-Image Gen­er­a­tive Mod­els,” said Night­shade can sab­o­tage data so that images of dogs, for instance, would be con­vert­ed to cats at train­ing time. In oth­er instances, car images were trans­formed into cars, and hats con­vert­ed to cakes. The work is pub­lished on the arX­iv preprint serv­er.

“A mod­er­ate num­ber of Night­shade attacks can desta­bi­lize gen­er­al fea­tures in a text-to-image gen­er­a­tive mod­el, effec­tive­ly dis­abling its abil­i­ty to gen­er­ate mean­ing­ful images,” Zhao said.

He termed his team’s cre­ation “a last defense for con­tent cre­ators against web scrap­ers that ignore opt-out­/­do-not-crawl direc­tives.”

Artists have long wor­ried about com­pa­nies such as Google, Ope­nAI, Sta­bil­i­ty AI and Meta that col­lect bil­lions of images online for use in train­ing datasets for lucra­tive image-gen­er­at­ing tools while fail­ing to pro­vide com­pen­sa­tion to cre­ators.

Eva Toore­nent, an advis­er for the Euro­pean Guild for Arti­fi­cial Intel­li­gence Reg­u­la­tion in the Nether­lands, said such prac­tices “have sucked the cre­ative juices of mil­lions of artists.”

“It is absolute­ly hor­ri­fy­ing,” she said in a recent inter­view.

Zhao’s team demon­strat­ed that despite the com­mon belief that dis­rupt­ing scrap­ing oper­a­tions would require upload­ing mas­sive amounts of altered images, they were able to achieve dis­rup­tion by using few­er than 100 “poi­soned” sam­ples. They achieved this by using prompt-spe­cif­ic poi­son­ing attacks that require far few­er sam­ples than the mod­el train­ing dataset.

Zhao sees Night­shade as a use­ful tool not only for indi­vid­ual artists but for large com­pa­nies as well, such as movie stu­dios and game devel­op­ers.

“For exam­ple, Dis­ney might apply Night­shade to its print images of ‘Cin­derel­la,’ while coor­di­nat­ing with oth­ers on poi­son con­cepts for ‘Mer­maid,’ ” Zhao said.

Night­shade can also alter art styles. For instance, a prompt request­ing an image be cre­at­ed in Baroque style may yield Cubist style imagery instead.

The tool emerges in the midst of ris­ing oppo­si­tion to AI com­pa­nies appro­pri­at­ing web con­tent under what the com­pa­nies say is allowed by fair-use rules. Law­suits were filed against Google and Microsoft­’s Ope­nAI last sum­mer accus­ing the tech giants of improp­er­ly using copy­right­ed mate­ri­als to train their AI sys­tems.

“Google does not own the inter­net, it does not own our cre­ative works, it does not own our expres­sions of our per­son­hood, pic­tures of our fam­i­lies and chil­dren, or any­thing else sim­ply because we share it online,” said the plain­tiffs’ attor­ney, Ryan Clark­son. If found guilty, the com­pa­nies face bil­lions in fines.

Google seeks a dis­missal of the law­suit, stat­ing in court papers, “Using pub­licly avail­able infor­ma­tion to learn is not steal­ing, nor is it an inva­sion of pri­va­cy, con­ver­sion, neg­li­gence, unfair com­pe­ti­tion, or copy­right infringe­ment.”

Accord­ing to Toore­nent, Night­shade “is going to make [AI com­pa­nies] think twice, because they have the pos­si­bil­i­ty of destroy­ing their entire mod­el by tak­ing our work with­out our con­sent.”

Source