Computer scientists show the way: AI models need not be SO power hungry
Each dot in this fig­ure is a con­vo­lu­tion­al neur­al net­work mod­el with the ener­gy con­sump­tion on hor­i­zon­tal axis and per­for­mance on ver­ti­cal axis. Con­ven­tion­al­ly, mod­els are select­ed only based on their performance—without tak­ing their ener­gy con­sump­tion into account—resulting in mod­els in the red ellipse. This work enables prac­ti­tion­ers to select mod­els from the green ellipse which give good trade-off between effec­tive­ness and effi­cien­cy. Cred­it: Fig­ure from sci­en­tif­ic arti­cle (ieeexplore.ieee.org/document/10448303)

The fact that colos­sal amounts of ener­gy are need­ed to Google away, talk to Siri, ask Chat­G­PT to get some­thing done, or use AI in any sense, has grad­u­al­ly become com­mon knowl­edge.

One study esti­mates that by 2027, AI servers will con­sume as much ener­gy as Argenti­na or Swe­den. Indeed, a sin­gle Chat­G­PT prompt is esti­mat­ed to con­sume, on aver­age, as much ener­gy as forty mobile phone charges. But the research com­mu­ni­ty and the indus­try have yet to make the devel­op­ment of AI mod­els that are ener­gy effi­cient and thus more cli­mate friend­ly the focus, com­put­er sci­ence researchers at the Uni­ver­si­ty of Copen­hagen point out.

“Today, devel­op­ers are nar­row­ly focused on build­ing AI mod­els that are effec­tive in terms of the accu­ra­cy of their results. It’s like say­ing that a car is effec­tive because it gets you to your des­ti­na­tion quick­ly, with­out con­sid­er­ing the amount of fuel it uses. As a result, AI mod­els are often inef­fi­cient in terms of ener­gy con­sump­tion,” says Assis­tant Pro­fes­sor Raghaven­dra Sel­van from the Depart­ment of Com­put­er Sci­ence, whose research looks in to pos­si­bil­i­ties for reduc­ing AI’s car­bon foot­print.

But a new study, of which he and com­put­er sci­ence stu­dent Pedram Bakhtiar­i­fard are two of the authors, demon­strates that it is easy to curb a great deal of CO2e with­out com­pro­mis­ing the pre­ci­sion of an AI mod­el. Doing so demands keep­ing cli­mate costs in mind from the design and train­ing phas­es of AI mod­els. The study will be pre­sent­ed at the Inter­na­tion­al Con­fer­ence on Acoustics, Speech and Sig­nal Pro­cess­ing (ICASSP-2024).

“If you put togeth­er a mod­el that is ener­gy effi­cient from the get-go, you reduce the car­bon foot­print in each phase of the mod­el’s ‘life cycle.’ This applies both to the mod­el’s train­ing, which is a par­tic­u­lar­ly ener­gy-inten­sive process that often takes weeks or months, as well as to its appli­ca­tion,” says Sel­van.

Recipe book for the AI industry

In their study, the researchers cal­cu­lat­ed how much ener­gy it takes to train more than 400,000 con­vo­lu­tion­al neur­al net­work type AI models—this was done with­out actu­al­ly train­ing all these mod­els. Among oth­er things, con­vo­lu­tion­al neur­al net­works are used to ana­lyze med­ical imagery, for lan­guage trans­la­tion and for object and face recognition—a func­tion you might know from the cam­era app on your smart­phone.

Based on the cal­cu­la­tions, the researchers present a bench­mark col­lec­tion of AI mod­els that use less ener­gy to solve a giv­en task, but which per­form at approx­i­mate­ly the same lev­el. The study shows that by opt­ing for oth­er types of mod­els or by adjust­ing mod­els, 70–80% ener­gy sav­ings can be achieved dur­ing the train­ing and deploy­ment phase, with only a 1% or less decrease in per­for­mance. And accord­ing to the researchers, this is a con­ser­v­a­tive esti­mate.

“Con­sid­er our results as a recipe book for the AI pro­fes­sion­als. The recipes don’t just describe the per­for­mance of dif­fer­ent algo­rithms, but how ener­gy effi­cient they are. And that by swap­ping one ingre­di­ent with anoth­er in the design of a mod­el, one can often achieve the same result. So now, the prac­ti­tion­ers can choose the mod­el they want based on both per­for­mance and ener­gy con­sump­tion, and with­out need­ing to train each mod­el first,” says Pedram Bakhtiar­i­fard.

“Often­times, many mod­els are trained before find­ing the one that is sus­pect­ed of being the most suit­able for solv­ing a par­tic­u­lar task. This makes the devel­op­ment of AI extreme­ly ener­gy-inten­sive. There­fore, it would be more cli­mate-friend­ly to choose the right mod­el from the out­set, while choos­ing one that does not con­sume too much pow­er dur­ing the train­ing phase.”

The researchers stress that in some fields, like self-dri­ving cars or cer­tain areas of med­i­cine, mod­el pre­ci­sion can be crit­i­cal for safe­ty. Here, it is impor­tant not to com­pro­mise on per­for­mance. How­ev­er, this should­n’t be a deter­rence to striv­ing for high ener­gy effi­cien­cy in oth­er domains.

“AI has amaz­ing poten­tial. But if we are to ensure sus­tain­able and respon­si­ble AI devel­op­ment, we need a more holis­tic approach that not only has mod­el per­for­mance in mind, but also cli­mate impact. Here, we show that it is pos­si­ble to find a bet­ter trade-off. When AI mod­els are devel­oped for dif­fer­ent tasks, ener­gy effi­cien­cy ought to be a fixed criterion—just as it is stan­dard in many oth­er indus­tries,” con­cludes Raghaven­dra Sel­van.

The “recipe book” put togeth­er in this work is avail­able as an open-source dataset for oth­er researchers to exper­i­ment with. The infor­ma­tion about all these more than 400,000 archi­tec­tures is pub­lished on Github which AI prac­ti­tion­ers can access using sim­ple Python scripts.

The UCPH researchers esti­mat­ed how much ener­gy it takes to train 429,000 of the AI sub­type mod­els known as con­vo­lu­tion­al neur­al net­works in this dataset. Among oth­er things, these are used for object detec­tion, lan­guage trans­la­tion and med­ical image analy­sis.

It is esti­mat­ed that the train­ing alone of the 429,000 neur­al net­works the study looked at would require 263,000 kWh. This equals the amount of ener­gy that an aver­age Dan­ish cit­i­zen con­sumes over 46 years. And it would take one com­put­er about 100 years to do the train­ing. The authors in this work did not actu­al­ly train these mod­els them­selves but esti­mat­ed these using anoth­er AI mod­el, thus sav­ing 99% of the ener­gy it would have tak­en.

Why is AI’s carbon footprint so big?

Train­ing AI mod­els con­sumes a lot of ener­gy, and there­by emits a lot of CO2e. This is due to the inten­sive com­pu­ta­tions per­formed while train­ing a mod­el, typ­i­cal­ly run on pow­er­ful com­put­ers.

This is espe­cial­ly true for large mod­els, like the lan­guage mod­el behind Chat­G­PT. AI tasks are often processed in data cen­ters, which demand sig­nif­i­cant amounts of pow­er to keep com­put­ers run­ning and cool. The ener­gy source for these cen­ters, which may rely on fos­sil fuels, influ­ences their car­bon foot­print.

Source link