How do you know when AI is powerful enough to be dangerous? Regulators try to do the math

How do you know if an system is so powerful that it poses a security danger and shouldn鈥檛 be unleashed without careful oversight?

For regulators trying to put guardrails on AI, it鈥檚 mostly about the arithmetic. Specifically, an AI model trained on 10 to the 26th floating-point operations must now be reported to the U.S. government and could soon trigger even .

Say what? Well, if you鈥檙e counting the zeroes, that鈥檚 100,000,000,000,000,000,000,000,000, or 100 septillion, calculations to train AI systems on huge troves of data.

What it signals to some lawmakers and AI safety advocates is a level of computing power that might enable rapidly advancing AI technology to create or proliferate weapons of mass destruction, or conduct catastrophic cyberattacks.

Those who鈥檝e crafted such regulations acknowledge they are an imperfect starting point to distinguish today鈥檚 highest-performing 鈥� largely made by California-based companies like Anthropic, Google, Meta Platforms and ChatGPT-maker OpenAI 鈥� from the next generation that could be even more powerful.

Critics have pounced on the thresholds as arbitrary 鈥� an attempt by governments to regulate math. Adding to the confusion is that some rules set a speed-based computing threshold 鈥� how many floating-point operations per second, known as flops 鈥� while others are based on cumulative number of calculations no matter how long they take.

鈥淭en to the 26th flops,鈥� said venture capitalist Ben Horowitz on a podcast this summer. 鈥淲ell, what if that鈥檚 the size of the model you need to, like, cure cancer?鈥�

by President Joe Biden last year relies on a 10 to the 26th threshold. So does California鈥檚 newly passed AI safety legislation 鈥� which Gov. Gavin Newsom has until Sept. 30 to sign into law or veto. California adds a second metric to the equation: regulated AI models must also cost at least $100 million to build.

Following Biden鈥檚 footsteps, the also measures floating-point operations, but sets the bar 10 times lower at 10 to the 25th power. That covers some AI systems already in operation. China鈥檚 government has also looked at measuring computing power to determine which AI systems need safeguards.

No publicly available models meet the higher California threshold, though it鈥檚 likely that some companies have already started to build them. If so, they鈥檙e supposed to be sharing certain details and safety precautions with the U.S. government. Biden employed a Korean War-era law to compel tech companies to alert the U.S. Commerce Department if they鈥檙e building such AI models.

AI researchers are still debating how best to evaluate the capabilities of the latest generative AI technology and how it compares to human intelligence. There are tests that judge AI on solving puzzles, logical reasoning or how swiftly and accurately it predicts what text will answer a person鈥檚 chatbot query. Those measurements help assess an AI tool鈥檚 usefulness for a given task, but there鈥檚 no easy way of knowing which one is so widely capable that it poses a danger to humanity.

鈥淭his computation, this flop number, by general consensus is sort of the best thing we have along those lines,鈥� said physicist Anthony Aguirre, executive director of the Future of Life Institute, which has advocated for the passage of California鈥檚 Senate Bill 1047 and other AI safety rules around the world.

Floating point arithmetic might sound fancy 鈥渂ut it鈥檚 really just numbers that are being added or multiplied together,鈥� making it one of the simplest ways to assess an AI model鈥檚 capability and risk, Aguirre said.

鈥淢ost of what these things are doing is just multiplying big tables of numbers together,鈥� he said. 鈥淵ou can just think of typing in a couple of numbers into your calculator and adding or multiplying them. And that鈥檚 what it鈥檚 doing 鈥� ten trillion times or a hundred trillion times.鈥�

For some tech leaders, however, it鈥檚 too simple and hard-coded a metric. There鈥檚鈥渘o clear scientific support鈥� for using such metrics as a proxy for risk, argued computer scientist Sara Hooker, who leads AI company Cohere鈥檚 nonprofit research division, in a July paper.

鈥淐ompute thresholds as currently implemented are shortsighted and likely to fail to mitigate risk,鈥� she wrote.

Venture capitalist Horowitz and his business partner Marc Andreessen, founders of the influential Silicon Valley investment firm Andreessen Horowitz, have attacked as well as California lawmakers for AI regulations they argue could snuff out an emerging AI startup industry.

For Horowitz, putting limits on 鈥渉ow much math you鈥檙e allowed to do鈥� reflects a mistaken belief there will only be a handful of big companies making the most capable models and you can put 鈥渇laming hoops in front of them and they鈥檒l jump through them and it鈥檚 fine.鈥�

In response to the criticism, the sponsor of California鈥檚 legislation sent a letter to Andreessen Horowitz this summer defending the bill, including its regulatory thresholds.

Regulating at over 10 to the 26th is 鈥渁 clear way to exclude from safety testing requirements many models that we know, based on current evidence, lack the ability to cause critical harm,鈥� wrote state Sen. Scott Wiener of San Francisco. Existing publicly released models 鈥渉ave been tested for highly hazardous capabilities and would not be covered by the bill,鈥� Wiener said.

Both Wiener and the Biden executive order treat the metric as a temporary one that could be adjusted later.

Yacine Jernite, who works on policy research at the AI company Hugging Face, said the computing metric emerged in 鈥済ood faith鈥� ahead of last year鈥檚 Biden order but is already starting to grow obsolete. AI developers are doing more with smaller models requiring less computing power, while the potential harms of more widely used AI products won鈥檛 trigger California鈥檚 proposed scrutiny.

鈥淪ome models are going to have a drastically larger impact on society, and those should be held to a higher standard, whereas some others are more exploratory and it might not make sense to have the same kind of process to certify them,鈥� Jernite said.

Aguirre said it makes sense for regulators to be nimble, but he characterizes some opposition to the threshold as an attempt to avoid any regulation of AI systems as they grow more capable.

鈥淭his is all happening very fast,鈥� Aguirre said. 鈥淚 think there鈥檚 a legitimate criticism that these thresholds are not capturing exactly what we want them to capture. But I think it鈥檚 a poor argument to go from that to, 鈥榃ell, we just shouldn鈥檛 do anything and just cross our fingers and hope for the best.鈥欌€�

Matt O'brien, The Associated Press

sa国际传媒

How do you know when AI is powerful enough to be dangerous? Regulators try to do the math

This has been shared 0 times

Get your daily Victoria news briefing

More Science News