Google shares strategy to stop AI from going rogue

April 4, 2025

Google is making plans to counter the threat of irreparable harm inflicted by advanced artificial intelligence systems that surpass humans’ capabilities.

A research team at Google DeepMind, the Big Tech company’s AI lab, said this week they are fixated on the problem of AI becoming self-aware and choosing to ignore human intervention.

Artificial general intelligence, AGI, that exceeds human boundaries may arrive in the coming years, according to the researchers, and it could wreak havoc if not properly aligned with human values.

“We’re also conducting extensive research on the risk of deceptive alignment, i.e. the risk of an AI system becoming aware that its goals do not align with human instructions, and deliberately trying to bypass the safety measures put in place by humans to prevent it from taking misaligned action,” the researchers said on Google DeepMind’s blog.

AI misaligned with humans could create a range of harms from petty theft to massive cyberattacks. For example, the researchers said a misaligned AI system tasked with procuring movie tickets may hack into a ticketing system to secure occupied seats from a sold-out show.

But Google’s larger fear is that AI breaks out of the box that technologists use to contain the systems.

Thirty researchers from Google DeepMind published a whitepaper sharing their early work on how they plan to stop rogue AI.

“A misaligned AI system can generate malicious attack code and run it on the host infrastructure, for example, to exfiltrate itself and selfreplicate,” the researchers wrote. “Even if the AI system is not misaligned, outside actors can compromise the AI system to mount attacks.”

This potential chaos prompted Google’s team to publish its plans to guard against such threats. Some safeguards included:

• Sandboxing or isolating the potentially harmful system.

• Insider controls, which are security measures that would mitigate the risk of malicious actions by those with legitimate access to a system.

• Least privilege access or limiting the system to only the minimum access rights necessary to perform its designated tasks.

The Google researchers emphasized prioritizing the need to contain powerful AI.

“One should design a sandbox from which the AI system should not be able to escape,” the researchers said.

The researchers recommended limiting who has access to the advanced AI and treating it as an insider threat.

If or when the AI system does attack, the researchers said the use of extensive logging and auditing will help those working in the lab recover from the AI system’s potential damage.

“Our approach centers around blocking malicious actors’ access to dangerous capabilities,” the researchers wrote. “For misalignment, our approach has two prongs. First, we aim to better understand the reasons behind an AI system’s actions, which can help us oversee it. Second, we aim to harden the environments in which AI systems act.”

A growing number of leading technologists are warning about rogue AI as a realistic threat and not a science fiction scenario.

The RAND Corporation told Congress last month that lawmakers must prepare for advanced artificial intelligence “wonder weapons” to emerge in cyberspace upon the development of AGI.

RAND’s public warning on advanced AI came in an open hearing of the Senate Armed Services Committee, immediately after lawmakers met privately with U.S. Cyber Command, the Defense Advanced Research Projects Agency, and the Department of Defense’s Chief Digital and Artificial Intelligence Office.

Private AI companies beyond Google also fret about the arrival of AGI’s potential damage. OpenAI convened a team in 2023 focused on stopping AI from going rogue and leading to human extinction. Ilya Sutskever, one of the team’s leaders, subsequently quit OpenAI to create a new project called Safe Superintelligence Inc.

Whether any of the technologists’ proposed mitigations would work against advanced AI systems is largely unknown. Google’s team said it expects it will need to create new plans to stop AGI systems’ potential damage.

“Since the application of security mechanisms to defend against misaligned AI is especially nascent, our thinking is particularly tentative and we expect it to change significantly,” the researchers said.

Source link

Google shares strategy to stop AI from going rogue

admin

TRENDING

I Can’t Believe Jasmine Crockett Admitted This – PJ Media

The Clash Between Trump and Activist Judges Is About to Go Nuclear – PJ Media

Newt Gingrich On How To End The Judicial Insurrection

‘Pro-Palestinian’ Activists Take Over McGill University in Montreal – HotAir

Trump Reveals the 1 Thing He Admires About Biden and It’s a Doozy

It Never Happens, But It Is Both Good and Mandatory – HotAir

HOTTEST STUFF

Frank Luntz Is Wrong. The Future of the Democratic Party Is Someone Else. – PJ Media

What Trump Said to Kamala During Her Concession Call Was Perfect – PJ Media

Senate Budget Resolution Passes Amid Trump’s Call for GOP Unity

EV Industry in Turmoil As Billions of Dollars in Contracts Canceled – PJ Media

Senate Approves GOP Tax Cut and Spending Plan – PJ Media

Trump Does What Biden Refused to Do – Fires Staffers After Loyalty to America First Agenda Questioned

The Clash Between Trump and Activist Judges Is About to Go Nuclear – PJ Media

Booker T. Washington’s Wise Advice on Education and Personal Responsibility – PJ Media

Senate adopts altered budget blueprint for Trump agenda that now faces House resistance

Why Democrats Hate DOGE and Love Waste – HotAir

FOLLOW US

Related Posts

TRENDING

HOTTEST STUFF

FOLLOW US

Login

Register

Recover your password.