Yoshua Bengio's Move to SafeguardedAI and the Quest for Quantified AI Safety

One of the deep learning giants, Yoshua Bengio's next move has been made public. Regarding AI safety - he has joined a project named SafeguardedAI (Protected Artificial Intelligence) and serves as the scientific director.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_0

According to the introduction, SafeguardedAI aims to: construct an AI system responsible for understanding and reducing the risks of other AI Agents by combining scientific world models and mathematical proofs.

The main focus is on quantified safety guarantees.

The project is supported by the UK's Advanced Research and Invention Agency (ARIA), and it is said that in the future, ARIA will invest a total of £59 million (about RMB 537 million).

Bengio said:

If you plan to deploy a certain technology, given that abnormal AI behavior or misuse may have very serious consequences, you need to present sufficient reasons and preferably provide strong mathematical guarantees to ensure that your AI system will operate normally.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_1

Protected AI

The SafeguardedAI project is divided into three technical areas, and each area has specific goals and budgets:

Scaffolding, to build a scalable and interoperable language and platform for maintaining real-world models/specifications and checking proof documents.

Machine Learning, to use cutting-edge AI to help domain experts build first-class mathematical models of complex real-world dynamics and utilize cutting-edge AI to train autonomous systems.

Applications, to deploy an autonomous AI system protected by a gatekeeping AI in critical cyber-physical operating environments and unlock significant economic value through quantified safety guarantees.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_2

Officially, after Bengio joined, he will pay special attention to TA3 and TA2, and provide scientific strategic advice throughout the plan.

ARIA also plans to invest £18 million (about RMB 164 million) to establish a non-profit organization to lead the research and development work of TA2.

The project director of SafeguardedAI is the former senior software engineer at Twitter, David davidad Dalrymplee, who joined ARIA last September.

For Bengio's arrival, Dalrymple also posted a photo of the two on X (formerly Twitter):

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_3

Regarding the specific method of building an AI system responsible for understanding and reducing the risks of other AI Agents, David davidad Dalrymple, Yoshua Bengio and others have written a document.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_4

In it, a model called GuaranteedSafeAI (Guaranteed Safe AI) is proposed, mainly through three core interactions to quantify the safety guarantee of the AI system:

World Model, providing a mathematical description to explain how the AI system affects the external world and properly handling Bayesian and Knightian uncertainties

Safety Specification, defining which effects are acceptable mathematical descriptions

Verifier, providing an auditable certificate that the AI complies with the safety specification

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_5

They also divided the strategy for creating a world model into L0-L5 safety levels:

Level 0: There is no explicit world model. Assumptions about the world are implicitly in the training data and implementation details of the AI system.

Level 1: Use a trained black-box world simulator as the world model.

Level 2: Use a generative model of a probabilistically causal model that can be tested by checking if it assigns sufficient credibility to a specific human-made model (such as a model proposed in scientific literature).

Level 3: Use (one or more) probabilistically causal models (or their distributions), possibly generated with the help of machine learning, which have been thoroughly reviewed by human domain experts.

Level 4: Use world models of real-world phenomena that are formally verified as reasonable abstractions of fundamental physical laws.

Level 5: Do not rely on specific world models but use global safety specifications that cover all possible worlds.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_6

AI Risk Gets Much Attention in the Academic Circle

AI Risk has always been one of the focus topics of industry leaders. Hinton left Google to freely discuss the issue of AI risk. Before, there was even a large-scale scene of fighting among AI giants such as Wu Enda, Hinton, LeCun, and Hassabis.

Wu Enda once said:

The greatest concern about AI is actually that the AI risk is overly exaggerated, resulting in strict regulations suppressing open source and innovation.

Some people spread the fear (of AI exterminating humans) just to make money.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_7

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_8

DeepMind CEO Hassabis believes:

This is not a scare. If the risk of AGI is not disc ussed from now on, the consequences may be very serious.

I don't think we would want to start taking precautions before the danger erupts.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_9

Bengio and other artificial intelligence experts such as Hinton, Yao Qizhi, and Zhang Yaqin have previously published an open letter, Managing AIRIsksinanEraofRapidProgress.

Yoshua Bengios Move to SafeguardedAI and the Quest for Quantified AI Safety_10

It pointed out that humans must take AGI seriously that it may surpass human capabilities in many key fields in this decade or the next decade. It is recommended that regulatory agencies should have a comprehensive insight into the development of AI, especially be vigilant about those large models trained on value billions of dollars of supercomputers.

Just a month ago, Bengio also wrote an article titled ReasoningthroughargumentsagainsttakingAIsafetyseriously (Responding to the Viewpoints Against Taking AI Safety Seriously), in which he shared his latest thoughts. Those who are interested can take a look.

GuaranteedSafeAI:

Likes