Projects

G7 Hiroshima AI Process Code of Conduct and EU AI Act GPAI - Commonality Analysis

James Gealy, Daniel Kossack

26/05/2025

This report contains an analysis of the commonalities and differences between the G7 Hiroshima Process International Code of Conduct for Organizations Developing Advanced AI Systems and the EU AI Act text on general-purpose AI models. There is substantial commonality between the texts, though each has requirements/recommendations not found in the other. In essence, their commonality can be thought of as fitting a Venn diagram, with approximately 30% high or complete commonality, 50% moderate commonality, and 20% not overlapping where requirements or recommendations from one are not found in the other.

Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation

Malcolm Murray, Henry Papadatos, Otter Quarks, Pierre-François Gimenez, Simeon Campos

07/03/2025

The literature and multiple experts point to many potential risks from large language models (LLMs), but there are still very few direct measurements of the actual harms posed. AI risk assessment has so far focused on measuring the models' capabilities, but the capabilities of models are only indicators of risk, not measures of risk. Better modeling and quantification of AI risk scenarios can help bridge this disconnect and link the capabilities of LLMs to tangible real-world harm. This paper makes an early contribution to this field by demonstrating how existing AI benchmarks can be used to facilitate the creation of risk estimates. We describe the results of a pilot study in which experts use information from Cybench, an AI benchmark, to generate probability estimates. We show that the methodology seems promising for this purpose, while noting improvements that can be made to further strengthen its application in quantitative AI risk assessment.

A Frontier AI Risk Management Framework

Siméon Campos, Henry Papadatos, Fabien Roger, Chloé Touzet, Otter Quarks, Malcolm Murray

11/02/2025

The recent development of powerful AI systems has highlighted the need for robust risk management frameworks in the AI industry. Although companies have begun to implement safety frameworks, current approaches often lack the systematic rigor found in other high-risk industries. This paper presents a comprehensive risk management framework for the development of frontier AI that bridges this gap by integrating established risk management principles with emerging AI-specific practices. The framework consists of four key components: (1) risk identification (through literature review, open-ended red-teaming, and risk modeling), (2) risk analysis and evaluation using quantitative metrics and clearly defined thresholds, (3) risk treatment through mitigation measures such as containment, deployment controls, and assurance processes, and (4) risk governance establishing clear organizational structures and accountability. Drawing from best practices in mature industries such as aviation or nuclear power, while accounting for AI’s unique challenges, this framework provides AI developers with actionable guidelines for implementing robust risk management. The paper details how each component should be implemented throughout the life-cycle of the AI system - from planning through deployment - and emphasizes the importance and feasibility of conducting risk management work prior to the final training run to minimize the burden associated with it.

A Framework to Rate AI Developers’ Risk Management Maturity

Siméon Campos, Henry Papadatos, Fabien Roger, Chloé Touzet, Malcolm Murray

September 2024

Leading frontier AI companies have started publishing their risk management frameworks. The field of risk management is well-established, with practices that have proven effective across multiple high-risk industries. To ensure that AI risk management benefits from the insights of this mature field, this paper proposes a framework to assess the implementation of adequate risk management practices in the context of AI development and deployment. The framework consists of three dimensions: (1) Risk identification, which assesses the extent to which developers cover risks systematically, both from the existing literature and through red teaming; (2) Risk tolerance & analysis, which evaluates whether developers have precisely defined acceptable levels of risk, operationalized these into specific capability thresholds and mitigation objectives, and implemented robust evaluation procedures to determine if the model exceeds these capability thresholds; (3) Risk mitigation, which assesses the AI developers' precision in defining mitigation measures, evaluates the evidence of their implementation and examines the rationale provided to justify that these measures effectively achieve the defined mitigation objectives.

How Can Nuclear Safety Inform AI Safety?

Siméon Campos, James Gealy

September 2023

As general-purpose AI systems (GPAIS) and foundation models surpass human abilities in diverse tasks and become more integrated into our daily lives, their potential risks grow, underscoring a pressing need for a robust regulatory framework at the international level. Drawing inspiration from the nuclear power industry, we explore lessons from nuclear safety and the International Atomic Energy Agency (IAEA) to inform the development and deployment of GPAIS.

How Can Biosafety Inform AI Safety?

Olivia Jimenez

September 2023

This memo outlines select standards in biosafety, with a focus on how high-risk biological agents are treated in biosafety level (BSL) 3 and 4 labs in the United States. It then considers how similar standards could be applied to high-risk AI research.

General-Purpose AI Systems

Siméon Campos, Romain Laurent

March 2023

In this post, we're working towards proposing a qualitative definition that could be used by regulators that are trying to target the most dangerous AI systems in the EU, China and the US.