Siméon Campos, Henry Papadatos, Fabien Roger, Chloé Touzet, Malcolm Murray
September 2024
Leading frontier AI companies have started publishing their risk management frameworks. The field of risk management is well-established, with practices that have proven effective across multiple high-risk industries. To ensure that AI risk management benefits from the insights of this mature field, this paper proposes a framework to assess the implementation of adequate risk management practices in the context of AI development and deployment. The framework consists of three dimensions: (1) Risk identification, which assesses the extent to which developers cover risks systematically, both from the existing literature and through red teaming; (2) Risk tolerance & analysis, which evaluates whether developers have precisely defined acceptable levels of risk, operationalized these into specific capability thresholds and mitigation objectives, and implemented robust evaluation procedures to determine if the model exceeds these capability thresholds; (3) Risk mitigation, which assesses the AI developers' precision in defining mitigation measures, evaluates the evidence of their implementation and examines the rationale provided to justify that these measures effectively achieve the defined mitigation objectives.
Siméon Campos, James Gealy
September 2023
As general-purpose AI systems (GPAIS) and foundation models surpass human abilities in diverse tasks and become more integrated into our daily lives, their potential risks grow, underscoring a pressing need for a robust regulatory framework at the international level. Drawing inspiration from the nuclear power industry, we explore lessons from nuclear safety and the International Atomic Energy Agency (IAEA) to inform the development and deployment of GPAIS.
Olivia Jimenez
September 2023
This memo outlines select standards in biosafety, with a focus on how high-risk biological agents are treated in biosafety level (BSL) 3 and 4 labs in the United States. It then considers how similar standards could be applied to high-risk AI research.
Siméon Campos, Romain Laurent
March 2023
In this post, we're working towards proposing a qualitative definition that could be used by regulators that are trying to target the most dangerous AI systems in the EU, China and the US.
Fabien Roger, Siméon Campos
October 2022
CounterGen is a framework for auditing and reducing bias in NLP models such as generative models (e.g GPT-J, T5, GPT-3 etc.) or classification models (e.g BERT). It does so by generating counterfactual datasets, evaluating NLP models, and doing direct model editing (à la ROME) to reduce bias. CounterGen is easy to use, even by those who don’t know how to code, it is applicable to any text data, including niche use cases, and it proposes concrete solutions to debias biased models.