SaferAI joins the US AI Safety Institute Consortium (NIST)!
How Can Nuclear Safety Inform AI Safety?
How Can Nuclear Safety Inform AI Safety?
As general-purpose AI systems (GPAIS) and foundation models surpass human abilities in diverse tasks and become more integrated into our daily lives, their potential risks grow, underscoring a pressing need for a robust regulatory framework at the international level. Drawing inspiration from the nuclear power industry, we explore lessons from nuclear safety and the International Atomic Energy Agency (IAEA) to inform the development and deployment of GPAIS.
How Can Biosafety Inform AI Safety?
How Can Biosafety Inform AI Safety?
This memo outlines select standards in biosafety, with a focus on how high-risk biological agents are treated in biosafety level (BSL) 3 and 4 labs in the United States. It then considers how similar standards could be applied to high-risk AI research.
General-Purpose AI Systems
General-Purpose AI Systems
In this post, we're working towards proposing a qualitative definition that could be used by regulators that are trying to target the most dangerous AI systems in the EU, China and the US.
CounterGen
CounterGen
CounterGen is a framework for auditing and reducing bias in NLP models such as generative models (e.g GPT-J, T5, GPT-3 etc.) or classification models (e.g BERT). It does so by generating counterfactual datasets, evaluating NLP models, and doing direct model editing (à la ROME) to reduce bias. CounterGen is easy to use, even by those who don’t know how to code, it is applicable to any text data, including niche use cases, and it proposes concrete solutions to debias biased models.