Increasing risks from advanced AI demand effective risk management systems tailored to this rapidly changing technology. One key part of risk management is establishing risk tiers. Risk tiers are categories based on expected harm that specify in advance which mitigations and responses will be applied to systems of different risk levels. Risk tiers force AI companies to identify potential risks from their systems and plan appropriate responses. They also provide public transparency regarding the risk level society is accepting from AI and how those risks are being managed.
Risk management, including risk tiering, has received attention from both policymakers and industry, but different organizations have taken divergent and sometimes incompatible approaches. This diversity has facilitated innovation and experimentation in adapting risk management to the challenges of advanced AI. However, it has also made it difficult to understand the overall risk picture and how each system and developer contributes to it, as well as to compare the effectiveness of different risk estimation and mitigation practices. As such, a more standardized approach to risk tiering–one that achieves the benefits of effective aggregation, comparison, and consistent scientific grounding while preserving space for innovation–is needed.
To explore such standardization, the Oxford Martin AI Governance Initiative (AIGI) convened experts from government, industry, academia, and civil society to lay the foundation for a gold standard for advanced AI risk tiers. A complete gold standard will require further work. However, the convening provided insights for how risk tiers might be adapted to advanced AI while also establishing a framework for broader standardization efforts.
Insights from the convening included the following:
- Quantitative risk tiers clarify the relationship between hazardous capabilities and expected harm; systematic qualitative modeling should apply where quantitative approaches fail. Quantitative risk modeling provides a basis for risk-informed decisionmaking and represents best practice in safety-critical industries like nuclear safety and aviation. Such modeling helps risk managers and the public understand what risks a system actually poses, instead of simply whether a harmful capability exists. Quantitative modeling also facilitates mitigations by providing clarity on how they reduce risk, for example whether they reduce the likelihood or severity of harm. For those AI risks where modeling is possible, risk managers should apply quantitative estimates. However, for some risks, quantitative estimates come with unacceptably large error bars. There, systematic scenario- or source-based modeling should be used to identify possible harms transparently while clearly conveying the risk level.
- AI systems should be classified into risk tiers at defined point throughout their lifecycle. Risk assessment, tier classification based on that assessment, and risk treatment should occur repeatedly from before pretraining (based on capability estimates) to after deployment. Mitigations should map onto each risk tier at each step.
- Benefits from AI releases should be considered alongside the risks, but more measurement work is needed before risks and benefits can be compared effectively. In some cases, advanced AI’s benefits may outweigh certain risk increases. However, benefits remain difficult to estimate, and more work must occur to enable reasonable benefit-risk comparisons. Quantitative risk tiers could facilitate these comparisons by providing a shared comparative framework.
- Standardized risk management practices likely enable better oversight of risks, their interactions, and responses from risk managers, auditors, and regulators. Current scholarly discussions of frontier AI labs’ risk management often focus on individual company processes, but society is accepting risk from all companies collectively. Determining the overall risk level from advanced AI, which systems contribute specific parts of that risk, and how different systems might interact to cause unforeseen harms is challenging. Innovation in risk management should be balanced with standardization efforts that allow governments to understand the overall risk landscape. Frameworks for integrating risk assessments, analyzing organizational risk contributions, and verifying best practice adoption would aid this understanding.
- Risk tier modeling should go beyond capability assessments to include how they might become threats when deployed. User capabilities and the characteristics of the overall risk environment determine the threat levels presented by key AI risk sources. To accurately underpin risk tiers, evaluations should account for these factors. AI companies may need to work with other actors better situated to provide certain informational inputs. For example, organizations like AISIs should provide risk landscape inputs for risk management processes, supplementing capabilities evaluations performed by AI companies.
Risk tiers clarify the harms AI might present and identify the measures being taken to prevent them. Establishing a gold standard for risk tiers will help create consensus on existing best practices and where more work is needed.
Back to top