From November 2024 to March 2025, a variety of stakeholders, including from industry, civil society and academia, participated in the drafting of the first EU General-Purpose AI Code of Practice. The process was guided by chairs and vice-chairs selected for their deep understanding of general-purpose AI systems and their independent perspective. As one of the participants, we have been actively engaged throughout the drafting process. We provided detailed written feedback across all three drafts and presented our analyses at working group meetings as well as the workshop for civil society organisations hosted by the chairs and vice-chairs.
The Code of Practice is intended to serve as a voluntary compliance tool for providers of general-purpose AI models (including those with systemic risk) under the EU AI Act, until the time when harmonized standards are published. It will detail the AI Act rules for providers of general-purpose AI models in the following areas: transparency and copyright (Working Group 1), risk assessment (Working Group 2), technical risk mitigation (Working Group 3) and risk governance (Working Group 4).
Before the drafting of the Code of Practice, we had already developed a comprehensive risk management framework that builds on established risk management principles in other industries while addressing the unique challenges posed by advanced AI systems. Our focus on structured risk identification, analysis, mitigation, and the essential governance to support these processes, provided a foundation that naturally aligned with the Code's structure.
Leveraging this background and our expertise in standardization (including work within CEN/CENELEC and ISO/IEC), we provided feedback to the Code. Our primary objective was to improve its risk management provisions, ensuring the Code effectively supports the EU AI Act's critical goal to: “assess and mitigate possible systemic risks at Union level, including their sources, that may stem from the development, the placing on the market, or the use of general-purpose AI models with systemic risk.”
Our feedback centered on the systemic risk identification process and systemic risk tiers.
On systemic risk identification, we recommended utilising information that is already gathered in other Commitments and Measures of the Code. This includes information on incidents and near-misses (pursuant to Commitment II.12 and Measure II.4.14) of the provider’s models, model-independent information (pursuant to Measure II.4.3) such as “forecasting of general trends” and “expert interviews and/or panels” gathered for the provider’s models, as well as information on serious incidents. There is no requirement in the Code to report every serious incident publicly–only reporting to the AI Office and, as appropriate, to national competent authorities is obligatory. Therefore, we propose that the AI Office might notify providers about systemic risks that it identified based on the available serious incident reports.
Furthermore, we propose a methodology consisting of exploratory red-teaming and the analysis of lightweight scenarios.
As with most of our feedback, we proposed concrete text in the language of the Code in order to make it directly actionable. For systemic risk identification, we proposed three different texts with different levels of detail and effectiveness in order to clarify how we prioritize the different elements.
Another focus of our feedback is the definition of risk tiers for which we recommended a hierarchical methodology consisting of three approaches: harm-based definitions (e.g., >1% chance/year of injury), scenario-based definitions (e.g., >1% chance/year model helps create a novel dangerous biological agent) and risk-source-based definitions (e.g., a specific level of Autonomous AI R&D capabilities). The three approaches should be tried in the listed order, only proceeding to the following approach if the current approach is demonstrated to not be feasible. This is important in order to prioritize approaches that define tiers in terms closest to tangible harm. Furthermore, building on our frontier AI risk management framework, we recommend mapping measurable risk indicators to each of the tiers. These systemic risk indicators serve as early warning signs and provide clarity on when specific risk tiers are approached or breached, creating accountability and allowing for timely intervention. We explain our approach in two-and-a-half pages in this memo.
The Code’s effectiveness will depend on future-proofness as well as continued refinement to technological developments, the development of our understanding of the AI risk landscape and progress on AI risk management in general.
We believe that our recommendations on systemic risk identification and systemic risk tiers increase future-proofness of the Code. For example, one area of GPAI risk management in which we think such progress is important and tractable is quantitative risk analysis. Thanks to its hierarchical structure, our recommendation on the definition of systemic risk tiers and indicators not only remains effective, but even gains effectiveness as quantitative risk analysis methods improve. Furthermore, our recommendation on systemic risk identification increases future-proofness because it enables proactive identification of future risks before those risks materialize.
Overall, we recommend that the effective structure and clear language of the third draft are preserved in the final version, the risk identification process and the definition of risk tiers are substantially revised and several other measures are improved and clarified.
If the Code will be refined in the future, SaferAI remains committed to participate in that process.