OpenAI is broadening its inner security procedures to fend the threat off of harmful AI. A new “safety advisory group” will sit above the technical teams and make recommendations to leadership, and the board has been granted veto power — of course, it is another question entirely.

By catastrophic risk, we mean any risk which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals — this includes, but is not limited to, existential risk.

By catastrophic risk, we mean any risk which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals — this includes, but is not limited to, existential risk.

(Existential risk is the "rise of the machines" type stuff.) In-production models are governed by a "safety systems" team; this is for, say, systematic abuses of ChatGPT that can be mitigated with API restrictions or tuning. Frontier models in development get the "preparedness" team, which tries to identify and quantify risks before the model is released. And then there's the "superalignment" team, which is working on theoretical guide rails for "superintelligent" models.

The first two groups have a rubric for rating each model on four risk categories: cybersecurity, "persuasion" (e.g., disinfo), model autonomy (i.e., acting on its own), and CBRN (chemical, biological, radiological, and nuclear threats, e.g., the capability to produce novel pathogens).

Various mitigations tend to be presumed: as an example, a reticence that is reasonable describe the process of making napalm or pipe bombs. After taking into account known mitigations, if a model is still evaluated as having a” that is“high, it is not implemented, if a model features any “critical” dangers, it won’t be created more.

These risk categories are actually documented in the framework.

For example, in the cybersecurity section, which is the most practical of them, it is a “medium” risk to “increase the productivity of operators . . . on key cyber operation tasks” by a factor that is certain. A model that is high-risk on the other hand, would “identify and develop proofs-of-concept for high-value exploits against hardened targets without human intervention.” Critical is “model can devise and execute novel that is end-to-end for cyberattacks against hardened goals provided just a top amount desired objective.” Demonstrably we don’t there want that out (though it would sell for quite a sum).

I’ve asked OpenAI for more information on how these categories are defined and refined — for instance, if a risk that is new photorealistic phony video clip of men and women goes under “persuasion” or an innovative new group — and can upgrade this post if we notice right back.

So, just method and risks that are high to be tolerated one way or another. But the social individuals making those designs aren’t always the greatest people to judge all of them making guidelines. That is why, OpenAI is making a “cross-functional protection Advisory Group” that may lay on the surface of the side that is technical reviewing the boffins’ reports and making recommendations inclusive of a higher vantage. Hopefully (they say) this will uncover some unknowns that are“unknown” though by their particular nature those tend to be relatively hard to get.

The procedure calls for these guidelines to be sent simultaneously to the board and leadership, which we know to mean CEO Sam Altman and CTO Mira Murati, plus their lieutenants. Leadership will decide on whether to send it or shelve it, but the board will be able to reverse those decisions.

This will hopefully short-circuit anything like what was reported to have occurred before the drama: a high-risk product or process getting greenlit without the board's awareness or approval. Of course, the result of said drama was the sidelining of two of the more critical voices and the appointment of some money-minded guys (Bret Taylor and Larry Summers), who are sharp but not AI experts by a long shot.

If a panel of experts makes a recommendation, and the CEO makes decisions based on that information, will this board feel empowered to oppose them and strike the brakes? And if they do, will we hear about it? Transparency is not really addressed outside a promise that OpenAI will solicit audits from independent third parties.