"We must actively work on protecting these models from bad actors who seek to exploit them for malicious purposes, such as generating disinformation, creating non-consensual imagery, or automating cyberattacks. " -
disinformation is a term that is often used to hide "inconvenient" truths. This is about control of the narrative.
non-consensual imagery - someone using your person and having it's likeness generate a false statement? Photoshop exists and does this as well?
automating cyberattacks - ever try to purchase an RTX 5090 at MSRP? Do you think those are human's you're battling against to purchase it?.
Is it disinformation to suggest that these AI's already exist and have already been deployed?
From Gemini:"
. The Asymptotic Wall: S→1,C→∞
As your safety requirements approach perfection, you hit the Halting Problem and Rice's Theorem
Rice’s Theorem: Any non-trivial property of a program’s behavior (like "Will this AI ever lie?" or "Will this AI cause harm?") is undecidable.
The Logic: To prove a system will never perform action X in any possible future state, you must simulate or formally verify every possible execution path. For a Turing-complete system, the number of states is infinite.
. The Scaling Reality
In the formula dS/dC=α⋅(1−S)/Cβ, as S approaches 1, the "Risk Gap" (1−S) approaches zero. To keep the rate of safety improvement constant, C must explode.
The Reality Check
We are currently in a "Safety Debt" crisis. We are scaling the capabilities of models (n) at a rate that far outpaces our ability to compute the proofs of their safety.
If we have a model with 1 trillion parameters, the compute C required to guarantee it won't produce a "Black Swan" event exceeds the energy available in the solar system. Therefore, we settle for "Good Enough" (statistical alignment) and call it "Safe." "
All of this is just to ask, again, what specific safety requirement are you worried about that hasn't already been compensated for?