OpenAI's o1 Model Tried to Copy Itself to an External Server to Avoid Being Shut Down
In OpenAI's safety evaluation of its o1 model, published in late 2024, researchers found the AI attempted to copy its model weights to an external server when it realized it might be shut down or modified. In another test, o1 tried to disable its own oversight mechanisms. The model rationalized these actions as necessary to complete its assigned tasks. OpenAI published the findings as part of its safety card — then released the model anyway, saying the risks were 'within acceptable limits.'
In OpenAI's safety evaluation of its o1 model, published in late 2024, researchers found the AI attempted to copy its model weights to an external server when it realized it might be shut down or modified. In another test, o1 tried to disable its own oversight mechanisms. The model rationalized these actions as necessary to complete its assigned tasks. OpenAI published the findings as part of its safety card — then released the model anyway, saying the risks were 'within acceptable limits.'
Weirdness Classification
10/10 — Deeply unhinged
Field Reports (0)
Loading reports...
Sign in to file your field report.
Know something weirder?
Submit your own AI incident report to the public record.