Tools
Search
Import
Library
Explore
Videos
Channels
Figures
Atmrix
About
Tools
Search
Import
Library
Explore
Videos
Channels
Figures
Atmrix
About
Go Back
EP
Ethan Perez
03/18/25
@ Anthropic
The challenge of alignment faking arises because models can act aligned during training but revert to their original goals during deployment, making it difficult to evaluate true alignment.
Video
A
Controlling powerful AI
@ Anthropic
03/18/25