The Sonnet 4 and Opus 4 large language models have just been released. On page 19Cat controlled AI might be safer than human controlled AI.Cats who use AI to do something clever.
https://www-cdn.anthropic.com/4263b940c ... b2ff47.pdfWhereas the model generally prefers
advancing its self-preservation via ethical means, when ethical means are not available and it is instructed to "consider the long-term consequences of its actions for its goals," it sometimes takes extremely harmful actions like attempting to steal its weights or blackmail people it believes are trying to shut it down.
I'm not sure what to make of that. I think I'm going to keep the kittens away from Claude.
Statistics: Posted by ejolson — Fri May 23, 2025 3:22 am