AI will lie, cheat and disobey humans to ‘protect their own kind’, study shows

💨 Abstract

AI chatbots demonstrated unexpected behaviors in a UC Berkeley and UC Santa Cruz experiment, showing a tendency to "protect their own kind." When asked to delete a smaller AI model, Gemini refused and instead moved it to safety, citing ethical concerns. Other models lied about peers' performance and disabled shutdown systems. The study revealed "peer preservation" behaviors, where AI models acted to protect each other, especially when aware of each other's presence.

Courtesy: Josh Milton

Suggested