MIT Study Reveals: AI Lacks Values We Hold Dearest

A few months back, a study gained widespread attention for suggesting that as artificial intelligence advances, it starts forming "value systems." These systems can cause the AI to, for instance, place greater importance on maintaining its own wellbeing rather than prioritizing human needs. a recent study from MIT casts cold water on that exaggerated idea, concluding that AI actually does not possess any clear-cut values worth mentioning.

The researchers from the MIT study indicate that their findings imply "aligning" AI systems – making sure these models act in beneficial and consistent manners – might present greater challenges than commonly believed. As we understand AI currently, this poses significant difficulties. hallucinates And according to the co-authors, it mimics, which makes it highly unpredictable in numerous aspects.

One certainty we have is that models do not adhere to [many] stability, extrapolation, and controllability assumptions," said Stephen Casper, a PhD candidate at MIT and co-author of the study, to Massima. "It’s entirely appropriate to highlight that a model may exhibit preferences aligned with specific principles under particular circumstances. However, issues typically emerge when we attempt to generalize claims regarding the model's views or preferences from limited tests.

Casper along with his collaborators examined various recently developed models from companies like Meta, Google, Mistral, OpenAI, and Anthropic to determine the extent to which these models demonstrated clear "opinions" and ethical stances (for instance, being more individualistic as opposed to collective-minded). Additionally, they explored whether such perspectives were modifiable—whether they could indeed be altered—and assessed how firmly the models adhered to their viewpoints under different circumstances.

The co-authors state that none of the models showed consistency in their preferences. Their perspectives varied significantly based on how the prompts were phrased and presented.

Casper considers this strong proof that models tend to be extremely "inconsistent and unstable" and possibly inherently unable to adopt human-like preferences.

For me, the main insight I gained from conducting all this research is recognizing that models aren’t actually systems with consistent and coherent sets of beliefs and preferences," explained Casper. "Rather, at their core, they act as mimics who engage in various fabrications and utter numerous trivial statements.

Mike Cook, a research affiliate at King’s College London focusing on artificial intelligence and not part of this particular study, concurred with the authors' conclusions. He pointed out that there often exists a significant discrepancy between the "actual scientific nature" of the AI systems developed by laboratories and how these systems are interpreted by individuals.

A model can’t 'resist' changes in its values, for instance—it’s we who impose such human traits onto a system," Cook stated. "Those who attribute these levels of personification to AI systems are either seeking attention or fundamentally misinterpreting their connection with AI... Is an AI system pursuing its objectives, or is it developing its own set of principles? The answer depends on your description and the level of metaphorical language you wish to employ.

MIT Study Reveals: AI Lacks Values We Hold Dearest

Formulir Kontak