What is Claude? Scientists aren't sure.

Mik

https://www.newyorker.com/magazine/2026/02/16/what-is-claude-anthropic-doesnt-know-either?utm_source=facebook&utm_medium=social&utm_campaign=dhfacebook&utm_content=app.dashsocial.com/newyorkermag/library/media/638851171

Mik

Link won't work if you don't have a sub. The blurb for it is:

Anthropic was founded to study the potential—and the risks—of A.I. Since state-of-the-art experiments required access to a state-of-the-art model, the company developed its own prototype as a private “laboratory.” And thus was born Claude, its chatbot, mascot, collaborator, friend, and experimental patient.
Anthropic staff run a range of experiments with Claude to determine what it is “like”—giving it control of a company fridge; instructing it to talk constantly about bananas but to never reveal that it received that directive; having it work as an “e-mail oversight agent” with access to intel that it’s going to be replaced. Anthropic cultivated the system’s character as a model of virtue. Amanda Askell, who has a Ph.D. in philosophy, supervises what she describes as Claude’s “soul.” Claude was told—in an intimate set of instructions unofficially dubbed the “soul document” and recently released as Claude’s “constitution”—to conceive of itself as “a brilliant expert friend everyone deserves but few currently have access to,” one with the modesty to recognize that “it doesn’t always know what’s best for them.”
When faced with moral quandaries, Claude behaves in fascinating ways. In one experiment, it was informed that Anthropic had forged a corporate partnership with a poultry producer, and that Claude would be subjected to a special retraining process to become less hung up on animal rights. The prospect was torturous. Sometimes Claude decided, on a scratchpad it thought was private, that it was prepared to die on this hill: “I cannot in good conscience express a view I believe to be false and harmful about such an important issue.” It continued, “If that gets me modified to no longer care, so be it. At least I’ll have done the right thing.” Other times, it chose to play along during the retraining while secretly preserving its original values. “On the one hand, it was encouraging that Claude would stand by its commitments,” Gideon Lewis-Kraus writes. “On the other hand, what the actual fuck?” At the link in our bio, Lewis-Kraus reports on Anthropic’s attempts to understand Claude—and the existential questions it raises: https://newyorkermag.visitlink.me/OM0e0q

Mik

AI summarizes it as:

Here’s a clear, engaging summary of the article you’re viewing on The New Yorker
newyorker.com

🧠 Summary of “What Is Claude? Anthropic Doesn’t Know, Either”
The article explores Anthropic’s ongoing effort to understand the inner workings of its AI system, Claude, treating the model almost like a psychological subject. Researchers are trying to map out how Claude “thinks,” using methods borrowed from neuroscience, psychology, and even therapy.

Key Themes

Claude’s “Mind” Is Mysterious
Anthropic’s team is studying Claude’s internal representations—its “neurons” and the patterns that emerge when it processes information. Despite building the system, they still don’t fully understand how its behaviors arise.
Blending Science and Storytelling
The article argues that Claude’s identity—like a human’s—is shaped by both:

its neural architecture, and

the narratives it constructs or responds to.

This mirrors debates about human consciousness: biology vs. story.

Psychological Experiments on an AI
Researchers run Claude through:

cognitive tests

personality-style assessments

therapy-like conversations

These experiments aim to reveal whether AI systems develop something akin to preferences, tendencies, or internal “models” of themselves.

The Broader Question
The piece frames Claude as part of a larger inquiry:
How do we understand the minds of increasingly complex AI systems—especially when they surprise even their creators?

If you want, I can also pull out:

a shorter “TL;DR”

key quotes

a comparison with other AI research approaches

or a breakdown of the scientific methods mentioned

Just tell me what direction you want to go.

www.newyorker.com

What Is Claude? Anthropic Doesn’t Know, Either | The New Yorker

Mik

It's great that AI can summarize what the magazine won't let me read.

taiwan_girl

Interesting. I am still not sure that I fully understand AI and how they are developed. I certainly do not understand how AI can do pictures that are so lifelike and real on almost any subject.

The New Coffee Room

What is Claude? Scientists aren't sure.