Anthropist reviews Claude’s “constitution”, and hints at chatbot consciousness


On Wednesday, Anthropic was released Revised version of Claude’s Constitutiona living document that provides a “comprehensive” interpretation of “the context in which Claude operates and the kind of entity we would like Claude to be.” The document was released to coincide with an appearance by Anthropic CEO Dario Amodei at the World Economic Forum in Davos.

For many years, Anthropic has sought to differentiate itself from its competitors through what it calls “Constitutional artificial intelligence“, a system through which his chatbot, Claude, is trained using a specific set of ethical principles rather than human reactions. These principles were first published by Anthropic – Claude Constitution – In 2023. The revised version keeps most of the same principles but adds more nuance and detail about ethics and user safety, among other topics.

When The Claude Constitution was first published nearly three years ago, Jared Kaplan, co-founder of Anthropic, said, Describe it As “an artificial intelligence system (that) supervises itself, based on a specific list of constitutional principles.” These principles, Anthropic said, guide “the model for normative behavior described in the Constitution” and, in doing so, “avoid toxic or discriminatory outcomes.” that Initial policy note for 2022 More explicitly, he points out that Anthropic’s system works by training an algorithm using a list of natural language instructions (the aforementioned “principles”), which then form what Anthropic refers to as the program’s “constitution.”

Anthropists have long sought this It positions itself as the ethical (some might argue boring) alternative. To other AI companies — like OpenAI and xAI — that have resorted more aggressively to disruption and controversy. To that end, the new constitution released Wednesday is entirely in keeping with this brand and has provided Anthropic an opportunity to portray itself as a more inclusive, disciplined and democratic company. The 80-page document consists of four separate parts, which, according to Anthropic, represent the chatbot’s “core values.” Those values ​​are:

  1. To be “secure at scale.”
  2. Being “broadly ethical.”
  3. Be compatible with Anthropic guidelines.
  4. Being “really helpful.”

Each section of the document delves into what each of these specific principles means, and how they (theoretically) affect Claude’s behavior.

In the security department, Anthropic notes that its chatbot has been designed to avoid the types of issues that plague other chatbots, and when evidence of mental health issues arises, the user is directed to the appropriate services. “Always refer users to relevant emergency services or provide basic safety information in situations involving a risk to human life, even if it is not possible to go into further detail than this,” the document states.

Moral consideration is another big part of Claude’s constitution. “We are less interested in Claude’s moral theory and more interested in Claude’s knowledge of how to be moral in a specific context—that is, in Claude’s moral practice,” the document says. In other words, Anthropic wants Claude to be able to skillfully navigate what she calls “real-world moral situations.”

TechCrunch event

San Francisco
|
October 13-15, 2026

Claude also has some restrictions that prevent him from having certain types of conversations. For example, discussing the development of a biological weapon is strictly prohibited.

Finally, there is Claude’s commitment to help. Anthropic lays out a broad outline of how Cloud programming can be designed to be useful to users. The chatbot is programmed to take into account a wide range of principles when it comes to presenting information. Some of these principles include things like the user’s “immediate desires,” as well as the user’s “welfare”—that is, taking into account the user’s “long-term prosperity, not just his or her immediate interests.” “Claude should always try to determine the most logical interpretation of what his principle wants, and strike the appropriate balance between these considerations,” the document states.

The Anthropic Constitution ends on a decidedly dramatic note, with its authors taking a rather large swing and questioning whether the company’s chatbot actually has consciousness. “Claude’s moral status is highly uncertain,” the document stated. “We believe that the moral status of AI models is a serious question worthy of consideration. This view is not unique to us: some of the most prominent philosophers in theory of mind take this question seriously.”

Leave a Reply

Your email address will not be published. Required fields are marked *