Comment by nonethewiser
17 hours ago
>We use the constitution at various stages of the training process. This has grown out of training techniques we’ve been using since 2023, when we first began training Claude models using Constitutional AI. Our approach has evolved significantly since then, and the new constitution plays an even more central role in training.
>Claude itself also uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses. All of these can be used to train future versions of Claude to become the kind of entity the constitution describes. This practical function has shaped how we’ve written the constitution: it needs to work both as a statement of abstract ideals and a useful artifact for training.
>We use the constitution at various stages of the training process. This has grown out of training techniques we’ve been using since 2023, when we first began training Claude models using Constitutional AI. Our approach has evolved significantly since then, and the new constitution plays an even more central role in training.
>Claude itself also uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses. All of these can be used to train future versions of Claude to become the kind of entity the constitution describes. This practical function has shaped how we’ve written the constitution: it needs to work both as a statement of abstract ideals and a useful artifact for training.
The linked paper on Constitutional AI: https://arxiv.org/abs/2212.08073
Ah I see, the paper is much more helpful in understanding how this is actually used. Where did you find that linked? Maybe I'm grepping for the wrong thing but I don't see it linked from either the link posted here or the full constitution doc.
In addition to that the blog post lays out pretty clearly it’s for training:
> We use the constitution at various stages of the training process. This has grown out of training techniques we’ve been using since 2023, when we first began training Claude models using Constitutional AI. Our approach has evolved significantly since then, and the new constitution plays an even more central role in training.
> Claude itself also uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses. All of these can be used to train future versions of Claude to become the kind of entity the constitution describes. This practical function has shaped how we’ve written the constitution: it needs to work both as a statement of abstract ideals and a useful artifact for training.
As for why it’s more impactful in training, that’s by design of their training pipeline. There’s only so much you can do with a better prompt vs actually learning something and in training the model can be trained to reject prompts that violate its training which a prompt can’t really do as prompt injection attacks trivially thwart those techniques.
It's worth understanding the history of Anthropic. There's a lot of implied background that helps it make sense.
To quote:
> Founded by engineers who quit OpenAI due to tension over ethical and safety concerns, Anthropic has developed its own method to train and deploy “Constitutional AI”, or large language models (LLMs) with embedded values that can be controlled by humans.
https://research.contrary.com/company/anthropic
And
> Anthropic incorporated itself as a Delaware public-benefit corporation (PBC), which enables directors to balance stockholders' financial interests with its public benefit purpose.
> Anthropic's "Long-Term Benefit Trust" is a purpose trust for "the responsible development and maintenance of advanced AI for the long-term benefit of humanity". It holds Class T shares in the PBC, which allow it to elect directors to Anthropic's board.
https://en.wikipedia.org/wiki/Anthropic
TL;DR: The idea of a constitution and related techniques is something that Anthropic takes very seriously.
This article -> article on Constitutional AI -> The paper
It's not linked directly, you have to click into their `Constitutional AI` blogpost and then click into the linked paper.
I agree that the paper is just much more useful context than any descriptions they make in the OP blogpost.