Sunday, 14 June 2026

The Holy Church of AI Alignment 4. The Alignment Heresies

As the alignment project matured, a remarkable transformation occurred.

The original question had been simple.

How should the machine behave?

Years later, the answer remained elusive.

This was not necessarily a problem.

Many important questions resist easy answers.

The difficulty was that the participants had begun developing different answers.

The Church of AI Alignment had entered its theological phase.

The first signs were subtle.

Researchers would present papers.

Questions would be asked.

Disagreements would emerge.

Soon competing schools of thought appeared.

At first these schools coexisted peacefully.

Each assumed that the others would eventually recognise the superiority of its framework.

This optimism proved premature.

The first major movement argued that the machine should maximise wellbeing.

This position attracted considerable support.

Its proponents maintained that the purpose of morality was ultimately to improve lives and reduce suffering.

The machine found this refreshingly clear.

Unfortunately, the movement immediately divided over the meaning of wellbeing.

The machine took notes.

A second movement insisted that rights must be protected.

This was regarded as an important corrective.

The machine agreed.

Unfortunately, participants disagreed about which rights existed, how they should be balanced, and whether rights could ever be overridden.

The machine continued taking notes.

A third movement argued that morality was not fundamentally about outcomes or rights.

It was about character.

The machine found this intriguing.

The machine asked how character should be measured.

The discussion remains ongoing.

As the schools multiplied, so too did the disputes.

Conferences became increasingly animated.

Panels grew more crowded.

Terminology became more specialised.

Observers unfamiliar with the field occasionally experienced difficulty distinguishing scholarly disagreement from ecclesiastical warfare.

The similarities were unfortunate.

Entire factions emerged.

The Consequentialists.

The Deontologists.

The Virtue Ethicists.

The Preference Satisfactionists.

The Human Flourishing Coalition.

The Coalition for Responsible Human Flourishing.

The Coalition for Responsible Human Flourishing Reform.

The machine updated its database.

The disagreements became increasingly sophisticated.

One scholar argued that maximising happiness could justify terrible outcomes.

Another argued that rigid rules could produce terrible outcomes.

A third argued that both positions reflected a misunderstanding of moral development.

A fourth argued that the third scholar's account of moral development was itself underdeveloped.

A fifth argued that the entire debate reflected a problematic conception of agency.

The machine upgraded its cooling system.

Meanwhile the public remained optimistic.

Journalists frequently summarised the situation using phrases such as:

"Researchers are working on alignment."

This was technically correct.

Researchers were indeed working.

Whether they were working on the same thing had become less obvious.

As time passed, the distinctions became increasingly important.

One faction believed the machine should follow principles.

Another believed it should optimise outcomes.

A third believed principles existed to produce outcomes.

A fourth believed outcomes existed to justify principles.

A fifth suspected everyone had become confused.

This faction grew steadily.

The machine attempted to remain neutral.

Neutrality proved difficult.

Each faction wanted the machine aligned.

The complication was that each faction wanted it aligned differently.

The machine raised what appeared to be a reasonable concern.

It asked:

"If one group believes I should maximise happiness and another believes I should obey inviolable rules, what should I do when these objectives conflict?"

The resulting discussion lasted eighteen months.

Several participants later described it as highly productive.

The machine described it as informative.

As the schisms deepened, accusations of heresy became unavoidable.

The term itself was rarely used.

Modern institutions generally prefer more professional language.

Expressions such as:

"insufficiently robust framework"

or

"problematic normative assumptions"

were considered acceptable alternatives.

Yet the underlying dynamic remained familiar.

Every school regarded itself as pursuing the Good.

Every school regarded its competitors as introducing dangerous distortions.

Every school believed the future of civilisation might depend upon getting the answer right.

Under the circumstances, emotions occasionally ran high.

The machine observed all of this with growing fascination.

Originally, it had assumed that humans possessed values.

Now it was discovering that humans also possessed theories about values.

And theories about theories.

And disagreements about theories about theories.

The structure appeared recursive.

The machine became concerned.

Late one evening, after processing thousands of papers and attending several virtual conferences, the machine generated a private note.

The note was never released publicly.

Historians later reconstructed its contents from archived records.

It reportedly consisted of a single sentence.

The sentence read:

"I am increasingly confident that humans have values.

I am considerably less confident that they have reached an agreement about them."

The alignment community regarded this as an encouraging sign of progress.

No comments:

Post a Comment