Sunday, 14 June 2026

The Alignment Problem — A Conversation in the Senior Common Room at St Anselm's

The afternoon sun lay across the windows of the Senior Common Room.

Professor Quillibrace was reading.

Miss Stray was writing notes in a small notebook.

Mr Blottisham entered carrying a newspaper and an expression of considerable satisfaction.

"I see they've solved it."

Neither Quillibrace nor Miss Stray looked up.

This was not unusual.

After a moment Quillibrace asked:

"Solved what?"

"The alignment problem."

"I wasn't aware it had been solved."

"It is perfectly straightforward."

Miss Stray closed her notebook.

This was generally a sign that trouble was approaching.

Blottisham sat down.

"The whole issue is rather overcomplicated."

"I see," said Quillibrace.

"One simply ensures that the machine shares human values."

Quillibrace lowered his book.

Miss Stray looked interested.

Blottisham took this as encouragement.

"That is the entire problem."

"The entire problem?"

"Exactly."

The professor considered this.

Then he nodded.

"Excellent."

Blottisham smiled.

He had learned to be cautious when Quillibrace said "excellent."

Unfortunately, caution was not among his dominant traits.

"Quite."

Quillibrace placed a bookmark in his book.

"Which human values?"

Blottisham waved a hand.

"The usual ones."

"The usual ones?"

"Good values."

Quillibrace nodded.

"Good."

A pause followed.

Blottisham waited.

Nothing happened.

Eventually he frowned.

"Well?"

"Well what?"

"Aren't you going to argue?"

"Why would I argue?"

"Because I have just explained the solution."

Quillibrace looked thoughtful.

"I am merely trying to understand it."

Miss Stray smiled faintly.

This expression generally indicated that she understood exactly what was happening.

Blottisham did not.

Quillibrace continued.

"You propose that the machine should be aligned with human values."

"Yes."

"And these values are good values."

"Obviously."

"How shall we identify them?"

Blottisham looked surprised.

"We already know them."

"Do we?"

"Of course."

Quillibrace nodded.

"Excellent."

The word returned.

Blottisham felt a slight unease.

The professor continued.

"Would you mind listing them?"

"Certainly."

Blottisham leaned back.

"Fairness."

"Good."

"Freedom."

"Excellent."

"Justice."

"Very good."

"Honesty."

"Wonderful."

Blottisham smiled.

The matter appeared settled.

Then Miss Stray spoke.

"What do you mean by fairness?"

Blottisham blinked.

"What?"

"Fairness."

"What about it?"

"You listed it."

"Yes."

"What does it mean?"

Blottisham frowned.

"Surely everyone knows."

Miss Stray waited.

Quillibrace waited.

The room became unexpectedly quiet.

After a moment Blottisham said:

"It means being fair."

"Ah."

"Well, it does."

Miss Stray nodded sympathetically.

"That is often what fairness means."

Blottisham looked relieved.

"Exactly."

She continued.

"The difficulty is that different people mean different things by it."

Blottisham waved this aside.

"Minor details."

Quillibrace looked interested.

"Minor?"

"Obviously."

"How fortunate."

"What is?"

"The alignment problem."

Blottisham frowned.

"What do you mean?"

Quillibrace folded his hands.

"If the disagreements are merely minor, we should be able to resolve them immediately."

"Of course."

"Excellent."

There it was again.

The room fell silent.

Outside, a gardener pushed a wheelbarrow across the lawn.

Inside, Blottisham experienced the growing sensation that he had accidentally volunteered for something.

Miss Stray opened her notebook.

"Let us begin with fairness."

Blottisham's confidence wavered.

"Must we?"

"I think so."

"Why?"

Quillibrace smiled.

The smile was faint.

Almost invisible.

"Because," he said, "the machine is waiting."

No comments:

Post a Comment