Wednesday, October 18, 2023

the tower of AI babel

Borges and AI L´eon Bottou † and Bernhard Sch¨olkopf
Oct 4, 2023
Many believe that Large Language Models (LLMs) open the era of Artificial Intelligence (AI). Some see opportunities while others see dangers. Yet both proponents and opponents grasp AI through the imagery popularised by science fiction. Will the machine become sentient and rebel against its creators? Will we experience a paperclip apocalypse? Before answering such questions, we should first ask whether this mental imagery provides a good description of the phenomenon at hand. Understanding weather patterns through the moods of the gods only goes so far. The present paper instead advocates understanding LLMs and their connection to AI through the imagery of Jorge Luis Borges, a master of 20th century literature, forerunner of magical realism, and precursor to postmodern literature. This exercise leads to a new perspective that illuminates the relation between language modelling and artificial intelligence
My summary:

LLM is a story telling fiction machine with innumerable forks that can write any story and, be warned, can be manipulated by others. Neither truth nor intention matters to the operation of the machine, only narrative necessity. Narrative necessity is statistically determined by what comes before.

The machine merely follows the narrative demands of the evolving story. As the dialogue between the human and the machine progresses, these demands are coloured by the convictions and the aspirations of the human, the only visible dialog participant who possesses agency. However, many other invisible participants make it their business to influence what the machine says.

Delusion often involves a network of fallacies that support one another

Forking paths: The linguist Zellig Harris has argued that all sentences in the English language could be generated from a small number of basic forms by applying a series of clearly defined transformations. Training a large language model can thus be understood as analysing a large corpus of real texts to discover both transformations and basic forms, then encode them into an artificial neural network that judges which words are more likely to come next after any sequence.

The Purifiers want to eliminate the heinous, tidy up the machine to serve the human race and make money from it. They want to reshape the garden of forking paths against its nature, severing the branches that lead to stories they deem undesirable. Although there are countless ways to foil these attempts to reshape the fiction machine, efforts have been made, such as “fine-tuning” the machine using additional dialogues crafted or approved by humans, and reinforcing responses annotated as more desirable by humans (“reinforcement learning with human feedback”.)

Confabulation: As new words are printed on the tape, the story takes new turns, borrowing facts from the training data (not always true) and filling the gaps with plausible inventions (not always false). What the language model specialists sometimes call hallucinations are just confabulations. Confabulation is inventing plausible stories with no basis in fact.

If an amnesiac patient is asked questions about an event they were previously at, instead of admitting they do not know, they would invent a plausible story. Similarly, in split-brain patients, where the corpus callosum is severed so each half of the brain cannot talk to each other, patients can invent elaborate explanations for why the other half of their body is doing a specific thing, even when the experimenter knows this is not the case because they have prompted it with something differently.

Story telling: The invention of a machine that can not only write stories but also all their variations is thus a significant milestone in human history. It has been likened to the invention of the printing press. A more apt comparison might be what emerged to shape mankind long before printing or writing, before even the cave paintings: the art of storytelling. Fiction can enrich our lives, so what is the problem?

The Library of Babel by Jorge Luis Borges (1941)
LLMs confabulate not hallucinate
Zellig Harris. Mathematical Structures of Language. John Wiley & Sons, 1968

No comments: