Hey all, Ernie here with a piece from contributor John Ohno, who last wrote a piece a year and a half ago about the home robot trend of the 1980s. Here’s a completely different take on automation.
Today in Tedium: AI is in the news a lot these days, and journalists, being writers, tend to be especially interested in computers that can write. Between OpenAI’s GPT-2 (the text-generating “transformer” whose creators are releasing it a chunk at a time out of fear that it could be used for evil), Botnik Studios (the comedy collective that inspired the “we forced a bot to watch 100 hours of seinfeld” meme), and National Novel Generation Month (henceforth NaNoGenMo—a yearly challenge to write a program that writes a novel during the month of November), when it comes to writing machines, there’s a lot to write about. But if you only read about writing machines in the news, you might not realize that the current batch is at the tail end of a tradition that is very old. Today’s Tedium talks procedurally generated text. — John @ Tedium
Today’s issue of Tedium is brought to you by IVPN. More from them in a second.
Six major events in the history of procedural text generation
1305: The publication of the first edition of Ramon Llull’s Ars Magna, whose later editions introduced combinatorics. (Image, above, is one of Llull’s wheels)
1921: Tristan Tzara publishes “How To Write a Dadaist Poem,” describing the cut-up technique.
1983: The “travesty generator” is described in Scientific American. The Policeman’s Beard is Half Constructed, a book written by “artificial insanity” program Racter, is published.
2005: A paper written by SCIgen is accepted into the WMSCI conference.
2014: Eugene Goostman passes Royal Society Turing Test.
2019: OpenAI releases the complete GPT-2 model.
The tangled prehistory of writing and writing machines
Depending on how loosely you want to define “machine” and “writing,” you can plausibly claim that writing machines are almost as old as writing.
The earliest examples of Chinese writing we have were part of a divination method where random cracks on bone were treated as choosing from or eliminating part of a selection of pre-written text (a technique still used in composing computer-generated stories). Descriptions of forms of divination whereby random arrangements of shapes are treated as written text go back as far as we have records—today we’d call this kind of thing “asemic writing” (asemic being a fancy term for “meaningless”), and yup, computers do that too.
But, the first recognizably systematic procedure for creating text is probably that of the medieval mystic Ramon Llull. For the second edition of his book Ars Magna (first published in 1305), he introduced the use of diagrams and spinning concentric papercraft wheels as a means of combining letters—something he claimed could show all possible truths about a subject. While computer-based writing systems today tend to have more complicated rules about how often to combine letters, the basic concept of defining all possible combinations of some set of elements (a branch of mathematics that’s now called combinatorics) looms large over AI and procedural art in general.
The 20th century, though, is really when procedural literature comes into its own. Llull’s combinatorics, which had echoed through mathematics for six hundred years, got combined with statistics and in 1906 Andrey Markov published his first paper on what would later become known as the “Markov Chain”—a still-popular method of making a whole sequence of events (such as an entire novel) out of observations about how often onekind of event follows another (such as how often the word “cows” comes after the word “two”).
Markov chains would become useful in other domains (for instance, they became an important part of the “Monte Carlo method” used in the first computer simulations of hydrogen bombs), but they are most visible in the form of text generators: they are the source of the email “spam poetry” you probably receive daily (an attempt to weaken automatic spam-recognition software, which looks at the same statistics about text that Markov chains duplicate) and they are the basis of the nonsense-spewing “ebooks” bots on twitter.
Fifteen years after Markov’s paper, the Dadaist art movement popularized another influential technique with the essay “HOW TO MAKE A DADAIST POEM”. This is known as the “cut up” technique, because it involves cutting up text and rearranging it at random.
It’s the basis for refrigerator poetry kits, but literary luminaries like T. S. Elliot, William S. Burroughs, and David Bowie used the method (on source text edgier than the magnetic poetry people dare use, such as negative reviews) to create some of their most groundbreaking, famous, and enduring work.
When computerized text generators use Markov chains, a lot of the appeal comes from the information lost by the model—the discontinuities and juxtapositions created by the fact that there’s more that matters in an essay than how often two words appear next to each other—so most use of Markov chains in computerized text generation are also, functionally, using the logic of the cut-up technique.
That said, there are computerized cut-up generators of various varieties, some mimicking particular patterns of paper cut-ups.
Protect yourself from digital surveillance. Use IVPN with AntiTracker.
Your privacy is under threat. Service providers log your browsing activities. Data brokers build profiles of you. Social networks track you across the internet.
David Bowie used a computer program to automate a variation of the cut-up technique to produce the lyrics on his 1995 album Outside. Based on his description, I wrote a program to simulate it. He’s no stranger to the technique—he’s been using it since the 70s.
Quantifying the strangeness of art
Claude Shannon (scientific pioneer, juggling unicyclist, and inventor of the machine that shuts itself off) was thinking about Markov chains in relation to literature when he came up with his concept of “information entropy.”
When his paper on this subject was published in 1948, it launched the field of information theory, which now forms the basis of much of computing and telecommunications. The idea of information entropy is that rare combinations of things are more useful in predicting future events—the word “the” is not very useful for predicting what comes after it, because nouns come after it at roughly the same rate as nouns come after all sorts of other words, whereas in english “et” almost always comes before “cetera.”
In his model, this means “the” is a very low-information word, while “et” is a very high-information word. In computer text generation, information theory gets used to reason about what might be interesting to a reader: a predictable text is boring, but one that is too strange can be hard to read.
Making art stranger by increasing the information in it is the goal of some of the major 20th century avant-garde art movements. Building upon Dada, Oulipo (a french “workshop of potential literature” formed in 1960) took procedural generation of text by humans to the next level, inventing a catalogue of “constraints”—games to play with text, either limiting what can be written in awkward ways (such as never using the letter “e”) or changing an existing work (such as replacing every noun with the seventh noun listed after it in the dictionary). Oulipo has been very influential on computer text generation, in part because they became active shortly after computerized generation of literature began.
In 1952, the Manchester Mark I was programmed to write love letters, but the first computerized writing machine to get a TV spot was 1961’s SAGA II, which wrote screenplays for TV westerns. SAGA II has the same philosophy of design as 1976’s TaleSpin: what Judith van Stegeren and Marlet Theune (in their paper on techniques used in NaNoGenMo) term “simulation.” Simulationist text generation involves creating a set of rules for a virtual world, simulating how those rules play out, and then describing the state of the world: sort of like narrating as a robot plays a video game.
Simulation has high “narrative coherence”—everything that happens makes sense—but tends to be quite dull, and to the extent that systems like SAGA II and TaleSpin are remembered today, it’s because bugs occasionally caused them to produce amusingly broken or nonsensical stories (what the authors of TaleSpin called “mis-spun tales”).
Computers go to Hollywood
The screenplay by SAGA II (above) has quite a different feel from Sunspring, another screenplay also staged by actors (below).
This is because, while SAGA II explicitly models characters and their environment, the neural net that wrote Sunspring does not.
Neural nets, like Markov chains, really only pay attention to frequency of co-occurrence, although modern neural nets can do this in a very nuanced way, able to weigh not just how the immediate next word is affected but how that affects words half a sentence away, and at the same time able to invent new words by making predictions at the level of individual letters. But, because they are statistical, everything a neural net knows about the world comes down to associations in its training data—in the case of a neural net trained on text, how often certain words appear near each other.
This accounts for how dreamlike Sunspring feels. SAGA II is about people with clear goals, pursuing those goals and succeeding or failing. Characters in Sunspring come off as more complicated because they are not consistent: the neural net wasn’t able to model them well enough to give them motivations. They speak in a funny way because the neural net didn’t have enough data to know how people talk, and they make sudden shifts in subject matter because the neural net has a short attention span.
These sound like drawbacks, but Sunspring is (at least to my eyes) the more entertaining of the two films. The actors were able to turn the inconsistencies into subtext, and in turn, they allowed the audience to believe that there was a meaning behind the film—one that is necessarily more complex and interesting than the rudimentary contest SAGA produced.
In 1966, Joseph Wizenbaum wrote a simple chatbot called Eliza, and was surprised that human beings (even those who knew better) were willing to suspend disbelief and engage intellectually with this machine.
This phenomenon became known as the Eliza effect, and it’s here that the origins of text generation in divination join up with computing. Just as a monkey typing on a typewriter will eventually produce Hamlet (assuming a monkey actually was random—it turns out they tend to sit on the keyboard, which prevents certain patterns from coming up), there is no reason why a computer-generated text cannot be something profound. But, the biggest factor in whether or not we experience that profoundness is whether or not we are open to it—whether or not we are willing to believe that words have meaning even when written by a machine, and go for whatever ride the machine is taking us on.
All the refinement that goes into these techniques is really aimed at decreasing the likelihood that a reader will reject the text as meaningless out of hand—and meanwhile, text generated by substantially less nuanced systems are declared as genius when attributed to William S. Burroughs.
Some text-generation techniques, therefore, attempt to evoke the eliza effect (or, as Stegren and Theune call it, “charity of interpretation”) directly. After all, some kinds of art simply require more work from an audience before they make any sense; if you make sure the generated art gets classified into that category, people will be more likely to attribute meaning to it. It is in this way that “Eugene Goostman,” a fairly rudimentary bot, “passed” the Royal Society’s turing test in 2014: Eugene pretended to be a pre-teen who didn’t speak English very well, and judges attributed his inability to answer questions coherently to that rather than to his machine nature. Inattention can also be a boon to believability: as Gwern Branwen (of neural-net-anime-face-generator thisWaifuDoesNotExist.net fame) notes, OpenAI’s GPT-2 produces output that only really seems human when you’re skimming, and if you’re reading carefully, the problems can be substantial.
On the other hand, some folks are interested in computer-generated text precisely because computers cannot pass for human. Botnik Studios uses Markov chains and a predictive keyboard to bias writers in favor of particular styles, speeding up the rate at which comedians can create convincing pastiches.
Since you can create a Markov model of two unrelated corpora (for instance, cookbooks and the works of Nietzsche), they can create mash-ups too—so they recently released an album called Songularity featuring tracks like “Bored with this Desire to get Ripped” (written with the aid of a computer trained on Morrissey lyrics and posts from bodybuilding forums). Award-winning author Robin Sloan attached a similar predictive text system to neural nets trained on old science fiction, while Janelle Shane, in her AI Weirdness blog, selects the most amusing material after the neural net is finished. The publisher Inside The Castle holds an annual “remote residency” for authors who are willing to use procedural generation methods to make difficult, alienating literature—notably spawning the novel Lonely Men Club, a surreal book about a time-traveling Zodiac Killer.
NaNoGenMo entries worth reading … or trying to
Writing 50,000 words in such a way that somebody wants to read all of them is hard, even for humans. Machines do considerably worse. That said, some entries are worth looking at (either because they’re beautiful, perverse, strange, or very well crafted). Here are some of my favorites.
Most Likely to be Mistaken for the Work of a Human Author: MARYSUE, an extremely accurate simulation of bad Star Trek fanfiction by Chris Pressey.
Text Most Likely to be Read to Completion: Annals of the Perrigues, a moody and surreal travel guide to imaginary towns by Emily Short.
Most Likely to be Mistaken for Moody Avant-Garde Art: Generated Detective, a comic book generated from hardboiled detective novels and flickr images, by Greg Borenstein.
Lowest Effort Entry: 50,000 Meows, a lot of meows, some longer than others, by Hugo van Kemenade.
Most Unexpectedly Compelling: Infinite Fight Scene, exactly what it says on the tin, by Filip Hracek.
The Most Beautiful Entry That Is Completely Unreadable: Seraphs, a codex of asemic writing and illustrations by Liza Daly.
Machine-generated text isn’t limited to the world of art and entertainment. Spammers use Markov chain based “spam poetry” to perform Bayesian poisoning (a way of making spam filters, which rely on the same kinds of statistics that Markov chains use, less effective) and template-based “spinners” to send out massive numbers of slightly-reworded messages with identical meanings.
Code to generate nonsense academic articles was written in order to identify fraudulent academic conferences and journals, which then became used by fraudulent academics as a means of padding their resume, leading to several major academic publishers using statistical methods to identify and filter out these machine-generated nonsense papers after more than a hundred of them successfully passed peer review.
Procedural text has been hidden in our newspapers for a few years, but despite some false starts, it looks like it’s finally beginning to step into the limelight in literature—to fill the demand for new and strange experimental fiction, generate surreal comedy, and to aid pulp authors in fields like Kindle erotica to feed readers’ insatiable hunger for content. Already today, it’s more likely than you think that a book you are reading was written with machinery as a creative aid. It’s not clear when, but procedural writing may soon become as common as the use of spell-checkers are now. When that day comes, we should expect literature to get much stranger!
Find this one an interesting read? Share it with a pal! Thanks to John Ohno for helping out this time around, as well as to our sponsor IVPN for helping to keep us running: