Chapter 11: The Finer Arts from
MACHINERY of THE MIND: Inside the New Science of Artificial Intelligence
by George Johnson
TIMES books,
copyright 1986 by George Johnson
As Harold Cohen
recalls it, his fascination with American Indian petroglyphs began in 1973,
when, in a canyon in northern California, he stood gazing at a wall of
rock and the pictures some long-dead artist had chisled there.
Twelve years later, Cohen, an artist who uses computers to explore the
mysteries of creativity, is still struck by the memory of what he saw:
"This extraordinary
sight - an escarpment rising from the floor of the desert, so you had a
wall about fifteen feet high that formed a kind of arc about a hundred
feet across." It looked, Cohen remembers, like a theater,
hewn by nature from the side of a cliff. Onstage were a number
of petroglyphs, the ancient, primitive drawings that adorn rocks throughout
the southwestern United States.
Some petroglyphs,
which range in age from 500 to 15,000 years old, are fairly easy to interpret.
The lines of the simple figures form crude images of birds, deer,
human faces. But the most mysterious of these symbols are far more
abstract: roughly drawn circles, ovals, squares, and triangles; some empty,
some filled with parallel lines, crosses, grids, et cetera; spirals,
zigzags, targets made of concentric circles, circles linked like beads,
circles surrounded by radiating lines. The petroglyphs are usually
fairly small, maybe several inches from end to end. The patterns Cohen
saw that day in northern California were especially intriguing because
of their size.
"They were
much bigger than average - I mean something like six feet across. The
placing was very deliberate and very dramatic. It gave a very
strong sense of having been done for something. There was a
sense of purposefulness about the thing that impressed me enormously."
And so, as it did when confronted witha work of art, Cohen's mind began,
almost automatically, to search for meaning.
Usually, when we see
art, we can assume we have some things in common with the artist. We
know something of his culture and history. Faced with a thousand-year-old
painting we know that the man hanging on the cross is meant to be Jesus.
We can look at a Mexican carving of a creature, half bird, half serpent,
and be fairly sure we are seeing a replica of the Aztec god Quetzalcoatl.
In both cases, we can interpret the image because information about the
artist's culture has survived along with the art. Or, as Cohen likes
to put it, the"codebook" the artist used to encrypt a message
into the lines and colors of a painting, or the grooves in a piece of rock,
has been handed down to us through the centuries. If the artist's
culture is not too different from our own, we already know what the symbols
stand for, what the artwork is supposed to mean - the codebook is in our
heads. Otherwise, we can look in a library.
But as he stared
at the petroglyphs, Cohen realized he was faced with a very different situation.
"I was struck by the fact that I really had no idea who
these people were." He had no way of knowing what the
artist might have felt and thought, or what life had been like in those
days. The symbols Cohen was trying to decipher were from a
culture that had disappeared long ago, leaving no record, no history.
The codebook had been lost forever. There was no way to know what the artist
had intended by these strange patterns.
And yet Cohen still felt that familiar compulsion to interpret. It was obvious that there was intelligence in those marks, that they had been put there by a human. Over the years, the content of the message may have been lost, the information dissipated through time. But merely by virtue of their form it was clear that the marks were intentional, that they were the product of a mind. And, since another mind - Harold Cohen's was trying to read them, a certain resonance was generated, a connection that extended across the centuries. These were images in their most raw and basic form, stripped of all the cultural trappings that say this means this, and that means that. What remained were just lines on rock. Why then did they have such evocative power?
This feeling of a connection with an ancient intelligence was almost mystical, and many people would have been content to leave it at that. But Cohen was interested in more rational explanations. For about a year he had been studying as a visiting scholar at Stanford University's artificial-intelligence lab. Incongruously,perhaps, his experience with the petroglyphs gave him an idea for how he might strip the process of imagemaking to its bare essentials and program a computer to create.
Since his early days as a painter, Cohen, a stout, heavy-set man with a graying beard and short ponytail, has strongly believed in the importance of demystifying art. From 1952 to 1968 he worked in his native England, creating abstract paintings that explored, among many other things, the way color and shape can be used to induce in the minds of an audience a whole range of aesthetic effects. In the words of Michael C. Compton, keeper of museum services for the Tate Gallery in London, "Harold Cohen built up a reputation as a painter equal to that of any British artist of his generation." He won scholarships and fellowships; his work was displayed in shows all over the world; he had paintings in the Tate's permanent collection, one-person shows in prestigious galleries. "In short," Compton wrote, "he was a successful painter and could look forward to a long and rewarding career."
Yet he was different from many of his colleagues in that, for him, art making was an analytical process, a means of systematically exploring what he called "the mechanics and processes of communication." As he worked he introspected, trying to see what the procedures were that he used to make images that seemed to move people in certain ways. As Compton describes the paintings, they sound almost like scientific experiments:
He explored the conditions for forms to be seen as overlapping one another, to be Iying adjacent or in different planes. In 1962, he was exploring the factors of symmetry and asymmetry, of repetition and variation, of the spatial effects of diagonals and the interrelation of lines with colour fields.... A group of paintings of 1966 was created according to rules which determined the movement of a line in relation to a preformed, spattered field. A final series of 1967-8 was made by spraying through masks, perforated by elliptical holes, onto an irregularly sprayed field, so that the interrelation of the two layers would generate sensations in the viewer that could be interpreted as a field of objects in space.
As Cohen experimented with his art, he gradually began to realize that to develop a theory he also would have to think very deeply about the nature of the mind. As he thought about what he did when he created a piece of art, he became intrigued by the idea that the mind worked something like a digital computer. The artist was able to communicate with his audience because he had mental programs, procedures for making pictures. These images then served as a medium, triggering in the viewer another set of programs to decipher them.
"Right through the '60s my interest as a painter had to do with the mechanism of standing-for-ness," Cohen explained, "the fact that I could make marks and you would proclaim that the marks stood for something. That's always been the core of my interest as an artist." But after a decade or so of producing a body of work that explored the artistic process, he was becoming increasingly dissatisfied. "I was feeling a good deal of frustration about the state of my own work. Oh, it was fine everybody said it was beautiful and all that. But by the end of the '60s it was beginning to seem to me that all I was doing was cataloging all the various ways in which things could stand for things. After the better part of a decade I didn't sense that I was any closer to understanding what the mechanisms were. I was simply collecting data. I wasn't generating a theory."
Then, in 1968, he was invited by the University of California at San Diego to spend a year as a visiting professor of art. He thought the change would do him good. As it turned out, it was more of a change than he had reckoned for. Almost as soon as he arrived, he met a graduate student , a musician and computer enthusiast who convinced him to learn some of the rudiments of programming. Cohen was curious so he decided to give it a try. Almost immediately he found himself hooked. It was satisfying to think of a seemingly simple task and then analyze it so precisely that it could be turned into something a computer could do, like drawing a closed figure. For a human, it seems a fairly spontaneous and unconscious act. We take pencil in hand, put point to paper, then sweep out a line that curves back on itself, enclosing a small bit of space. Before we begin drawing we don't know exactly what the shape will look like, only that it will be roughly a certain size and occupy a certain region of the paper. It is this lack of conscious planning that makes freehand drawing look so natural. That is why we call it free.
But how, Cohen wondered, could he get a computer to do that? What would the procedure be? It would be easy enough to write a program to draw a circle of any size, or an ellipse, anything regular and geometrical. But what rules would a computer need to draw something that looked freehand, in a style that was humanlike? Cohen approached the problem by thinking about the procedure he seemed to unconsciously follow when he drew a closed form:
I. Start by moving the pencil in any direction.
2. If you find yourself nearing the edge of the paper, then start circling; otherwise, continue on.
3. If you find the pencil drawing in the same direction as in Step I, then immediately head back to the starting point.
4. When you reach it, stop. The figure is done.
By adding more of these if-then rules, the process could be further refined, until, voila! you'd have a program capable of drawing a variety of free-form shapes either on a computer screen or using a plotter, a motordriven device that moves a pen across a piece of paper. Moreover, the computer would draw the figure in much the same way that a person would, in a manner that was structured - there were certain rules that had to be followed but not rigid: any number of possible trajectories would do.
"From the beginning what excited me about the computer was the fact that one could write programs that seemed in some curious way like thinking," Cohen said. "That's always been the interesting thing for me, the fact that one can use the machine to simulate some aspects of intellectual performance. In my own case obviously those aspects are particularly involved with imagemaking behavior, or for that matter image-reading behavior, because I think they're essentially the same thing. Since then I've just been sitting here punching keys."
At first, programming was more an intellectual exercise than anything else. But as Cohen worked with the computer for about a year, he had a growing sense that this was the tool he needed to continue his experiments with the processes by which the mind does art. He would try to write a program that knew, in some crude sense, how to draw. When his visiting professorship at San Diego expired, Cohen joined the university faculty full time. He met his second wife, Becky, a photographer, and southern California became his home. Though he continued with his other artwork, he soon found that he was spending most of his working hours programming.
"I started on the university computer, which was by modern standards an old clunker of a machine running Fortran. That was still in the days of batch processing." Each line of a program had to be typed with a keypunch machine onto a separate IBM card, the letters and numbers encoded as patterns of holes. Then the entire deck was taken to the computer center and left with a technician who, when your turn came around, fed the cards into the machine. The next day you returned and picked up the results, printed by a Teletype on green-and-white-striped fanfold paper. In those early days of computing it wasn't possible to sit at a terminal with a screen and interact with the machine, tinkering with a program until it ran. If you made a mistake, leaving out a comma in card number 127, you might have to wait twenty-four hours to find out. You would go to retrieve your printout and learn that the program had crashed. If you were lucky, the printout would contain helpful error messages to help you diagnose what had gone wrong. You would go through the cards and replace the offending one, then bring the deck back to the computing center and get inline again.
"I wonder if anybody starting today with that kind of computer would last the first ten minutes," Cohen said. "I had the feeling that if I was going to make out with a computer at all I'd really better get my hands on one. I felt very remote from it. It seemed too abstract to me. It was a bit like making a painting by posting your instructions in the mail and hoping you'd come out okay. So I managed to talk the university research committee into coming up with enough money to buy me a small machine, but at the same time it was obvious that I was in a fairly capital-intensive operation." To design and run a program that would exhibit anything in the way of interesting drawing skills, he would need a more sophisticated computer.
"So I wrote up a grant proposal to the National Science Foundation, which in fact was the first and last grant proposal that I ever wrote. It failed, that is to say it failed to get me any money. But I did learn a lot from it because they send you back reviewers' opinions, and one I remember particularly said, 'How could professor Cohen hope to learn Fortran? He's an artist.'
"Other reviewers showed a more enlightened attitude.
"One of the proposals fell into Ed Feigenbaum's hands, and his response was that it was the best proposal he'd read in years. So he came down to talk to me and meet me and asked me to come up to Stanford. So it was very valuable, much better than a grant. We've been good friends ever since. "
In 1972 Feigenbaum invited Cohen to come to the Stanford artificial intelligence lab as a visiting scholar. After a year absorbing the culture of AI, learning about if-then rules, Turing machines, et cetera, Cohen stopped, during a weekend trip to Mammoth, for his first look at the petroglyphs, which were located in the Chalfant Valley. Becky, who had discovered the site on an earlier visit to the area, had a feeling that her husband would be moved by them. The experience was, in fact, a turning point in Harold Cohen's career. Primed with his long-held belief in the possibility of describing the artistic process, and with the central dogma of computer science that anything that can be explained can be programmed, he thought more deeply about what the programs for image making might be.
During the next few years he visited more petroglyph sites and, over and over, experienced that same compulsion to interpret what he came to call "the paradox of insistent meaningfulness." Why, he wondered, do "we persist in regarding as meaningful . . . images whose original meanings we cannot possibly know, including many that bear no explicitly visual resemblance to the things in the world?" To help explain the phenomenon, Cohen began to devise a rough computational theory. He thought more about those wired-in mental programs that he believed people use to make and read images. If we and the ancient artists simply by virtue of being human had the same set of programs, then that would explain how it was possible for us to look at petroglyphs and sense intelligence at work. It wouldn't matter that our minds and that of the artist dwelled in different cultural contexts, that we had lost the codebook for the art. As long as all rational creatures shared the same low-level mechanisms, then we would have a basis for communication. It would not be a message that was transmitted across the years, but rather a sense of communion, a feeling of shared humanity revealed in the orderly arrangement of a few marks.
"I am proposing that the intended meanings of the maker play only a relatively small part in the sense of meaningfulness," Cohen later wrote. "That sense of meaningfulness is generated for us by the structure of the image rather than by its content." Over the centuries, Cohen believed, a set of representational strategies," rules for image making, have evolved among the cognitive structures of the brain. All people, whether they are ancient Indians, cave dwellers, medieval Europeans, or modern-day Africans, New Yorkers, Californians, or Eskimos share a similar set of wired-in procedures for making pictures. When we are moved by a thought, a feeling, an experience, our brains cause us to put the same kinds of marks on the rock or page. Thus we can appreciate art, even when it comes from cultures distant in space and time. It doesn't really matter whether the artist's specific message comes across. He might have had one thing in mind, while we as viewers impose our own interpretations. More important, Cohen believed, is that art, whether it's a twentieth-century abstract painting or primitive Indian rock art, compels us to begin interpreting the instant we sense those familiar strategies at work. Art is a "meaning generator," not a medium for carrying the artist's message. Certainly, if the work is representational, we'll see trees, lakes, animals, etcetera, but more important, Cohen believed, are those basic image-making and image-reading programs that link artist and viewer in the transaction we call art.
Now, if Cohen could figure out what those processes were, he could test his theory by embedding it in a program that would draw primitive images, drawings convincing enough to evoke in a viewer that same feeling of communion with an alien intelligence that he'd had with the petroglyphs in the canyon. The viewer would see the computer's marks and sense a mind at work. He would be moved to try to discover the meaning of the drawing. But in this case Cohen would know that the computer hadn't intended to say anything. It would merely possess a few purely syntactic rules of drawing.
Cohen found Stanford so engaging that he ended up staying for two years, commuting to San Diego during the second year to teach his classes. He returned home in 1974 with the sophisticated knowledge he needed to continue his work. The result was a program that over the next several years grew to the length of a short novel. He named it Aaron. The letters didn't stand for anything. Later Cohen discovered that Aaron was his Hebrew name. To begin with, Cohen gave his program information about three concepts that he considered fundamental to drawing: the difference between inside and outside, between closed and open, and between figure and ground. This knowledge was contained implicitly in the form of some three hundred if-then rules, packaged something like the various experts in the Hearsay II program.
To begin a drawing, an expert called Artwork would pick a starting point, at random. Then Artwork would call on another expert, Planner, to decide what kind of figure to draw. Suppose that Planner chose (again at random) to draw a curve, arching from the bottom of the canvas like a little hill. A conventional program might have done this by picking a curve from a stock of predrawn figures stored in memory, and then reproducing it, rotating, shrinking, stretching, distorting it invarious ways to make a unique figure. But that was not what a human artist would do, and Cohen wanted his program to model the artistic process. Aaron would have to draw the figure from scratch, creating it anew, according to a package of rules.
To draw a curve, first one must know how to draw a line. So Planner would call on an expert that knew about lines. Using specifications sent down by Planner, the Line expert would choose a beginning and ending point. It would say, in effect, "Start at point A heading 5 degrees north and end at point B heading 175 degrees south." Now, to get from A to B there are any number of possible paths other than the obvious engineer's curves. Cohen didn't want the drawings to look like blueprints, a style most people would associate with computer art. He wanted a more impromptu feel. To achieve this effect, the Line expert would call on another expert, Sectors, to attend to the details. It did this by breaking up the line-drawing process into a number of small steps. First Sectors would pick a destination point somewhere between A and B, a signpost, Cohen called it. Then another agent, called Curves, would generate instructions that sent the pen veering toward the signpost. Once it was close to the mark but not actually touching it, Sectors would stop and reconnoiter. It would see where the pen was now and pick a second signpost that would take the line a little closer to its final destination. Then it would call on Curves again to plot another rough trajectory. After this process had been repeated several times, the line would be complete. The result was a very spontaneous-looking curve.
This, Cohen reasoned, is how a human draws. We have a general idea of where we want a line to begin and end. We move the pen a few inches, pause to see where we are, and decide how best to continue. By constantly monitoring our progress and making dozens of these unconscious microdecisions, we draw a line.
Once a figure was finished, Artwork would resume control of the drawing and decide on another region of the picture to fill. It would call Planner again, which might now decide to draw a closed form or a zigzag. It would send its instructions to Line,which would recruit Sectors and Curves, and the process would begin anew.
"As human beings do, the program makes use of randomness," Cohen explained. "Any time it has to make a decision where it doesn't know whether one thing is better than another thing or it really doesn't care, when it has no basis for making a choice, it says anything between this and this will do." In a single drawing there are millions of these microdecisions. When the program draws a line, the way each tiny segment will look is a decision. The segment must leave from one point and head toward the next, but not according to a mathematically perfect trajectory. The program says, in effect, "anything between I degree and 2 degrees to the left or right will be fine. I'll worry about it afterwards. " When drawing the next segment it can correct for previous inaccuracies. Or, to put it another way, the program uses randomness controlled by feedback. It guides itself by constantly monitoring its own behavior.
In the early stages of a drawing, the choices about what and where to draw were random, but as more and more figures appeared on the canvas, the program based its decisions on what was already there. (Aaron couldn't actually see the picture it was drawing - it didn't have eyes, but as it drew it stored in its memory information about what was on the page.) This too seemed to be what a human would do. There is no logical basis for deciding how to begin a picture. "Leonardo advised the artist to throw a dirty sponge at the wall as a way of 'suggesting' how to start the painting," Cohen said. But once we put an initial image on the paper, its presence affects what we do next. We might move on and draw another figure closed or open, bigger or larger than the one before. Or we might augment a figure already there, by shading it or drawing another figure inside. Or if we have just drawn an open form we might decide to repeat it - a V would become a zigzag. As more and more images fill the paper, our decisions become less random and based more on our sense of aesthetics and style.
Cohen gave Aaron its aesthetics in the form of if-then rules, or heuristics:
"If the last figure you drew was open and at least X figures have been drawn and at least Y of them were open and at least Z units of space are now available, then, draw a closed figure according to the following rules:
50 percent of the time make it two-sided (a simple loop)
and 32 percent of the time make it three-sided, et cetera
50 percent of the time make its proportions between 1:4 and 1:6; 12 percent
of the time make them between 3:4 and 7:8, et cetera."
If Aaron was filling in a figure that it had previously drawn, it would be guided by rules telling it to shade it, divide it into sections, or draw another figure inside it. All these rules were of the form "If A, B, and C . . .then P percent of the time do X; Q percent of the time do Y; and R percent of the time do Z." Instead of hard-and-fast rules that said do X all the time, the program had a general sense that under certain conditions it is good to do X a certain amount of the time.
Other rules--the ones used by Artwork--controlled the distribution of images across the picture. Still other rules helped the program avoid having new figures bump into old ones. If Aaron was drawing a curve and the pen came too close to another figure, it would veer away from it. While at first the program respected territorial integrity this way, later Cohen gave it rules for overlapping, allowing it to create drawings that showed a sense of perspective one thing appeared to be in front of another.
&&&&&&&&
During Aaron's initial development in the
mid-70s, most of it drawings were done on a computerscreen. When the machine
was turned off, the program's creations disappeared into the ether. Then
in 1976 Cohen needed a way to show off Aaron to an audience, so that
whole crowds could watch the artist at work. So he made a robot,
a small remote-controlled truck that carried a pen. Cohen would place this
turtle, as he called it, on a large sheet of paper. As Aaron
sent it signals through a cable, the turtle would crawl around and draw.
Cohen does all his own electronic and mechanical work. To keep track
of where the turtle was, he rigged up a sonar navigatior system. The
turtle emitted signals, which were picked up by receivers at the bottom
two corners of the drawing. By calculating the time it took
the signals to reach the receivers, Aaron could determine how far
north, south, east, or west the turtle was. From knowledge of the turtle's
previous positions, Aaron could infer the direction in which it was
traveling.
In 1977, Cohen introduced Aaron and the turtle at two exhibitions: Documenta VI in Kassel, West Germany, and at the Stedelijk Museum in Amsterdam. Two years later he demonstrated the system at the Museum of Modern Art in San Francisco. As audiences watched, the turtle drew huge pictures, each one taking two to four hours to produce. The three hundred or so if-then rules interacted to generate pictures that were complex, distinctive, and pleasing to the eye. There were so many ways in which the rules could interact that no two drawings were ever the same. Under the guidance of Aaron, the turtle crawled about, producing curves and zigzags, figures that looked like mountains, rocks, or clouds. But most of all, Aaron's drawings looked very much like petroglyphs. The pictures were crude but startling, for they seemed to have been created according to some sense of artistic standards, by some sort of mind.
As Cohen watched
Aaron draw, it was easy to forget the nature of the rules it used to make
decisions. They interacted so seamlessly that it was never clear
what was causing what. For Aaron's audiences, the experience
was mesmerizing. Having finished one figure, Aaron would pause, reflect,
then move to a new place on the paper to begin another form. Some
perplexed viewers insisted that Cohen must have made up the drawings
in advance, then coded them step by step into the computer, which would
mindlessly regurgitate them. Others believed the turtle must simply
be wandering at random. But it was that interplay between planning
and chance that was the essence of Aaron's art. Each drawing was an original,
but all were united by a certain style. People described the drawings
as warm, humorous. "Oh, this must be a beach scene," some
would say. A museum guard speculated that Cohen must be from SanFrancisco,
since his program had obviously drawn the Twin Peaks. Those familiar
with drawings that Cohen himself had done in his earlier days told
him that Aaron's work was reminiscent of his own. He doesn't
know quite what to make of that.
By showing in such a
graphic way how something as intangible as style can arise from the interaction
of rules, Aaron demonstrated what computer scientists call emergence.
Each rule that Cohen gave Aaron was clear and simple in itself, but
as they worked together what emerged was a characteristic way of drawing
that, while difficult to describe, was recognizable. There was such
a thing as Aaronesque. The word "synergy" comes to mind:
the whole is greater than the sum of its parts. But there is nothing
mystical about that. Just as the simple rules of grammar generate
the complexity of a language, Aaron's grammar could generate all possible
drawings of a certain kind. Recall the few simple rules in John Horton
Conway's game of Life.
To see one of the drawings in isolation,
without knowing its origin, was to relive Cohen's petroglyph experience.
It was as though you had been wandering in the desert and suddenly come
upon a wall of images. You would begin to interpret. "These
look like mountains, this is lightning, this is a cloud." You
might notice patterns similar to those in other drawings or paintings
you had seen on rocks in various parts of the world. If you were a
devotee of Carl Jung, you might take this as evidence that all humanity
shares in its collective unconscious a repository of mythological
images. If, on the other hand, you were a fan of Erich von Daniken you
might take this as proof that visitors from another world had spread their
symbols to cultures far and wide. (Some people insist they see
spaceships drawn on walls of caves.) But, in the case of Aaron's
petroglyphs, there would be no need to posit ancient astronauts or
a communal mind - just a few syntactic rules, all concerned
with form not content. The difference between inside and outside, closed
and open, figure and groundthat was all Aaron knew. That was
all an artist, living anywhere at any time on the planet, would need
to produce figures that seemed meaningful.
In the early 1980s, Cohen scrapped
the turtle for a more familiar computer plotter. The result was smaller, more
precise drawings that could be generated in minutes instead of hours. During
a single exhibition Aaron produced thousands of drawings. But
who, Cohen is often asked, is the artist? He wrote the program, but Aaron does
the work, making the millions of decisions that go into producing a piece
of art. Some critics have wondered if Cohen really should be signing
his name to the pictures. That depends on what one considers to be
the art work - Aaron or the drawings it does. Or both.
For almost a decade now, Aaron has continued to evolve. And Cohen has acquired a better computer. Since 1977 Digital Equipment Corporation has generously supplied Cohen with the machinery he needs. In 1983, in recognition of the wide exposure his work was receiving, they gave him a $ 125,000 computer, a VAX-750. Using this powerful new equipment Cohen has been supplementing Aaron's syntactic rules of drawing with semantics, giving the program some rudimentary knowledge about the outside world what people look like, for example.
"At the moment, the knowledge that the program has is quite trivial, Cohen said. "It knows that one particular kind of closed form actually has the label 'head.' If it draws a head it knows it's going to have to put in features; it knows that the body comes underneath; it knows that there are appendages that attach to it - things like that. So now it will intentionally draw figures. It's not that it will do something and you willsay, 'Oh, that must be a figure.' There's no question about it that's what it intended to do. The program was intentional before in the sense that it clearly knew what it was doing, but it wasn't intentional in regard to what you saw there. Now if you recognize a person standing there, that's because it put a person there. It knew. Trivial though it is, it's a critical, radical departure. What it knows about the things in the world affects what it does in the making of the drawing."
Aaron also knows about blocks.
"It knows that blocks can be piled up on top of each other, but it knows also that blocks can't be piled on top of people. So you look at a drawing and see lots of piled blocks, and people standing on top of the blocks, but never a time when a block is on top of a person. So the drawing reflects the program's knowledge of the world."
Aaron also knows that people can't be piled on people. It knows how to draw plantlike structures trees,with branches fanning from a trunk. The result is drawings that are striking in their sophistication. While the earlier works seemed childlike and friendly, some of the new ones have an eerie, almost contemplative quality. Abstract people seem to lounge on abstract rocks, lost in thought. Weird trees loom in the background. It sounds funny to say so, but the new drawings seem more mature.
At first, Cohen was concerned that if Aaron knew something about the figures it was drawing, the art would lose its effect. "I had a strong suspicion that if you started referring to things explicitly, evocation would depart in a big hurry. It's a bit like those early Kandinskys - they really required you to be teetering on the edge of meaningfulness for them to work.
"He was glad to discover that he was wrong.
"It turned out that evocation did not depart in a big hurry at all. It simply changes location. Where in the earlier program the viewer addresses the question of what is there, on the piece of paper, now the viewer addresses the question of why are those things there, why is that big one on his own over there, while the other figures are all pointing at each other? In other words the evocation moves on to a kind of dramatic level rather than simply an identification level. It's a much more complex kind of thing that is going on.
"You can imagine now that if you push the idea of dramatic reading further, then clearly the next bunch of changes that get made in the program would have to do with the kinds of gestures the people in the drawings make. If one person is standing here and you draw another person alongside it, and this one puts its arm out, then it would appear to be putting its arm around the figure." But, if the two figures were drawn farther apart, and one put out its arm, then it might appear to be pointing at the other one. The mood of the drawing would change,from comforting to accusatory.
In the summer of 1985 Cohen began extending Aaron's knowledge to achieve these kinds of subtle effects. Aaron now knows how big various body parts are in relation to one another and what range of movements theyare capable of. As this semantic knowledge becomes more refined, the drawings become increasingly lifelike. Using a program he calls a "tutor," Cohen continues to give Aaron new drawing rules, such as how to make an object appear to fold where it bends.
"The hope is that somewhere along the line the program will be capable of saying, 'Yeah, a fold's fine but enough is enough and I should only do them with a certain frequency.' The ability to do that rests upon being able to establish criteria. Why would it say, 'I should only do them with a certain frequency'? One answer isthat there might be a general heuristic that says, 'Don't do anything all the time.' So that might be something that's been put there by the tutor in the form of a rule. The longer-range intention is for the program to be able to judge what it is doing in terms of its own output and say, 'Well, I know that there's a rule that says X, but that appears not to be a very good rule, and I should now be able to modify that.' " By comparing what it has done with what it had meant to do, Aaron would gradually refine and expand its knowledge. It would learn to draw better pictures. "We're really talking about long-range stuff," Cohen said. "I'm not in the position to do that now. The key is the provision of criteria, adequately powerful criteria of performance, and so far that's been a very elusive goal."
As Aaron becomes more sophisticated, producing drawings that seem to get better all the time, Cohen feels he is succeeding in his quest to demystify creativity.
"After nearly thirty years spent in making art, in the company of other artists, I am prepared to declare that the artist has no hot-line to the infinite, and no uniquely delineated mind functions," he wrote. "What he does, he does with the same general-purpose equipment that everybody has."
During exhibitions, such as one in 1983 at the Tate, Cohen sells Aaron's drawings for $20 each. A drawing of similar quality by a human artist would sell for many times as much, in part because it would take so much longer to produce. Cohen hand-colors some of Aaron's drawings (according to Cohen's own internal program), and they sell for $2,000. But Cohen is hoping to teach Aaron about color soon. Everyone, he believes, should be able to afford original works of art. And people should be dispelled of the illusion that there is something mysterious about the artistic process, that the artist is someone who is blessed with inexplicable powers. Cohen wrote:
Like any other cultural function, art will be dominated by those who do it best, and doing anything best is extraordinary by definition. Extraordinary things, from birds' feathers to Michelangelo sculptures, become more extraordinary, not less, the more one knows about them. You don't improve upon the wonderfulness of art by thinking of it as magic. Artcan be wonderful precisely because normal people can do such things with normal intellectual hardware.
My own belief is that lies are bad for people: they have a right to know that some human beings have used normal resources to do remarkable things. Being made to believe that one only does remarkable things by virtue of having remarkable resources turns the individual (who knows his/her resources to be 'normal') into a second-class citizen for whom there is no hope.
Or, as Cohen wrote in an invited paper at the 1970 International Joint Conference on Artificial Intelligence in Tokyo, Japan: "[A]rt is an elaborate and sophisticated game played around the curious fact that within the mind things can stand for other things." The art we see in museums is "a complex interweaving" of the sensibility of the individual artist and of the culture he lives in. "But ultimately, art itself, as opposed to its manifestations, is universal because it is a celebration of the human mind."
While Cohen continues to refine Aaron, some of his colleagues have tried applying computer creativity to other media, though with notably less success. Racter, a program designed by writer William Chamberlain and programmer Thomas Etter, generates surrealistic-sounding prose in a manner reminiscent of Dada artists of the early twentieth century, who wrote poetry by randomly picking words from a paper bag. Racter (short for "raconteur" it was originally written on a personal computer that only accepted program names up to six letters long) composes stories by wandering about in its memory, seeing what tidbits its creators have left for it to find. Some memory locations carry words, conveniently marked with tags indicating that they belong to certain categories - types of animals, for example; others contain stock phrases. Still other locations hold commands telling the program how to put words and phrases together into likely-sounding sentences. For example, during its peregrinations Racter might chance upon a command instructing it to start a sentence with the word "The," followed by a noun that is an animal name, followed by an "eating" verb (conjugated into the third person past tense), followed by "the" again, and then ending with a noun that names a type of food. Thus the program would spew forth a sentence "The otter ate the artichoke "that is not only grammatical but that makes a certain, superficial kind of sense. In addition to sentence structures, Racter alsois supplied with patterns for constructing semicoherent paragraphs and stories.
To add to the illusion of meaningfulness, Racter keeps track of words and phrases it has recently used and periodically reinjects them into the narrative stream. As a result, Chamberlain explained, the program "seems to spin a thread of what might initially pass for coherent thinking.... [I]ts output is not only new and unknowable, it is apparently thoughtful."
Racter, which is available on floppy disks to owners of personal computers, is not intended as a serious attempt at AI research. But the program produces output that is undeniably captivating. Some of Racter's musings were collected in a book entitled The Policeman's Beard Is Half Constructed. In a story called "Soft Ions," the program included this intriguing passage:
Helene watched John and cogitated: A supper with him? Disgusting! A supper would facilitate a dissertation,and a dissertation or tale was what John carefully wanted to have. With what in mind? Wine, otters, beans? No! Electrons! John simply was a quantum logician; his endless dreams were captivating and interesting; at all events Matthew, Helene, and Wendy were assisting him in his infuriated tries to broaden himself. Now legions of dreams itched to punch Wendy's consciousness. Yet John whispered, "Just a minute! Helene's a maid, I'm a quantum logician; can maids know galaxies and even stars or a multitude of galactic systems? The universe is frightening, little, gargantuan; can maids recognize electrons? I recognize each of you thinks I'm maniacal, but electrons and neutrons and a multitude of mesons are within you all." . . .
Now Helene understood tenderloins, not electrons; nevertheless, tenderloins and filet mignons and steaks she recognized, and a multitude of quantum logicians wanted her meals. Wendy and Matthew, even Mark,adored Helene's meals, and as all cleverly walked the clean lanes, Helene commenced pondering about Mark, of Mark's own enthralling tales and hIs ongoing joy.
When A. K. Dewdney, author of the "Computer Recreations" columnin Scientific American, wrote about Racter, he arranged for that old AI war-horse Eliza to psychoanalyze it. The resulting conversation went like this:
Hello. Please state your problem. Excellent!
Excuse me a minute while I attend to some business. I'll be right back.
Why do you need to tell me excuse you while you attend to some business? Is it better to need or to despise? Why is it that I do need to tell you to excuse myself to attend to some business? Because electrons are brave. Braveness is nice, don't you think? And the needers all insist upon it Do you imagine that I am a Communist?
While Eliza's other patient, Parry, was a simulation of paranoia, Racter, Dewdney observed, sounded more like a schizophrenic. Its "prose" was less like intelligible English than like freehand drawings made with words instead of lines or, better yet, like some kind of verbal jazz.
Of all contemporary art forms, perhaps jazz is the one that seems least likely to yield to the powers of computation. It has, after all, an almost supernatural air. We think of dark clubs in back-street basements where soloists, their eyes squeezed shut, play notes that swirl like smoke through the air. The music seems impromptu, spontaneous; the players entranced, as though seized by a vision,connected to the cosmic or lost in some private universe. What emerge are newly coined phrases strung together and played in a way never heard quite like that before. If freehand drawing has its counterpart inmusic, then it must be in jazz. As the rhythm section lays out the background, playing the chords and drumbeats that form the harmonic and rhythmic boundaries of the piece, the soloist improvises, embellishing and transforming the melody, revealing the meanings that lie hidden in a song.
Though he admires Aaron's drawings, David Alex Levitt, a graduate student in MIT's artificial-intelligence lab, is unfamiliar with the details of Cohen's program. But in his efforts to apply to jazz thecomputational credo, Levitt shares Cohen's conviction that creativity is an intellectual skill, a process,something a computer can do. There are principles of improvisation, Levitt believes. While in many players they might be so ingrained as to be unconscious, perhaps they can be teased out and turned into programs.
"People often talk of music making as though it does not involve intelligence," Levitt wrote, "only esthetic[s], intuition, and feeling. But this is an excessively romantic view; we certainly solve problems when we make music. Composers and improvisers . . . fit melodic contours into new harmonic contexts, avoid 'dissonances,' and generally find ways of satisfying some description we have of the music we're trying to make. The solution to a specific, completely defined musical problem is often easy to find, and sometimes there will be many obvious solutions; other times we may need to search, and sometimes no solution can be found."
Music making, as Herbert Simon might say, is problem solving, though the nature of the problem is not always easy to define.
"For the musician, the problem is often simply 'compose something interesting,' " Levitt wrote. "In a way it is subtler than many Natural Language problems - imagine asking a language production program to look into its database and 'say something clever.' "
The unstructured nature of "the jazz problem," as Levitt called it, intrigued him, so for his master's thesis project in the early 1980s, he set out to write a program that could play a simple jazz tune. Given a sequence of chords and a melody, it would improvise, producing solos that were unpredictable yet within the constraints of what a listener would consider musical. What jazz players do on the fly, composers do in a slower, more methodical manner: they invent music that is original but follows (and occasionally bends) the rules. If Levittcould get a computer to play jazz, he reasoned, it would be a step toward designing programs that would aid composers with their art.
Levitt became interested in applying computers to musical composition while he was an undergraduate at Yale in the middle 70s. "I was a self-trained ragtime and jazz pianist," he recalled, "teaching myself to improvise and listening to a lot of Fatha Hines and Fats Waller." Waller and Hines had big hands, enabling them to strike chords that spanned great lengths of the keyboard. On a piano, an octave is eight keys wide (twelve if you count white and black notes). Levitt's idols could simultaneously hit notes as many as ten keys apart with one hand - what is known in music theory as a tenth. "I fortunately was able to stretch myself to play tenths," Levitt said, "but it was clear that there were a lot of totally unmusical reasons why there were things that I could hear that I couldn't play." There were not only physical limitations, like small hands, but mental ones as well.
"If I'm harmonizing a melody and I want to know all the dominant seventh chords that contain a certain note, there's this mental computation that I go through, followed by this physical computation." The chord not only must contain the right notes, it has to be playable on the keyboard. "It's a real task," he said. At Yale Levitt was studying engineering and applied science, and it was natural for him to wonder whether a computer could be programmed to help with some of the mental drudgery that accompanied the otherwise uplifting taskof composing.
Working with a friend who was a music major, Levitt made his first, crude jazz program during his junior year. The program was based on what mathematicians call a correlation matrix, something Levitt learned about in a class called "The Computer as a Research Tool." "The professor demonstrated that you could get some very Shakespeare-like prose out of the computer if you set up a correlation matrix saying how often an a followed an a or an a followed an, et cetera."
A matrix can be thought of as a grid, like those used in an atlas to show the mileage between two cities. Forming the top of the matrix would be a row of the twenty-six letters of the alphabet. A column of twenty-sixletters would form the left-hand edge. If you wanted to see how often, in Shakespeare's work, d followed e, you would find the row marked d and run your finger across it until you reached the column marked e. At the intersection, the answer would lie.
Using such a matrix as part of a program, the professor had been able to get a computer to throw together letters into something that sounded vaguely Shakespearean. The process is reminiscent of the old story about the monkeys and the typewriters. If a room full of monkeys banged on typewriters, eventually (but not before the universe died) they would produce all the literature that has ever been (and will ever be) written. Among the pages of gibberish, an occasional Shakespearean play would be there, along with the reports we wrote in the second grade and all our love notes and shopping lists. Likewise, if a computer systematically produced every possible combination of fifty letters and spaces, it would generate, somewhere among the mess, everysentence of that length that could possibly be written. Even that would take almost forever. But by constraining the task, so that not just any letter can follow any other (no qzs allowed), one can get the computer to produce less garbage. If clusters of letters must be formed according to a statistical analysis ofhow often they occur in English, or, more specifically, in Shakespeare, what would emerge from the random juxtapositions would sound more and more convincing. Some words would be real, others would at least sound English-like: "creep," for example. The computer wouldn't know what it was saying, or why; no meaning would be intended. But it would be an interesting experiment in the possibilities of syntactic shuffling.
Levitt and his friend wondered what would happen if they used a correlation matrix to generate Charlie Parker-like music. They'd gauge the frequency with which certain groups of notes"licks," as they say in jazz appeared. "If the matrix was of aufficiently high order," Levitt explained, "and the data base was small and contained only Charlie Parker licks, then occasionally the music would sound characteristic. But in general it didn't say much about how Charlie Parker improvised.
"Later, in Roger Schank's lab, we did something a little more sophisticated that had Lisp functions for modulation and arpeggiation and scalewise motion and such things that came a little bit closer to what we knew we did when we improvised on the piano." But it was not until he graduated and went to MIT that Levitt seriously began working on a jazz program.
As a graduate student Levitt worked with Marvin Minsky, who is himself an accomplished musician. Squeezed inside Minsky's small office at the AI laboratory is an electronic organ. As a diversion he often improvises Baroque style fugues. With Minsky's encouragement, Levitt began writing a program that knewsome of the principles of jazz improvisation. Levitt's program ran on a Lisp machine hooked to a synthesizer, a device that electronically produces a wide range of sounds. Using the keyboard of the synthesizer, Levitt gave the system a sequence of chords and a melody, which were translated by the program into Lisp structures. Then the program devised an improvisation and played it by sending signals to the synthesizer.
Before it could perform its solo, the program had to do a great deal of processing. First it rapidly analyzed the chord sequence and melody to establish the boundaries within which it could improvise. It did this by consulting some heuristics Levitt developed that captured the idea that an improvisation shouldn't stray too far from the melody lest the listener think it strange, but that it shouldn't be too close; then it might sound boring. During this analysis, the program also chopped the melody into small, two measure phrases. Then it ranked them according to how musically interesting they were. Those with the highest ratings would be used as the foundation of the improvisation. As in Lenat's Automated Mathematician, "interestingness" was captured in the form of heuristics. Levitt devised these rules by introspecting on what he did when he listened to a piece or improvised on the piano. A phrase is interesting, for example, if it contains a musical surprise: an unresolved chromatic tone, a non chordal leap, or a syncopation.
In devising his heuristics, Levitt considered music as a psychological process. He was concerned with "theories of what listeners notice and remember . . . what listeners expect" and what they consider surprising. "[T]he musician," he wrote, "directs the thoughts of [the] audience." Even for people who don't know a major seventh from a minor third, or what an unresolved chromatic tone or syncopation is, hearing certain chords creates expectations that composers can play with, either furfilling them or not, as they see fit. In this way, such effects as suspense and tension are achieved. We feel relieved when a piece, after wandering through the mazes of musical space,returns home to its original key and theme, even if we don't understand the details of the journey.
Once the initial analysis was done, Levitt's program was ready to perform. First it played the tune straight, without alteration. Then it began to improvise. It did this by picking the most interesting phrases and scanning them for certain features which, if present, triggered the appropriate rules. These heuristics told the program to change certain aspects of the phrase while keeping others the same. To demonstrate his program for his master's thesis, Levitt gave it an old New Orleans tune called "Basin Street Blues." Levitt readily admits that the program's solo was not as interesting as one a human would do. "Human soloists have a much larger vocabulary of features to imitate, and [they] use them more judiciously," he wrote. He considered the program "a crude but relatively plausible improvisation model." The program was not as sophisticated an artist as Aaron. It could hardly be said to have a unique and engaging style. But it was based on the same spirit of demystification, on the belief that something as ineffable as improvisational style might be thought of as rules interacting with rules, a product of emergence.
For his doctoral dissertation Levitt refined his theories. During a temporary job at Atari he worked on a program that analyzes a mdsical score and uses computer graphics to display patterns that would be of interest to an improviser or composer. Using the program one can "see the score not as just a score but in a different form," Levitt explained, "as a pattern of consonances and dissonances and as a rhythm pattern, as a pattern of ups and downs, as a pattern of scale degrees with respect to the root of the harmonic center of the piece, or the root of the chord." While these terms won't mean much to a nonmusician, the important point is that the program can find patterns in a piece of music that might not be immediately obvious features of its style.
Levitt tried out his new system with Scott Joplin piano rags. The computer would partially analyze apiece, producing a template that displayed some of the characteristics of Joplin's style. Then Levitt would examine the template, marking all the things in the analysis that he considered important. Then he would store the template in the computer's memory. Now, after different chords and a different melody were plugged into the system, Joplinlike music would come out the other end. The computer produced, somewhat in the Joplin style, music that the composer had actually never written.
The program needs a great deal of refining before it can perform as well as Levitt would like, though it does produce what might pass for rudimentary Joplin. But Levitt's aim is not so much to produce an automatic improviser as to make tools that will help composers. He hopes the template system will lay the groundwork for a program that can be used as a composer's assistant. If music making is an intelligent process, better music might result by using computers to extend the reach of the composer's mind, just as Levitt was able to stretch his fingers to play chords as wide as those of Fats Waller.
copyright 1986 by George Johnson