Excerpts from Alan Turing's " Computing Machinery and
Intelligence"
Alan Turing (1912-1954) was a
British Mathematician. During World War II, Turing's work with the
other mathematicians and analysts at Bletchley Park resulted in the
construction of early computers which were able to decipher German
coded messages sent using their supposedly unbreakable Enigma
encryption machines. Turing is also widely regarded as one of the
pioneering thinkers of digital computing and his "Turing Test,"
described below, remains influential as a way of thinking about the
idea of artificial intelligence. Despite his contributions to the
victorious British war effort, Turing was persecuted for being
violating laws against male homosexuality in Britain, and ultimately
ended his own life as a result.
1. The Imitation Game
I propose to consider the question, "Can machines think?" This
should begin with definitions of the meaning of the terms "machine"
and "think." The definitions might be framed so as to reflect so far
as possible the normal use of the words, but this attitude is
dangerous, If the meaning of the words "machine" and "think" are to
be found by examining how they are commonly used it is difficult to
escape the conclusion that the meaning and the answer to the
question, "Can machines think?" is to be sought in a statistical
survey such as a Gallup poll. But this is absurd. Instead of
attempting such a definition I shall replace the question by
another, which is closely related to it and is expressed in
relatively unambiguous words.
The new form of the problem can be described in terms of a game
which we call the 'imitation game." It is played with three people,
a man (A), a woman (B), and an interrogator (C) who may be of either
sex. The interrogator stays in a room apart front the other two. The
object of the game for the interrogator is to determine which of the
other two is the man and which is the woman. He knows them by labels
X and Y, and at the end of the game he says either "X is A and Y is
B" or "X is B and Y is A." The interrogator is allowed to put
questions to A and B thus:
C: Will X please tell me the length of his or her hair?
Now suppose X is actually A, then A must answer. It is A's object in
the game to try and cause C to make the wrong identification. His
answer might therefore be:
"My hair is shingled, and the longest strands are about nine inches
long."
In order that tones of voice may not help the interrogator the
answers should be written, or better still, typewritten. The ideal
arrangement is to have a teleprinter communicating between the two
rooms. Alternatively the question and answers can be repeated by an
intermediary. The object of the game for the third player (B) is to
help the interrogator. The best strategy for her is probably to give
truthful answers. She can add such things as "I am the woman, don't
listen to him!" to her answers, but it will avail nothing as the man
can make similar remarks.
We now ask the question, "What will happen when a machine takes the
part of A in this game?" Will the interrogator decide wrongly as
often when the game is played like this as he does when the game is
played between a man and a woman? These questions replace our
original, "Can machines think?"
2. Critique of the New Problem
As well as asking, "What is the answer to this new form of the
question," one may ask,
"Is this new question a worthy one to investigate?" This latter
question we investigate
without further ado, thereby cutting short an infinite regress.
The new problem has the advantage of drawing a fairly sharp line
between the physical
and the intellectual capacities of a man. No engineer or chemist
claims to be able to
produce a material which is indistinguishable from the human skin.
It is possible that at
some time this might be done, but even supposing this invention
available we should feel
there was little point in trying to make a "thinking machine" more
human by dressing it
up in such artificial flesh. The form in which we have set the
problem reflects this fact in
the condition which prevents the interrogator from seeing or
touching the other
competitors, or hearing -their voices. Some other advantages of the
proposed criterion
may be shown up by specimen questions and answers. Thus:
Q: Please write me a sonnet on the subject of the Forth Bridge.
A: Count me out on this one. I never could write poetry.
Q: Add 34957 to 70764.
A: (Pause about 30 seconds and then give as answer) 105621.
Q: Do you play chess?
A: Yes.
Q: I have K at my K1, and no other pieces. You have only K at K6 and
R at R1. It is your move. What do you play?
A: (After a pause of 15 seconds) R-R8 mate.
The question and answer method seems to be suitable for introducing
almost any one of the fields of human endeavour that we wish to
include. We do not wish to penalise the machine for its inability to
shine in beauty competitions, nor to penalise a man for losing in a
race against an aeroplane. The conditions of our game make these
disabilities irrelevant. The "witnesses" can brag, if they consider
it advisable, as much as they please about their charms, strength or
heroism, but the interrogator cannot demand practical
demonstrations.
The game may perhaps be criticised on the ground that the odds are
weighted too heavily against the machine. If the man were to try and
pretend to be the machine he would clearly make a very poor showing.
He would be given away at once by slowness and inaccuracy in
arithmetic. May not machines carry out something which ought to be
described as thinking but which is very different from what a man
does? This objection is a very strong one, but at least we can say
that if, nevertheless, a machine can be constructed to play the
imitation game satisfactorily, we need not be troubled by this
objection.
It might be urged that when playing the "imitation game" the best
strategy for the machine may possibly be something other than
imitation of the behaviour of a man. This may be, but I think it is
unlikely that there is any great effect of this kind. In any case
there is no intention to investigate here the theory of the game,
and it will be assumed that the best strategy is to try to provide
answers that would naturally be given by a man.
***
4. Digital Computers
The idea behind digital computers may be explained by saying that
these machines are intended to carry out any operations which could
be done by a human computer. The human computer is supposed to be
following fixed rules; he has no authority to deviate from them in
any detail. We may suppose that these rules are supplied in a book,
which is altered whenever he is put on to a new job. He has also an
unlimited supply of paper on which he does his calculations. He may
also do his multiplications and additions on a "desk machine," but
this is not important.
If we use the above explanation as a definition we shall be in
danger of circularity of argument. We avoid this by giving an
outline. of the means by which the desired effect is achieved. A
digital computer can usually be regarded as consisting of three
parts:
(i) Store.
(ii) Executive unit.
(iii) Control.
The store is a store of information, and corresponds to the human
computer's paper, whether this is the paper on which he does his
calculations or that on which his book of rules is printed. In so
far as the human computer does calculations in his bead a part of
the store will correspond to his memory.
The executive unit is the part which carries out the various
individual operations involved in a calculation. What these
individual operations are will vary from machine to machine. Usually
fairly lengthy operations can be done such as "Multiply 3540675445
by 7076345687" but in some machines only very simple ones such as
"Write down 0" are possible.
We have mentioned that the "book of rules" supplied to the computer
is replaced in the machine by a part of the store. It is then called
the "table of instructions." It is the duty of the control to see
that these instructions are obeyed correctly and in the right order.
The control is so constructed that this necessarily happens.
The information in the store is usually broken up into packets of
moderately small size. In one machine, for instance, a packet might
consist of ten decimal digits. Numbers are assigned to the parts of
the store in which the various packets of information are stored, in
some systematic manner. A typical instruction might say-
"Add the number stored in position 6809 to that in 4302 and put the
result back into the latter storage position."
Needless to say it would not occur in the machine expressed in
English. It would more likely be coded in a form such as 6809430217.
Here 17 says which of various possible operations is to be performed
on the two numbers. In this case the)e operation is that described
above, viz., "Add the number. . . ." It will be noticed that the
instruction takes up 10 digits and so forms one packet of
information, very conveniently. The control will normally take the
instructions to be obeyed in the order of the positions in which
they are stored, but occasionally an instruction such as
"Now obey the instruction stored in position 5606, and continue from
there"
may be encountered, or again
"If position 4505 contains 0 obey next the instruction stored in
6707, otherwise continue straight on."
Instructions of these latter types are very important because they
make it possible for a sequence of operations to be replaced over
and over again until some condition is fulfilled, but in doing so to
obey, not fresh instructions on each repetition, but the same ones
over and over again. To take a domestic analogy. Suppose Mother
wants Tommy to call at the cobbler's every morning on his way to
school to see if her shoes are done, she can ask him afresh every
morning. Alternatively she can stick up a notice once and for all in
the hall which he will see when he leaves for school and which tells
him to call for the shoes, and also to destroy the notice when he
comes back if he has the shoes with him.
The reader must accept it as a fact that digital computers can be
constructed, and indeed have been constructed, according to the
principles we have described, and that they can in fact mimic the
actions of a human computer very closely.
The book of rules which we have described our human computer as
using is of course a convenient fiction. Actual human computers
really remember what they have got to do. If one wants to make a
machine mimic the behaviour of the human computer in some complex
operation one has to ask him how it is done, and then translate the
answer into the form of an instruction table. Constructing
instruction tables is usually described as "programming." To
"programme a machine to carry out the operation A" means to put the
appropriate instruction table into the machine so that it will do A.
An interesting variant on the idea of a digital computer is a
"digital computer with a random element." These have instructions
involving the throwing of a die or some equivalent electronic
process; one such instruction might for instance be, "Throw the die
and put the-resulting number into store 1000." Sometimes such a
machine is described as having free will (though I would not use
this phrase myself), It is not normally possible to determine from
observing a machine whether it has a random element, for a similar
effect can be produced by such devices as making the choices depend
on the digits of the decimal for .
Most actual digital computers have only a finite store. There is no
theoretical difficulty in the idea of a computer with an unlimited
store. Of course only a finite part can have been used at any one
time. Likewise only a finite amount can have been constructed, but
we can imagine more and more being added as required. Such computers
have special theoretical interest and will be called infinitive
capacity computers.
The idea of a digital computer is an old one. Charles Babbage,
Lucasian Professor of Mathematics at Cambridge from 1828 to 1839,
planned such a machine, called the Analytical Engine, but it was
never completed. Although Babbage had all the essential ideas, his
machine was not at that time such a very attractive prospect. The
speed which would have been available would be definitely faster
than a human computer but something like I 00 times slower than the
Manchester machine, itself one of the slower of the modern machines,
The storage was to be purely mechanical, using wheels and cards.
The fact that Babbage's Analytical Engine was to be entirely
mechanical will help us to rid ourselves of a superstition.
Importance is often attached to the fact that modern digital
computers are electrical, and that the nervous system also is
electrical. Since Babbage's machine was not electrical, and since
all digital computers are in a sense equivalent, we see that this
use of electricity cannot be of theoretical importance. Of course
electricity usually comes in where fast signalling is concerned, so
that it is not surprising that we find it in both these connections.
In the nervous system chemical phenomena are at least as important
as electrical. In certain computers the storage system is mainly
acoustic. The feature of using electricity is thus seen to be only a
very superficial similarity. If we wish to find such similarities we
should took rather for mathematical analogies of function.
***
6. Contrary Views on the Main Question
We may now consider the ground to have been cleared and we are ready
to proceed to the debate on our question, "Can machines think?" and
the variant of it quoted at the end of the last section. We cannot
altogether abandon the original form of the problem, for opinions
will differ as to the appropriateness of the substitution and we
must at least listen to what has to be said in this connexion.
It will simplify matters for the reader if I explain first my own
beliefs in the matter. Consider first the more accurate form of the
question. I believe that in about fifty years' time it will be
possible, to programme computers, with a storage capacity of about
109, to make them play the imitation game so well that an average
interrogator will not have more than 70 per cent chance of making
the right identification after five minutes of questioning. The
original question, "Can machines think?" I believe to be too
meaningless to deserve discussion. Nevertheless I believe that at
the end of the century the use of words and general educated opinion
will have altered so much that one will be able to speak of machines
thinking without expecting to be contradicted. I believe further
that no useful purpose is served by concealing these beliefs. The
popular view that scientists proceed inexorably from
well-established fact to well-established fact, never being
influenced by any improved conjecture, is quite mistaken. Provided
it is made clear which are proved facts and which are conjectures,
no harm can result. Conjectures are of great importance since they
suggest useful lines of research.
***
7. Learning Machines
The reader will have anticipated that I have no very convincing
arguments of a positive nature to support my views. If I had I
should not have taken such pains to point out the fallacies in
contrary views. Such evidence as I have I shall now give.
Let us return for a moment to Lady Lovelace's objection, which
stated that the machine can only do what we tell it to do. One could
say that a man can "inject" an idea into the machine, and that it
will respond to a certain extent and then drop into quiescence, like
a piano string struck by a hammer. Another simile would be an atomic
pile of less than critical size: an injected idea is to correspond
to a neutron entering the pile from without. Each such neutron will
cause a certain disturbance which eventually dies away. If, however,
the size of the pile is sufficiently increased, tire disturbance
caused by such an incoming neutron will very likely go on and on
increasing until the whole pile is destroyed. Is there a
corresponding phenomenon for minds, and is there one for machines?
There does seem to be one for the human mind. The majority of them
seem to be "subcritical," i.e., to correspond in this analogy to
piles of subcritical size. An idea presented to such a mind will on
average give rise to less than one idea in reply. A smallish
proportion are supercritical. An idea presented to such a mind that
may give rise to a whole "theory" consisting of secondary, tertiary
and more remote ideas. Animals minds seem to be very definitely
subcritical. Adhering to this analogy we ask, "Can a machine be made
to be supercritical?"
The "skin-of-an-onion" analogy is also helpful. In considering the
functions of the mind or the brain we find certain operations which
we can explain in purely mechanical terms. This we say does not
correspond to the real mind: it is a sort of skin which we must
strip off if we are to find the real mind. But then in what remains
we find a further skin to be stripped off, and so on. Proceeding in
this way do we ever come to the "real" mind, or do we eventually
come to the skin which has nothing in it? In the latter case the
whole mind is mechanical. (It would not be a discrete-state machine
however. We have discussed this.)
These last two paragraphs do not claim to be convincing arguments.
They should rather be described as "recitations tending to produce
belief."
The only really satisfactory support that can be given for the view
expressed at the beginning of ยง6, will be that provided by waiting
for the end of the century and then doing the experiment described.
But what can we say in the meantime? What steps should be taken now
if the experiment is to be successful?
As I have explained, the problem is mainly one of programming.
Advances in engineering will have to be made too, but it seems
unlikely that these will not be adequate for the requirements.
Estimates of the storage capacity of the brain vary from 1010 to
1015 binary digits. I incline to the lower values and believe that
only a very small fraction is used for the higher types of thinking.
Most of it is probably used for the retention of visual impressions,
I should be surprised if more than 109 was required for satisfactory
playing of the imitation game, at any rate against a blind man.
(Note: The capacity of the Encyclopaedia Britannica, 11th edition,
is 2 X 109 ) A storage capacity of 107 , would be a very practicable
possibility even by present techniques. It is probably not necessary
to increase the speed of operations of the machines at all. Parts of
modern machines which can be regarded as analogs of nerve cells work
about a thousand times faster than the latter. This should provide a
"margin of safety" which could cover losses of speed arising in many
ways, Our problem then is to find out how to programme these
machines to play the game. At my present rate of working I produce
about a thousand digits of progratiirne a day, so that about sixty
workers, working steadily through the fifty years might accomplish
the job, if nothing went into the wastepaper basket. Some more
expeditious method seems desirable.
In the process of trying to imitate an adult human mind we are bound
to think a good deal about the process which has brought it to the
state that it is in. We may notice three components.
(a) The initial state of the mind, say at birth,
(b) The education to which it has been subjected,
(c) Other experience, not to be described as education, to which it
has been subjected.
Instead of trying to produce a programme to simulate the adult mind,
why not rather try to produce one which simulates the child's? If
this were then subjected to an appropriate course of education one
would obtain the adult brain. Presumably the child brain is
something like a notebook as one buys it from the stationer's.
Rather little mechanism, and lots of blank sheets. (Mechanism and
writing are from our point of view almost synonymous.) Our hope is
that there is so little mechanism in the child brain that something
like it can be easily programmed. The amount of work in the
education we can assume, as a first approximation, to be much the
same as for the human child.
We have thus divided our problem into two parts. The child programme
and the education process. These two remain very closely connected.
We cannot expect to find a good child machine at the first attempt.
One must experiment with teaching one such machine and see how well
it learns. One can then try another and see if it is better or
worse. There is an obvious connection between this process and
evolution, by the identifications
Structure of the child machine = hereditary material
Changes of the child machine = mutation,
Natural selection = judgment of the experimenter
One may hope, however, that this process will be more expeditious
than evolution. The survival of the fittest is a slow method for
measuring advantages. The experimenter, by the exercise of
intelligence, should he able to speed it up. Equally important is
the fact that he is not restricted to random mutations. If he can
trace a cause for some weakness he can probably think of the kind of
mutation which will improve it.
It will not be possible to apply exactly the same teaching process
to the machine as to a normal child. It will not, for instance, be
provided with legs, so that it could not be asked to go out and fill
the coal scuttle. Possibly it might not have eyes. But however well
these deficiencies might be overcome by clever engineering, one
could not send the creature to school without the other children
making excessive fun of it. It must be given some tuition. We need
not be too concerned about the legs, eyes, etc. The example of Miss
Helen Keller shows that education can take place provided that
communication in both directions between teacher and pupil can take
place by some means or other.
We normally associate punishments and rewards with the teaching
process. Some simple child machines can be constructed or programmed
on this sort of principle. The machine has to be so constructed that
events which shortly preceded the occurrence of a punishment signal
are unlikely to be repeated, whereas a reward signal increased the
probability of repetition of the events which led up to it. These
definitions do not presuppose any feelings on the part of the
machine, I have done some experiments with one such child machine,
and succeeded in teaching it a few things, but the teaching method
was too unorthodox for the experiment to be considered really
successful.
The use of punishments and rewards can at best be a part of the
teaching process. Roughly speaking, if the teacher has no other
means of communicating to the pupil, the amount of information which
can reach him does not exceed the total number of rewards and
punishments applied. By the time a child has learnt to repeat
"Casabianca" he would probably feel very sore indeed, if the text
could only be discovered by a "Twenty Questions" technique, every
"NO" taking the form of a blow. It is necessary therefore to have
some other "unemotional" channels of communication. If these are
available it is possible to teach a machine by punishments and
rewards to obey orders given in some language, e.g., a symbolic
language. These orders are to be transmitted through the
"unemotional" channels. The use of this language will diminish
greatly the number of punishments and rewards required.
Opinions may vary as to the complexity which is suitable in the
child machine. One might try to make it as simple as possible
consistently with the general principles. Alternatively one might
have a complete system of logical inference "built in."' In the
latter case the store would be largely occupied with definitions and
propositions. The propositions would have various kinds of status,
e.g., well-established facts, conjectures, mathematically proved
theorems, statements given by an authority, expressions having the
logical form of proposition but not belief-value. Certain
propositions may be described as "imperatives." The machine should
be so constructed that as soon as an imperative is classed as "well
established" the appropriate action automatically takes place. To
illustrate this, suppose the teacher says to the machine, "Do your
homework now." This may cause "Teacher says 'Do your homework now' "
to be included amongst the well-established facts. Another such fact
might be, "Everything that teacher says is true." Combining these
may eventually lead to the imperative, "Do your homework now," being
included amongst the well-established facts, and this, by the
construction of the machine, will mean that the homework actually
gets started, but the effect is very satisfactory. The processes of
inference used by the machine need not be such as would satisfy the
most exacting logicians. There might for instance be no hierarchy of
types. But this need not mean that type fallacies will occur, any
more than we are bound to fall over unfenced cliffs. Suitable
imperatives (expressed within the systems, not forming part of the
rules of the system) such as "Do not use a class unless it is a
subclass of one which has been mentioned by teacher" can have a
similar effect to "Do not go too near the edge."
The imperatives that can be obeyed by a machine that has no limbs
are bound to be of a rather intellectual character, as in the
example (doing homework) given above. important amongst such
imperatives will be ones which regulate the order in which the rules
of the logical system concerned are to be applied, For at each stage
when one is using a logical system, there is a very large number of
alternative steps, any of which one is permitted to apply, so far as
obedience to the rules of the logical system is concerned. These
choices make the difference between a brilliant and a footling
reasoner, not the difference between a sound and a fallacious one.
Propositions leading to imperatives of this kind might be "When
Socrates is mentioned, use the syllogism in Barbara" or "If one
method has been proved to be quicker than another, do not use the
slower method." Some of these may be "given by authority," but
others may be produced by the machine itself, e.g. by scientific
induction.
The idea of a learning machine may appear paradoxical to some
readers. How can the rules of operation of the machine change? They
should describe completely how the machine will react whatever its
history might be, whatever changes it might undergo. The rules are
thus quite time-invariant. This is quite true. The explanation of
the paradox is that the rules which get changed in the learning
process are of a rather less pretentious kind, claiming only an
ephemeral validity. The reader may draw a parallel with the
Constitution of the United States.
An important feature of a learning machine is that its teacher will
often be very largely ignorant of quite what is going on inside,
although he may still be able to some extent to predict his pupil's
behavior. This should apply most strongly to the later education of
a machine arising from a child machine of well-tried design (or
programme). This is in clear contrast with normal procedure when
using a machine to do computations one's object is then to have a
clear mental picture of the state of the machine at each moment in
the computation. This object can only be achieved with a struggle.
The view that "the machine can only do what we know how to order it
to do,"' appears strange in face of this. Most of the programmes
which we can put into the machine will result in its doing something
that we cannot make sense (if at all, or which we regard as
completely random behaviour. Intelligent behaviour presumably
consists in a departure from the completely disciplined behaviour
involved in computation, but a rather slight one, which does not
give rise to random behaviour, or to pointless repetitive loops.
Another important result of preparing our machine for its part in
the imitation game by a process of teaching and learning is that
"human fallibility" is likely to be omitted in a rather natural way,
i.e., without special "coaching." (The reader should reconcile this
with the point of view on pages 23 and 24.) Processes that are
learnt do not produce a hundred per cent certainty of result; if
they did they could not be unlearnt.
It is probably wise to include a random element in a learning
machine. A random element is rather useful when we are searching for
a solution of some problem. Suppose for instance we wanted to find a
number between 50 and 200 which was equal to the square of the sum
of its digits, we might start at 51 then try 52 and go on until we
got a number that worked. Alternatively we might choose numbers at
random until we got a good one. This method has the advantage that
it is unnecessary to keep track of the values that have been tried,
but the disadvantage that one may try the same one twice, but this
is not very important if there are several solutions. The systematic
method has the disadvantage that there may be an enormous block
without any solutions in the region which has to be investigated
first, Now the learning process may be regarded as a search for a
form of behaviour which will satisfy the teacher (or some other
criterion). Since there is probably a very large number of
satisfactory solutions the random method seems to be better than the
systematic. It should be noticed that it is used in the analogous
process of evolution. But there the systematic method is not
possible. How could one keep track of the different genetical
combinations that had been tried, so as to avoid trying them again?
We may hope that machines will eventually compete with men in all
purely intellectual fields. But which are the best ones to start
with? Even this is a difficult decision. Many people think that a
very abstract activity, like the playing of chess, would be best. It
can also be maintained that it is best to provide the machine with
the best sense organs that money can buy, and then teach it to
understand and speak English. This process could follow the normal
teaching of a child. Things would be pointed out and named, etc.
Again I do not know what the right answer is, but I think both
approaches should be tried.
We can only see a short distance ahead, but we can see plenty there
that needs to be done.
Source:
https://courses.cs.umbc.edu/471/papers/turing.pdf