perm filename ARPA.DOC[P,LES] blob
sn#044034 filedate 1973-05-10 generic text, type T, neo UTF8
Proposal to
THE ADVANCED RESEARCH PROJECTS AGENCY
for support of
THE STANFORD ARTIFICIAL INTELLIGENCE PROJECT
John McCarthy, Professor of Computer Science
Principal Investigator
THE HEURISTIC PROGRAMMING PROJECT
Edward Feigenbaum, Professor of Computer Science
Co-Principal Investigator
Joshua Lederberg, Professor of Genetics
Co-Principal Investigator
and
THE NETWORK PROTOCOL DEVELOPMENT PROJECT
Vinton Cerf, Assistant Professor of Computer Science and
Electrical Engineering, Principal Investigator
January 1973
STANFORD UNIVERSITY
ABSTRACT
Research projects in artificial intelligence, heuristic programming,
and network protocol development are proposed. These projects will
be administratively distinct within the Computer Science Department
and are described below in separate sections with separate budgets.
The total cost is $3 million over a period of two years and two weeks
(15 June 1973 through 30 June 1975).
Artificial Intelligence
The Stanford Artificial Intelligence Project proposes to advance
existing research programs in robotics, representation theory,
mathematical theory of computation, and automatic deduction. These
activities will cost a total of $1.25 million per year.
A principal objective of the robotics work will be to lay foundations
in computer vision and manipulation that will permit major advances
in industrial automation.
The work on representation theory will enlarge the set of problems
that can be represented and, hopefully, solved by computers. This
field is fundamental to language understanding, learning from
experience, and advanced problem solving.
Mathematical theory of computation is aimed at formally proving the
i
Abstract (cont.)
correctness of programs. This will provide a much more powerful
basis for certifying programs than our current imperfect "debugging"
techniques.
The automatic deduction work will extend heuristic programming
techniques for theorem proving to a wider range of problems than can
now be handled.
Heuristic Programming
The theme of the Heuristic Programming Project is the formulation and
development of intelligent programs for assisting scientific work.
The aim is toward high levels of performance by the programs in these
tasks. Five project tasks are proposed:
1. Completion of the Meta-DENDRAL Program, a theory-formation
assistant for mass spectrometry.
2. Development of systems for intelligent control of scientific
data-collection instruments.
3. Application of Heuristic Programming to the task of computer
programming ("automatic programming").
4. Application of Heuristic Programming to protein structure
determination from electron density maps derived from x-ray
crystallographic data.
5. Application of Heuristic Programming to climatology, to develop
programs to assist climatologists in theory formulation.
ii
Abstract (cont.)
The activities of the Project have received substantial coordinate
support from other agencies, such as NIH and NASA, and it is expected
that this pattern will continue. The amount of support requested
from ARPA is $200,000 per year.
Network Protocol Development
The Network Protocol Development Project has two parts costing a
total of about $50,000 per year.
The first part is to provide for the orderly design, development,
implementation, and review of protocols for the ARPA network. A key
element in this task is to supply theoretical bases for the design of
protocols and to make predictions of performance to be verified by
observation and experiment (e.g. through the Network Measurement
Center).
The second part involves the coordination of an effort to specify
guidelines for subnet and Host-Host protocol design so that it will
be technically feasible for national networks to be interconnected.
The essential problem here is to isolate and raise the critical
issues which must be resolved by the protocols, and to effectively
stimulate and lead the working group's efforts. The cost of these
activities will be about $50,000 per year.
iii
TABLE OF CONTENTS
Section Page
1. ARTIFICIAL INTELLIGENCE PROJECT . . . . . . . . . . . 1
1.1 REPRESENTATION THEORY . . . . . . . . . . . . 5
1.2 ROBOTICS . . . . . . . . . . . . . 6
1.2.1 The Pilot Assembly Station . . . . . . . . 8
1.2.2 Research in Vision . . . . . . . . . . 11
1.2.3 Manipulator Development . . . . . . . . 15
1.2.4 System Requirements for Advanced
Automation . . . . . . . . 17
1.2.5 References . . . . . . . . . . . . 18
1.3 MATHEMATICAL THEORY OF COMPUTATION . . . . . . . 21
1.3.1 Recent Accomplishments . . . . . . . . 21
1.3.2 Future Work . . . . . . . . . . . . 22
1.3.3 References . . . . . . . . . . . . 23
1.4 AUTOMATIC DEDUCTION . . . . . . . . . . . . 25
1.4.1 Theorem Proving . . . . . . . . . . . 25
1.4.2 Verification of Computer Programs . . . . . 26
1.4.3 Automatic Program Generation . . . . . . 27
1.4.4 References . . . . . . . . . . . . 28
1.5 FACILITIES . . . . . . . . . . . . 29
1.6 PROJECT PUBLICATIONS . . . . . . . . . . . 30
1.6.1 Recent Articles and Books . . . . . . . 30
1.6.2 Theses . . . . . . . . . . . . 34
1.6.3 Films . . . . . . . . . . . . 37
1.6.4 Abstracts of Recent Reports . . . . . . . 39
1.7 BUDGET . . . . . . . . . . . . 56
2. HEURISTIC PROGRAMMING PROJECT . . . . . . . . . . 62
2.1 THEORY FORMATION . . . . . . . . . . . . 65
2.1.1 Meta-DENDRAL Problem Domain . . . . . . . 66
2.1.2 Meta-DENDRAL Background . . . . . . . . 66
2.1.3 Relation of Theory Formation to Automatic
Programming . . . . . . . . 66
2.1.4 Research Trajectory . . . . . . . . . 67
2.1.5 Progress . . . . . . . . . . . . 68
2.1.6 Use of Multiple Explanatory Paradigms . . . 69
iv
Table of Contents (cont.)
2.2 INTELLIGENT CONTROL OF INSTRUMENTS: THE SELF-
DIRECTING ANALYST (SDA) . . . . 70
2.2.1 Application to Scientific Work . . . . . . 72
2.3 APPLICATION OF HEURISTIC PROGRAMMING TO THE TASK
OF COMPUTER PROGRAMMING . . . . 74
2.3.1 Automatic Programming . . . . . . . . . 74
2.3.2 Meta-DENDRAL, Theta-DENDRAL, and Computer
Operating Systems . . . . . . 74
2.3.3 List-Processing . . . . . . . . . . . 74
2.3.4 Automatic Debugging . . . . . . . . . 75
2.3.5 Automatic Program Formation in Operating
Systems . . . . . . . . . 76
2.4 APPLICATION OF HEURISTIC PROGRAMMING TO PROTEIN
STRUCTURE DETERMINATION . . . . 78
2.4.1 Problem Description . . . . . . . . . 78
2.4.2 Objectives . . . . . . . . . . . . 79
2.4.3 Program Construction . . . . . . . . . 80
2.4.4 Fundamental Research Problems . . . . . . 81
2.4.5 Work Plan . . . . . . . . . . . . 82
2.5 APPLICATION OF HEURISTIC PROGRAMMING TO THEORY
FORMATION IN CLIMATOLOGY . . . 84
2.5.1 Problem Formulation . . . . . . . . . 84
2.5.2 The Knowledge Base . . . . . . . . . . 85
2.5.3 Problem Difficulty . . . . . . . . . . 85
2.5.4 Resources . . . . . . . . . . . . 86
2.5.5 Future Plans . . . . . . . . . . . . 86
2.6 APPENDIX: THE NATURE OF AN APPLICATION OF
HEURISTIC PROGRAMMING
TECHNIQUES . . . . . . . . 87
2.6.1 Coordinate Support . . . . . . . . . . 88
2.7 DENDRAL PUBLICATIONS . . . . . . . . . . . 90
2.8 BUDGET . . . . . . . . . . . . 94
3. NETWORK PROTOCOL DEVELOPMENT PROJECT . . . . . . . . 96
3.1 OBJECTIVES . . . . . . . . . . . . 97
3.2 PRESENT STATE OF AFFAIRS . . . . . . . . . . 98
3.3 TYPICAL ANALYSIS . . . . . . . . . . . . 101
3.4 WORK STATEMENT . . . . . . . . . . . . 110
v
Table of Contents (cont.)
3.5 PERSONNEL . . . . . . . . . . . . 111
3.6 REFERENCES . . . . . . . . . . . . 112
3.7 BUDGET . . . . . . . . . . . . 115
4. BUDGET SUMMARY . . . . . . . . . . . . 116
5. COGNIZANT PERSONNEL . . . . . . . . . . . . 117
vi
1. ARTIFICIAL INTELLIGENCE PROJECT
Artificial Intelligence is the experimental and theoretical study of
perceptual and intellectual processes using computers. Its ultimate
goal is to understand these processes well enough to make a computer
perceive, understand and act in ways now only possible for humans.
The Stanford Artificial Intelligence Project will soon have been in
existence ten years, although the present location and the computer
facilities are a few years newer. During this time, its work has
been basic and applied research in artificial intelligence and
closely related fields, including computer vision, problem solving,
control of an artificial arm and a vehicle. This work has been fully
described in our Artificial Intelligence Memoranda and in published
papers. The bibliography of this proposal gives references [Section
1.6]. Here is a short list of what we consider to have been our main
accomplishments.
Robotics
Development of vision programs for finding, identifying and
describing various kinds of objects in three dimensional
scenes. The scenes include objects with flat faces and also
curved objects.
The development of programs for manipulation and assembly of
objects from parts. The latest result is the complete
assembly of an automobile water pump from parts in known
locations.
Speech Recognition
Development of a system for recognition of continuous speech,
later transferred to Carnegie-Mellon University and now being
expanded upon elsewhere. Additional research in speech
recognition has been done here and will continue if support
can be found.
Heuristic Programming
Our support of Hearn's work on symbolic computation led to
the development of REDUCE, now being extended at the
University of Utah and widely used elsewhere.
Work in heuristic programming resulting in Luckham's
resolution theorem prover. This is currently about the best
1
ARTIFICIAL INTELLIGENCE PROJECT
theorem prover in existence, and it puts us in a position to
test the limitations of current ideas about heuristics so we
can go beyond them.
Representation Theory
Work in the theory of how to represent information in a
computer is fundamental for heuristic programming, for
language understanding by computers and for programs that can
learn from experience. Stanford has been the leader in this
field.
Mathematical Theory of Computation
Our work in mathematical theory of computation is aimed at
replacing debugging by computer checking of proofs that
programs meet their specifications. McCarthy, Milner, Manna,
Floyd, Igarashi, and Luckham have been among the leaders in
developing the relevant mathematical theory, and the
laboratory has developed the first actual proof-checking
programs that can check proofs that actual programs meet
their specifications. In particular, Robin Milner's LCF is a
practical proof checker for a revised version of Dana Scott's
logic of computable functions.
Research Facilities
Development of a laboratory with very good computer and
program facilities and special instrumentation for the above
areas. At present, we have the largest general purpose
display-oriented timesharing system in the world.
Unfortunately, it also appears to be the most heavily loaded
timesharing system in the world.
The development of a mechanical arm well suited to
manipulation research. It is being copied and used by other
laboratories.
In the course of developing our facilities, we have improved
LISP, developed an extended Algol compiler called SAIL,
written editors, loaders, and other utility programs for the
PDP-10 and made numerous improvements to time-sharing
systems. Many of these programs, particularly LISP and SAIL,
are used directly in dozens of other computer centers.
We have designed an advanced central processor that is about
10 times as fast as our PDP-10. We plan to construct this
2
ARTIFICIAL INTELLIGENCE PROJECT
processor and make it operational within the early part of
this proposal period.
In the next period, we propose to continue our work in these areas
with the following expected results:
The artificial intelligence problem still won't be
conclusively solved. As a scientific subject, the study of
the mechanisms of intelligence is as difficult as physics,
chemistry, or biology, and decisive results will be slow in
coming and require genius.
Our identification of intellectual mechanisms and our ability
to make machines carry them out will advance. The precise
results cannot be predicted.
In short term problems and in applications, we can make more definite
predictions:
We will lay the groundwork for industrial implementation of
automatic assembly and inspection stations, employing
computer vision and general purpose manipulators.
We will be able to represent in the computer the full
information required for computer solution of problems such
as the travel problem given in Section 1.1, below.
We will be able to check proofs of the correctness of actual
though simple compilers into PDP-10 machine language.
The proposed structure of the Project is given in Figure 1.
3
ARTIFICIAL INTELLIGENCE PROJECT
Figure 1. Structure of the Stanford Artificial
Intelligence Project
Principal Investigator: John McCarthy
|
______________________|______________________
| | | |
| | | |
_______|_______ | | _______|_______
| | | | | Mathematical |
| Robotics | | | | Theory of |
| | | | | Computation |
|_______________| | | |_______________|
Feldman,Binford, | | McCarthy,Weyhrauch,
Paul, Quam | | Luckham
| |
| |
_______|_______ _______|_______
| | | |
|Representation | | Automatic |
| Theory | | Deduction |
|_______________| |_______________|
McCarthy Luckham,Green,Allen
4
ARTIFICIAL INTELLIGENCE PROJECT
1.1 REPRESENTATION THEORY
Current research staff: John McCarthy
An intelligent computer program that understands the physical,
psychological and social world well enough to intelligently solve
problems in it must be able to represent facts about the world in
general, particular situations, the purposes of others, and the laws
that determine the effects of actions. In our opinion, this is one
of the main problems that has to be solved before artificial
intelligence is a reality. John McCarthy has been working on the
representation problem for a number of years and will continue this
work. Present and proposed efforts are based on the use of sentences
in first order logic and set theory as the primary means of
representing facts. The meaning of other data structures and of
programs is described by such sentences. The work is facilitated by
the existence of a reasonably convenient proof checking program for
this logic and substantial bases of axioms in machinable form. From
various points of view, Luckham, Feldman, Weyhrauch and Newey are
also concerned with the representation problem.
To make this concrete, consider the problem of travelling to a
foreign city. The following kinds of information have to be
represented. (1) The general facts about airplanes, cars, and
walking and the effects of actions using them. (2) The location and
format of information of various kinds, e.g. airline schedules in the
airline guide and in the knowledge of travel agents (3) The facts
about what other people must do or refrain from doing in order to
travel, e.g. the ticket agent must give you a ticket which he will do
in exchange for money in order that the gate agent will refrain from
preventing you from getting on the plane. (4) There is also
information about what information does not have to be known in
advance because it will be available when needed such as what gate a
plane at the second stage of the journey will leave from and how to
get to the gate. The test of whether this information has been
successfully put into the computer can be made by having a proof
checker check a proof that a given strategy for making a journey will
work barring accidents, strikes, and other acts of God or government.
In our opinion, understanding the kinds of information required to
achieve a class of goals is necessary before it is reasonable to have
question answerers or natural-language-understanders solve such
problems or converse in natural language about them.
In the next contract period the Stanford Artificial Intelligence
Laboratory proposes to increase our ability to represent information
in computers. In particular, we plan to use the above-mentioned
travel problem as the basis for our work.
5
ARTIFICIAL INTELLIGENCE PROJECT
1.2 ROBOTICS
Current research staff: Jerome Feldman, Tom Binford, Lou Paul, Lynn
Quam, Bruce Anderson, Craig Cook, Randy Davis, Kicha Ganapathy,
Gunnar Grape, Ram Nevatia, Richard Orban, Karl Pingle, Vic Scheinman,
Yoram Yakimovsky.
During the past several years our vision effort has shifted from
simple scenes of polyhedra to complex objects in richer context. Not
all problems with towers of polyhedra have been solved, but there are
a host of reasons for moving on. Manipulation has also proceeded
from manipulation of blocks to assembly operations for many of the
same reasons. Minsky and Papert [Minsky 1972] have described the
central issues and the intellectual position of robotics in
artificial intelligence. We refer to that discussion as background to
our motivation to direct research to real world scenes and objects.
An important motivation is to develop applications for manipulation
and visual feedback capabilities which have been developed here. The
time is ready for the first transfer of robotics technology from
laboratory to prototype assembly stations. We discuss robotics
applications more fully in another section.
Another major motivation is to face new issues; systems have come a
few steps toward heterarchical systems from heirarchy; from a
succession of homogeneous passes with compounding errors to a system
structure where high and low levels interact closely. One aspect of
such a system is the use of many functionally independent techniques;
this has meant including experimentation with color and texture,
depth information, motion parallax and image prediction. The choice
of problems as paradigms is central to the way in which perceptual
systems evolve. A great deal can be done with simple scenes of
polyhedra using a single image in one color, ignoring all other
perceptual abilities. Since most perceptual problems can be resolved
within a single image, one approach is to exhaust those
possibilities. Important areas are missed; for example, visual
feedback is clumsy without depth information. For our plans, visual
feedback for manipulation is primary.
There are other ways in which the tower of blocks paradigm is
inadequate for our purposes. Low level feature description operators
are tailored for this case. Adequate region and edge-oriented
descriptors must be evolved. With polyhedra, only simple preliminary
grouping mechanisms have been programmed. More intermediate level
descriptors must be implemented.
Although real world scenes are much more complicated, they may be
6
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
simpler to perceive in that they have a much richer context. There
is only very general knowledge available in the usual blocks task;
usually color, size, and support are unspecified. In an assembly
task, sizes are known, color is used to code, and parts fit in
special ways. Often recognition can be based on obvious special
features. These tasks are good exercise in developing goal-oriented
visual systems. Although development and utilization of intelligent
robots will progress gradually, the information sciences will impact
industry enormously over a shorter term. An immediate effect is to
streamline information flow in the manufacturing process. Simple
systems can keep track of delayed effects of design decisions, lower
cost of preparation and transmission, and increase accessibility.
Computer controlled scheduling, resource allocation, and
transportation routing are important to most of industrial production
[Anderson 1972]. We see a major contribution from techniques being
pioneered in artificial intelligence laboratories, and that such
research will have payoff throughout the entire development of
intelligent systems. Enormous efficiency increases will come from
analysis of assembly and inspection in terms of some "primitive
operations". In the past, computer cost and lack of sophistication
has limited the ability for such analysis, and the slow pace of
automation has not pressed hard for analysis. Now it will be
possible to find common fabrication and assembly operations
throughout a large company, to discover commonality in design and
components. The success of that analysis depends upon representation
of fabrication operations and shape. Representation is a current
concern in robotics labs and may have early applications. This
orientation to the manufacturing system is already beginning to have
effect with the "group technology" approach, which attempts to
describe parts and operations, and group them together in one part of
the factory. Until the establishment of such analysis and
commonality of operations, automating will be made much more
difficult.
Considerations are similar for programming automation devices such as
the assembly station we describe. Even if programming is difficult,
there will be considerable applications for dexterous assembly
systems. For example, for the military, it is preferable to stockpile
programs than to stockpile parts or systems for contingency. In batch
production, tapes can be saved to cut stockpiling. But to achieve
full effect on industry, robot systems must be easy to program for
new tasks. Our current work is concerned with strategy generation
and automating visual feedback. Time-sharing systems will be
necessary to perform scheduling, resource allocation, routing
operations, etc. Thus there is a natural division of labor between
the time-sharing system and small machines. The strategy generation,
automating feedback, and interactive system for computer-aided
mechanical design expect to reside in the central system.
7
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
A major part of our effort is directed toward the representation of
complex shapes, for use in vision and strategies. While it might
seem that this effort has only distant applications, for the reasons
discussed above, early applications will be for convenient
man/machine interaction for programming robots and for
systematization. We can expect that in general, much of the work in
symbolic representation of the real world will have early payoff.
Present non-intelligent robots are programmed by specifying points
along a path to be repeated. With computer control, optional
trajectories can be executed in case of contingency. The engineer
must then specify what normal function is, how to monitor it, and how
to recognize recoverable situations. To do so, he can specify this
symbolically in interaction with a program in which he specifies
normal and abnormal conditions, shape of parts (usually some variant
of standard parts), the mating of parts for assembly, and schematic
descriptions of operations for insertion and inspection. The
significance of representations in general is to aid these processes.
Our representation provides a natural way of speaking about the shape
of objects, in terms of a structured description of part/whole with
primitives which resemble many fabrication operations. The better
the representation, the more it provides intuition for programs as to
the intent of operations and the function of parts.
A representation aids communication in both directions. We contend
that our representation is important for computer graphics. This
allows display of the machine's concepts to the user, and allows
greater use of predictive mechanisms based on internal models. Input
could be made visually with a depth measuring system such as we have
been using for the last two years. Since the operation would be done
on a central facility, the computer would be adequate for this task.
We expect that the representation will be of considerable use in
monocular vision also. We hope to substantiate these predictions.
Another application is automating design of numerically machined
parts. That will be simplified to the extent that the machine and
the user speak the same language of subparts and their descriptions.
Benefits would come from ease of programming and interaction,
variable tolerances for critical parts and better monitoring
operations.
1.2.1 The Pilot Assembly Station
We seek other funds to partially support application of robotics
techniques to industrial assembly and inspection. We will implement
over the next few years a prototype assembly station and a support
system for programming the assembly station for its tasks.
8
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
The assembly station is assumed to be normally connected to a large
time-shared computer facility. Not only does this appear to be a good
design for our problem, but it is also necessary for an integrated
programmed automation facility. The only emphasized statement in the
ARPA study report [Anderson 1972] is: "The maximum impact of
computers in manufacturing will come from complete, real-time
computer cognizance and control of all processes and resources,
allowing the precise scheduling and allocation of these resources."
Each assembly station will consist of a small digital computer, one
or more manipulators and cameras. Manipulators and cameras will be
combined as modules in systems to suit a given task, and will be
digitally controlled by the station computer.
The program in the station computer will be flexible as to the number
and position of manipulators and cameras. It will control input from
the cameras, provide for some pre-processing of the picture and the
application of simple tests to the visual data. In the case of
manipulators it will provide for simultaneous servo control. It will
also process other inputs, such as from touch sensors on the
manipulators, from photocell interrupt devices, limit switches, etc.
Both analog and binary outputs will be available for the control of
other devices.
Each station computer will have access to a task program specifying
the actions and tests it should perform to carry out its task. This
task program will be compiled by the central computer. Compilation
will consist of the analysis of predicted scenes in order to specify
simple tests that the station computer can carry out, the calculation
of collision free trajectories for the manipulators, the sequence of
operations and simple contingency plans. etc. We are planning for
tasks whose normal execution does not require detailed scene
analysis. If scene analysis becomes necessary (e.g. a dropped part)
we envision the station sending an image to the central computer for
processing.
The central computer will perform overall control of the individual
assembly stations and the relative rates of production. If at any
station a planned test fails, for which a contingency plan does not
exist, or any unexpected sensor signals are received, the central
computer will also be interrupted and appropriate actions taken.
Most of the hardware required for the assembly station has been
developed for the laboratory. Manipulators and cameras will be
discussed in detail below. The major additional problem is to refine
the existing designs to meet the demands of an industrial
environment.
9
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
The most difficult part of the assembly station, and the heart of the
of the proposal is a demonstration of the feasibility of the station.
We plan to do this by developing programs which actually assemble and
inspect a variety of fairly complex manufactured items.
A typical example is our current project, assembly of an automobile
water pump. The major parts are to be located, the screws are
supplied by a feeder, as is standard. Locating the body and cover of
the pump require a gross localization, then a fine localization of
screw holes. Touch provides a final accuracy and independent
coordination of the hand. The placing of alignment pins within screw
holes could be monitored visually; although in this case,the previous
actions provide adequate accuracy. Placing of the cover over
alignment pins can be carried out without vision. Improved force
sensing now under development will be useful for that operation.
As we move to more difficult tasks, the sophistication required goes
up quite rapidly. The current manipulation routines seem to be
expandable to cover anticipated problems, especially since we intend
to employ special purpose manipulators (tools) for some tasks. The
main part of our effort will be vision for assembly; an assembly task
will be performed using vision, touch, and two hands. The vision
effort will be integrated with other abilities, and while no great
generality will be stressed, new modules will be incorporated into
the system, and a set of techniques will evolve. These will depend
heavily on techniques similar to those of our recent studies in
visual feedback and hand/eye coordination [Gill 1972]. The internal
model of the world will be used to predict visual appearance of
selected areas of the scene, and a set of verification procedures
will confirm or contradict the predictions. Many of the operations
will be connected with supplying feedback to manipulation, for
alignment and insertion. These visual procedures will have models for
the parts and approximate locations.
This and other visual tasks connected with assembly will initially be
performed as special case problems. In addition, to make these
techniques available, we will pursue development of strategy
programs. This step is essential if intelligent assembly devices are
to be conveniently programmed for complex tasks. Within the division
of labor between large and small machines, this facility must reside
in the central time-sharing facility which has large memory, file
storage, and problem-solving languages. The system must have a model
of the parts for assembly,and a sequence of instructions for
assembly. It will be necessary to have a representation of shape and
models for usual assembly operations such as insertion and alignment,
and the ability to solve simple problems in connection with these
operations, all this to enable convenient communication with the
10
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
user. When these facilities exist, the addition of new features and
operations can be incorporated smoothly, so that users can call on a
growing library of planning modules. The result of a planning effort
is a cycle of machine planning and man-machine interaction which
results in a plan network for the mini computer which controls
individual machines.
1.2.2 Research in Vision
Our past work on visual feedback [Gill 1972] in stacking objects and
inserting objects in holes is directly relevant to vision for
assembly. The system provides feedback for picking up, then at the
final position, with the hand still holding the block and in view,
predicts the appearance of edges and corners within a small window
around the faces to be aligned, and uses a corner/edge finder to
determine alignment error. Incremental motion capability of the arm
is used for error correction. That work incorporates self-
calibration to coordinate hand and eye coordinate systems and
maintain reliability (by maintaining calibration), as well as to
limit search by accurately predicting expected features. Two well-
developed systems exist to aid future work in visual feedback.
GEOMED [Baumgart 1972a] is a geometric editor which allows the input
of geometric models and manipulation of these models. OCCULT
[Baumgart 1972b] is a hidden line elimination program which gives a
symbolic output, valuable for vision, in a few seconds.
We have a set of visual operations which provide confirmation of
predictions and direct interaction with the visual image. For
outdoor scenes, color is an important basis for region forming, and
for preliminary scene organization. Several preliminary color region
analysis programs have been developed [Bajcsy 1972]. These will be
extended as described below. An edge operator [Hueckel 1969,1972] and
a region-based corner/edge finder [Gill 1972] both operate on small
windows of the scene, as directed by a higher level. An edge
verifier is used to confirm or reject edges predicted from context
[Tenenbaum 1970]. Color identification has long been demonstrated.
Recently, a program has been implemented based on Land's Retinex
model of color constancy in vision. This involves taking intensity
measurements with the camera through red, green, and blue filter, and
setting up a calibration scheme for the Goethe color triangle in the
absence of any external color standard (white). This is achieved by
normalizing the intensities of regions within the scene through each
filter, and defining a color triangle derived from this ordering
data, with the center of the triangle designated as white. The
regions may then be assigne color names in the normal way. The
11
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
program as it stands can cope reasonably well with table top scenes
under varied lighting conditions - sunlight, quartz-halogen and
flourescent, and therefore shows the same kind of color constancy as
do humans.
We have directed part of our work to analysis of outdoor scenes.
Outdoor scenes were analyzed with a combination of color and texture
region analysis [Bajcsy 1972]; depth cues were taken from the
gradient of texture. A symbolic representation of texture and of the
world were adopted; descriptors of texture elements were obtained as
analytic expressions from the Fourier spectrum. A considerable
improvement could be made by combining these techniques with several
techniques which obtain depth information from correlation of images.
Motion parallax has been used with a succession of images. The
successive images differ little from one another and allow local
correlation without ambiguity. In a sense, this combines the
advantages of narrow and wide angle stereo, since stereo ambiguity is
avoided, while large baselines can be accumulated by tracking regions
over several images. Another system incorporates several facilities
in wide angle stereo [Quam 1972]; local statistics are used to limit
search for correspondence of small areas in two images. An
interesting feature is self-orientation: given two cameras whose
orientation is unknown, the system uses guided search to establish a
few correspondences, then determines the relative orientation of the
two cameras. Such self-orientation and self-calibration facilities
have considerable significance for industrial systems which can be
maintained calibrated. These are complementary depth sensing
techniques; motion parallax can be used for planning paths and
characterizing large scale terrain features. Two camera stereo can
be used for local manipulation and disaster avoidance. Visual
feedback with stereo vision is potentially more powerful than
without. We have a program which has demonstrated stereo visual
feedback.
A thesis by Agin [Agin 1972] showed considerable progress in
representation and description of complex objects. He implemented a
laser ranging device [Binford 1970] which was operational two years
ago. Using depth data and a representation of shape [Binford 1971]
which we have developed, we have been able to describe the parts of
objects like a doll, snake, and glove. The representation is based
on segmenting objects into parts connected at joints. Parts are
described by a volume representation with arbitrary cross section
defined along a space curve. The data were analyzed by techniques
suggested by the representation. Preliminary segmentation was made
into parts suggested by some primitives of the representation,
smoothly varying, locally cylindrical parts. An iterative procedure
defined an approximation to the axis of the segment, fit circular
12
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
cross sections normal to the axis, then obtained an estimate of a
global cross section function which was then used to extend the axis
and cross section as far as consistent.
A study of visual and depth ranging techniques has been carried out
which provides a background for choosing sensors and interface, and
making cost estimates [Earnest 1967, Binford 1972]. An interesting
result is that the cost of a laser ranging system such as ours is
less than $1k given a TV system. We have maintained a calibrated hand
and vision system for several years [Sobel 1970], and have developed
self-calibration techniques with one and two eyes [Quam 1972, Gill
1972].
A system for understanding scenes of blocks from single views has
been implemented [Falk 1970]; it segments objects, matches against
internal models, predicts appearance from its assumptions, and
searches for verification and contradiction based on the predicted
image. Support and spatial relations are also obtained. There has
been past work on programs to obtain line drawings as input to
recognition systems. An earlier system was based on limited context
of expecting continuation of edges at vertices and predicting missing
edges in certain expected configurations [Grape 1970]. A more recent
approach has been to use models within the process of generating line
drawings, so that the most reasonable next guess can be made and
context used intrinsically in the formation of line drawings.
VISUAL DEVELOPMENT FOR ASSEMBLY
In a later section, we describe assembly of a pump carried out with
the manipulator using touch, without vision. We are implementing
vision programs to aid the assembly. In the pump assembly, visual
tasks consist of gross location of large parts, then using the
information to predict location of images of holes, and monitoring of
insertion into holes. These tasks are typical of the simple visual
operations which we will implement or adapt for assembly. We will
greatly expand our efforts with visual feedback. At present, visual
feedback techniques are limited to polyhedra. We will improve image
prediction techniques to predict images of objects with curved edges.
We will develop a simple region analysis program which assumes
adequate contrast to locate the curved edges predicted within a small
window. Color will be used when necessary. We will implement two-
dimensional matching of boundary curves with predicted curves. These
will be combined with our existing edge operator and corner/line
operator. We will maintain a calibrated system and improve
calibration updating so that reliability is maintained at a high
level.
13
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
PROPOSED RESEARCH
We will improve the laser hardware so that it can be used widely and
extend our work with representation of complex objects using depth
data. We are able to make complete descriptions of objects which
consist of a single primitive segment, such as a torus, a snake or a
cone. The programs can segment dolls, toy horses, in intuitively
natural ways [Agin 1972]. It is necessary to improve our segmentation
process in order to proceed to joining primitive segments at joints
and make complete descriptions of complex objects. When we have
these descriptions, the computer will have symbolic internal models
of objects. No recognition has yet been done, although we are in a
position to recognize those objects which have only one primitive
segment, for which we can make complete descriptions. For recognition
of more complex objects, once we have complete descriptions, we will
encode the basic properties of the representation within a visual
memory scheme which allows access of hypothesized similar three-
dimensional shapes, with associated surface information, to compare
more closely with extensive verification. We intend to apply large
parts of our programs for description and recognition of curved
objects using only monocular intensity information. This is a much
harder problem, but within our representation, shares much of the
same code, and relies on the same conceptual basis of local three-
dimensional models.
We will continue work on color region analysis which is based on
proximity rather than contiguity. This was the plan outlined in
previous proposals, and has shown necessary by our work with textured
scenes [Bajcsy 1972]. Another effort combines region analysis with a
decision theoretic approach based on a world model. We anticipate
that these modules will aid the integrated system to function in more
complex environments than have been attempted in the past. There is
a growing effort on navigation of vehicles in outdoor scenes. While
this is not directly relevant to the industrial environment, it does
reflect the growing involvement in complex visual tasks, and will
lead to further development of an integrated visual system.
A strategy project is underway dealing with the world of blocks. The
project begins from the idea that robot vision cannot be studied
except as a part of intelligent behavior in general. Visual ability
is FOR something, and without reasonable expectations about the
world, the task of seeing is bound to be extremely difficult. We
plan to implement a system which can do simple manipulation tasks
("pick up the red block", "put all the cubes in the box") in an
environment of plane-faced objects, and operating in a dialogue mode.
The system will have a good but imperfect world model all the time,
and the essence of its ability will be the constant use of this model
14
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
to understand situations and in particular to recover from errors
("Is that block no longer there because the arm knocked it off?").
The main point of the project is to find a good organization for
knowledge about the simple world in question: we expect that when
this is done the complex subroutines of current vision programs
(linefinders etc) will be found to be unnecessarily complex in that
they embody in an inefficient way at a low level, knowledge which our
system will contain at a higher level. In part, this will be done by
expanding the facilities available. A stereo vision module is being
programmed, using line drawings and integrating stereo with the
segmentation and recognition processes.
We are at work on a program which exploits new features of SAIL
[Feldman 1972] for dealing with multi-process tasks in a problem-
solving environment. The aim is to construct a driver program to
accomplish a simple assembly task under somewhat uncertain
conditions. Making use of the new language features, one can arrange
for such a program to cope with errors in the execution of tasks -
e.g. to recover from dropping things, breaking things, and further to
try several different but possibly cooperative methods for solving a
problem. It is intended that this program will have its own vision
procedures specially adapted for finding tools, e.g. wrenches,
screwdrivers, etc., and parts, e.g. nuts and bolts. An interpreter is
being developed to allow flexible description of the task by the
user.
1.2.3 Manipulator Development
Manipulator research began at Stanford Artificial Intelligence
Laboratory in 1965 with the acquisition and programming of a
prototype Rancho Orthotic Arm. A later version of this arm was used
in vision directed block stacking programs.
Our present arm, the result of a continuing design project, replaced
the Rancho Arm in 1970. This arm like its predecessor was controlled
by a digital servo program running under the time sharing computer.
The new arm was used in the "Instant Insanity" demonstration in 1971.
A second arm of the same type, with only minor modifications, is
currently being interfaced to the computer. Both arms will share a
common work space and will be able to be used individually, or
together for two handed tasks.
Manipulator research moves ahead simultaneously on three fronts:
control, primitives and devices. Tasks are translated into an
existing hand language and where the task cannot be accomplished in
15
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
terms of existing primitives, new primitives are formulated in a
consistent manner. When a new primative is formulated, real time
routines are developed to perform the action. This also leads, in
many cases, to the development of new hardware. Also as new
developments occur in hardware design, existing real time routines
are either re-written or modified.
Research is now concentrating on the following areas:
Strain gauge instrumentation of the wrist, in order to allow for more
sensitive interaction of the hand with a task. High resolution, high
sensitivity touch sensors, to increase tactile recognition and
control abilities. These developments will lead in turn to new
control concepts.
Development of tools for the hand. These tools can be instrumented
such that the computer can feel and or see at the end of the tool.
This should be a very important area of research as it goes beyond
present human capabilities.
Two handed arm control, the planning and execution of tasks requiring
the coordinated motion of both arms. This will require an extension
of the existing servo program and the analysis of synchronization
problems. The development of collision avoiders, so that arms will
not collide with other objects and with each other.
Modification of the existing arm design to provide for better
incremental motions, more degrees of freedom and a more rugged design
suitable for industrial use. Development of arm and eye computer
interfaces. These contain synchronized analog/digital converters for
input and current controlled drivers for output.
The development of new hands is expected as more complex tasks are
attempted. At present, with only one hand, we have been performing
the positioning type of hand actions. When the second arm is
operational we will be able to perform holding and pulling actions as
well.
DEMONSTRATIONS OF MANIPULATION
Operations controlled by force have been demonstrated by turning a
crank and screwing a nut onto a bolt. The hand language and
manipulation facilities are shown in the assembly of a small pump.
The positions of the parts are known approximately; they are located
on a pallet as expected in industrial practice. By touch, the hand
locates position and orientation of the pump body more accurately.
Aligning pins are inserted into the holes to guide the gasket and the
16
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
pump cover. Screws are obtained from a feeder, as is standard. All
operations have been performed in the assembly, but the total
reliability is still sufficiently low that one error or another
occurs. Vision has not been used in the assembly.
1.2.4 System Requirements for Advanced Automation
A key element in the development of any large programming effort is
the software support. There has been continual effort along these
lines at the Stanford Artificial Intelligence Laboratory and many
useful ideas have been developed. We propose to extend the research
along two separate lines: support for research and system software
for operating automated systems. We will briefly describe our past
efforts in this area and what we propose to undertake.
The Stanford Artificial Intelligence Laboratory has pioneered in
several areas of system programming, including display-oriented
timesharing systems, interactive graphics, and programming languages.
A great deal of additional effort has gone into direct support of
hand-eye research. Very early, we established a set of standard
formats and library routines [Pingle 1972a]. A programming language,
SAIL, was implemented in 1968. The language incorporates special
features for large real-world-oriented problem solving systems and is
currently in use at a number of Artificial Intelligence and other
laboratories.
As we began to assemble larger and more sophisticated collections of
programs, we felt a need for flexible mechanisms for cooperating
sequential processes. The resulting monitor and language features
[Feldman & Sproull 1971] have worked well for us and have been
influential on other projects. RecenTdy, we have added an additional
set of features to SAIL [Feldman 1972] to facilitate non-
deterministic and event-directed programming. The early reactions to
this work, both within the laboratory and without has been very
encouraging.
A major share of the programming research effort will be concentrated
on developing a programming system for automated assembly. This will
combine ideas from our past work on languages, representation,
computer graphics and arm control. We view this task as a branch of
the larger effort on automatic programming being undertaken in this
laboratory and elsewhere.
17
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
1.2.5 References
[Anderson 1972] R. Anderson, "Programmable Automation, The Future of
Computers in Manufacturing", Datamation, Dec 1972.
[Agin 1972] G. Agin, "Description and Representation of Curved
Objects", Ph.D. Thesis, Computer Science Dept., Stanford
University, September 1972.
[Bajcsy 1972] "Computer Identification of Visual Texture" Computer
Science Department, Stanford University, (1972).
[Baumgart 1972a] B. Baumgart, "GEOMED - A Geometric Editor", May
1972. SAILON-68,Artificial Intelligence Lab, Stanford
University,
[Baumgart 1972b] Baumgart, Bruce G., Winged Edge Polyhedron
Representation, AIM 179,Artificial Intelligence Lab, Stanford
University,October 1972.
[Binford 1971] "Visual Perception by Computer", Invited paper at IEEE
Systems Science and Cybernetics, Miami, Dec 1971.
[Binford 1972] "Sensor Systems for Manipulation", Conference on
Remotely Manned Systems, JPL, Pasadena, Calif, Sept 1972.
[Earnest 1967] L. D .Earnest, "Choosing an eye for a Computer"; Memo
AIM-51, Artificial Intelligence Lab, Stanford University,
1967.
[Falk 1971] G. Falk, "Scene Analysis Based on Imperfect Edge Data",
Proc. 2IJCAI, Brit. Comp. Soc., Sept. 1971.
[Feldman & Sproull 1971] J. Feldman and R. Sproull, "System Support
for the Stanford Hand-eye System", Proc. 2IJCAI, Brit. Comp.
Soc., Sept. 1971.
[Feldman 1972] J. Feldman, et al,"Recent Developments in SAIL-An
Algol Based Language for Artificial Intelligence", (with
others). CS 308, Stanford University, August 1972. To
appear in Proc. Fall Joint Computer Conference, 1972.
[Gill 1972] A, Gill, Visual Feedback and Related Problems in Computer
Controlled Hand-eye Coordination, AIM-178, Stanford AI
Laboratory, October 1972.
18
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
[Grape 1969] G. Grape, "Computer Vision through Sequential
Abstraction", Artificial Intelligence Laboratory, Internal
Report, 1969.
[Grape 1970] G. Grape, "On predicting and Verifying Missing Elements
in Line-Drawings" Artificial Intelligence Laboratory,
Internal Report, March 1970.
[Hueckel 1969] M.H. Hueckel, "An Operator Which Locates Edges in
Digitized Pictures", Stanford Artificial Intelligence
Laboratory AIM-105, December 1969, JACM, Vol. 18, No. 1,
January 1971.
[Minsky 1972] M. Minsky, "Artificial Intelligence Progress Report",
AI Memo 252, MIT AI Lab, 1972;
[Paul 1971] R. Paul, "Trajectory Control of a Computer Arm", 2nd
International Joint Conference on Artificial Intelligence,
1971.
[Paul 1972] R. Paul, "Modelling, Trajectory Calculation and Servoing
of a Computer Controlled Arm", Ph.D. Thesis, Computer
Science Dept., Stanford University, Sept. 1972.
[Pingle 1972a] Karl K. Pingle, "Hand/Eye Library", Stanford
Artificial Intelligence Laboratory Operating Note 35.1,
January, 1972.
[Pingle 1972b] K.K. Pingle and J.M. Tenenbaum, "An Accomodating Edge
Follower", IJCAI-2. Proceedings of the 2nd. International
Symposium on Industrial Robots, May 1972, Chicago, Illinois.
[Quam 1972] L. H. Quam, S. Liebes, Jr., R. B. Tucker, M. J. Hannah,
B. G. Eross, "Computer Interactive Picture Processing",
Stanford Artificial Intelligence Laboratory AIM-166, April
1972.
[Scheinman 1969] V. D. Scheinman, Design of a Computer Manipulator,
Stanford Artificial Intelligence Project, Memo AIM-92, June
1969.
[Sobel 1970] Irwin Sobel, "Camera Models and Machine Perception",
Stanford Artificial Intelligence Project Memo AIM-121, May,
1970.
[Swinehart 1969] D. Swinehart, R. Sproull, "SAIL", Stanford
Artificial Intelligence Laboratory Operating Note 57,
November 1969.
19
ARTIFICIAL INTELLIGENCE PROJECT ROBOTICS
[Tenenbaum 1970] J. M. Tenenbaum, "Accommodation in Computer Vision",
PhD thesis in Electrical Engineering, September 1970.
20
ARTIFICIAL INTELLIGENCE PROJECT
1.3 MATHEMATICAL THEORY OF COMPUTATION
Current research staff: John McCarthy, David Luckham, Robin Milner,
Richard Weyhrauch, Malcolm Newey, Whit Diffie.
1.3.1 Recent Accomplishments
Recent accomplishments in mathematical theory of computation have
been as follows.
1. Professor John McCarthy has developed a set of axioms for the
Scott theory in set theory in first order logic. These axioms
are in the form acceptable to Diffie's first-order-logic proof
checker.
2. Dr. Jean-Marie Cadiou spent the summer further researching the
difference between the "call by value" and "call by name" methods
of evaluating recursive functions. This represents further work
in the direction of his thesis [2].
3. Dr. David Luckham, Dr. Shigeru Igarashi, and Dr. Ralph London
have together worked on the project of producing a running
program for the correct generation of verification conditions for
the language PASCAL. This work builds on an idea of C.A.R. Hoare
[3]. In addition to the verification generation (VG) Dr. Luckham
and J. Morales have used Dr. Luckham's theorem prover to prove
some of the theorems generated by the VG.
Luckham has also supervised the thesis of J. Buchanan [1] in
which the notion of a semantic frame has been used to write
programs guaranteed correct by their method of generation.
4. Dr. Shigeru Igarashi [4] has explored the question of extending
the notion of the "admissibility" of Scott's computation
induction to a wider class of sentences than considered by Scott.
This work was presented at the MTC Symposium in Novosibirsk,
U.S.S.R.
5. Dr. Robin Milner and Dr. Richard Weyhrauch completed their study
of definability in LCF. They produced two papers. In [9] LCF
was used to express the correctness of programs and a machine
checked proof of the correctness a particular program was given.
In [8] a proof of the correctness of a compiler was presented.
Dr. Milner also presented an expository paper summerizing all
21
ARTIFICIAL INTELLIGENCE PROJECT MATHEMATICAL THEORY OF COMPUTATION
this work at the Symposium at Novosibirsk [7]. Subsequent to
this along with Dana Scott, work has been done on a version of
LCF based on the type free λ-calculus. A set of axioms for such
a system has been found.
6. Dr. Robin Milner has been studying the theory of processes using
the type free λ-calculus. This work is still in a preliminary
stage.
7. Dr. Richard Weyhrauch has been looking at formal proofs in
several different languages (first order set theory, first order
logic, LCF) to see if various things learned from LCF carry over
to these languages.
8. Malcolm Newey has produced a set of theorems to be used as a
theorem base for people wanting to use LCF. This includes axioms
for arithmetic, lists and finite sets, as well as about 900
theorems and proofs in these areas. An A.I. Memo on these will
appear soon.
1.3.2 Future Work
First there are people working on clarifying the extent of Floyd's
method of inductive assertions. The main tool for this study is the
embryonic verification generator mentioned above and Dr. Luckham's
theorem prover. To this end Dr. Luckham is planning an interactive
program that uses both. The main task here is to 1) demonstrate that
the formal system suggested by C.A.R. Hoare, when extended to the
full language PASCAL is correct and useable; 2) program the
verification generator (VG), using Hoare's ideas, to generate
sentences, which if provable would guarantee the partial correctness
of particular programs; 3) use the theorem prover, perhaps with
interactive help, to prove the sentences generated by the VG. The
work mentioned in 3) is relevant to this project and are presently
continuing. In addition a new student, Nori Suzuki, is working on
the addition of mechanisms to the VG to help in proving the
termination (or total correctness) of programs.
The second project concerns extending the ideas of D. Scott and C.
Strachey. A major part of this research is a computer program, LCF,
which implements the logical calculus, in which the notions of
"function" and "functional" can be discussed. Actually there are now
two such calculi, the original LCF [7,8,9] and the type free logic (5
above). The following projects are either already underway or are
contemplated in the near future. A new machine version of LCF. This
22
ARTIFICIAL INTELLIGENCE PROJECT MATHEMATICAL THEORY OF COMPUTATION
is necessitated for several reasons. 1) The usual run of features
left out; 2) most important a better interface with other programs.
An examination of how to automate proving of assertions in LCF. For
example, could the proofs given in [8,9] have been automated. Drs.
Milner and Weyhrauch have several ideas along these lines and it is
likely that Steve Ness will undertake this problem as a thesis topic.
Malcolm Newey, another student, hopes to actually use LCF to prove
formally the correctness of LCOM0 and LCOM4. An informal proof of
their correctness was presented in [5]. Both of these tasks will
require the discussion of how to create subsidiary deduction rules.
There is also the question (reflecting on 6 above) of the
implementation of a version of LCF for the type free logic. Whether
this will be done depends on further theoretical work to show its
necessity.
An attempt to automate either of the above ways of proving properties
of programs requires one to talk about proving theorems in some
formal language. Another project envisioned is a new first order
logic proof checker which is based on theories of natural deduction,
rather than resolution. One of the as yet unexplored discoveries of
the work of [8,9] was the effect of the physical structure of LCF on
the feasibility of carrying out proofs. Dr. Weyhrauch hopes to
examine this question in terms of natural deduction in general.
These projects are all directed at the more central questions, what
is the right notion of correctness and equivalence of programs. In
addition they address themselves to examining the computational
feasibility of automating the procedures suggested by the theoretical
results. It is hoped that these studies will eventually lead to a
clear notion of correctness that lends itself to automatic (though
perhaps interactive) checking.
1.3.3 References
[1] Buchanan, J., Thesis, to appear
[2] Cadiou, J.M., "Recursive definitions of partial functions and
their computations", Stanford Artificial Intelligence Project
AIM-163, March 1972.
[3] Hoare, C.A.R., "An axiomatic basis of computer programming",
CACM, Volume 12, 576-80, 583, 1969.
[4] Igarashi, S., "Admissibility of fixed-point induction in first-
order logic of typed theories", Stanford Artificial Intelligence
Project, AIM-168, May 1972.
23
ARTIFICIAL INTELLIGENCE PROJECT MATHEMATICAL THEORY OF COMPUTATION
[5] London, R.L., "Correctness of two compilers for a LISP subset",
Stanford Artificial Intelligence Project, AIM-151, October 1971.
[6] Manna, Z., INTRODUCTION TO MATHEMATICAL THEORY OF COMPUTATION,
Stanford University Computer Science Department May 1972.
[7] Milner, R. (paper given at Novosibirsk), 1972.
[8] Milner, R. & Weyhrauch, R., "Proving compiler correctness in a
mechanized logic", Machine Intelligence 7, Michie, E. (ed.),
Edinburgh, Edinburgh University Press, 1972.
[9] Weyhrauch, W. & Milner, R., "Program semantics and correctness
in a mechanized logic", First USA-Japan Computer Conference
Proceeedings, Tokyo, Hitashi Printing Company, October 1972.
24
ARTIFICIAL INTELLIGENCE PROJECT
1.4 AUTOMATIC DEDUCTION
Current research staff: David Luckham, John Allen, Jack Buchanan
Jorge Morales and Nori Suzuki.
Research over the past year has been directed very strongly towards
developing the applications of automatic proof procedures, especially
in the areas of mathematical theorem-proving, verification of
computer programs, and automatic generation of programs. This work
has involved extensive changes and additions to the theorem-proving
program (references include previous A.I. Project reports,and [l, 2,
and 3], and the development of new programs using special features of
the languages MLISP2 [4] and MICRO-PLANNER [5]). We give some brief
details of current work and future plans below.
Most of these projects have involved the discussion of many problems
(both theoretical and practical) in the mathematical theory of
computation, and this has resulted in a great deal of cooperation
between the two groups of people. In particular, discussions with
Robin Milner and Richard Weyhrauch on their experiments with LCF
helped the program verification project, and Richard Weyhrauch's
experiments in proving theorems in various formal systems speeded the
extension of user facilities in the theorem-prover. Different
projects also turned out to have similar practical problems; e.g. the
simplification algorithms in LCF and in the theorem-prover are
similar, and the data files built up by M. Newey for carrying out
proofs in LCF have a significant overlap with the files of facts used
by J. Morales in the program verification system. As a result of
this extensive common ground of interests and problems, a more
integrated development of the various projects is planned for the
future.
1.4.1 Theorem Proving
The basis of the theorem-proving effort has been the proof procedure
for first-order logic with equality, originally developed by Allen
and Luckham [1]. This has been extensively modified by J. Allen in
recent months; the basic theorem-proving program has been speeded up
by a factor of 20, an input-output language allowing normal
mathematical notation has been added, and the program will select
search strategies automatically if the user wishes (we refer to this
as automatic mode). As a result it is possible for the program to be
used by a person who is totally unfamiliar with the theory of the
Resolution principle and its associated rules of inference and
25
ARTIFICIAL INTELLIGENCE PROJECT AUTOMATIC DEDUCTION
refinements. This program has been used to obtain proofs of several
different research announcements in the Notices of the American
Mathematical Society, for example, [6, 7, and 8]. Recently, (July
1972) J. Morales learned to use the program essentially by using it
to obtain proofs of the results stated in [8, June 1972] as an
exercise. In the course of doing this he was able to formulate simple
ways of using the prover to generate possible new theorems in the
same spirit as [8], and did in fact succeed in extending the results
of [8]. Furthermore, he was able to send the authors proofs of their
results before they had actually had time to write them up [R.H.
Cowen, private correspondence, August 9, 1972]. The prover is also
being used as part of the program verification system (below).
At present some modifications in the prover's standard search
strategies are being made by J. Allen. A version of this program
with user documentation is being prepared for distribution to other
research groups. The addition of a language in which the user can
specify his intuition about how a proof of a given statement might
possibly be obtained, is in progress. J. Allen has already programmed
a very preliminary version of this "HUNCH" language, and has
completed the systems debugging necessary to get a compiled version
of J. Sussman's CONNIVER language running here; HUNCH language may
be implemented in CONNIVER, but discussions on this point are not yet
complete. Initially, it is expected that HUNCH language will be
useful in continuing with more difficult applications in first-order
mathematical theories. A research report outlining the results of
experiments made over the past two years is being prepared by J.
Allen and D. Luckham. A report on J. Morales' experiments is
available.
1.4.2 Verification of Computer Programs
This project was undertaken jointly by Shigeru Igarashi, Ralph London
and David Luckham. The aim is to construct a system for verifying
automatically that a computer program is consistent with its
documentation. It was decided to restrict attention to programs
written in N. Wirth's language PASCAL [9] since this was the subject
of an axiomatic study by C.A.R. Hoare [10]. Essentially the
verification system has three main components, (i) a Verification
Condition Generator (VCG), (ii) a simplifier, and (iii) a theorem-
prover. A first version of VCG has been implemented in MLISP2; it is
a fairly straightforward encoding of a set of rules of inference that
is derivation-complete for the system given by Hoare in [10]. VCG
will accept a PASCAL program (including go to's, conditionals,
while's, procedure and function calls with certain restrictions on
26
ARTIFICIAL INTELLIGENCE PROJECT AUTOMATIC DEDUCTION
the types of the actual parameters of procedure calls, and arrays),
together with a documentation, and will output a set of "lemmas" or
conditions that, if provable, will ensure that relative consistency
holds true. The simplifier, which is to process the conditions
output by VCG, has not yet been written. Initially, proofs of the
conditions for simple arithmetical programs and for certain sorting
programs involving array operations, were obtained by D. Luckham and
J. Allen using the theorem-proving program. A more systematic series
of experiments in which proofs of verification conditions were
obtained from standard data files using the prover in automatic mode,
have been made by J. Morales.
These experiments are continuing. It is planned to add new rules of
inference to the theorem-prover and to add some simple debugging
strategies for pointing out possible relative inconsistencies. A
more flexible version of VCG is planned which will allow the user to
specify his own rules of inference for an extension of Hoare's
system. A preliminary version of the simplifier is essential if the
whole verification process is to be made fully automatic for any
reasonable class of programs. People currently engaged in this work
include D. Luckham, J. Morales, N. Suzuki, and R. London (latter at
ISI, University of Southern California). A research report on VCG is
being prepared by S. Igarashi and R. London and D. Luckham.
1.4.3 Automatic Program Generation
We mention this work briefly here since it is an outgrowth of an
attempt to extend the applicability of the theorem-prover to problems
in Artificial Intelligence, and makes use of a particular notion of a
problem environment (called a "semantic frame") which was designed
for that original purpose. However, the system as it now stands, is
best thought of as a heuristic theorem-prover for a subset of Hoare's
logical system. It has been implemented in LISP using the backtrack
features of Micro-Planner, by J. Buchanan. It accepts as input a
problem environment and a problem and, if successful, gives as output
a program for solving the problem. At the moment, the output
programs are composed of the primitive operators of the environment,
assignments, conditional branches, while's, and non-recursive
procedure calls. This system has been used to generate programs for
solving various problems to do with robot control, conditional
planning, and for computing arithmetical functions. A research
report outlining the logical basis for the system and its overall
capabilities is in preparation by D. Luckham and J. Buchanan. The
details of its implementation will be available in J.R. Buchanan's
Ph.D. Thesis (also in preparation).
27
ARTIFICIAL INTELLIGENCE PROJECT AUTOMATIC DEDUCTION
1.4.4 References
[1] John Allen and David Luckham (1970), An Interactive Theorem-
Proving Program, Proc. Fifth International Machine Intelligence
Workshop, Edinburgh University Press, Edinburgh.
[2] Richard B. Kieburtz and David Luckham, "Compatibility and
Complexity of Refinements of the Resolution Principle", SIAM J.
Comput., Vol. 1, No. 4, December 1972.
[3] David Luckham and Nils J. Nilsson, "Extracting Information from
Resolution Proof Trees", Artificial Intelligence, Vol. 2, No. 1,
pp. 27-54, Spring 1971.
[4] David C. Smith, "MLISP", Computer Science Department, Artificial
Intelligence Memo No. AIM-135, October 1970.
[5] Sussman and R. Winograd, Micro Planner Manual, Project MAC.
[6] Chinthayamma (1969), Sets of Independent Axioms for a Ternary
Boolean Algebra, Notices Amer. Math. Soc., 16, p. 654.
[7] E.L. Marsden, "A Note on Implicative Models", Notices Amer.
Math. Soc. No. 682-02-7, p. 89, January 1971.
[8] Robert H. Cowen, Henry Frisz, Alan Grenadir, "Some New
Axiomatizations in Group Theory", Preliminary Report, Notices
Amer. Math. Soc., No. 72T-112, p. A547, June 1972.
[9] N. Wirth, "The Programming Language PASCAL", ACTA Informatica I
1., p. 35-63, 1971.
[10] C.A.R. Hoare, "An Automatic Approach to Computer Programming",
Commun ACM. 12, 10 576-580, 583, October 1969.
28
ARTIFICIAL INTELLIGENCE PROJECT
1.5 FACILITIES
The computer facilities of the Stanford Artificial Intelligence
Laboratory include the following equipment.
Central Processors: Digital Equipment Corporation PDP-10 and PDP-6
Primary Store: 65K words of 1.7 microsecond DEC Core
65K words of 1 microsecond Ampex Core
131K words of 1.6 microsecond Ampex Core
Swapping Store: Librascope disk (5 million words, 22 million
bits/second transfer rate)
File Store: IBM 3330 disc file, 6 spindles (leased)
Peripherals: 4 DECtape drives, 2 magnetic tape drives
(7 track), line printer, Calcomp plotter,
Xerox Graphics Printer
Terminals: 58 TV displays, 6 III displays, 3 IMLAC
displays, 1 ARDS display, 15 Teletype terminals.
Special Equipment: Audio input and output systems, hand-eye
equipment (3 TV cameras, 3 arms), remote-
controlled cart
The facility operates nearly 24 hours of every day. It typically
carries a load of forty-some jobs throughout the day and seldom has
less than eight active users all night long.
It is expected that the new high speed processor currently being
fabricated in our laboratory will become the main timesharing
processor during the early part of the proposal period. This should
provide adequate computational support for the proposed research
program.
It will be necessary to purchase some special instrumentation and
peripheral computer equipment in support of our research in computer
vision and manipulation. Thus, $75,000 per year has been budgeted
for capital equipment, based on recent experience.
29
ARTIFICIAL INTELLIGENCE PROJECT
1.6 PROJECT PUBLICATIONS
1.6.1 Recent Articles and Books
Articles and books by members of the Stanford Artificial Intelligence
Laboratory are listed below.
1. E. Ashcroft and Z. Manna,"Formalization of Properties of Parallel
Programs", Machine Intelligence 6, Edinburgh Univ. Press, 1971.
2. E. Ashcroft and Z. Manna, "The Translation of `Go To' Programs to
`While' Programs", Proc. IFIP Congress 1971.
3. K. M. Colby, S. Weber, F. D. Hilf, "Artificial Paranoia", J. Art.
Int., Vol. 2, No. 1, 1971.
4. G. Falk, "Scene Analysis Based on Imperfect Edge Data", Proc.
2IJCAI, Brit. Comp. Soc., Sept. 1971.
5. J. A. Feldman and A. Bierman, "A Survey of Grammatical
Inference", Proc. International Congress on Pattern Recognition,
Honolulu, January 1971, also in S, Watanbe (ed.), FRONTIERS OF
PATTERN RECOGNITION, Academic Press, 1972.
6. J. Feldman and R. Sproull, "System Support for the Stanford Hand-
eye System", Proc. 2IJCAI, Brit. Comp. Soc., Sept. 1971.
7. J. Feldman, et al, "The Use of Vision and Manipulation to Solve
the `Instant Insanity' Puzzle", Proc. 2IJCAI, Brit. Comp. Soc.,
Sept. 1971.
8. R. W. Floyd, "Toward Interactive Design of Correct Programs",
Proc. IFIP Congress 1971, pp. 1-5.
9. A. Hearn, "Applications of Symbol Manipulation in Theoretical
Physics", Comm. ACM, August 1971.
10. F. Hilf, K. Colby, D. Smith, W. Wittner, W. Hall, "Machine-
Mediated Interviewing", J. Nervous & Mental Disease, Vol. 152,
No. 4, 1971.
11. M. Hueckel, "An Operator which Locates Edges in Digitized
Pictures", JACM, January 1971.
30
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
12. M. Kahn and B. Roth, "The Near-minimum-time Control of Open-loop
Articulated Kinematic Chains", Trans. ASME, Sept. 1971.
13. R. Kling, "A Paradigm for Reasoning by Analogy", Proc. 2IJCAI,
Brit. Comp. Soc., Sept. 1971.
14. D. Knuth, "An Empirical Study of FORTRAN Programs", Software --
Practice and Experience, Vol. 1, 105-133, 1971.
15. D. Luckham and N. Nilsson, "Extracting Information from
Resolution Proof Trees", Artificial Intelligence Journal, Vol.
2, No. 1, pp. 27-54, June 1971.
16. Z. Manna (with R. Waldinger), "Toward Automatic Program
Synthesis", Comm. ACM, March 1971.
17. Z. Manna, "Mathematical Theory of Partial Correctness", J. Comp.
& Sys. Sci., June 1971.
18. R. Milner, "An Algebraic Definition of Simulation between
Programs", Proc. 2IJCAI, Brit. Comp. Soc., Sept. 1971.
19. U. Montanari, "On the Optimal Detection of Curves in Noisy
Pictures", Comm. ACM, May 1971.
20. N. Nilsson, "Problem-solving Methods in Artificial Intelligence",
McGraw-Hill, New York, 1971.
21. R. Paul, "Trajectory Control of a Computer Arm", Proc. 2IJCAI,
Brit. Comp. Soc., Sept. 1971.
22. K. Pingle and J. Tenenbaum, "An Accomodating Edge Follower",
Proc. 2IJCAI, Brit. Comp. Soc., Sept. 1971.
23. B. Roth, "Design, Kinematics, and Control of Computer-controlled
Manipulators", Proc. 6th All Union Conference on New Problems in
Theory of Machines & Mechanics, Leningrad, Jan. 1971.
24. R. Schank, "Finding the Conceptual Content and Intention in an
Utterance in Natural Language Conversation", Proc. 2IJCAI, Brit.
Comp. Soc., 1971.
25. J. Tenenbaum, et al, "A Laboratory for Hand-eye Research", Proc.
IFIP Congress, 1971.
31
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
26. A. W. Biermann and J. A. Feldman, "On the Synthesis of Finite-
state Machines from Samples of Their Behavior", IEEE
Transactions on Computers, Vol. C-21, No. 6, pp. 592-596, June
1972.
27. A. W. Biermann, "On the Inference of Turing Machines from Sample
Computations", Artificial Intelligence J., Vol. 3, No. 3, Fall
1972.
28. J. M. Cadiou and Z. Manna, "Recursive Definitions of Partial
Functions and their Computations", ACM SIGPLAN Notices, Vol. 7,
No. 1, January 1972.
29. K. M. Colby, F. D. Hilf, S. Weber, and H. C. Kraemer, "Turing-
like Indistinguishability Tests for the Validation of a Computer
Simulation of Paranoid Processes", Artificial Intelligence J.,
Vol. 3, No. 3, Fall 1972.
30. G. Falk, "Interpretation of Imperfect Line Data as a Three-
Dimensional Scene", J. Artificial Intelligence, Vol. 3, No. 2,
1972.
31. J. A. Feldman, "Some Decidability Results on Grammatical
Inference and Complexity", Information and Control, Vol. 20, No.
3, pp. 244-262, April 1972.
32. J. A. Feldman, J. R. Low, D. C. Swinehart, and R. H. Taylor,
"Recent Developments in SAIL, an ALGOL-based language for
Artificial Intelligence", Proc. Fall Joint Computer Conference,
1972.
33. S. J. Garland and D. C. Luckham, "Translating Recursive Schemes
into Program Schemes", ACM SIGPLAN Notices, Vol. 7, No. 1,
January 1972.
34. J. Gips, "A New Reversible Figure", Perceptual & Motor Skills,
34, 306, 1972.
35. Richard B. Kieburtz and David Luckham, "Compatibility and
Complexity of Refinements of the Resolution Principal", SIAM J.
Comput., Vol. 1, No. 4, December 1972.
36. D. E. Knuth, "Ancient Babylonian Algorithms", Comm. ACM, July
1972.
37. R. L. London, "Correctness of a Compiler for a LISP Subset", ACM
SIGPLAN Notices, Vol. 7, No. 1, January 1972.
32
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
38. Z. Manna, S. Ness, and J. Vuillemin, "Inductive Methods for
Proving Properties of Programs", ACM SIGPLAN Notices, Vol. 7,
No. 4, January 1972.
39. Z. Manna and J. Vuillemin, "Fixpoint Approach to the Theory of
Computation", Comm. ACM, July 1972.
40. R. Milner, "Implementation and Application of Scott's Logic for
Computable Functions", ACM SIGPLAN NOTICES, Vol. 7, No. 1,
January 1972.
41. James A. Moorer, "Music and Computer Composition", Comm. ACM,
January 1972.
33
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
1.6.2 Theses
Theses that have been published by the Stanford Artificial
Intelligence Laboratory are listed here. Several earned degrees at
institutions other than Stanford, as noted.
AIM-43, R. Reddy, AN APPROACH TO COMPUTER SPEECH RECOGNITION BY
DIRECT ANALYSIS OF THE SPEECH WAVE, Ph.D. Thesis in Computer
Science, September 1966.
AIM-46, S. Persson, SOME SEQUENCE EXTRAPOLATING PROGRAMS: A STUDY OF
REPRESENTATION AND MODELING IN INQUIRING SYSTEMS, Ph.D.
Thesis in Computer Science, University of California,
Berkeley, September 1966.
AIM-47, B. Buchanan, LOGICS OF SCIENTIFIC DISCOVERY, Ph.D. Thesis in
Philosophy, University of California, Berkeley, December
1966.
AIM-44, J. Painter, SEMANTIC CORRECTNESS OF A COMPILER FOR AN ALGOL-
LIKE LANGUAGE, Ph.D. Thesis in Computer Science, March 1967.
AIM-56, W. Wichman, USE OF OPTICAL FEEDBACK IN THE COMPUTER CONTROL
OF AN ARM, Eng. Thesis in Electrical Engineering, August
1967.
AIM-58, M. Callero, AN ADAPTIVE COMMAND AND CONTROL SYSTEM UTILIZING
HEURISTIC LEARNING PROCESSES, Ph.D. Thesis in Operations
Research, December 1967.
AIM-60, D. Kaplan, THE FORMAL THEORETIC ANALYSIS OF STRONG
EQUIVALENCE FOR ELEMENTAL PROPERTIES, Ph.D. Thesis in
Computer Science, July 1968.
AIM-65, B. Huberman, A PROGRAM TO PLAY CHESS END GAMES, Ph.D. Thesis
in Computer Science, August 1968.
AIM-73, D. Pieper, THE KINEMATICS OF MANIPULATORS UNDER COMPUTER
CONTROL, Ph.D. Thesis in Mechanical Engineering, October
1968.
AIM-74, D. Waterman, MACHINE LEARNING OF HEURISTICS, Ph.D. Thesis in
Computer Science, December 1968.
AIM-83, R. Schank, A CONCEPTUAL DEPENDENCY REPRESENTATION FOR A
COMPUTER ORIENTED SEMANTICS, Ph.D. Thesis in Linguistics,
University of Texas, March 1969.
34
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-85, P. Vicens, ASPECTS OF SPEECH RECOGNITION BY COMPUTER, Ph.D.
Thesis in Computer Science, March 1969.
AIM-92, Victor D. Scheinman, DESIGN OF COMPUTER CONTROLLED
MANIPULATOR, Eng. Thesis in Mechanical Engineering, June
1969.
AIM-96, Claude Cordell Green, THE APPLICATION OF THEOREM PROVING TO
QUESTION-ANSWERING SYSTEMS, Ph.D. Thesis in Electrical
Engineering, August 1969.
AIM-98, James J. Horning, A STUDY OF GRAMMATICAL INFERENCE, Ph.D.
Thesis in Computer Science, August 1969.
AIM-106, Michael E. Kahn, THE NEAR-MINIMUM-TIME CONTROL OF OPEN-LOOP
ARTICULATED KINEMATIC CHAINS, Ph.D. Thesis in Mechanical
Engineering, December 1969.
AIM-121, Irwin Sobel, CAMERA MODELS AND MACHINE PERCEPTION, Ph.D.
Thesis in Electrical Engineering, May 1970.
AIM-130, Michael D. Kelly, VISUAL IDENTIFICATION OF PEOPLE BY
COMPUTER, Ph.D. Thesis in Computer Science, July 1970.
AIM-132, Gilbert Falk, COMPUTER INTERPRETATION OF IMPERFECT LINE DATA
AS A THREE-DIMENSIONAL SCENE, Ph.D. Thesis in Electrical
Engineering, August 1970.
AIM-134, Jay Martin Tenenbaum, ACCOMMODATION IN COMPUTER VISION,
Ph.D. Thesis in Electrical Engineering, September 1970.
AIM-144, Lynn H. Quam, COMPUTER COMPARISON OF PICTURES, PhD Thesis in
Computer Science, May 1971.
AIM-147, Robert E. Kling, REASONING BY ANALOGY WITH APPLICATIONS TO
HEURISTIC PROBLEM SOLVING: A CASE STUDY, PhD Thesis in
Computer Science, August 1971.
AIM-149, Rodney Albert Schmidt, Jr., A STUDY OF THE REAL-TIME CONTROL
OF A COMPUTER-DRIVEN VEHICLE, PhD Thesis in Electrical
Engineering, August 1971.
AIM-155, Jonathan Leonard Ryder, HEURISTIC ANALYSIS OF LARGE TREES AS
GENERATED IN THE GAME OF GO, PhD Thesis in Computer Science,
December 1971.
35
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-163, J.M. Cadiou, RECURSIVE DEFINITIONS OF PARTIAL FUNCTIONS AND
THEIR COMPUTATIONS, PhD Thesis in Computer Science, April
1972.
AIM-173, Gerald Jacob Agin, REPRESENTATION AND DESCRIPTION OF CURVED
OBJECTS, PhD Thesis in Computer Science, October 1972.
AIM-174, Morris, Francis Lockwood, CORRECTNESS OF TRANSLATIONS OF
PROGRAMMING LANGUAGES--AN ALGEBRAIC APPROACH, PhD Thesis in
Computer Science, August 1972.
AIM-177, Paul, Richard, MODELLING, TRAJECTORY CALCULATION AND
SERVOING OF A COMPUTER CONTROLLED ARM, PhD Thesis in Computer
Science, (forthcoming)
AIM-178, Gill, Aharon, VISUAL FEEDBACK AND RELATED PROBLEMS IN
COMPUTER CONTROLLED HAND EYE COORDINATION, PhD Thesis in
Electrical Engineering, October 1972.
AIM-180, Bajcsy, Ruzena, COMPUTER IDENTIFICATION OF TEXTURED VISUAL
SCENES, PhD Thesis in Computer Science, October 1972.
36
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
1.6.3 Films
Prints of the following film reports are available for loan to
interested groups.
1. Art Eisenson and Gary Feldman, "Ellis D. Kroptechev and Zeus, his
Marvelous Time-Sharing System", 16mm black and white with
sound, 15 minutes, March 1967.
The advantages of time-sharing over standard batch processing are
revealed through the good offices of the Zeus time-sharing system on
a PDP-1 computer. Our hero, Ellis, is saved from a fate worse than
death. Recommended for mature audiences only.
2. Gary Feldman, "Butterfinger", 16mm color with sound, 8 minutes,
March 1968.
Describes the state of the hand-eye system at the Artificial
Intelligence Project in the fall of 1967. The PDP-6 computer getting
visual information from a television camera and controlling an
electrical-mechanical arm solves simple tasks involving stacking
blocks. The techniques of recognizing the blocks and their positions
as well as controlling the arm are briefly presented. Rated "G".
3. Raj Reddy, Dave Espar and Art Eisenson, "Hear Here", 16mm color
with sound, 15 minutes, March 1969.
Describes the state of the speech recognition project as of Spring,
1969. A discussion of the problems of speech recognition is followed
by two real time demonstrations of the current system. The first
shows the computer learning to recognize phrases and second shows how
the hand-eye system may be controlled by voice commands. Commands as
complicated as "Pick up the small block in the lower lefthand
corner", are recognized and the tasks are carried out by the computer
controlled arm.
4. Gary Feldman and Donald Peiper, "Avoid", 16mm silent, color, 5
minutes, March 1969.
Reports on a computer program written by D. Peiper for his Ph.D.
Thesis. The problem is to move the computer controlled
electrical-mechanical arm through a space filled with one or more
known obstacles. The program uses heuristics for finding a safe
path; the film demonstrates the arm as it moves through various
cluttered environments with fairly good success.
37
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
5. Richard Paul and Karl Pingle, "Instant Insanity", 16 mm color,
silent, 6 minutes, August, 1971.
Shows the hand/eye system solving the puzzle "Instant Insanity".
Sequences include finding and recognizing cubes, color recognition
and object manipulation. This film was made to accompany a paper
presented at the 1971 International Joint Conference on Artificial
Intelligence in London. It may be hard to understand without a
narrator.
6. Suzanne Kandra, "Motion and Vision", 16 mm color, sound, 22
minutes, November 1972.
A technical presentation of three research projects completed in
1972: advanced arm control by R. P. Paul [AIM-177], visual feedback
control by A. Gill [AIM-178], and representation and description of
curved objects by G. Agin [AIM-173].
38
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
1.6.4 Abstracts of Recent Reports
Abstracts of recent Artificial Intelligence Memos are given below.
1971
AIM-140, Roger C. Schank, INTENTION, MEMORY, AND COMPUTER
UNDERSTANDING, January 1971, 59 pages.
Procedures are described for discovering the intention of a speaker
by relating the Conceptual Dependence representation of the speaker's
utterance to the computer's world model such that simple implications
can be made. These procedures function at levels higher than that of
structure of the memory. Computer understanding of natural language
is shown to consist of the following parts: assigning a conceptual
representation to an input; relating that representation to the
memory such as to extract the intention of the speaker; and selecting
the correct response type triggered by such an utterance according to
the situation.
AIM-141, Bruce G. Buchanan, Edward A. Feigenbaum, and Joshua
Lederberg, THE HEURISTIC DENDRAL PROGRAM FOR EXPLAINING
EMPIRICAL DATA, February 1971, 20 pages.
The Heuristic DENDRAL program uses an information processing model of
scientific reasoning to explain experimental data in organic
chemistry. This report summarizes the organization and results of the
program for computer scientists. The program is divided into three
main parts: planning, structure generation, and evaluation.
The planning phase infers constraints on the search space from the
empirical data input to the system. The structure generation phase
searches a tree whose termini are models of chemical models using
pruning heuristics of various kinds. The evaluation phase tests the
candidate structures against the original data. Results of the
program's analyses of some tests are discussed.
AIM-142, Robin Milner, AN ALGEBRAIC DEFINITION OF SIMULATION BETWEEN
PROGRAMS, February 1971, 21 pages.
A simulation relation between programs is defined which is quasi-
ordering. Mutual simulation is then an equivalence relation, and by
dividing out by it we abstract from a program such details as how the
sequencing is controlled and how data is represented. The
equivalence classes are approximations to the algorithms which are
realized, or expressed, by their member programs.
39
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
A technique is given and illustrated for proving simulation and
equivalence of programs; there is an analogy with Floyd's technique
for proving correctness of programs. Finally, necessary and
sufficient conditions for simulation are given.
AIM-143. John McCarthy, Arthur Samuel and Artificial Intelligence
Project staff, Edward Feigenbaum and Heuristic Dendral
Project staff, PROJECT TECHNICAL REPORT, MARCH 1971, 80
pages, (out of print).
An overview is presented of current research at Stanford in
artificial intelligence and heuristic programming. This report is
largely the text of a proposal to the Advanced Research Projects
Agency for fiscal years 1972-3.
AIM-144. Lynn H. Quam, COMPUTER COMPARISON OF PICTURES, May 1971,
120 pages.
This dissertation reports the development of digital computer
techniques for detecting changes in scenes by normalizing and
comparing pictures which were taken from different camera positions
and under different conditions of illumination. The pictures are
first geometrically normalized to a common point of view. Then they
are photometrically normalized to eliminate the differences due to
different illumination, camera characteristics, and reflectance
properties of the scene due to different sun and view angles. These
pictures are then geometrically registered by maximizing the cross
correlation between areas in them. The final normalized and
registered pictures are then differenced point by point.
The geometric normalization techniques require relatively accurate
geometric models for the camera and the scene, and static spatial
features must be present in the pictures to allow precise geometric
alignment using the technique of cross correlation maximization.
Photometric normalization also requires a relatively accurate model
for the photometric response of the camera, a reflectance model for
the scene (reflectance as a function of the illumination view, and
phase angles) and some assumptions about the kinds of reflectance
changes which are to be detected.
These techniques have been incorporated in a system for comparing
Mariner 1971 pictures of Mars to detect variable surface phenomena as
well as color and polarization differences. The system has been
tested using Mariner 6 and 7 pictures of Mars.
Although the techniques described in this dissertation were developed
40
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
for Mars pictures, their use is not limited to this application.
Various parts of this software package, which was developed for
interactive use on the time-sharing system of the Stanford Artificial
Intelligence Project, are currently being applied to other scenery.
AIM-145, Bruce G. Buchanan, Edward A. Feigenbaum, and Joshua
Lederberg, A HEURISTIC PROGRAMMING STUDY OF THEORY FORMATION
IN SCIENCE, June 1971, 41 pages.
The Meta-DENDRAL program is a a vehicle for studying problems of
theory formation in science. The general strategy of Meta-DENDRAL is
to reason from data to plausible generalizations and then to organize
the generalizations into a unified theory. Three main subproblems
are discussed: (1) explain the experimental data for each individual
chemical structure, (2) generalize the results from each structure to
all structures, and (3) organize the generalizations into a unified
theory. The program is built upon the concepts and programmed
routines already available in the Heuristic DENDRAL performance
program, but goes beyond the performance program in attempting to
formulate the theory which the performance program will use.
AIM-146, Andrei P. Ershov, PARALLEL PROGRAMMING, July 1971, 14 pages.
This report is based on lectures given at Stanford University by Dr.
Ershov in November, 1970.
AIM-147, Robert E. Kling, REASONING BY ANALOGY WITH APPLICATIONS TO
HEURISTIC PROBLEM SOLVING: A CASE STUDY, August 1971, 191
pages.
An information-processing approach to reasoning by analogy is
developed that promises to increase the efficiency of heuristic
deductive problem-solving systems. When a deductive problem-solving
system accesses a large set of axioms more than sufficient particular
problem, it will often create many irrelevant deductions than
saturate the memory of the problem solver.
Here, an analogy with some previously solved problem and a new
unsolved problem is used to restrict the data base to a small set of
appropriate axioms. This procedure (ZORBA) is studied in detail for
a resolution theorem proving system. A set of algorithms (ZORBA-I)
which automatically generates an analogy between a new unproved
theorem, T↓A, and a previously proved theorem, T, is described in
detail. ZORBA-I is implemented in LISP on a PDP-10.
ZORBA-I is examined in terms of its empirical performance on parts of
analogous theorems drawn from abstract algebra. Analytical studies
41
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
are included which show that ZORBA-I can be useful to aid automatic
theorem proving in many pragmatic cases while it may be a detriment
in certain specially contrived cases.
AIM-148, Edward Ashcroft, Zohar Manna, and Amir Pneuli, DECIDABLE
PROPERTIES OF MONADIC FUNCTIONAL SCHEMAS, July 1971, 10
pages.
We define a class of (monadic) functional schemas which properly
include `Ianov' flowchart schemas. We show that the termination,
divergence and freedom problems for functional schemas are decidable.
Although it is possible to translate a large class of non-free
functional schemas into equivalent free functional schemas, we show
that this cannot be done in general. We show also that the
equivalence problem for free functional schemas is decidable. Most
of the results are obtained from well-known results in Formal
Languages and Automata Theory.
AIM-149, Rodney Albert Schmidt, Jr., A STUDY OF THE REAL-TIME CONTROL
OF A COMPUTER-DRIVEN VEHICLE, August 1971, 180 pages.
Vehicle control by the computer analysis of visual images is
investigated. The areas of guidance, navigation, and incident
avoidance are considered. A television camera is used as the prime
source of visual image data.
In the guidance system developed for an experimental vehicle, visual
data is used to gain information about the vehicle system dynamics,
as well as to guide the vehicle. This information is used in real
time to improve performance of the non-linear, time-varying vehicle
system.
A scheme for navigation by pilotage through the recognition of two
dimensional scenes is developed. A method is proposed to link this
to a computer-modelled map in order to make journeys.
Various difficulties in avoiding anomolous incidents in the automatic
control of automobiles are discussed, together with suggestions for
the application of this study to remote exploration vehicles or
industrial automation.
42
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-150, Robert W. Floyd, TOWARD INTERACTIVE DESIGN OF CORRECT
PROGRAMS, September 1971, 12 pages.
We propose an interactive system proving the correctness of a
program, or locating errors, as the program is designed.
AIM-151, Ralph L. London, CORRECTNESS OF TWO COMPILERS FOR A LISP
SUBSET, October 1971, 41 pages.
Using mainly structural induction, proofs of correctness of each of
two running Lisp compilers for the PDP-10 computer are given.
Included are the rationale for presenting these proofs, a discussion
of the proofs, and the changes needed to the second compiler to
complete its proof.
AIM-152, A.W. Biermann, ON THE INFERENCE OF TURING MACHINES FROM
SAMPLE COMPUTATIONS, October 1971, 31 pages.
An algorithm is presented which when given a complete description of
a set of Turing machine computations finds a Turing machine which is
capable of doing those computations. This algorithm can serve as the
basis for designing a trainable device which can be trained to
simulate any Turing machine by being led through a series of sample
computations done by that machine. A number of examples illustrate
the use of the technique and the possibility of the application to
other types of problems.
AIM-153, Patrick J. Hayes, THE FRAME PROBLEM AND RELATED PROBLEMS IN
ARTIFICIAL INTELLIGENCE, November 1971, 18 pages.
The frame problem arises in considering the logical structure of a
robot's beliefs. It has been known for some years, but only recently
has much progress been made. The problem is described and discussed.
Various suggested methods for its solution are outlines, and
described in a uniform notation. Finally, brief consideration is
given to the problem of adjusting a belief system in the face of
evidence which contradicts beliefs. It is shown that a variation on
the situation notation of (McCarthy and Hayes, 1969) permits an
elegant approach, and relates this problem to the frame problem.
AIM-154, Zohar Manna, Stephen Ness and Jean Vuillemin, INDUCTIVE
METHODS FOR PROVING PROPERTIES OF PROGRAMS, November 1971, 24
pages.
We have two main purposes in this paper. First, we clarify and
extend known results about computation of recursive programs,
emphasizing the difference between the theoretical and practical
43
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
approaches. Secondly, we present and examine various known methods
for proving properties of recursive programs. We discuss in detail
two powerful inductive methods, computational induction and
structural induction, illustrating their applications by various
examples. We also briefly discuss some other related methods.
Our aim in this work is to introduce inductive methods to as wide a
class of readers as possible and to demonstrate their power as
practical techniques. We ask the forgiveness of our more
theoretical-minded colleagues for our occasional choice of clarity
over precision.
AIM-155, Jonathan Leonard Ryder, HEURISTIC ANALYSIS OF LARGE TREES AS
GENERATED IN THE GAME OF GO, December 1971, 300 pages.
The Japanese game of Go is of interest both as a problem in
mathematical repesentation and as a game which generates a move tree
with an extraordinarily high branching factor (100 to 300 branches
per ply). The complexity of Go (and the difficulty of Go for human
players) is thought to be considerably greater than that of chess.
The constraints of being able to play a complete game and of being
able to produce a move with a moderate amount of processing time were
placed on the solution.
The basic approach used was to find methods for isolating and
exploring several sorts of relevant subsections of the global game
tree. This process depended heavily on the ability to define and
manipulate the entities of Go as recursive functions rather than as
patterns of stones. A general machine-accessible theory of Go was
developed to provide context for program evaluations.
A program for playing Go is now available on the Stanford PDP-10
computer. It will play a complete game, taking about 10 to 30
seconds for an average move. The quality of play is better than that
of a beginner in many respects, but incompletenesses in the current
machine-representable theory of Go prevent the present program from
becoming a strong player.
AIM-156, Kenneth Mark Colby, Franklin Dennis Hilf, Sylvia Weber, and
Helena C. Kraemer, A RESEMBLANCE TEST FOR THE VALIDATION OF A
COMPUTER SIMULATION OF PARANOID PROCESSES, November 1971, 29
pages.
A computer simulation of paranoid processes in the form of a dialogue
algorithm was subjected to a validation study using an experimental
resemblance test in which judges rate degrees of paranoia present in
initial psychiatric interviews of both paranoid patients and of
44
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
versions of the paranoid model. The statistical results indicate a
satisfactory degree of resemblance between the two groups of
interviews. It is concluded that the model provides a successful
simulation of naturally occuring paranoid processes.
AIM-157, Yorick Wilks, ONE SMALL HEAD -- SOME REMARKS ON THE USE OF
`MODEL' IN LINGUISTICS, December 1971, 17 pages.
I argue that the present situation in formal linguistics, where much
new work is presented as being a "model of the brain", or of "human
language behavior", is an undesirable one. My reason for this
judgement is not the conservative (Braithwaitian) one that the
entities in question are not really models but theories. It is
rather that they are called models because they cannot be theories of
the brain at the present stage of brain research, and hence that the
use of "model" in this context is not so much aspirational as
resigned about our total ignorance of how the brain stores and
processes linguistic information. The reason such explanatory
entities cannot be theories is that this ignorance precludes any
"semantic ascent" up the theory; i.e., interpreting the items of the
theory in terms of observables. And the brain items, whatever they
may be, are not, as Chomsky has sometimes claimed, in the same
position as the "occult entities" of Physics like Gravitation; for
the brain items are not theoretically unreachable, merely unreached.
I then examine two possible alternate views of what linguistic
theories should be proffered as theories of: theories of sets of
sentences, and theories of a particular class of algorithms. I argue
for a form of the latter view, and that its acceptance would also
have the effect of making Computational Linguistics a central part of
Linguistics, rather than the poor relation it is now.
I examine a distinction among "linguistic models" proposed recently
by Mey. who was also arguing for the self-sufficiency of
Computational Linguistics, though as a "theory of performance". I
argue that his distinction is a bad one, partly for the reasons
developed above and partly because he attempts to tie it to Chomsky's
inscrutable competence-performance distinction. I conclude that the
independence and self-sufficiency of Computational Linguistics are
better supported by the arguments of this paper.
45
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-158, Zohar Manna, Ashok Chandra, PROGRAM SCHEMAS WITH EQUALITY,
December 1971, 22 pages.
We discuss the class of program schemas augmented with equality
tests, that is, tests of equality between terms. In the first part
of the paper we illustrate the "power" of equality tests. It turns
out that the class of program schemas with equality is more powerful
than the "maximal" classes of schemas suggested by other
investigators. In the second part of the paper, we discuss the
decision problems of program schemas with equality. It is shown, for
example, that while the decision problems normally considered for
schemas (such as halting, divergence, equivalence, isomorphism and
freedom) are decidable for ianov schemas. They all become
undecidable if general equality tests are added. We suggest,
however, limited equality tests which can be added to certain
subclasses of program schemas while preserving their decidability
property.
1972
AIM-159, J.A. Feldman and Paul C. Shields, TOTAL COMPLEXITY AND
INFERENCE OF BEST PROGRAMS, April 1972.
Axioms for a total complexity measure for abstract programs are
presented. Essentially, they require that total complexity be an
unbounded increasing function of the Blum time and size measures.
Algorithms for finding the best program on a finite domain are
presented, and their limiting behavior for infinite domains
described. For total complexity, there are important senses in which
a machine can find the best program for a large class of functions.
AIM-160, J. Feldman, AUTOMATIC PROGRAMMING, February 1972, 20 pages.
The revival of interest in Automatic Programming is considered. The
research is divided into direct efforts and theoretical developments
and the successes and prospects of each are described.
AIM-161, Y. Wilks, AN ARTIFICIAL INTELLIGENCE APPROACH TO MACHINE
TRANSLATION, February 1972, 44 pages.
The paper describes a system of semantic analysis and generation,
programmed in LISP 1.5 and designed to pass from paragraph length
input in English to French via an interlingual representation. A
wide class of English input forms will be covered, but the vocabulary
will initially be restricted to one of a few hundred words. With
this subset working, and during the current year (1971-72), it is
also hoped to map the interlingual representation onto some predicate
46
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
calculus notation so as to make possible the answering of very simple
questions about the translated matter. The specification of the
translation system itself is complete, and its main points are: i) It
translates phrase by phrase--with facilities for reordering phrases
and establishing essential semantic connectivities between them--by
mapping complex semantic stuctures of "message" onto each phrase.
These constitute the interlingual representation to be translated.
This matching is done without the explicit use of a conventional
syntax analysis, by taking as the appropriate matched structure the
"most dense" of the alternative structures derived. This method has
been found highly successful in earlier versions of this analysis
system.
ii) The French output strings are generated without the explicit use
of a generative grammar. That is done by means of STEREOTYPES:
strings of French words, and functions evaluating to French words,
which are attached to English word senses in the dictionary and built
into the interlingual representation by the analysis routines. The
generation program thus receives an interlingual representation that
already contains both French output and implicit procedures for
assembling the output, since the stereotypes are in effect recursive
procedures specifying the content and production of the output word
strings. Thus the generation program at no:time consults a word
dictionary or inventory of grammar rules.
It is claimed that the system of notation and translation described
is a convenient one for expressing and handling the items of semantic
information that are ESSENTIAL to any effective MT system. I discuss
in some detail the semantic information needed to ensure the correct
choice of output prepositions in French; a vital matter inadequately
treated by virtually all previous formalisms and projects.
AIM-162, Roger Schank, Neil Goldman, Chuck Reiger, Chris Reisbeck,
PRIMITIVE CONCEPTS UNDERLYING VERBS OF THOUGHT, April 1972,
102 pages.
In order to create conceptual structures that will uniquely and
unambiguously represent the meaning of an utterance, it is necessary
to establish `primitive' underlying actions and states into which
verbs can be mapped. This paper presents analyses of the most common
mental verbs in terms of such primitive actions and states. In order
to represent the way people speak about their mental processes, it
was necessary to add to the usual ideas of memory structure the
notion of Immediate Memory. It is then argued that there are only
three primitive mental ACTs.
47
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-163, J.M. Cadiou, RECURSIVE DEFINITIONS OF PARTIAL FUNCTIONS AND
THEIR COMPUTATIONS, April 1972, 160 pages.
A formal syntactic and semantic model is presented for `recursive
definitions' which are generalizations of those found in LISP, for
example. Such recursive definitions can have two classes of
fixpoints, the strong fixpoints and the weak fixpoints, and also
possess a class of computed partial functions.
Relations between these classes are presented: fixpoints are shown to
be extensions of computed functions. More precisely, strong
fixpoints are shown to be extensions of computed functions when the
computations may involve "call by name" substitutions; weak fixpoints
are shown to be extensions of computed functions when the computation
only involve "call by value" substitutions. The Church-Rosser
property for recursive definitions with fixpoints also follows from
these results.
Then conditions are given on the recursive definitions to ensure that
they possess least fixpoints (of both classes), and computation rules
are given for computing these two fixpoints: the "full" computation
rule, which leads to the least weak fixpoint. A general class of
computation rules, called `safe innermost', also lead to the latter
fixpoint. The "leftmost innermost" rule is a special case of those,
for the LISP recursive definitions.
AIM-164, Zohar Manna and Jean Vuillemin, FIXPOINT APPROACH TO THE
THEORY OF COMPUTATION, April 1972, 29 pages.
Substantial effort has been put into finding verification techniques
for computer programs. There are now so many verification techniques
that a choice in a practical situation may appear difficult to the
non-specialist. In particular, the most widely used methods, such as
the "inductive assertion method", can be used to prove some input-
output relation (assuming that the program terminates) while such
problems as termination or equivalence usually require a different
type of technique.
Our main purpose in this paper is to suggest that Scott's fixedpoint
approach to the semantics of programming languages frees us from that
dilemma. It allows one not only to justify all existing verification
techniques, but also to extend them to handle other properties of
computer programs, including termination and equivalence, in a
uniform manner.
48
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-165, D.A. Bochvar, TWO PAPERS ON PARTIAL PREDICATE CALCULUS,
April 1972, 50 pages.
The three-valued system to which this study is devoted is of interest
as a logical calculus for two reasons: first, it is based on
formalization of certain basic and intuitively obvious relations
satisfied by the predicates "true", "false" and "meaningless" as
applied to propositions, and as a result the system possesses a
clear-cut and intrinsically logical interpretation; second, the
system provides a solution to a specifically logical problem,
analysis of the paradoxes of classical mathematical logic, by
formally proving that certain propositions are meaningless.
The paper consists of three parts. In the first we develop the
elementary part of the system--the propositional calculus--on the
basis of intuitive considerations. In the second part we outline the
"restricted" functional calculus corresponding to the propositional
calculus. The third and last part uses a certain "extension" of the
functional calculus to analyze the paradoxes of classical
mathematical logic.
We are indebted to Professor V.I. Glivenko for much valuable advice
and criticism. In particular, he provided a more suitable definition
of the function a(see I, Section 2, subsection 1.).
AIM 166, Lynn H. Quam, Sidney Liebes, Jr., Robert B. Tucker, Marsha
Jo Hannah, Botond G. Eross, COMPUTER INTERACTIVE PICTURE PROCESSING,
April 1972, 40 pages.
This report describes work done in image processing using an
interactive computer system. Techniques for image differencing are
described and examples using images returned from Mars by the Mariner
Nine spacecraft are shown. Also described are techniques for stereo
image processing. Stereo processing for both conventional camera
systems and the Viking 1975 Lander camera system is reviewed.
AIM-167, Ashok Chandra, EFFICIENT COMPILATION OF LINEAR RECURSIVE
PROGRAMS, June 1972, 43 pages.
We consider the class of linear recursive programs. A linear
recursive program is a set of procedures where each procedure can
make at most one recursive call. The conventional stack
implementation of recursion requires time and space both proportional
to n, the depth of recursion. It is shown that in order to implement
linear recursion so as to execute in time n one doesn't need space
proportional to n: n**ε for sufficiently small ε will do. It is also
known that with constant space one can implement linear recursion in
49
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
time n**2. We show that one can do much better: n**(1+ε) for
arbitrarily small ε. We also describe an algorithm that lies between
these two: it takes time n*log n and space log n.
It is shown that several problems are closely related to the linear
recursion problem, for example, the problem of reversing an input
tape given a finite automaton with several one-way heads. By casting
all these problems into canonical form, efficient solutions are
obtained simultaneously for all.
AIM-168, Shigeru Igarashi, ADMISSIBILITY OF FIXED-POINT INDUCTION IN
FIRST-ORDER LOGIC OF TYPED THEORIES, May 1972, 40 pages.
First-order logic is extended so as to deal with typed theories,
especially that of continuous functions with fixed-point induction
formalized by D. Scott. The translation of his formal system, or the
λ calculus-oriented system derived and implemented by R. Milner, into
this logic amounts to adding predicate calculus features to them.
In such a logic the fixed-point induction axioms are no longer valid,
in general, so that we characterize formulas for which Scott-type
induction is applicable, in terms of syntax which can be checked by
machines automatically.
Diskfile: Aim168.igr[aim,doc.)
AIM-169, Robin Milner, LOGIC FOR COMPUTABLE FUNCTIONS. DESCRIPTION
OF A MACHINE IMPLEMENTATION, May 1972, 36 pages.
This paper is primarily a user's manual for LCF, a proof-checking
program for a logic of computable functions proposed by Dana Scott in
1969, but unpublished by him. We use the name LCF also for the logic
itself, which is presented at the start of the paper. The proof-
checking program is designed to allow the user interactively to
generate formal proofs about computable functions and functionals
over a variety of domains, including those of interest to the
computer scientist--for example, integers, lists and computer
programs and their semantics. The user's task is alleviated by two
features: a subgoaling facility and a powerful simplification
mechanism. Applications include proofs of program correctness and in
particular of compiler correctness; these applications are not
discussed herein, but are illustrated in the papers referenced in the
introduction.
Diskfile: Lcfman.rgm[aim,doc]
50
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-170, Yorick Wilks, LAKOFF ON LINGUISTICS AND NATURAL LOGIC, June
1972, 19 pages.
The paper examines and criticises Lakoff's notions of a natural logic
and of a generative semantics described in terms of logic, I argue
that the relationship of these notions to logic as normally
understood is unclear, but I suggest, in the course of the paper, a
number of possible interpretations of his thesis of generative
semantics. I argue further that on these interpretations a mere
notational variant of Chomskyan theory, I argue, too, that Lakoff's
work may provide a service in that it constitutes a reductio ad
absurdum of the derivational paradigm of modern linguistics; and
shows, inadvertently, that only a system with the ability to
reconsider its own inferences can do the job that Lakoff sets up for
linguistic enquirey -- that is to say, only an "artificial
intelligence" system.
Diskfile: Lakoff.Yaw[aim,doc]
AIM-171, Roger Schank, ADVERBS AND BELIEF, June 1972, 30 pages.
The treatment of a certain class of adverbs in conceptual
representation is given. Certain adverbs are shown to be
representative of complex belief structures. These adverbs serve as
pointers that explain where the sentence that they modify belongs in
a belief structure.
AIM-172, Sylvia Weber Russell, SEMANTIC CATEGORIES OF NOMINALS FOR
CONCEPTUAL DEPENDENCY ANALYSIS OF NATURAL LANGUAGE, July
1972, 64 pages.
A system for the semantic categorization of conceptual objects
(nominals) is provided. The system is intended to aid computer
understanding of natural language. Specific implementations for
"noun-pairs" and prepositional phrases are offered.
AIM-173, Gerald Jacob Agin, REPRESENTATION AND DESCRIPTION OF CURVED
OBJECTS, October 1972,
This document describes some first efforts toward the goal of
description and recognition of the general class of curved, three
dimensional objects.
Three dimensional images, similar to depth maps, are obtained with a
triangulation system using a television camera, and a deflectable
laser beam diverged into a plane by a cylindrical lens.
51
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
Complex objects are represented as structures joining parts called
generalized cylinders. These primitives are formalized in a volume
representation by an arbitrary cross section varying along a space
curve axis. Several types of joint structures are discussed.
Experimental results are shown for the description (building of
internal computer models) of a handful of complex objects, beginning
with laser range data from actual objects. our programs have
generated complete descriptions of rings, cones, and snake-like
objects, all of which may be described by a single primitive. Complex
objects, such as dolls, have been segmented into parts, most of which
are well described by programs which implement generalized cylinder
descriptions.
AIM-174, Morris, Francis Lockwood, CORRECTNESS OF TRANSLATIONS OF
PROGRAMMING LANGUAGES -- AN ALGEBRAIC APPROACH, August 1972,
124 p.
Programming languages and their sets of meanings can be modelled by
general operator algebras; semantic functions and compiling functions
by homomorphisms of operator algebras. A restricted class of
individual programs, machines, and computations can be modelled in a
uniform manner by binary relational algebras. A restricted class of
individual manner by binary relational algebras. These two
applications of algebra to computing are compatible: the semantic
function provided by interpreting ("running") one binary relational
algebra on another is a homomorphism on an operator algebra whose
elements are binary relational algebras.
Using these mathematical tools, proofs can be provided systematically
of the correctness of compilers for fragmentary programming languages
each embodying a single language "feature". Exemplary proofs are
given for statement sequences, arithmetic expressions, Boolean
expressions, assignment statements, and while statement. Moreover,
proofs of this sort can be combined to provide (synthetic) proofs for
in principle, many different complete programming languages. One
example of such a synthesis is given.
AIM-175, Tanaka, Hozumi, HADAMARD TRANSFORM FOR SPEECH WAVE ANALYSIS,
August 1972, 34 pages.
Two methods of speech wave analysis using the Hadamard transform are
discussed. The first method is a direct application of the Hadamard
transform for speech waves. The reason this method yields poor
results is discussed. The second method is the application of the
Hadamard transform to a log-magnitude frequency spectrum. After the
application of the Fourier transform the Hadamard transform is
52
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
applied to detect a pitch period or to get a smoothed spectrum. This
method shows some positive aspects of the Hadamard transform for the
analysis of a speech wave with regard to the reduction of processing
time required for smoothing, but at the cost of precision. A formant
tracking program for voiced speech is implemented by using this
method and an edge following technique used in scene analysis.
AIM-176, Feldman, J.A., Low, J.R., Swinehart, D.C., Taylor, R. H.,
RECENT DEVELOPMENTS IN SAIL. AN ALGOL-BASED LANGUAGE FOR
ARTIFICIAL INTELLIGENCE, (forthcoming).
AIM-177, Paul, Richard, MODELLING, TRAJECTORY CALCULATION AND
SERVOING OF A COMPUTER CONTROLLED ARM, (forthcoming)
AIM-178, Aharon Gill, VISUAL FEEDBACK AND RELATED PROBLEMS IN
COMPUTER CONTROLLED HAND EYE COORDINATION, October 1972, 134
pages.
A set of programs for precise manipulation of simple planar bounded
objects, by means of visual feedback, was developed for use in the
Stanford hand-eye system. The system includes a six degrees of
freedom computer controlled manipulator (arm and hand) and a fully
instrumented computer television camera.
The image of the hand and manipulated objects is acquired by the
computer through the camera. The stored image is analyzed using a
corner and line finding program developed for this purpose. The
analysis is simplified by using all the information available about
the objects and the hand, and previously measured coordination
errors. Simple touch and force sensing by the arm help the
determination of three dimensional positions from one view.
The utility of the information used to simplify the scene analysis
depends on the accuracy of the geometrical models of the camera and
arm. A set of calibration updating techniques and programs was
developed to maintain the accuracy of the camera model relative to
the arm model.
The precision obtained is better than .1 inch. It is limited by the
resolution of the imaging system and of the arm position measuring
system.
53
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-179, Baumgart, Bruce G., WINGED EDGE POLYHEDRON REPRESENTATION,
October 1972, 46 pages.
(forthcoming)
AIM-180, Bajcsy, Ruzena, COMPUTER IDENTIFICATION OF TEXTURED VISUAL
SCENES, October 1972, 156 pages.
This work deals with computer analysis of textured outdoor scenes
involving grass, trees, water and clouds. Descriptions of texture are
formalized from natural language descriptions; local descriptors are
obtained from the directional and non-directional components of the
Fourier transform power spectrum. Analytic expressions are obtained
for orientation, contrast, size, spacing, and in periodic cases, the
locations of texture elements. These local descriptors are defined
over windows of various sizes; the choice of sizes is made by a
simple higher-level program.
The process of region growing is represented by a sheaf-theoretical
model which formalizes the operation of pasting local structure (over
a window) into global structure (over a region). Programs were
implemented which form regions of similar color and similar texture
with respect to the local descriptors.
An interpretation is made of texture gradient as distance gradient in
space. A simple world model is described. An interpretation of
texture regions and texture gradient is made with a simulated
correspondence with the world model. We find that a problem-solving
approach, involving hypothesis-verification, more satisfactory than
an earlier pattern recognition effort (Bajcsy 1970) and more crucial
to work with complex scenes than in scenes of polyhedra. Geometric
clues from relative sizes, texture gradients, and interposition are
important in interpretation.
AIM-181, Buchanan, Bruce G, REVIEW OF HUBERT DREYFUS' WHAT COMPUTERS
CAN'T DO: A CRITIQUE OF ARTIFICIAL REASON (Harper & Row, New
York, 1972), November 1972, 14 pp.
The recent book "What Computers Can't Do" by Hubert Dreyfus is an
attack on artificial intelligence research. This review takes the
position that the philosophical content of the book is interesting,
but that the attack on artificial intelligence is not well reasoned.
54
ARTIFICIAL INTELLIGENCE PROJECT PROJECT PUBLICATIONS
AIM-182, Colby, K.M., Hilf, F.D. CAN EXPERT JUDGES, USING TRANSCRIPTS
OF TELETYPED PSYCHIATRIC INTERVIEWS, DISTINGUISH HUMAN
PARANOID PATIENTS FROM A COMPUTER SIMULATION OF PARANOID
PROCESSES?, December l972, 8 pp.
Expert judges, psychiatrists and computer scientists, could not
correctly distinguish a simulation model of paranoid processes from
actual paranoid patients.
55
ARTIFICIAL INTELLIGENCE PROJECT
1.7 BUDGET
15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
I. TOTAL SALARIES (detail below) $ 552,201 $ 547,669
II. STAFF BENEFITS
16% to 8-31-73
17.3% 9-1-73 to 8-31-74 94,030
18.3% 9-1-74 on 99,310
III. TRAVEL 30,700 30,700
IV. CAPITAL EQUIPMENT 75,000 75,000
V. EQUIPMENT RENTAL 19,774 19,774
VI. EQUIPMENT MAINTENANCE 40,320 40,320
VII. COMMUNICATIONS 16,200 16,200
(Telephones, dataphones, teletypes)
VIII. PUBLICATIONS COST (Past Experience) 13,500 13,500
IX. OTHER OPERATING EXPENSES 44,300 43,552
(e.g. office supplies, postage,
freight, consulting, utilties)
X. INDIRECT COSTS - 46% of NTDC 363,975 363,975
(NTDC = I+II+III+VI+VII+VIII+IX)
TOTAL ARTIFICIAL INTELLIGENCE BUDGET --- $1,250,000 $1,250,000
56
ARTIFICIAL INTELLIGENCE PROJECT BUDGET
SALARIES 15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
Faculty
Feldman, Jerome $3,626 $3,626
Associate Prof.
17% Acad. Yr., 17% Summer
Green, Cordell 12,082 12,082
Assistant Prof.
50% Acad. Yr., 100% Summer
Hoare, C.A.R. 3,472 0
Visiting Associate Prof.
50% Fall Qtr.'73
McCarthy, John 19,582 19,582
Professor
50% Acad. Yr., 100% Summer
Winograd, Terry 7,002 7,002
Visiting Assistant Prof.
50% Acad. Yr., 50% Summer
TOTAL Faculty SALARIES $45,764 $42,292
Research Staff
Allen, John 15,000 15,000
Research Associate
Anderson, D. Bruce 12,000 12,000
Research Programmer
Binford, Thomas O. 18,600 18,600
Research Associate
Diffie, Whit 6,180 0
Research Programmer
Earnest, Lester D. 26,964 26,964
Research Computer Scientist
Garland, Steve 6,996 0
Research Associate (Autumn 1973)
57
ARTIFICIAL INTELLIGENCE PROJECT BUDGET
Research Staff (Continued) 15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
Gorin, Ralph $12,960 $12,960
Systems Programmer
Helliwell, Richard P. 10,500 10,500
Systems Programmer
Holloway, John 15,000 15,000
Design Engineer
Luckham, David 20,280 20,280
Research Computer Scientist
Miller, Neil 13,680 13,680
Research Associate
Panofsky, Edward F. 12,960 12,960
Computer Systems Engineer
Paul, Richard P. 16,140 16,140
Research Programmer
Pingle, Karl K. 15,660 15,660
Research Programmer
Quam, Lynn 18,000 18,000
Research Associate
Samuel, Arthur L. 6,666 6,666
Senior Res. Assoc., 25% time
Scheinman, Victor 16,560 16,560
Design Engineer
Tesler, Larry 16,680 16,680
Systems Programmer
Thosar, Ravindra 3,300 0
Research Associate
Weyhrauch, Richard 13,200 13,200
Research Associate
TOTAL Research Staff SALARIES $277,326 $260,850
58
ARTIFICIAL INTELLIGENCE PROJECT BUDGET
15-JUN-73 1-JUL-74
thru thru
Student Research Assistants 30-JUN-74 30-JUN-75
50% Acad. Yr., 100% Summer
unless noted otherwise
Baumgart, Bruce G. $ 5,538 $ 5,538
Bolles, Robert 5,205 5,304
Borning, Alan 5,004 5,205
Davis, Randall 5,304 5,304
Frost, Martin 5,070 5,070
Ganapathy, Kicha 5,304 5,304
Gennery, Donald 5,070 5,070
Hemphill, Linda 5,538 5,538
Hui, Wing 4,914 2,816
Karp, Peggy 5,070 5,070
Lenat, D. B. 5,070 5,070
Low, J. Richard 5,538 5,538
Moorer, James A. 5,538 5,538
Morales, Jorge 5,070 5,070
Nevatia, Ramakant 5,538 5,538
Newey, Malcolm C. 5,304 5,304
Poole, David 5,304 5,304
Riesbeck, Christopher 5,538 5,538
Samet, Hanan 5,304 5,304
Suzuki, Norihisa 6,435 5,070
Taylor, Russell H. 3,345 3,345
59
ARTIFICIAL INTELLIGENCE PROJECT BUDGET
15-JUN-73 1-JUL-74
thru thru
Student Research Assistants (cont.) 30-JUN-74 30-JUN-75
Thomas, Arthur J. $ 5,070 $ 5,070
Wagner, Todd J. 4,914 4,914
Yu, Frank S. 4,914 4,914
------- 4,914 0
TOTAL Student Research Assistant SALARIES $ 129,813 $ 121,736
Others
Barnett, Barbara Ann 8,017 8,017
Secretary, 96% time
Baur, Queenette 8,352 8,352
Secretary, 96% time
Briggs, Norman R. 13,500 13,500
Research Coordinator, 90% time
Enderlein, Trina 1,602 1,602
Secretary, 25% time
Gafford, Thomas 4,380 4,380
Electronic Tech., 50% time
Langle, Robert 840 840
Administrator, 5% time
Stuart, Elbridge 7,776 7,776
Electronic Tech.
Wilson, Charles 8,568 8,568
Electronic Tech.
Wood, Patricia 6,624 6,624
Secretary, 96% time
Zingheim, Thomas J. 10,944 10,944
Electronic Tech.
60
ARTIFICIAL INTELLIGENCE PROJECT BUDGET
15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
Others (cont.)
-----, 2,400 2,400
Hourly Tech.
TOTAL Others SALARIES $73,003 $73,003
SUBTOTAL ARTIFICIAL INTELLIGENCE SALARIES $525,906 $497,881
Projected Salary Increases at 5%/Yr. 26,295 49,788
TOTAL ARTIFICIAL INTELLIGENCE SALARIES $552,201 $547,669
61
2. HEURISTIC PROGRAMMING PROJECT
Co-Principal Investigators: E.A. Feigenbaum, J. Lederberg
Proposed Project Theme for New Contract Period:
High-Performance Intelligent Programs for Assisting Scientific Work.
Background: In previous periods, the Heuristic Programming Project
developed the Heuristic DENDRAL Program for the analysis of mass
spectra of molecules; began the development of the Meta-DENDRAL
program for the automatic inference of theory in mass spectrometry;
began the design of a closed-loop control system for intelligent
control of scientific instruments by DENDRAL-like problem-solving
programs; studied application of artificial intelligence techniques
to new problem domains and formulated work in three such application
areas; and studied basic questions in artificial intelligence
research (e.g., the representation and use of knowledge; generality
in problem solvers). Heuristic DENDRAL is one of the few examples of
a high-performance intelligent system, sometimes achieving levels of
scientific problem solving not yet achieved by the best human
experts, often achieving levels equal to that of good human
performance.
For the new period, we propose a project theme, as follows:
a. developing applications of artificial intelligence research aimed
at assisting the work of scientists in a few specific scientific
disciplines, and scanning other disciplines for possible future
applications.
b. adopting, as a subgoal, the achievement of high levels of
performance of these science-assistance programs, recognizing
that in any one application the time constants associated with
achieving high performance may be relatively long.
Summary of Work Proposed:
To pursue work in keeping with this theme, the Project has organized
a number of specific tasks. These are summarized next, and are given
a more extended treatment in the remainder of the proposal.
1. Theory formation assistant.
Complete the Meta-DENDRAL Program, work on which began in this
62
HEURISTIC PROGRAMMING PROJECT
period, and bring it to high levels of performance on some tasks.
The domain is mass spectrometry, in which we have extensive
background, knowledge, and an expert team. The goal is to assist in
the inference of theory of fragmentation of molecules in a mass
spectrometer. Preliminary results are very promising. Basic
insights into the nature of this type of intelligent system
(induction from empirical data) are also important.
2. Intelligent Control of Scientific Instruments.
Develop programs for intelligent control of a scientific data-
collecting instrument. The program makes control decisions by
reasoning in terms of a theory of the instrumental and physical
process involved. This is a problem of importance when time, band
width, or other constraints limit the amount of data which can be
gathered for analysis. Candidate instruments are mass spectrometers
and nuclear magnetic resonance spectrometers.
3. Application of artificial intelligence to the task of computer
programming ("automatic programming")
Four subtasks will be pursued: First, the development of a problem-
solving executive (Theta-DENDRAL) to manage and optimize the
strategies and heuristics used by the Heuristic DENDRAL program in
its inference activity; second, the development of techniques for
specifying, and programs for writing, relatively simple list-
processing procedures; third, the development of an automatic
debugging program that employs an ad-hoc theory of a class of program
pathologies to debug ALGOL programs; and fourth, the development of
programs to write simple operating system procedures (e.g.
schedulers).
4. Application of Heuristic Programming to Protein Structure
Determination
Develop heuristic programs that will formulate 3-space models of the
structure of proteins in crystals using data derived from x-ray
crystallography. The derived data is the electron density map. The
expertise of protein chemists in "fitting" complex models to noisy
and incomplete maps will be extracted from the experts and used by
the program, as was done in the development of the Heuristic DENDRAL
Program.
5. Application of Heuristic Programming to Theory Formation in
Climatology.
At a relatively low level of effort, continue discussions with
63
HEURISTIC PROGRAMMING PROJECT
climatologists aimed at formulating an applicaton in the nature of a
theory-formation assistant to a climatologist. The assistance will
be in the direction of the formation and testing of hypotheses about
climate -- hypotheses framed in terms of concepts and variables that
relate directly to long-range phenomena (climate) rather than short-
range phenomena (weather).
The sections which follow this summary give extended discussions of
each task to be undertaken.
64
HEURISTIC PROGRAMMING PROJECT
2.1 THEORY FORMATION
Much of Computer Science can be viewed as a series of programming
innovations by means of which we are moving gradually from a position
of having to tell a computer precisely how we want a problem to be
solved to a position of being able to tell a problem-solving program
what we wish done by the computer. But, what the user wants done
always concerns some specific task environment -- some piece of the
real world. For the problem-solving program to interpret for itself
the what, it must have knowledge of the specific task environment and
its behavior. In other words, the program needs some kind of theory
(formal, informal, heuristic, etc.) of that environment in terms of
which to do its problem solving. For example, signals coming from a
mass spectrometer cannot be given physical-chemical explanations
except in terms of a theory of how molecules fragment in such an
instrument.
The problem of knowledge acquisition and structuring by problem
solving systems is crucial, and is perhaps the central problem of AI
research today. In the present period of the contract, we have made
it the main concern of our project. The work on automatic theory
formation is focussed on the development of a program called Meta-
DENDRAL for automatic acquisition of a theory of fragmentation
processes in mass spectrometry. For reasons that are presented in
our last proposal, we were ideally positioned to study this problem
in the context of mass spectrometry. (The reasons have to do with
the pre-existence of the high-performance Heuristic DENDRAL Program,
and the team of specialists that built it.)
Significant work has been done since the start of the present
contract period on the formulation and programming of a few different
experimental versions of Meta-DENDRAL. The preliminary results to
date have been surprising, gratifying, and motivating. We feel that
it is worth our effort, and ARPA's support, to continue this
development. We propose, therefore, two more years of support for
the Meta-DENDRAL effort, after which time we hope the project will
receive full support from the National Science Foundation (NSF) or
the National Institutes of Health (NIH) (Incidentally, Meta-DENDRAL
research is currently being supported 20% by the NIH DENDRAL Grant,
so there is initial recognition in those quarters of its importance).
65
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION
2.1.1 Meta-DENDRAL Problem Domain
Mass spectrometry is the task domain for the theory formation
program, called Meta-DENDRAL, as it is for the Heuristic DENDRAL
program. It is a natural choice for us because we have developed a
large number of computer programs for manipulating molecular
structures and mass spectra in the course of heuristic dendral
research and because of the interest in mass spectrometry among
collaborative researchers already associated with the project. This
is also a good task area because it is difficult, but not impossible,
for human scientists to develop fragmentation rules to explain the
mass spectrometric behavior of a class of molecules. Mass
spectrometry has not been formalized to any great degree, and there
remain gaps in the theory, but discovering new explanatory rules and
systematizing them is taking place throughout the country, albeit
slowly.
2.1.2 Meta-DENDRAL Background
We have described the design and partial implementation of the Meta-
DENDRAL program in a paper presented at the 7th Machine Intelligence
Workshop (Edinburgh, Scotland, June, 1972). A copy of that paper has
been submitted to ARPA and should be consulted for details. It will
be published in the proceedings of the conference (Machine
Intelligence 7, B. Meltzer & D. Michie, eds., in press).
The input to the theory formation program is a large collection of
data; the output is a set of rules which explain the data. The input
data are in pairs: molecular structures and their associated mass
spectra. The output rules are explanations of the mass spectra
written in terms of processes governing the fragmentation of
molecules in the mass spectrometer.
2.1.3 Relation of Theory Formation to Automatic
Programming
Automatic program formation follows many of the same threads as
automatic theory formation. Both tasks can be viewed as inductive
inference problems from experimental data to explanatory hypotheses.
Constructing a program which generates specified outputs for given
inputs is, in a sense, explaining how to get the output from the
input. The program constitutes a theory, in these terms, with some
66
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION
programs preferred over others for reasons of simplicity, generality,
elegance and the like.
2.1.4 Research Trajectory
The following subgoals have guided our research along one dimension,
although we have often been forced to consider other dimensions of
the problem.
(1) Collect a suitable set of known mass spectra together with
representations of the molecular structures from which the
spectra were derived. These are the input-output pairs to the
"black box" which we are trying to understand. In our case this
box is an analytical instrument for the chemists.
(2) Summarize and interpret the data with respect to possible
explanations of the individual data points. This re-
representation of the data is a critical step in extracting
explanatory rules, for the data points are, for the first time,
associated with possible mechanistic origins ("causes").
Ordinarily, this is such a tedious task that chemists are forced
to limit their analysis to a very few systematically explores the
space of possible mechanisms and collects evidence for each.
(3) Peruse the summary to make plans for intelligent rule formation.
Any of the possible mechanisms described in the summary-
interpretation phase could be incorporated in a rule of mass
spectrometry. But planning will allow the rule formation program
to start with explanatory rules which are likely to make good
reference points for the whole rule formation process. In a
planning program currently being implemented by Mr. Steven Reiss,
the computer peruses the summary looking for mechanisms with
"strong enough" evidence to call them first-order rules of mass
spectrometry. Our criteria for strong evidence may well change
as we gain more experience. For the moment, the program looks
for mechanisms which (a) appear in almost all the compounds (80%)
and (b) have no viable alternatives (where viable alternatives
are those alternative explanations which are frequently occurring
and cannot be disambiguated).
(4) Incorporate the possible mechanisms into general rules (rule
formation). By bringing more and more of the descriptive
mechanisms under rules, the rule formation program explains more
and more of the original data points. This is difficult for many
reasons, however. For instance, the rules must be general enough
67
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION
to avoid writing a new rule for each data point. Yet there are
numerous ways of generalizing rules, with few prospective
guidelines to focus attention on the elegant generalizations
which explain many data points simply.
We have experimented with various AI strategies for rule
formation. We are actively programming a data-guided
construction of rules. Mr. Carl Farrell is pursuing a heuristic
search, generative approach to rule formation in his Ph.D. thesis
work. We have also tried standard statistical inference
techniques with some success.
(5) Evaluate the rules to decide retrospectively whether each
proposed rule is worth keeping or not. If so, it may be further
modified in light of more data. If not, it will be discarded in
favor of rules which are simpler, explain more data, or are
otherwise better suited for incorporation into the emerging
theory.
(6) Codify the rules into a theory. Although a set of
phenomenological rules can predict the mass spectral behavior of
the class of molecules, further codification is needed to
increase the explanatory power of the rules. This may mean
something as "simple" as collapsing rules or subsuming rules
under one another. Or, at a deeper level, it may mean finding
relationships and principles which explain why the
phenomenological rules are good predictors.
(7) Finally, it will be necessary to compare alternative theories (at
whatever level) that come out of the program in order to choose
the best one. Part of this research means experimenting with
different criteria of "best" theory. Although the philosophical
literature is full of suggested criteria, no one has ever tried
to make them precise enough for use in a program.
2.1.5 Progress
Meta-DENDRAL has progressed in the last year within several of the
problem areas mentioned above. The paper presented at the Machine
Intelligence 7 Workshop describes much of our progress in mapping out
a detailed strategy for attaching the problem.
The computer programs written for interpreting and summarizing large
collections of data have already proved useful. For one set of
chemical compounds, the estrogenic steroids, the programs have
selected mass spectrometry rules which are nearly the same as the
68
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION
rules an expert mass spectroscopist arrived at from the same data.
An intermediate data summary produced by the program was used by the
expert for two reasons: (1) it was much more complete than any he
could have produced, and (2) it saved him several months work. The
rules produced from this data summary by the chemist and by the
program still require further codification to become elevated to the
status of theory. However, both the chemist and the performance
programs can use knowledge in this form for routine work.
2.1.6 Use of Multiple Explanatory Paradigms
We propose to develop programs for choice among different explanatory
frameworks, or paradigms, within the context of the Meta-DENDRAL
research. For the subtask of rule formation alone our investigations
have produced several paradigms varying in sophistication from
statistical analyses and model parameter fitting to generative
heuristic techniques and model construction. The paradigms vary in
their attention to detail, choice of primitive entities, scope for
introduction of heuristic assumptions about the task, and the range
of anticipated results. Whereas none of the methods appears
sufficient to handle the problem, each method bears promise of
generating useful information.
We are going to study carefully the development of a program that
treats these paradigms as alternative methods to pursue and that
organizes its activity according to the needs of each method. Such
design problems are inextricably enmeshed in problems of
representation and information exchanges between paradigms.
69
HEURISTIC PROGRAMMING PROJECT
2.2 INTELLIGENT CONTROL OF INSTRUMENTS: THE SELF-DIRECTING
ANALYST (SDA)
For more than a decade, computers (usually small) have been used to
acquire and interpret signals from real-world sensors and
instruments, and to make simple process control decisions. In
general, data acquisition and control programs operate with neither a
model of the external process being sensed nor of the sensor or
instrument. They, therefore, have little or no capability for
behaving adaptively with respect to the data-generating process or
the data-capturing device.
For two years, we have been planning for, and taking initial steps
toward, a "second generation" of computer control, in which the
control program: reasons in terms of a model of the domain being
sensed; makes real-time (or almost real-time) analyses of data
therefrom; uses the results to guide further manipulations and
information gathering; and attempts to optimize the use of the
(necessarily limited) capabilities of the instrument with respect to
the particular data stream being observed.
The parallel between this view of "intelligent control" and the AI
research in Robotics is quite evident. Note, though, the difference
in the quality of the task. The world of (say) the Stanford robot is
a "common sense" world in which a great deal of variability is to be
expected, tolerated, and coped with; but the problems being solved
are relatively simple. The world of instruments and sensors that we
plan to deal with is more tightly constrained and parameterized; even
moderate deviations from expectations are not to be tolerated; but
very complex problem solving behavior is to be called for in near
real-time.
We propose that ARPA support part of our project's effort toward
intelligent control of gas chromatograph-mass spectrometer (GC/MS)
and nuclear magnetic resonance (NMR) instruments. This is pilot work
aimed at a future when most scientific instruments, as well as the
sensors and instruments of industrial and military users, will be
controlled automatically by intelligent programs. We have chosen to
work in the limited world of GC/MS and NMR instruments because:
a. we have developed problem-solving programs that do a creditable
job of MS and NMR analysis.
b. we are affiliated with some of the world's experts in this type
of instrumentation, and they are currently eager to take this
step.
70
HEURISTIC PROGRAMMING PROJECT THE SELF-DIRECTING ANALYST
c. NIH and NSF have already made large grants to Stanford and our
project for acquisition of GC/MS and NMR instruments of the most
modern and appropriate genre. NIH is likely to grant us a
computer of sufficient power to do the job well, and the
resources to interface the instruments.
d. we have in mind an important application of intelligent control
to biochemical analysis and health care, that will serve to focus
and motivate the work.
The spectroscopic methods (mass- and NMR spectroscopy) currently
utilized in the DENDRAL Project have one common feature; in each
case, the data are gathered from a complex electronic instrument
which is capable of performing many different, related experiments.
Frequently, one such experiment is not informative enough to solve
the analytic problem at hand, yet seldom is it practical to gather
all possible data for a particular sample. The scientist must use
his knowledge and intuition to select those experiments which will,
with a minimum of effort, specify the solution. There is no reason,
though, why a computer program could not be written for making and
carrying out these decisions.
The initial problem which arises in the construction of such a
program is the one of instrument/computer interfacing. We have, in
our Chemistry Department, an NMR Spectrometer whose major functions
are already controlled via a mini-computer which also handles some of
the data-reduction tasks associated with Fourier-transform NMR. We
have received support from NIH for some aspects of the computer
control of our mass spectrometer, and have requested funds from NIH
for additional hardware to give the interface even greater
flexibility. The primary goal of such control is a computerized gas
chromatograph/mass spectrometer (GC/MS) system, probably the most
powerful analytic tool available to the chemist.
Thus, on the one hand, we have the Heuristic DENDRAL Program, capable
of relating MS and NMR data to chemical structure, and on the other,
MS and NMR instruments which are or will be computer controlled. We
feel that the time is ripe to investigate the software link, which we
shall call the self-directing analyst (SDA), between the two. The
purpose of the SDA is to direct the overall analysis, selecting at
each stage the most promising experiment.
As we currently conceive of it, the SDA is best viewed as containing
two levels of operation. The lower level should be concerned with
the routine aspects of instrument control, aspects such as machine
maintenance, specific instrument settings required for particular
experiments, data reduction, etc. This level is necessarily quite
71
HEURISTIC PROGRAMMING PROJECT THE SELF-DIRECTING ANALYST
instrument-dependent, and would probably not require any really
innovative research from a computer science standpoint. The main
purpose of this level of operation would be to upgrade the machine to
a "super-instrument" which could accept general functional commands
and return reduced (and perhaps partially analyzed) data. Many of
the features of this level are already incorporated into the NMR and
proposed GC/MS systems.
The higher level of operation of the SDA is the one which presents
interesting and challenging computer science problems. At the
minimum, this level must contain models of both the available "super-
instruments" (including factors of cost, reliability, information
content and time-scale for each type of experiment) and of the
DENDRAL analysis programs. It should be able to accept partial
solutions from DENDRAL and, using the latter models, attempt to
ascertain whether a serious ambiguity exists. If so, it should use
these models to identify the nature of the information needed to
resolve it. Using its "super-instrument" models, it would then need
to select the most practical approach to obtaining this information.
Once such a strategy had been developed, the instructions could be
forwarded to the lower-level control.
2.2.1 Application to Scientific Work
The payoff from intelligent control for scientific and technological
applications is potentially enormous. First of all, there are
control situations in which it is difficult, inappropriate, or
actually impossible for a human controller to function properly.
Second, it is commonly the case that a human controller cannot
respond fast enough and/or with enough problem solving capability to
optimize the use of the instrument with respect to the data stream
being controlled and the decision problem at hand. For example,
often far more data are captured than are "logically" needed to solve
the immediate analysis problem (but this situation is not known until
the ex-post-facto analysis is done). The effective capability of the
instrument is thereby degraded.
The application to biochemistry and health care that is being
prepared is the control of the GC/MS analysis of compounds present in
human urine. Urine is a mixture that contains small samples of
hundreds of compounds whose identification has relevance for the
prevention, diagnosis, and treatment of disease. But the process of
making identifications is laborious and difficult by traditional
techniques, so that in practice very few of the compounds are
analyzed and identified. The high speed of GC/MS analysis, and its
72
HEURISTIC PROGRAMMING PROJECT THE SELF-DIRECTING ANALYST
ability to deal with small samples, offers great promise. But the
number of compounds present in urine makes infeasible the complete
analysis of even a small number of urines each month when the MS data
are collected in the usual way and the analyses are performed by a
human GC/MS analyst. The analyses of hundreds (later thousands) of
urines per month is the goal, and we believe that it can be done only
with automated intelligent control.
As noted above, we have much of the basic equipment needed in this
investigation. The interdisciplinary nature of this project gives us
a pool of expertise covering the fields of GC/MS, NMR, instrumental
research, chemistry and artificial intelligence upon which to draw.
Most of the research will consist of programming the upper level of
the SDA. As we have outlined the system, it is amenable to
subdivision so that small portions could be treated independently.
For example, we might work first on the "ambiguity-recognizing"
portion of the upper-level SDA, while simulating by hand the other
portions of the program. In this way, useful partial systems could
be constructed along the way, each of which accepts a greater portion
of the chemist's tasks.
73
HEURISTIC PROGRAMMING PROJECT
2.3 APPLICATION OF HEURISTIC PROGRAMMING TO THE TASK OF
COMPUTER PROGRAMMING
2.3.1 Automatic Programming
The search for areas of application of heuristic programming to
science has led us to focus on the area of scientific endeavor in
which we are our own experts, computer programming. The currently
fashionable term for this kind of work is "automatic programming".
We view the work as the application of heuristic programming
techniques to computer programming tasks. Our previous work on
DENDRAL orients us naturally to consider the study of automatic
programming as the integration of knowledge and expertise about
computer programming and the application of this information to real
programming tasks.
2.3.2 Meta-DENDRAL, Theta-DENDRAL, and Computer Operating
Systems
As explained in a previous section, the Meta-DENDRAL Program's
behavior can be viewed as automatic programming. (We are given
examples of the input-output (structure-spectrum) pairs from which we
wish to infer the program (theory) which produces them.) Similarly,
another set of ideas emerging from the DENDRAL research bears
strongly on the automatic programming problem. These we collectively
label Theta-DENDRAL, and we propose to push their realization.
Theta-DENDRAL is conceived as a problem-solving executive which is
knowledgeable about all of the computer program resources applicable
to a particular problem, and selects an appropriate strategy to use
these resources to solve the problem. The output of Theta-DENDRAL is
essentially a plan that organizes and controls the other task-
oriented DENDRAL programs or as a form of scheduler in a computer
operating system. Theta-DENDRAL is, in a sense, an extreme form of
real-time program optimization, where the process being optimized is
the problem-solving activity of Heuristic DENDRAL.
2.3.3 List-Processing
We propose to study the automatic generation of list-processing
programs. In the first year we hope to explore synthesis techniques
74
HEURISTIC PROGRAMMING PROJECT APPLICATION TO PROGRAMMING
for simple programs such as dotted-pair reverse, list-reverse, list-
sort, list-flatten, permutations of a list, and possibly up to the
level of the Unification Algorithm. After the ground-work is laid in
the first year we will consider more performance-oriented problems,
possibly in the area of graph-traversal, and graph matching problems.
This application will encounter the usual problems of symbolic
problem solving, as well as certain problems that rarely have been
dealt with in other applications. Probably the most difficult
immediate problem is how to specify a programming task. Our initial
studies indicate that there is no unique, best method for program
description. Some programs seem best described by just giving the
program itself. In other cases, a complex program can be easily
specified by simple formal input-output relations. Two less rigorous
but quite useful program specifications used by people are: giving
sample input-output pairs; or, giving a trace of successive values of
variables (a partial state-description) during a computation without
giving the program itself. Graphical descriptions are easiest for
people in some cases, and English language description is sometimes
best. Initially we will take an eclectic viewpoint and allow
multiple techniques for program specifications.
Other problems are (1) developing sufficient knowledge about the
program-writing process to allow us to implement a system; (2)
selecting the right mix of formal and informal methods to write
satisfactory programs; (3) developing a good model of what program
the user has in mind (perhaps the hardest problem); and (4)
implementing such a complex system.
We will initially implement our system in the QA4 programming
language and will move to more efficient lower-level languages as
necessary for speed-ups.
This area of automatic programming is quite complex, and it is not at
all certain how the field will or should develop. We will gain a
better understanding of this within the first year of this project.
2.3.4 Automatic Debugging
This research will be concerned with the modeling of a programmer's
behavior during the debugging of a program. The term "debugging" is
currently used in three contexts. First, it is used to describe
aids, supplied by a programming system, which provide the programmer
clues as to his program's actual behavior. Obvious examples of such
debugging aids are dumps, traces, profiles and run-time error
75
HEURISTIC PROGRAMMING PROJECT APPLICATION TO PROGRAMMING
messages. Second, it is used to describe the activity of the
programmer in comparing his model of the program's intended behavior
with his emerging model of its actual behavior and in resolving the
differences between the two models. Third, it is used to describe
aids, again supplied by the programming system, to alter either the
program or its behavior or both. Text editors are perhaps the
simplest of these; interactive facilities which permit the programmer
to reset variables and otherwise alter the state of the computation
are more sophisticated examples.
This research will focus mainly on debugging in the second sense. We
will try to discover strategies for finding and fixing common
programming errors. We will attempt to determine what information
concerning the program's behavior is most relevant to the debugging
task. We will implement a prototype automatic debugging system which
illustrates and partially validates these discoveries.
Methods used to achieve these goals will include observation of
programmers doing debugging, Gedanken experiments, and experiments
with the prototype debugger. It should be noted that a close
parallel exists between the proposed research goals and methods of
this work and those of the DENDRAL effort. In both cases, research
in the task area and research on the computer implementation of the
model being generated proceeded together. It is hoped that the
synergistic effect which such simultaneous research produced for
DENDRAL will also speed this work. In both cases, also, the paradigm
is the same--the evoked behavior of a complex "nature" is interpreted
(understood) in terms of a model of that "nature". In the case under
discussion here, the "nature" is a computer program with pathologies,
and the model is used to infer hypotheses about the nature of the
bugs.
2.3.5 Automatic Program Formation in Operating Systems
This research will focus on automatic program formation within the
context of a specific task, the programming of operating systems.
In another area of application (mass spectrometry), the success of
DENDRAL can be attributed to: 1) the careful choice of specific
problem domain; 2) the accumulation of expertise about the task and
the translation of this expertise into heuristic rules for guiding
search. The choice of operating systems (o.s.) programming as a
problem domain for this automatic programming project is based on the
same criteria of choice of a manageable problem domain and
availability of experts. Further motivations for this choice, rather
76
HEURISTIC PROGRAMMING PROJECT APPLICATION TO PROGRAMMING
than others, come from the following observations: 1. With the
increasing complexity of operating systems, better tools for
describing and generating o.s. codes are needed. 2. The problem
domain, though complex, is small enough and specific enough so that
some solution seems possible. More specifically, an operating system
can be described in terms of a set of processing functions, together
with constraints on this processing. Since functions can be
performed by procedures, functional description is a good starting
point in tackling the general problem of transforming descriptions
into algorithms. 3. We computer scientists are our own experts in
determining what is involved in the process called programming, as
well as on the requirements of operating systems.
To be able to build operating systems we need (a) a method of
describing in detail what features are required in a particular
system; (b) problem reduction schemes which resemble the processes of
experts writing operating systems; and (c) methods of integrating (a)
and (b) resulting in codes which perform the required functions.
We plan to use hierarchy as a guiding philosophy in the building of
complex systems. However, the top-down approach most frequently used
in constructing hierarchies is not sufficient for building operating
systems. It must be supplemented by intelligent search for specific
machine codes which match the desired descriptive behavior at the
lowest level. For this purpose heuristic rules will be established
to guide the search. We will build much of our research on works
done so far on hierarchic systems by Simon and Dijkstra and on a pool
of available knowledge on the semantics of operating systems.
77
HEURISTIC PROGRAMMING PROJECT
2.4 APPLICATION OF HEURISTIC PROGRAMMING TO PROTEIN STRUCTURE
DETERMINATION
During the current period we have continued to explore new
applications of AI concepts and techniques. A set of criteria were
established to guide our search for a reasonable, appropriate and
potentially fruitful application (see appendix). While keeping the
search open-ended, we have focussed attention on two task domains
which appear promising in light of our criteria. The two
applications, theory formation in climatology, and protein structure
determination, are reflections of our continuing interest in problems
of scientific inference. The remainder of this section discusses our
current thinking about these two areas and our plans for further work
on them in the next research period.
We have used current ARPA funds to identify and formulate this
application, as indicated in our previous proposal. We are now ready
to seek support from another agency more directly concerned with the
substance of the discipline. We have applied for some of the
necessary resources to the National Institutes of Health, and plan to
seek additional support from the National Science Foundation. We are
seeking continued ARPA support of this endeavor only up to the time
that the expected alternative support is received.
2.4.1 Problem Description
This project, now in its formulation stage, will be (like its cousin,
the DENDRAL project) an interdisciplinary effort involving chemistry
and artificial intelligence research. Knowledge of protein
chemistry, techniques in the art of structure modeling, and heuristic
programming techniques will be brought together to develop a computer
program that infers plausible "first guess" structures of protein
molecules in crystals, given amino acid sequences and electron
density maps (derived from x-ray crystallographic data). As with the
DENDRAL algorithm, the combinatorics of generating solution
candidates makes a brute-force generating approach infeasible.
Expert protein chemists doing this work reduce the search for a
solution to manageable proportions by bringing to bear chemical
knowledge of various kinds and many "tricks of the trade"
(heuristics) that simplify the problem.
The research problem will be:
a. To discover from expert chemist collaborators the knowledge and
78
HEURISTIC PROGRAMMING PROJECT PROTEIN STRUCTURE DETERMINATION
the heuristics routinely employed in the solution of these problems,
and to formalize all of this as computer data structures and
heuristic procedures.
b. To discover a program organization and a set of representations
which will allow the knowledge and the heuristics to cooperate in
making the search efficient, i.e., generating plausible candidates in
a reasonable time. (This, we believe, is the central theme of
artificial intelligence research today.)
c. To implement the above in a computer program, the competence of
which will be such as to have a noticeable impact upon the discipline
of protein chemistry, much as the DENDRAL program has had an impact
upon mass spectrometry.
The "expert chemist collaborators" mentioned above will be members of
the x-ray crystallography research group of the Chemistry Department
of the University of California, San Diego. The main "knowledge
source" will be Dr. Jens Birktoft, a highly regarded protein model
builder.
2.4.2 Objectives
As presently formulated, the research has two objectives (one near-
term, one long-range) listed below in order of difficulty:
a. Small Structure Inference
Develop a computer program which infers the three dimensional
structure of a segment of a protein molecule, given x-ray
crystallographic data. Knowledge of the amino acid sequence and/or a
starting point in the electron density map, may or may not be given.
The program will employ specific knowledge of protein structure,
geometrical constraints and other special heuristics to infer the
structure.
b. Large Structure Inference
Develop a computational system which infers the three dimensional
structure of an entire protein molecule which best explains data from
x-ray crystallography and other sources. The output of the program
will be a table of atomic coordinates of the total molecule. As much
task-specific knowledge as possible will be incorporated in this
program including established rules of protein chemistry, geometrical
constraints, and the experts' rules of thumb.
79
HEURISTIC PROGRAMMING PROJECT PROTEIN STRUCTURE DETERMINATION
A number of other groups, particularly at Columbia University,
Princeton University, and Washington University, have been applying
computer processing to assist in protein structure modeling. This
type of model may be characterized as hypothetical, in which atomic
co-ordinates are produced as a function of various structural
parameters, e.g. the dihedral angles phi and psi, these parameters
being directly controlled by the user with a view to surveying a
variety of possible conformations. In contrast to this approach, we
are proposing a type of model building which is exploratory, i.e.,
sequence information and a density map are supplied and the program
autonomously builds models which fit the data.
2.4.3 Program Construction
We anticipate that the procedure for constructing the automatic model
building program defined above will contain many similarities to the
development of the Heuristic DENDRAL program.
From the standpoint of artificial intelligence, the DENDRAL Project's
goal has been studying scientific inference, and systematizing that
process sufficiently to allow a computer to make intelligent
decisions in solving real inferential problems. Although the work to
date has been carried forward in the context of mass spectrometry, a
considerable body of general experience and specific techniques have
been developed which would facilitate the investigation of other
inferential procedures. For example, (a) the methods employed in
eliciting a theory from an expert, so vital to the construction of
the planning and predictive phases of the Heuristic DENDRAL program,
are of general utility; (b) the global organization of the Heuristic
DENDRAL program into the three areas of planning, generating
hypotheses, and testing facilitate the incorporation of new knowledge
and/or heuristic methods in the program, and establishes a
parallelism with human scientific problem solving techniques; (c)
another design principle has been the use of a unified theory in all
phases of the program, in order to allow for rapid yet consistent
changes to the program as the theory is modified; (d) recently the
concept of table-driven programs has been implemented and found to
increase the program's flexibility by separating the theory from the
routines which use it.
80
HEURISTIC PROGRAMMING PROJECT PROTEIN STRUCTURE DETERMINATION
2.4.4 Fundamental Research Problems
a. Representation of Information
Protein chemists convey their knowledge of molecular structure to
their colleagues by using an assortment of representations: 3-D stick
models, space-filling models, schematic diagrams, stereoscopic
drawings of the main chain, a list of all atoms (excluding hydrogens)
in the molecule with their Cartesian coordinates, a list of the amino
acid sequence with values of the dihedral angles which define the
relative orientation of the amide planes and side chains, etc.
Moreover, some heuristics used in building a structure are stated
succinctly in one representation, some in another. A fundamental
research question is whether to express all knowledge of the task in
a single representation or to allow for a plurality of
representations, translations between which are performed internally
as needed by the computer program. The transformation functions
which map one representation into another may not always exist
because of inherent differences in the type of information they use.
Questions of the feasibility of using multiple representations and of
the existence and/or utility of a universal representation will be
explored.
b. Strategies for Employing Heuristics
The central paradigm of artificial intelligence research is heuristic
search. A tree of "tries" (aliases: subproblems, reductions,
candidates, solution attempts, alternatives-and-consequences, etc.)
is sprouted (or sproutable) by a generator. Solutions (variously
defined) exist at particular (unknown) depths along particular
(unknown) paths. To find one is a "problem". For any task regarded
as nontrivial, the search space is very large. Heuristics are
applied to direct search, to limit search, to constrain the sprouting
of the tree, etc. A fundamental problem in the design of the protein
modeling program is representing the task as a procedure in which the
given heuristics may be applied with maximum effectiveness.
For example, in the context of the proposed research, one kind of
constraint may be at the atomic level - e.g., atoms cannot overlap -
requiring the knowledge of atomic coordinates to see if the
constraint is satisfied. Another constraint may be at a more
macroscopic level - e.g., knowledge that a specific part of the
protein is an alpha-helix with a given orientation. It will be
necessary to explore alternative ways to best utilize these
heuristics. For example, one might build the molecule by aggregating
individual atoms, or by first identifying macro-structures,
81
HEURISTIC PROGRAMMING PROJECT PROTEIN STRUCTURE DETERMINATION
successively locating smaller and smaller sub-structures and finally
determining atomic coordinates.
2.4.5 Work Plan
The primary objectives of the first year's work are to establish the
program architecture, and to write enough of the program to permit
building simplified segments of a protein molecule which conform to
the given data and the search heuristics. The program should be able
to work in the absence of the amino acid sequence or the starting
point on the electron density map. Tasks of a more peripheral
nature, such as CRT display of molecular structures, large data
storage problems, or implementation of the program on a computing
network will probably be deferred beyond the first year. A suggested
set of steps, not necessarily sequential, toward the first year's
objective is the following:
1. First attempt. Write a simple program which can output atomic
coordinates of the atoms in a relatively small protein segment,
given the electron density distribution and amino acid sequence.
This program would serve more to illustrate the problems to be
faced than to solve them. The first attempt would be
characterized by (a) a single representation; (b) a small body of
heuristics; and (c) a convenient programming language, not
necessarily the final choice.
2. Reassessment. Each part of the simple program would be examined
critically to expose its major weaknesses and limitations. In
particular one would want to see at this point whether the design
choices made in step 1 were adequate not only for the simple test
cases but for more complex and realistic structure building tasks
(e.g., amino acid sequence not given), with their concomitant
body of heuristics.
3. Modification. Parts (and perhaps even all) of the first program
will be rewritten and tested against an expanded set of test
cases. Experiments with different representations, different
data structures, and/or a different programming language will be
made towards optimizing program design.
4. Extension. Once the principal design choices have been made, the
program will then be extended in whatever ways are required to
solve real structure building tasks, with larger molecular
segments. A considerably larger body of heuristics will be
incorporated at the level of data-specific constraints than were
82
HEURISTIC PROGRAMMING PROJECT PROTEIN STRUCTURE DETERMINATION
used in the prototype programs, and more than one representation
of rules may be required.
83
HEURISTIC PROGRAMMING PROJECT
2.5 APPLICATION OF HEURISTIC PROGRAMMING TO THEORY FORMATION IN
CLIMATOLOGY
The general task is to develop a predictive theory of climate, i.e.,
to find a set of relationships among primitive climatic variables.
Given the background we have developed in this problem during the
present period, we propose to continue work toward the formulation of
(and perhaps begin the programming of) an application of heuristic
programming to this task.
Good simulation models have been developed for predicting atmospheric
circulation, but they suffer from inherent instabilities which limit
their predictive power to a relatively short time scale for climate
studies. Theory verification is still possible but requires great
quantities of processing time on the fastest available computers.
Real data exists in the form of detailed, world-wide meteorological
measurements over the past 15 or 20 years, sea surface temperature
data over a comparable time period, other instrumental observations
over scattered portions of the globe during the past three hundred
years, and indirect evidence of climatic conditions over thousands of
years from tree rings, sea and lake bed cores, ice and snow covers,
etc.
A potentially interesting task for the information processing
theorist is the systematic exploration of cause-effect relationships
which determine recent climatic fluctuations, using the above
mentioned data and general physical principles as the corpus of
knowledge from which the relationships are inferred. As there is no
theory of climate per se it would be presumptuous to launch into a
fully automated theory formation program. A more reasonable goal for
the present is to construct a theory forming assistant to the
climatologist which can systematically generate relationships among
climate variables and test their validity.
2.5.1 Problem Formulation
a) There is a fairly common representation in use in the field, e.g.,
climate is generally discussed in terms of average temperatures,
humidity, rainfall, ice cover, etc., during certain periods of time
at certain locations. The formal language now used for describing
entities within this representation, however, is that of partial
differential equations, which one wishes here to avoid. We would
like to apply a different formal language, namely a formalization of
the natural language of the climatologist.
84
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION IN CLIMATOLOGY
b) There is at least the suspicion that a theory of climate can be
composed from a conjunction of mini-theories and/or primitive terms,
thus making the theory formation task amenable to a heuristic search
paradigm. However, the primitives and mini-theories themselves have
never been fully articulated.
2.5.2 The Knowledge Base
a) A theory of climate per se does not exist; that is the ultimate
goal of this task. A "meta-theory", i.e., a plan for building such a
theory can be formulated with about the same degree of precision as
Meta-DENDRAL.
b) We do not have an expert right at hand; the ones we know best are
several hundred miles away. Those experts are enthusiastic about the
task and see the payoff clearly. A computer scientist with some
knowledge of the domain is available at Stanford, a climatologist
with some knowledge of AI has not been identified, however.
c) The size of the knowledge domain, D, is, at first glance,
enormous, judging from the quantity of published material. The
relevant body of knowledge, however, is likely to be manageably
small, on the same order of complexity as mass spectrometry theory.
It is hard to say anything about the decoupling of D from other
universes of knowledge (e.g., is macro-climate decoupled from micro-
climate? from plasma physics? from bio-dynamics?). One will have to
insist on decoupling D in order to make any progress at all.
2.5.3 Problem Difficulty
The problem is clearly a very difficult one, because practicing
climatologists have made little headway themselves in developing a
theory of climate. There is some feeling, however, that substantial
progress could be made if new tools were available to analyze, in an
exhaustive and systematic fashion, the large volume of data in
conjunction with a proposer of explanatory hypotheses. It is not
clear how to structure the task so that progress can be made
incrementally, although the plan used by Meta-DENDRAL seems
applicable: at the least, we would start with a data interpretation
program. The problem is clearly ill-structured enough to be a
heuristic programming application.
85
HEURISTIC PROGRAMMING PROJECT THEORY FORMATION IN CLIMATOLOGY
2.5.4 Resources
During the early stages of investigating this area, there appeared to
be no problems with regard to personnel and technical resources. The
combination of the RAND Climate Dynamics staff, under Dr. Robert
Rapp, and the RAND Computer Science staff, under Dr. Keith Uncapher,
provided us with a source of highly motivated experts who could
supply both theoretical and software support. Recent organizational
changes, however, have split these two groups and exacerbated the
problems of collaboration. Consequently, there has been minimal
interaction between the collaborators for the past several months.
2.5.5 Future Plans
We know from past experience that a long time is required to bring to
fruition a formulation of an AI application in a scientific
discipline. Many discussions have been held during the current
period with our collaborators at RAND, but we have not yet crossed
the threshold of formulating a specific research plan. We intend to
continue to interact with climatologists at RAND, NCAR and other
research groups. More specific plans will be forthcoming during this
contract period, and will be presented to ARPA.
86
HEURISTIC PROGRAMMING PROJECT
2.6 APPENDIX: THE NATURE OF AN APPLICATION OF HEURISTIC
PROGRAMMING TECHNIQUES
I. Problem Formulation
A. Does there exist, or can there be brought to exist, a
generally agreed-upon "best way" to represent the problem
domain, D? Is there a representation which is in common
use in human problem solving in D? Is there a relatively
formal way of describing and creating entities within this
representation? i.e., a formal language?
B. Primitives and Combinatorics: Can the generation of
potential solutions be conceived as a combinatorial
process? (Either as "moves" or "reductions"?) There must
be this if we are to fit within the heuristic search
paradigm.
C. What is the set of primitives, i.e., solution ELEMENTS, out
of which solutions will be discovered by combination and
search?
II. The Knowledge Base: KNOWLEDGE IS POWER. To qualify as an
application, nontrivial power is necessary (almost by
definition).
A. Does there exist a model of some sort, preferably well-
understood and powerful, (a good theory, if you will), on
the basis of which pruning can take place, evaluations of
search paths can be made, solution candidates evaluated?
In other words, is there a powerful semantics behind the
symbol manipulation? If not, can such a thing be brought
into existence readily?
B. Is there in your environment at least one highly
knowledgeable, highly motivated, computer-oriented and
computationally "sensitive" expert, who can serve as an
informant--through whom the knowledge base can be acquired?
Intense interdisciplinary interaction will be required.
Does he see the payoff? Is there at least one computer
scientist around who will meet him halfway and become at
least a minor expert in D?
C. Is the universe of facts of D small? Is D almost decoupled
from all other possible universes?
III.Problem Difficulty
87
HEURISTIC PROGRAMMING PROJECT APPENDIX
A. Is the application too difficult at the current state of
knowledge? Too simple? Consider, for a typical problem in
D, how long does it take a human to solve that typical
problem? minute? hour? month? once a lifetime?
B. Ill-structuredness: Is D well-defined enough so that
measurements can be made as to how well the programs are
doing and where they are deficient, so that reasonable
corrective action can be taken?
C. Can progress toward higher levels of performance be made
incrementally, or is there a big performance threshold to
be overcome?
D. Is the application ill-structured enough to be a heuristic
programming application?
IV. Resources: These can be predicted to be relatively large.
A. Ability to obtain major chunks of resources, i.e., money,
effort, motivation, knowledgable experts?
B. Ability to sustain these resources over long periods of
time?
2.6.1 Coordinate Support
Support of this project by ARPA forms the keystone of a structure of
support from a variety of federal agencies and programs. Most
important of these are the following:
1. A three year grant from the National Institutes of Health for the
continued development of the Heuristic DENDRAL Program (including
some support for Meta-DENDRAL development and real-time
intelligent control).
2. A five year grant by NIH to Dr. Bruce Buchanan, the Associate
Investigator of the project, as a Career Development Award (i.e.,
salary support for five years).
3. Continuing NASA support to J. Lederberg for advanced R&D on
control of scientific instrumentation, particularly mass
spectrometers.
In addition, grant funds have been requested from NIH for a computing
88
HEURISTIC PROGRAMMING PROJECT APPENDIX
laboratory in the Medical School which will support some of the
proposed work, particularly the work on intelligent control of
instruments. Initial discussions have been held with members of the
NSF staff toward the submission of a proposal for support of the
protein structure determination application.
89
HEURISTIC PROGRAMMING PROJECT
2.7 DENDRAL PUBLICATIONS
1971-1972
[1] E.A. Feigenbaum, B.G. Buchanan, and J. Lederberg, "On Generality
and Problem Solving: A Case Study Using the DENDRAL Program".
In Machine Intelligence 6 (B. Meltzer and D. MIchie, eds.)
Edinburgh University Press (1971). (Also Stanford Artificial
Intelligence Project Memo No. 131.)
[2] A. Buchs, A.B. Delfino, C. Djerassi, A.M. Duffield, B.G.
Buchanan, E.A. Feigenbaum, J. Lederberg, G. Schroll, and G.L.
Sutherland, "The Application of Artificial Intelligence in the
Interpretation of Low-Resolution Mass Spectra", Advances in Mass
Spectrometry, 5, 314.
[3] B.G. Buchanan and J. Lederberg, "The Heuristic DENDRAL Program
for Explaining Empirical Data". In proceedings of the IFIP
Congress 71, Ljubljana, Yugoslavia (1971). (Also Stanford
Artificial Intelligence Project Memo No. 141.)
[4] B.G. Buchanan, E.A. Feigenbaum, and J. Lederberg, "A Heuristic
Programming Study of Theory Formation in Science." In
proceedings of the Second International Joint Conference on
Artificial Intelligence, Imperial College, London (September,
1971). (Also Stanford Artificial Intelligence Project Memo No.
145.)
[5] Buchanan, B. G., Duffield, A.M., Robertson, A.V., "An
Application of Artificial Intelligence to the Interpretation of
Mass Spectra", Mass Spectrometry Techniques and Appliances,
Edited by George W. A. Milne, John Wiley & Sons, Inc., 1971, p.
121-77.
[6] D.H. Smith, B.G. Buchanan, R.S. Engelmore, A.M. Duffield, A.
Yeo, E.A. Feigenbaum, J. Lederberg, and C. Djerassi,
"Applications of Artificial Intelligence for Chemical Inference
VIII. An approach to the Computer Interpretation of the High
Resolution Mass Spectra of Complex Molecules. Structure
Elucidation of Estrogenic Steroids", Journal of the American
Chemical Society, 94, 5962-5971 (1972).
[7] B.G. Buchanan, E.A. Feigenbaum, and N.S. Sridharan, "Heuristic
Theory Formation: Data Interpretation and Rule Formation". In
Machine Intelligence 7, Edinburgh University Press (in press).
90
HEURISTIC PROGRAMMING PROJECT DENDRAL PUBLICATIONS
[8] Brown, H., Masinter L., Hjelmeland, L., "Constructive Graph
Labeling Using Double Cosets", Computer Science Memo 318, 1972.
1964-1970
[9] J. Lederberg, "DENDRAL-64 - A System for Computer Construction,
Enumeration and Notation of Organic Molecules as Tree Structures
and Cyclic Graphs", (technical reports to NASA, also available
from the author and summarized in (12)).
(1a) Part I. Notational algorithm for tree structures (1964)
CR.57029
(1b) Part II. Topology of cyclic graphs (1965) CR.68898
(1c) Part III. Complete chemical graphs; embedding rings in trees
(1969)
[10] J. Lederberg, "Computation of Molecular Formulas for Mass
Spectrometry", Holden-Day, Inc. (1964).
[11] J. Lederberg, "Topological Mapping of Organic Molecules", Proc.
Nat. Acad. Sci., 53:1, January 1965, pp. 134-139.
[12] J. Lederberg, "Systematics of organic molecules, graph topology
and Hamilton circuits. A general outline of the DENDRAL
system." NASA CR-48899 (1965)
[13] J. Lederberg, "Hamilton Circuits of Convex Trivalent Polyhedra
(up to 18 vertices), Am. Math. Monthly, May 1967.
[14] G. L. Sutherland, "DENDRAL - A Computer Program for Generating
and Filtering Chemical Structures", Stanford Artificial
Intelligence Project Memo No. 49, February 1967.
[15] J. Lederberg and E. A. Feigenbaum, "Mechanization of Inductive
Inference in Organic Chemistry", in B. Kleinmuntz (ed) Formal
Representations for Human Judgment, (Wiley, 1968) (also Stanford
Artificial Intelligence Project Memo No. 54, August 1967).
[16] J. Lederberg, "Online computation of molecular formulas from
mass number." NASA CR-94977 (1968)
[17] E. A. Feigenbaum and B. G. Buchanan, "Heuristic DENDRAL: A
Program for Generating Explanatory Hypotheses in Organic
Chemistry", in Proceedings, Hawaii International Conference on
System Sciences, B. K. Kinariwala and F. F. Kuo (eds),
University of Hawaii Press, 1968.
91
HEURISTIC PROGRAMMING PROJECT DENDRAL PUBLICATIONS
[18] B. G. Buchanan, G. L. Sutherland, and E. A. Feigenbaum,
"Heuristic DENDRAL: A Program for Generating Explanatory
Hypotheses in Organic Chemistry". In Machine Intelligence 4 (B.
Meltzer and D. Michie, eds) Edinburgh University Press (1969),
(also Stanford Artificial Intelligence Project Memo No. 62, July
1968).
[19] E. A. Feigenbaum, "Artificial Intelligence: Themes in the Second
Decade". In Final Supplement to Proceedings of the IFIP68
International Congress, Edinburgh, August 1968 (also Stanford
Artificial Intelligence Project Memo No. 67, August 1968).
[20] J. Lederberg, "Topology of Molecules", in The Mathematical
Sciences - A Collection of Essays, (ed.) Committee on Support of
Research in the Mathematical Sciences (COSRIMS), National
Academy of Sciences - National Research Council, M.I.T. Press,
(1969), pp. 37-51.
[21] G. Sutherland, "Heuristic DENDRAL: A Family of LISP Programs",
to appear in D. Bobrow (ed), LISP Applications (also Stanford
Artificial Intelligence Project Memo No. 80, March 1969).
[22] J. Lederberg, G. L. Sutherland, B. G. Buchanan, E. A.
Feigenbaum, A. V. Robertson, A. M. Duffield, and C. Djerassi,
"Applications of Artificial Intelligence for Chemical Inference
I. The Number of Possible Organic Compounds: Acyclic Structures
Containing C, H, O and N". Journal of the American Chemical
Society, 91:11 (May 21, 1969).
[23] A. M. Duffield, A. V. Robertson, C. Djerassi, B. G. Buchanan,
G. L. Sutherland, E. A. Feigenbaum, and J. Lederberg,
"Application of Artificial Intelligence for Chemical Inference
II. Interpretation of Low Resolution Mass Spectra of Ketones".
Journal of the American Chemical Society, 91:11 (May 21, 1969).
[24] B. G. Buchanan, G. L. Sutherland, E. A. Feigenbaum, "Toward an
Understanding of Information Processes of Scientific Inference
in the Context of Organic Chemistry", in Machine Intelligence 5,
(B. Meltzer and D. Michie, eds) Edinburgh University Press
(1970), (also Stanford Artificial Intelligence Project Memo No.
99, September 1969).
[25] J. Lederberg, G. L. Sutherland, B. G. Buchanan, and E. A.
Feigenbaum, "A Heuristic Program for Solving a Scientific
Inference Problem: Summary of Motivation and Implementation",
Stanford Artificial Intelligence Project Memo No. 104, November
1969.
92
HEURISTIC PROGRAMMING PROJECT DENDRAL PUBLICATIONS
[26] C. W. Churchman and B. G. Buchanan, "On the Design of Inductive
Systems: Some Philosophical Problems". British Journal for the
Philosophy of Science, 20 (1969), pp. 311-323.
[27] G. Schroll, A. M. Duffield, C. Djerassi, B. G. Buchanan, G. L.
Sutherland, E. A. Feigenbaum, and J. Lederberg, "Application of
Artificial Intelligence for Chemical Inference III. Aliphatic
Ethers Diagnosed by Their Low Resolution Mass Spectra and NMR
Data". Journal of the American Chemical Society, 91:26
(December 17, 1969).
[28] A. Buchs, A. M. Duffield, G. Schroll, C. Djerassi, A. B.
Delfino, B. G. Buchanan, G. L. Sutherland, E. A. Feigenbaum,
and J. Lederberg, "Applications of Artificial Intelligence For
Chemical Inference. IV. Saturated Amines Diagnosed by Their Low
Resolution Mass Spectra and Nuclear Magnetic Resonance Spectra",
Journal of the American Chemical Society, 92, 6831 (1970).
[29] Y.M. Sheikh, A. Buchs, A.B. Delfino, G. Schroll, A.M. Duffield,
C. Djerassi, B.G. Buchanan, G.L. Sutherland, E.A. Feigenbaum
and J. Lederberg, "Applications of Artificial Intelligence for
Chemical Inference V. An Approach to the Computer Generation of
Cyclic Structures. Differentiation Between All the Possible
Isomeric Ketones of Composition C6H10O", Organic Mass
Spectrometry, 4, 493 (1970).
[30] A. Buchs, A.B. Delfino, A.M. Duffield, C. Djerassi, B.G.
Buchanan, E.A. Feigenbaum and J. Lederberg, "Applications of
Artificial Intelligence for Chemical Inference VI. Approach to
a General Method of Interpreting Low Resolution Mass Spectra
with a Computer", Chem. Acta Helvetica, 53, 1394 (1970).
93
HEURISTIC PROGRAMMING PROJECT
2.8 BUDGET
15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
I. TOTAL SALARIES (detail below) $ 76,413 $ 80,052
II. STAFF BENEFITS
16% to 8-31-73
17.3% 9-1-73 to 8-31-74 13,012
18.3% 9-1-74 on 14,516
III. TRAVEL 6,000 6,000
IV. COMPUTER TIME 44,888 37,379
V. EQUIPMENT RENTAL 4,896 4,896
VI. EQUIPMENT MAINTENANCE 720 720
VII. COMMUNICATIONS 3,000 3,000
(Telephones, dataphones)
VIII. PUBLICATIONS COST (Past Experience) 1,000 1,000
IX. OTHER OPERATING EXPENSES 1,200 1,200
(e.g. Office Supplies, Postage, Freight,
Consulting, Utilties)
X. INDIRECT COSTS -- 46% of NTDC 48,871 51,237
(NTDC = I+II+III+V+VI+VII+VIII+IX)
TOTAL HEURISTICS PROGRAMMING BUDGET ------ $ 200,000 $ 200,000
94
HEURISTIC PROGRAMMING PROJECT BUDGET
SALARIES 15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
Faculty
Feigenbaum, Edward $13,902 $13,902
Professor, 50% Acad. Yr., 50% Summer
TOTAL Faculty SALARIES $13,902 $13,902
Research Staff
Engelmore, Robert 22,500 22,500
Research Associate
White, William 6,247 6,247
Research Programmer, 45% time
TOTAL Research Staff SALARIES $28,747 $28,747
Student Research Assistants
(50% Acad. Yr., 100% Summer)
Brown, Dennis 5,538 5,538
Bryant, Penny 5,070 5,070
Masinter, Larry 4,914 4,914
TOTAL Student Res. Assts. SALARIES $15,522 $15,522
Others
Reiss, Steven 6,000 6,000
Laboratory Technician
Larson, Marianne 3,600 3,600
Secretary, 50% time
Wharton, Kathleen 5,004 5,004
Secretary, 63% time
SUBTOTAL HEURISTIC PROGRAMMING SALARIES $72,775 $72,775
Projected Salary Increases at 5% 3,638 7,277
TOTAL HEURISTIC PROGRAMMING SALARIES $76,413 $80,052
95
3. NETWORK PROTOCOL DEVELOPMENT PROJECT
This proposal includes two closely related but distinct areas of
endeavor. The first concerns the development and implementation of
protocols for the ARPA network and the second concerns the
coordination of an International Network Working Group (INWG) to
achieve the interconnection of national networks currently in
existence or under development.
96
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.1 OBJECTIVES
The successful execution of this proposal will achieve the following
goals for each of the two projects:
To provide for more orderly design, development, implementation and
review of protocols for the ARPA network, this project will:
1) determine areas in which new or further protocol development will
enhance resource sharing within the network.
2) define the design goals and constraints of each protocol,
specifically addressing critical issues such as error handling,
flow control, achievable throughput, expected delays, effect of
network size on performance, generality in the face of
interconnection to other networks, and ease of and cost of
implementation.
3) uncover conflicts between protocol designs or inefficiencies
introduced by the layering of incompatible protocols.
4) isolate poorly or wrongly performing protocol implementations.
With respect to INWG coordination, it will:
1) achieve early agreement on compatibility guidelines for
communication subnet and Host-to-Host protocols so as to make
interconnection of national networks technically feasible.
2) initiate early interconnection experiments between the ARPA
network and, for instance, the National Physical Laboratory (NPL)
network in England [8, 9, 10, 22, 23, 26], the CYCLADES network
in France [12], and/or parts of the proposed CANUNET network in
Canada [4].
3) with IFIPS support, capture the attention of the international
computer community for the purpose of influencing and encouraging
the development of national and international networking
standards.
97
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.2 PRESENT STATE OF AFFAIRS
Let us consider first the ARPA network protocol development theme.
The history of protocol development for the ARPA network is one of
relatively ad hoc disorder. This is hardly surprising, considering
the newness of the territory, but nevertheless the lack of
coordination has led to needless inefficiencies and inconsistencies
among the protocols developed. Issues which ought to be resolved
(but often are ignored or poorly explored) include the following:
1) Justification for creation of a protocol (i.e. Who needs it?).
Why won't existing protocols satisfy the need?
2) Compatibility with lower and upper layers of existing protocol.
If incompatibilities exist, can they be justified? On what
grounds?
3) Expected performance of the protocol. Meaningful measures of
performance need to be defined (sometimes throughput and delay
are sufficient measures). Comparison of expected performance and
theoretical or measured network capability is essential to
evaluation of the proposal. The amount of Host or network
overhead introduced by the protocol is of interest (e.g.
restricting links to carry one message at a time).
4) Completeness of the protocol specification, particularly in the
matter of error handling, timeout, retransmissions and the like.
5) Provision for error detection and recovery. An unjustified, but
often underlying assumption in some ARPA protocols is that the
communications subnet will never lose a message. Even though
this is almost true, it is a poor assumption on which to base a
protocol.
6) Ease of implementation and its cost (taking into account
variations in Host architecture and operating systems). This
issue may not be resolvable by any one person (who might need to
be omniscient regarding Hosts on the ARPA network), but a more
concerted effort than the ubiquitous RFC can yield results.
7) Extent to which the proposed protocol design satisfies the
requirements (as explicated in issue (1)).
This list is not exhaustive, but perhaps will stimulate the
imagination. The issues listed above deal primarily with the design
phase of protocol development. Once a protocol has been ordained, it
98
NETWORK PROTOCOL DEVELOPMENT PROJECT PRESENT STATE OF AFFAIRS
typically is implemented in a leisurely and only mildly coordinated
fashion. Ignoring the issue of speeding up implementation for a
moment, several other issues should be resolved pro forma:
1) Methodical search for variation in observed and specified
protocol behavior (i.e. debugging).
2) Comparison of observed and predicted performance. This issue is
frequently ignored, although there are some notable exceptions
(IMP-IMP protocol, File Transfer Protocol). Implementation
review should be a normal part of protocol development, but
requires strong leadership by the coordinating body.
3) Cost of implementation (space, time, dollars, introduction of
overhead on top of lower protocol levels, etc.).
Returning to the issue of slow implementation, it can be observed
that things speed up when there is a compelling reason . Sometimes
the compelling reason is a programmer who wants to use a remote
resource but cannot without a certain protocol implementation. And
sometimes the inertia can be overcome simply by periodically
reminding the implementor(s) that the task is not yet done.
We turn now to the present state of the INWG. Considerable material
from ARPA [18], CYCLADES [12], and NPL [1] have been distributed to
the working group. Background material from CCITT [5,6,7] and
CANUNET[4] will shortly be distributed, and position papers are due
soon regarding the issues which must be faced by the two working
subgroups (communication subnet protocol and Host-Host protocol).
The tasks associated with the INWG fall into two categories:
managerial and technical. The first is mostly one of providing for
communication among the members (currently adequately supplied
through the Network Information Center), arranging for periodic
meetings of the groups, and regularly stimulating the members to
remind them of their commitments to INWG. Issues to be faced must be
isolated, described, and resolved; this time, in the context of
interconnection of differing networks. A partial list of important
issues includes:
1. Maximum packet length
2. Multipacket messages (and reassembly) versus single packet
messages only.
3. Congestion control across network boundaries.
4. Interconnection at the IMP level or through "gateway" Hosts.
99
NETWORK PROTOCOL DEVELOPMENT PROJECT PRESENT STATE OF AFFAIRS
5. Error detection and recovery between networks.
6. Process naming conventions (Hierarchical? Uniform?).
7. Interconnection via radio or ground links or both?
8. Should Host-Host protocol be line switch oriented or message
switch oriented?
9. Expected performance of proposed protocol. (Throughput and delay
analysis).
10. Should the order of arriving messages be guaranteed by the
interconnected subnets?
This effort is still in its initial stages. Review of national
network designs and critiques of existing protocols are the major
activities. Concurrent with this is the issue isolation effort. It
is expected that out of these early studies will arise a design
philosophy and detailed statement of performance objectives which
must be met by both the subnet protocol and the Host-Host protocol.
It is also reasonable to expect that a priority ordering of protocols
to be developed will be a useful byproduct.
One objective, affiliation with an existing international group of
some stature, has already been accomplished. INWG will be adopted as
a working group under IFIPS Technical Committee 6 on computer
communication (TC6). The chairman of TC6, Mr. Alex Curran of Bell-
Northern Research in Canada, has agreed to the arrangement, as have
Prof. Heinz Zemanek (IFIP's President) and Mr. Richard Tanaka
(CALCOMP) President of AFIPS.
100
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.3 TYPICAL ANALYSIS
As an example of the kind of analysis one could expect from the
proposed effort, let us consider the current Host-to-Host protocol
[18].
Many factors affect the throughput available to communicating Hosts
in the ARPA Network. Among these are the overhead cost of format and
control bits for the IMP-to-IMP and Host-to-Host protocols, the
overhead of IMP operation (IMP processing and queuing delays,
telephone line bandwidth taken by IMP-to-IMP control message
exchanges), and finally the effect of Host-to-Host protocols on
pipeline utilization. Message and packet formats introduce a
fundamental overhead so we consider them first.
Figures 1 and 2 show respectively the Host message format and the IMP
packet format used in the ARPA Network. Messages passed from Host to
IMP or vice-versa carry overhead and control bits which reduce useful
data bandwidth available to communicating Hosts. We derive a formula
for effective Host-IMP bandwidth utilization below (note that this is
a function of Host word length also).
Let T be the number of text bits to be transmitted from a W bit/word
Host. Then M(T,W) gives the total Host message length:
|←----------------- b (T) -------------→|← b (T,W) -→|←- b (T,W) -→|
| 1 | 2 | 3 |
| 32 8 8 16 8 T | | |
__________________________________....____________________________
| | | | | | | | |
|leader | m | S | C | m | TEXT | m | 1 0 0 ... 0 |
| | 1 | | | 2 | | 3 | |
|_______|_____|_____|______|_____|_...._|____________|_____________|
| | | | |
|←-------- preamble ------------→| |← Host pad →|←- IMP pad -→|
where m = m = m = 0
1 2 3
S = byte size
C = byte count
b , b , b are explained in equation (1)
1 2 3
Figure 1. Host-Host message format
101
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
8 8 8 8 80 d 24 8 8
_____________________________________ . . .________________________
| | | | | | | | | |
|SYN | SYN | DLE | STX | IMP header | data | CHECKSUM | DLE | ETX |
|____|_____|_____|_____|____________|_ . . ._|__________|_____|_____|
where 0 ≤ d ≤ 1008 bits
Figure 2. IMP packet format
M(T,W) = b (T) + b (T,W) + b (T,W) (1)
1 2 3
where b (T) = T + 72
1
b (T,W) = (W - b mod W) mod W
2 1
b (T,W) = 1 + 16 - (b (T) + b (T,W) + 1) mod 16
3 1 2
In equation (1) above, b (T) represents the preamble bits of the Host
1
message format plus the text (see figure 1). The Host padding, which
makes text plus preamble an integer multiple of Host words, is given
by b (T,W). Finally, the padding which makes the entire message an
2
integer number of 16 bit IMP words is given by b (T,W). The
3
effective fraction of bandwidth available after subtracting message
format overhead is given in equation (2) as
E (T,W) = T/M(T,W) (2)
m
Since the IMP-Host protocol [18] specifies that
M(T,W) ≤ 8096 (3)
we can derive the range of legitimate Host text length as a function
of Host word length. The results appear in Table 1 below.
102
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
__________________________________
| | |
| W | Maximum text in bits |
|--------|-------------------------|
| 8 | 8016 |
| 16 | 8008 |
| 18 | 8010 |
| 24 | 8016 |
| 32 | 7992 |
| 36 | 8012 |
|________|_________________________|
TABLE 1. Maximum text length as a function of Host word length
Host messages are broken into packets (see figure 2) before
transmission through the network. Of the bits transmitted from the
Host to its local IMP, all but the 32 bit leader are contained in the
data bits of one or more packets. The information in the 32 bit
leader is replicated in the 80 bit header of each packet, along with
other control information for the IMP's use. If a message is broken
into n packets, each containing d bits of data, then we have that
i
n
___
\
> d = M(T,W) - 32. (4)
/ i
___
i=1
The total number of bits in all packets required to send a Host
message of T text bits is given in equation (5).
M(T,W)-33
P(T,W) = (FLOOR[--------- ] + 1) * 152 + M(T,W) - 32 (5)
1008
where FLOOR[x] is the largest integer smaller than x.
Using (5) we can derive the maximum expected bandwidth utilization
achievable between two Hosts as
E (T,W) = T/P(T,W). (6)
P
The maximum expected bandwidth for a 32 bit Host is shown in Figure
3.
103
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
FIGURE 3 GOES HERE
104
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
If it were the case that all of the 50 kilobits available on lines
between IMPs were available to Hosts, we could immediately compute
the expected bandwidth of T bit messages from a 32 bit Host by
multiplying E (T,32) by 50 kilobits. However, as we shall see, not
P
all of the 50 kilobit telephone line bandwidth is really available to
the Hosts.
Every .655 seconds, each IMP sends to each of its neighbors a 2
packet routing message consisting of 1024 bits of data and 304 bits
of packet overhead or a total of 1328 bits. Thus, on the average,
these routing messages take up about
1328 bits/.655 seconds = 2 kilobits/sec (7)
There are other IMP related messages which take up available
bandwidth, (e.g. trouble reports), but these tend not to take up
enough to worry about in this rough analysis. We conclude, then that
of the 48 kilobits/ sec available to the Hosts on the telephone
lines, 48 * E (T,W) can actually be used to transmit Host data.
P
The preceding analysis makes one very important assumption with
respect to current NCP implementations of the Host-Host protocol. It
assumes that the pipeline between pairs of communicating Hosts can be
kept full, on the average, during transmission. By pipeline is meant
the logical link over which data flows from source to destination.
The current NCP's do not permit transmission of a second message
until the first has been received and a Request for Next Message
(RFNM) returned from the destination IMP. This policy (which exists
because of restrictions in earlier IMP-Host protocol specifications)
produces serious degradation in the expected bandwidth utilization
for Hosts communicating over a path which is several hops long (i.e.
packets are forwarded by several intervening IMPs before arriving at
the destination).
Let us suppose that there are N full packets (N is between 1 and 8)
in a message which is transmitted over H hops. Suppose further that
it takes c seconds, on the average, to transmit one full packet. One
can easily see that it will take (H + N -1)c seconds for the message
to arrive at the destination after the first bit leaves the source
IMP. It will take longer, of course, if packets must be
retransmitted, or if there are large queuing delays at any
intermediate IMP. Owing to the routing strategy in the IMPs, routing
tends to be fixed for .655 second periods, so that the likelihood
that a message will take alternate routes is small, otherwise the
(H + N - l)c seconds might be shorter due to parallelism in
105
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
transmission. Such parallelism would also be a function of network
topology and we ignore this complicated subject for the present.
Thus, from the source point of view, it takes Nc seconds to dispose
of a message which is N packets long, but it takes (H + N - l)c
seconds (or more) before the RFNM is returned to permit the next
message to be transmitted. For an N packet message sent over H hops,
a Host is able to use only
E (H,N) = N/(H + N - 1) (8)
H
of the available bandwidth.
Figure 4 shows the curves for E (N,H) for a range of N and H. The
H
foregoing computations can be used to make a rough estimate of the
expected achievable bandwidth available to two Hosts communicating
over a single link, taking into account all the previously mentioned
overheads.
If we pick H, N, and W, we can compute T from equations (4) and (5),
obtaining the expected bandwidth from equation (9) below.
EB(N,H,W) = E (N,H) * E (T,W) * 48 kilobits/sec. (9)
H P
In Figure 5, we see the curves for 1 and 8 packet messages assuming a
32 bit source Host.
106
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
FIGURE 4 GOES HERE
107
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
FIGURE 5 GOES HERE
108
NETWORK PROTOCOL DEVELOPMENT PROJECT TYPICAL ANALYSIS
These figures are disappointing for Hosts which are widely separated,
so we are interested in improving the situation. The current IMP-
Host protocol no longer prohibits the sending of more than one
message on a given link, however, it should be clear that the current
NCP's are unable to conveniently make use of this fact. Suppose more
than one message is transmitted on a given link by a source Host. If
any one of them is lost by the subnet, the local IMP will advise Host
that a message was lost. Even if the IMP tells the Host the leader
of the lost message, the Host cannot tell which of the two was lost.
There is no room in the leader for a Host message number, and putting
this in the text would not help. The Host could retransmit all
unRFNMed messages, but this might lead to serious problems for the
receiving Host (duplicate messages would be delivered unknowingly by
the destination IMP). Two alternatives present themselves:
1. Host retransmits all unRFNMed messages, but we add to the Host-
Host protocol a Host message numbering scheme so that the
receiving Host can sort out the duplicates, as well as re-order
the arriving messages since the IMP, because of Host
retransmission, can now deliver messages out of order to the
receiving Host.
2. Expand the leader to permit Host message numbers to be included.
These would not be used by the IMPs but would permit Hosts to
retransmit only those messages which were lost, and receiving
Hosts to verify arriving message ordering.
If either alternative is used, provision must be made in the Host-
Host protocol specification for such error detection and
retransmission. This is currently ignored in the existing protocol.
There are a number of other rough spots in the current protocols,
which we briefly mention here. ICP, using the present "line-
switched" Host-Host protocol requires an almost laughable number of
message exchanges to transmit a 32 bit socket number. A version
using Walden, Bressler, and Metcalfe's Message Switched Protocol
(MSP) [3] would undoubtedly save a great deal of effort. The TIP
program spends much of its energy sending off ALLOCATE messages to
serving Hosts. The TELNET protocol does not deal with the problem of
remote echo over satellite connections (i.e., what is needed is a
means to transfer the echoing task to the local Host or TIP whenever
possible).
109
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.4 WORK STATEMENT
This proposal will provide for the coordination of two major
activities of importance to the ARPA network:
1. The design, development, implementation and review of protocol
for the network.
2. The development of well-conceived guidelines for protocol design
and interconnection of national networks.
Much of the work of the first task will be accomplished by persons
supported directly by this contract, but in addition, efforts of
knowledgeable members of the ARPA community will be coordinated to
achieve timely protocol review and adequate scope and depth of
critical analysis. The key notion here is that such reviews can be
scheduled in a thorough, orderly, and timely way, rather than being
left to providence.
The second task, while also heavily technical in nature, is seen to
have extensive managerial aspects. Maintaining and insisting upon a
flow of communication between INWG members, introduction of critical
analyses, position papers, and proposed protocols form the main
thrust of the work. Keeping track of the state of other national
networks and probing for new developments form important concomitants
to the main effort. The ultimate initial goal, of course, is to
produce protocol design guidelines by December 1973 which, if
followed by participating members, will permit the technical
interconnection of the associated networks.
110
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.5 PERSONNEL
Vinton G. Cerf
Born: New Haven, Connecticut, June 23, 1943
Education: B.S. Stanford University (Mathematics) 1965
M.S. University of California, Los Angeles (Computer
Science) 1970
Ph.D. University of California, Los Angeles (Computer
Science) 1972
Positions: Assistant Professor of Computer Science and
Electrical Engineering, Stanford University,
1972-
Principal Programmer, Computer Science Department,
University of California, Los Angeles, 1967-1972
Senior Programmer, Jacobi Systems Corporation,
Los Angeles, 1968-1970
Associate Systems Engineer, IBM Corporation,
Los Angeles, 1965-1967
Honors: North American Rockwell Scholarship, 1961-1965
Richard L. Zellerbach Scholarship, 1962-1965
Professional
Societies: ACM (past chairman, LASIGART)
Publications
and Reports: 1. Carr, C.S., Crocker, S., and Cerf, V., "HOST-HOST
Communication Protocol in the ARPA Network",
AFIPS Proceedings of the 1970 Spring Joint
Computer Conference, p. 587-597.
2. Cerf, V., Harslem, E., Heafner, J., Metcalfe, R.,
and White, J., "An Experimental Service for
Adaptable Data Reconfiguration", IEEE Trans-
actions on Communications, vol. COM-20, no. 3,
June 1972, p. 557-564.
3. Cerf, V., and Naylor, W., "Storage Considerations
in Store-and-Forward Message Switching", 1972
WESCON Technical Papers, Session 7, Sept. 19-22,
1972, Los Angeles.
111
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.6 REFERENCES
1. Barber, D.L.A., "The European Computer Network Project", Computer
Communication, Impacts and Implications, ICCC, October 24-26,
1972, Washington, D.C., p. 345-351.
2. Bartlett, K.A., "Transmission Control in a Local Data Network",
IFIPS Proceedings, August 1968, Edinburgh.
3. Bressler, R., Murphy, D., and Walden, D., "A Proposed Experiment
with a Message Switching Protocol", NWG/RFC 333 (NIC 9926), 15
May 1972.
4. Brunel, L., "Canadian University Computer Network", Universite de
Quebec Report of the CANUNET Advisory Committee, March 1972.
5. CCITT (International Telegraph and Telephone Consultative
Committee), "Question on Public Data Networks for the Period
1972-1976", Document AP V-No. 24-E, 5 May 1972.
6. CCITT "Preliminary Report on the Work of Joint Working Party on
`New Data Networks' ", Document AP V-No. 21-E, 15 June 1972.
7. CCITT, "Preliminary Report on the Work of Joint Working Party
NRD; Part III/1: Proposals for New Recommendations on Public Data
Networks (Recommendations X.1, X.20, X.21, X.50)," Document AP V-
No. 22-E, 28 May 1972.
8. Davies, D.W., "The Principles of a Data Communications Network
for Computers and Remote Peripherals", IFIPS Proceedings, August,
1968, Edinburgh.
9. Davies, D.W., "Packet Switching in a Public Data Network", IFIPS
Proceedings, August 1971, Ljubljana, p.(TA5) 69-73.
10. Davies, D.W., "The Control of Congestion in Packet-Switching
Networks", Proceedings of SIGCOM, October 1971, Palo Alto.
11. Despres, Remi F., "A Packet Switching Network with Graceful
Saturated Operation", Computer Communication, Impacts and
Implications, ICCC, October 24-26, 1972, Washington, D.C., p.
345-351.
12. Elie, M., et al,"CYCLADES HOST-HOST Protocol", (original in
French), I.R.I.A. Technical Report 502.2, October 1972.
112
NETWORK PROTOCOL DEVELOPMENT PROJECT REFERENCES
13. Heart, F.J., et al, "The Interface Message Processor for the ARPA
Computer Network", Proceedings of the Spring Joint Computer
Conference, 1970, p. 551-567.
14. Husted, John M., "Current and Near Future Data Transmission via
Satellites of the Intelsat Network", Computer Communication,
Impacts and Implications, ICCC, October 24-26, 1972, Washington,
D.C., p. 201-209.
15. Kleinrock, L., "Analytical and Simulation Methods in Computer
Network Design", Proceedings of the Spring Joint Computer
Conference, 1970.
16. Metcalfe, R.M., "Strategies for Operating Systems in Computer
Networks", Proceedings of the ACM Annual Conference, August 1972,
Boston, p. 278-282.
17. Network Information Center, ARPA Satellite System Notes 1-29.
18. Network Information Center, Current Network Protocols.
19. Network Information Center, International Network Working Group
Notes 1 - 9.
20. Network Information Center, Network Measurement Notes 1 - 10.
21. Roberts, L.G., and Wessler, B.D., "Computer Networks to Achieve
Resource Sharing", Proceedings of the Spring Joint Computer
Conference, 1970, p. 543-549.
22. Scantlebury, R.A., et al, "The Design of Message Switching Centre
for a Digital Communication Network", IFIPS Proceedings August
1968, Edinburgh.
23. Scantlebury, R.A. and Wilkinson, P.T., "The Design of a Switching
System to Allow Remote Access to Computer Services by Other
Computers and Terminal Devices", Proceedings of SIGCOM, October
1971, Palo Alto.
24. Smith, B.T., "Mixed Computer Networks: Benefits, Problems and
Guidelines", Computer Communication, Impacts and Implications,
ICCC, October 24-26, 1972, Washington, D.C., p. 201-209.
25. Walden, D.C.,"A System for Interprocess Communication in a
Resource Sharing Computer Network", CACM, vol. 15, no. 4, April
1972, p. 221-230.
113
NETWORK PROTOCOL DEVELOPMENT PROJECT REFERENCES
26. Wilkinson, P.T., and Scantlebury, R.A., "The Control Functions in
a Local Network", IFIPS Proceedings, August 1968, Edinburgh.
114
NETWORK PROTOCOL DEVELOPMENT PROJECT
3.7 BUDGET
15-JUN-73 1-JUL-74
thru thru
30-JUN-74 30-JUN-75
I. Salaries
A. Faculty
Vinton Cerf, Assistant Professor
Principal Investigator
50% academic year, 100% summer $ 10,417 $ 10,938
B. Others
1. (2) Student Research Assistants,
50% acad. year, 100% summer
at $4200/year 8,400 8,400
2. Secretary (25% time) 2,256 2,369
3. Support services (10% time) 1,000 1,050
-------- --------
TOTAL DIRECT SALARIES $ 22,073 $ 22,757
II. Staff Benefits (16% thru 8/31/73,
17.3% thru 8/31/74, 18.3% thru
8/31/75) 3,759 4,126
III. Expendables 1,000 1,000
IV. Travel (4 east cost trips/yr,
1 European trip, and 12 local) 4,200 4,600
V. Reports and Communications 2,530 2,449
-------- --------
TOTAL DIRECT COSTS $ 33,562 $ 34,932
VI. Indirect Costs (46% of
Total Direct Costs) 15,438 16,068
-------- --------
T O T A L $ 49,000 $ 51,000
TOTAL FOR TWO YEARS $100,000
115
4. BUDGET SUMMARY
The total cost of this proposal is as follows.
15-JUN-73 1-JUL-74
thru thru TOTAL
30-JUN-74 30-JUN-75
Artificial Intelligence Project $1,250,000 $1,250,000 $2,500,000
(see PagePAGE 56 for details)
Heuristic Programming Project 200,000 200,000 400,000
(see PagePAGE 94 for details)
Network Protocol Dev. Project 49,000 51,000 100,000
(see PagePAGE 115 for details)
GRAND TOTALS $1,499,000 $1,501,000 $3,000,000
116
5. COGNIZANT PERSONNEL
For contractual matters:
Office of the Research Administrator
Stanford University
Stanford, California 94305
Telephone: (415) 321-2300, ext. 2883
For technical and scientific matters regarding the
Artificial Intelligence Project:
Prof. John McCarthy
Computer Science Department
Telephone: (415) 321-2300, ext. 4971
For administrative matters regarding the Artificial
Intelligence Project, including questions relating
to the budget or property acquisition:
Mr. Lester D. Earnest
Mr. Norman Briggs
Computer Science Department
Telephone: (415) 321-2300, ext. 4971
For technical, scientific, and budgetary matters
regarding the Heuristic Programming Project:
Prof. Edward Feigenbaum
Computer Science Department
Telephone: (415) 321-2300, ext. 4878
Prof. Joshua Lederberg
Genetics Department
Telephone: (415) 321-2300, ext. 5052
For technical, scientific, and budgetary matters
regarding the Network Protocol Development Project:
Prof. Vinton Cerf
Stanford Electronics Laboratories
Telephone: (415) 321-2300, ext. 73458
117