perm filename THESIS.DOC[2,TES] blob sn#009908 filedate 1972-07-14 generic text, type T, neo UTF8

                         *** BLANK PAGE ***

       MEMO AIM-



                            Gerald Jacob Agin


       A  representation  is  proposed  in  which three-dimensional
       objects  are  represented  by  data  structures  composed of
       primitives called  generalized cylinders.   These primitives
       consist of a space  curve or "skeleton" and a  cross section
       which may vary along the length of the  skeleton.  Apparatus
       and programs are described which obtain depth information by
       scanning  objects  with  a  laser  and   television  camera.
       Results are presented from  a set of programs  which analyze
       the laser-derived depth information and segment objects into
       primitives  describable as  generalized  cylinders.  Methods
       are  proposed  whereby  a  program  may   generate  complete
       descriptions of complex curved objects.

                  *** WORKING DRAFT -- July 14, 1972 ***

       The  research reported  here was  supported in  part  by the
       Advanced Research Projects Agency.

       The  views and  conclusions contained  in this  document are
       those  of  the  author  and  should  not  be  interpreted as
       necessarily  representing  the  official   policies,  either
       expressed  or  implied, of  the  Advanced  Research Projects
       Agency or of the U. S. Government.

       Reproduced  in   the  USA.   Available  from   the  National
                                    ---------  ----   ---  --------
       Technical Information Service, Springfield, Virginia  22151.
       --------- ----------- -------  -----------  --------  -----
       Price: full size copy $3.00; microfiche copy $0.95.
       -----  ---- ---- ---- -----  ---------- ---- -----

                       PREFACE TO THE WORKING DRAFT

       This is the working draft of "Representation and Description
       of  Curved   Three-Dimensional  Objects.   Copies   of  this
       document are for review and comments only.

       Present status:

           Section  needs writing.

           Section  and Section 5 need extensive revisions.

           Update Section  to reflect new calibration procedure.

           Some figures remain to be drawn.

           Fix up some of the references.

           Some  further   research  on  fitting   cross  sections,
           describing skeletons, and describing complex objects, if
           time permits.

       Comments and criticisms are invited.
                                       Jerry Agin
                                                           Page iii

                            TABLE OF CONTENTS

       1  INTRODUCTION                                       Page 1

         REFERENCES                                          Page 5
                                                            Page iv

                          LIST OF ILLUSTRATIONS
                                                             Page 1

                             1  INTRODUCTION

       My  present interest  in representation  and  description of

       curved   objects  arose   from  a   desire  to   extend  the

       capabilities of  the Stanford  Hand-Eye System  [Feldman] to

       recognize  a  wider  class  of  objects  than  plane-bounded

       solids.   Initial  attempts  to  recognize  geometric cones,

       cylinders,  and  spheres  were  not  carried  far  enough to

       demonstrate  the   usefulness  of  existing   techniques  in

       recognizing   this  limited   addition  to   the   class  of

       recognizable   objects.   But   there  appears   to   be  no

       insurmountable barrier to doing so.

       It soon became apparent that little useful purpose  would be

       served by this limited extension.  A significant improvement

       in performance of vision  systems would come only  when they

       were  capable of  recognizing the  sort of  everyday objects

       that a  robot of  the future  might have  to deal  with.  To

       accomplish this, we make use of two new tools or techniques.

       The first of these is a representation general  and flexible

       enough  to carry  varied  information about  an  object, its

       parts,   and   their   relation   to   one   another.    The

       representation  we  propose  will  allow  several  different
       1.0  INTRODUCTION                                     Page 2

       models for primitives or parts of objects.  These models may

       include prototypes of  various sorts.  The  particular model

       we present in  this paper, generalized cylinders,  is useful

       for  describing  a  large  class  of  natural  and  man-made

       objects.  It describes in a natural and intuitive way pieces

       which possess elongation, or which have axial symmetry.  The

       generalized cylinders may be linked together in various ways

       to  form complex  objects.  The  most  significant departure

       from   previous   methods  of   description   is   that  the

       representation  is essentially  a volume  representation, as

       opposed  to a  surface  representation.  Section  2  of this

       report  describes  in detail  representation  by generalized

       cylinders, and gives some  examples of how objects  might be

       modelled with such a representation.

       The  second  new  technique  is  the  use  of  direct  depth

       measurement for recognition.  Every two-dimensional image of

       a three-dimensional scene is ambiguous, in that there are an

       infinite number of realizable physical objects (or groupings

       of  objects)  which could  give  rise to  the  image.  These

       ambiguities  may be  resolved only  by  a priori assumptions

       about  the nature  of the  scene.  The  use of  direct depth

       information  eliminates  the need  for  assumptions  such as

       squareness of  corners, requirements that  stored prototypes
       1.0  INTRODUCTION                                     Page 3

       exist for every possible object in a scene, or  knowledge of

       the  reflective  characteristics  of  surfaces.   Section  3

       describes a ranging system, which obtains  depth information

       by triangulation, using a laser and a television camera.

       The test of any vision system must be how well it deals with

       actual objects.  Sections  4 and 5 describe  an experimental

       system to  generate descriptions  of physical  objects, from

       depth data derived from the ranging system.  The  system has

       so far given good  results in describing some  simple curved

       objects, and  has been  moderately successful  in describing

       parts  of  complex objects.   Some  results may  be  seen in

       Figures  ___.   Section  6  contains  some  suggestions  for

       further research.

       So  far we  have  not achieved  our goal  of  recognition of

       complex objects; we cannot yet identify a given object, say,

       as a humanoid figure or as a hammer.  But we hope our method

       of representation and our work in description of objects has

       laid the groundwork for progress in this area.

       I would like to express my appreciation for the guidance and

       assistance  of  my  advisor,  Dr.  Thomas  O.  Binford.  His

       insight has  been helpful  on many  occasions.  Many  of the
       1.0  INTRODUCTION                                     Page 4

       original ideas  on which  this research  is based  were his,

       including  generalized  translational  invariance,  and  the

       basic configuration of a laser ranging system.

       Thanks  are  also due  to  Dr. Arthur  L.  Schawlow  for the

       generous loan of a laser, to Victor Scheinman for the design

       and assembly of  the laser deflection apparatus,  to Richard

       Underwood  for routines  for the  numerical solution  of the

       generalized eigenvalue equation,  and to Bruce  G. Baumgart,

       R.  K.  Nevatia,  and  J.  M.  Tenenbaum  for  their helpful

       comments and suggestions.
                                                             Page 5


       [Baumgart 72a]  GEOMED ...

       [Baumgart 72b]   On the  Representation of  Physical Objects
                        __ ___  ______________ __  ________ _______


       [Blum]   Harry Blum,  "A Transformation  for  Extracting New

               Descriptors  of  Shape",  Symposium  on  Models  for

               Perception  of  Speech  and  Visual   Form,  Boston,

               November 11-14, 1964.

       [Binford 70]  Thomas  O. Binford, "Triangulation  by Laser",

               December, 1970, unpublished.

       [Binford 71]   Thomas  O.  Binford,  "Visual  Perception  by

               Computer", presented at ...

       [Coons] S. A. Coons  and B. Herzog, "Surfaces  for Computer-

               Aided Aircraft  Design", J. Aircraft,  Vol 1,  No. 4
                                        __ ________

               (July-Aug, 1968), pp 402-406.

       [Courant]  Differential and Integral Calculus, Interscience,
                  ____________ ___ ________ ________

               1936, Volume 2, pp. 190-199.
       2.0  REFERENCES                                       Page 6

       [Coxeter]  H. S. M. Coxeter, Introduction to  Geometry, John
                                    ____________ __  ________

               Wiley and Sons, 1961, pp321-326.

       [DeBoor]  Carl de Boor and John R. Rice, Least Squares Cubic
                                                _____ _______ _____

               Spline  Approximation   I  -  Fixed   Knots,  Purdue
               ______  _____________   _  _  _____   _____

               University Report No. CSD TR 20, April  1968.  Least

               Squares  Cubic  Spline Approximation  II  - Variable
               _______  _____  ______ _____________  __  _ ________

               Knots, Purdue University Report No. CSD TR 21, April


       [Earnest]   Lester  D.  Earnest,  Choosing  an  Eye   for  a
                                         ________  __  ___   ___  _

               Computer,  Stanford Artificial  Intelligence Project

               Memo AIM-51, April, 1967.

       [Falk]  ...

       [Feldman]  J. Feldman, K. Pingle, T Binford, G Falk, A. Kay,

               R. Paul, R. Sproull,  and J. Tenenbaum, "The  Use of

               Vision  and  Manipulation  to  Solve   the  `Instant

               Insanity'   Puzzle",   Second   International  Joint

               Conference   on  Artificial   Intelligence,  London,

               September 1-3, 1971.

       [Horn]   Berthold Klaus  Paul  Horn, Shape  from  Shading: A
                                            _____  ____  ________ _
       2.0  REFERENCES                                       Page 7

               Method  for Finding  the  Shape of  a  Smooth Opaque
               ______  ___ _______  ___  _____ __  _  ______ ______

               Object  from One  View, Ph.D.  Thesis, Massachusetts
               ______  ____ ___  ____

               Institute of Technology, June, 1970.

       [Krakaur]  ...

       [Mott-Smith]  John Mott-Smith, unpublished ...

       [Pingle]   Karl   K.  Pingle,  Hand/Eye   Library,  Stanford
                                      ________   _______

               Artificial  Intelligence  Laboratory  Operating Note

               35.1, January, 1972.

       [Roberts 63]  L.  G. Roberts,  Machine Perception  of Three-
                                      _______ __________  __ ______

               Dimensional Solids ...
               ___________ ______

       [Roberts 65]     L.   G.    Roberts,    Homogeneous   Matrix
                                               ___________   ______

               Representation  and  Manipulation  of  N-Dimensional
               ______________  ___  ____________  __  _____________

               Constructs,  Document  MS1045,  Lincoln  Laboratory,

               Massachusetts Institute of Technology, May, 1965.

       [Shirai]  Yoshiaki  Shirai and  Motoi Suwa,  "Recognition of

               Polyhedrons   with    a   Range    Finder",   Second

               International   Joint   Conference   on   Artificial

               Intelligence, London, September 1-3, 1971.
       2.0  REFERENCES                                       Page 8

       [Smith]  Lyle B.  Smith, The Use of  Man-Machine Interaction
                                ___ ___ __  ___________ ___________

               in    Data-Fitting    Problems,    Stanford   Linear
               __    ____________    ________

               Accelerator Center Report No. 96, March, 1969.

       [Sobel]  Irwin Sobel, Camera Models and  Machine Perception,
                             ______ ______ ___  _______ __________

               Stanford Artificial  Intelligence Project  Memo AIM-

               121, May, 1970.

       [Will]  P.  M. Will  and K. S.  Pennington, "Grid  Coding: A

               Preprocessing  Technique   for  Robot   and  Machine

               Vision",  Second International  Joint  Conference on

               Artificial  Intelligence,  London,   September  1-3,