perm filename LABEL.XFR[DOC,LMM] blob sn#062951 filedate 1973-09-13 generic text, type C, neo UTF8

COMMENT ⊗ VALID 00002 PAGES C REC PAGE DESCRIPTION C00001 00001 C00005 00002 .<< NOTE THE FOLLOWING CONVENTIONS IN THIS TEXT : >> C00048 ENDMK C⊗; .<< NOTE THE FOLLOWING CONVENTIONS IN THIS TEXT : >> .<< ?? MEANS QUESTION MARK .<< ?B MEANS BEGIN BOLD FACE .<< ?I MEANS BEGIN ITALICS .<< ?N MEANS RESTORE NORMAL TEXT .<< ?U MEANS SUPERSCRIPT NEXT CHAR .<< ?D MEANS SUBSCRIPT NEXT CHAR ,<< ?P MEANS THE CHARACTER PI .<< ?( IS A LEFT BRACE .<< ?) IS A RIGHT BRACE .<< ?> IS GREATER THAN OR EQUAL SIGN .<< ?< IS LESS THAN OR EQUAL SIGN .<< ?/ IS A VERTICAL BAR .<< ?A IS AN AMPERSAND (USED FOR MAKING FRACTIONS .<< ?L IS LEFT SQUARE BRACKET .<< ?R IS RIGHT SQUARE BRACKET . << title page >> .BEGIN CENTER Applications of Artificial Intelligence for Chemical Inference XIII. Labelling Objects Having Symmetry<<ACK1:We gratefully acknowledge the support of this research by the National Institutes of Health (RR 00612-03) and the Advanced Research Projects Agency (SD-183). >?U,<<ML:For part XII, see L. Masinter, N. S. Sridharan, J. Lederberg, and D. H. Smith, ?IJ. Amer. Chem. Soc.?N, ?B00?N, 0000 (0000).> Larry Masinter and N.S. Sridharan Contribution from the Computer Science Department Stanford University Stanford, California 94305 .END .GROUP SKIP 5 .BEGIN INDENT 6,6,6 ABSTRACT. An algorithm for finding a complete set of non-equivalent labellings of a symmetric object and applications of the algorithm to problems in chemistry are presented. .END .SKIP TO COLUMN 1 The class of combinatorial problems dealing with finding a complete set of non-isomorphic objects under varying constraints and concepts of isomorphism, has wide applications in a variety of fields. The problem attacked in this paper is one of finding all distinct ways to assign a given collection of labels or colors to the parts of a symmetric object. In chemistry, one manifestation of this problem is to make all assignments of ligands (from a fixed set) to a given carbocyclic skeleton .REFERENCE! ML?). Part A of this paper may be read as a brief tutorial on the nature of the problem and an introduction to the terminology found in more technical treatments. Part B is a textual description of a method for the solution of this type of problem. Part C is a summary of the procedure in a more algorithmic form; an even more formal description and a proof of correctness is available elsewhere<<BROWN: (a) H. Brown, L. Masinter, and L. Hjelemeland, ?IDiscrete Math.?N in press; (b) Stanford Computer Science Memo 31B, Computer Science Department, Stanford University.>. Part D gives examples and applications of this algorithm in both organic and inorganic chemistry. This problem is encountered in a wide range of applications beyond chemistry-- within many areas of graph theory and combinatorics, for example. It has been known how to compute the number of solutions<<DEBRUIJN: N. G. DeBruijn, ?INedarl. Akad. Wentesh. Proc. Ser. A?N, ?B62?N, 59 (1959).>?U,<<polya: N. G. DeBruijn, in "Applied Combinatorial Mathematics," E. Beckenbach, Ed., John Wiley, New York, 1964, p. 144.>, but an efficent method of actually constructing the solutions has not previously been published<<PERLMAN:an alternative algorithm has been described to us in a private communication from D. M. Perlman, University of California, San Diego, California.>. .TITL PART A: DEFINITIONS .HD SYMMETRY AND ITS RELATIONSHIP TO LABELLING Consider the special case of the general problem: suppose all of the labels are distinct. For example, suppose that one wishes to number the faces of a cube with the numbers ?(1, 2, 3, 4, 5, 6?), or the "nodes" (atom positions in a graph) of the decalin skeleton (?B1?N) with numbers ?(1, 2, 3, 4, 5, 6, 7, 8, 9, 10?). . FIGURE 1,8,The Decalin Skeleton There are n! (n factorial) ways of labelling where n is the number of parts. If the object has no symmetry then each of these n! ways are distinct from the rest. However for the decalin skeleton (where n! = 10! = 3,628,800 ways) there is some symmetry. First one picks, arbitrarily, one of the numberings as a point of reference (e.g. ?B2a?N). . FIGURE 2,10,Three of the 10! numberings of the Decalin Skeleton. Some of the 10! ways might be viewed as different (e.g. ?B2b?N); some of them might be viewed as the same (e.g. ?B2c?N). Intuitively, ?B2a?N and ?B2c?N are equivalent because one could flip ?B2a?N over the 3-8 axis and get ?B2c?N. There is another way of determining the "sameness" of such numberings which is easier in more complicated cases and is generally more suited to computer application: .BEGIN I6 ?BDEFINITION:?N Two numberings of the nodes of a graph are ?Iequivalent?N if the connection tables with the respective numberings are identical when the node numbers are written in ascending order and each "connected to" list is in ascending order. .END Table I contains the respective "connection tables" of structures ?B2a-c?N. Note that the connection table for ?B2c?N is identical to that of ?B2a?N; that of ?B2b?N is different. .BEGIN NOFILL GROUP .SELECT 4 .SEPLIN ?BTable I.?N Connection Tables for Structures ?B2a-c?N. ?B2a?N ?B2b?N ?B2c?N node?/connected ?/ node?/connected ?/ node?/connected ?/ to ?/ ?/ to ?/ ?/ to ?/ ?/ 1 2,1O ?/ 1 8,9 ?/ 1 2,1O 2 1,3 ?/ 2 3,7 ?/ 2 1,3 3 2,4,8 ?/ 3 2,6 ?/ 3 2,4,8 4 3,5 ?/ 4 6,8,1O ?/ 4 3,5 5 4,6 ?/ 5 9,1O ?/ 5 4,6 6 5,7 ?/ 6 3,4 ?/ 6 5,7 7 6,8 ?/ 7 2,8 ?/ 7 6,8 8 3,7,9 ?/ 8 1,4,7 ?/ 8 3,7,9 9 8,1O ?/ 9 1,5 ?/ 9 8,1O 1O 1,9 ?/ 1O 4,5 ?/ 1O 1,9 .SEPLIN .END This definition means, among other things, that properties such as valency are preserved: If two numberings are equivalent and in the first, node 1 is trivalent, then in the second, node 1 is trivalent. If there are other properties of the nodes (they are already colored or labelled, for example), then this definition can be extended to include the preservation of those properties. For example, suppose there are atom names associated with (some of) the nodes of the graph. Then one can define equivalent numberings to be those which not only have identical connection tables, but also the same atom names for the corresponding nodes. Then ?B3a?N would still be equivalent to ?B3c?N but no longer to ?B3b?N since, although the connection tables of ?B3a?N and ?B3b?N are identical, node 1 in ?B3a?N is labelled with an "N" while in ?B3b?N it unlabelled. .Figure 3,10,Partially labelled graphs reduce the equivalencies. .HD PERMUTATIONS AND PERMUTATION GROUPS Given a numbering of a graph, one can use a condensed notation to write down other numberings in terms of the first. All of these are written down with respect to an original "reference" numbering. Using the numbering of ?B2a?N as the reference numbering, numberings for ?B2a-c?N are given in Table II. In the 2b case, the row of numbers means that, in sequence, the node numbered 1 in the reference numbering of ?B2a?N is now numbered 2, the node originally numbered 2 is now numbered 10, and so on. .BEGIN NOFILL GROUP SELECT 4 .SEPLIN ?BTable II?N Condensed Notation for Numberings of ?B2a-c?N ?B2a?N numbers: 1 2 3 4 5 6 7 8 9 1O ?B2b?N numbers: 2 7 8 1 9 5 1O 4 6 3 ?B2c?N numbers: 5 4 3 2 1 1O 9 8 7 6 .SEPLIN .END One can conceptualize a numbering as a transformation or as a function: The transformation ?P for ?B2c?N is ?P?D2?Dc(1)=5, ?P?D2?Dc(2)=4, ?P?D2?Dc(3)=3, ... ?P?D2?Dc(10)=6. These transformations are ?Ipermutations?N: one to one mappings from the integers ?(1,2,...,n?) to themselves. The transformation for the "reference" numbering is the identity ?P?Di(x)=x. It can be shown that whatever the graph, the set of transformations satisfying the "equivalency" requirement above satisfies the property of a group. One may then say: .BEGIN I6 The ?Isymmetry group?N of a graph is the set of all transformations which yield identical connection tables. (If there are other properties to be considered, one may include them as part of the connection table). .END; CONTINUE For the decalin skeleton there are 4 permutations in the symmetry group, given in Table III. .BEGIN NOFILL GROUP SELECT 4 .SEPLIN ?BTable III.?N The Symmetry Group of the Decalin Skeleton ?P?Di?Ua 1 2 3 4 5 6 7 8 9 1O ?P?Dv 5 4 3 2 1 1O 9 8 7 6 ?P?Dh 1O 9 8 7 6 5 4 3 2 1 ?P?D?L18O?R 6 7 8 9 1O 1 2 3 4 5 ?Ua The reference numbering corresponds to that given for ?B2a?N. .SEPLIN .END These correspond directly to the intuitive geometric symmetries ?P?Di=identity, ?P?Dh=reflection about horizontal axis, ?P?Dv=reflection about vertical axis, ?P?D?L180?R = rotation by 180 degrees. It is not too difficult for a computer program to figure out the symmetry group of a graph given its connection table. This definition deals with the symmetry of the ?Inodes?N of a graph. In many cases, one might wish to label the ?Iedges?N (interconnecting lines) of a graph. In this case, the symmetry group on the edges rather than on the nodes is required. It is very easy to calculate this group from the group on the nodes. Let the numbering for each edge in the graph be the unordered pair of numbers of the nodes that form the end-points. Then the corresponding permutations can be obtained as follows: .BEGIN NOFILL ?P?Di?(n?D1,n?D2?)=?(?P?Di(n?D1),?P?Di(n?D2)?) .END CONTINUE This is most easily shown by way of an example (Table IV). .BEGIN NOFILL GROUP SELECT 4 .SEPLIN ?BTable IV.?N Permutation Group of Decalin Skeleton Acting on the edges ?P?Di?Ua 1-2 2-3 3-8 3-4 4-5 5-6 6-7 7-8 8-9 9-10 1-10 ?P?Dv 4-5 3-4 3-8 2-3 1-2 1-10 9-10 8-9 7-8 6-7 5-6 ?P?Dh 9-10 8-9 3-8 7-8 6-7 5-6 4-5 3-4 2-3 1-2 1-10 ?P?D?L180?R 6-7 7-8 3-8 8-9 9-10 1-10 1-2 2-3 3-4 4-5 5-6 ?UaThe reference numbering corresponds to that given for ?B2a?N. .SEPLIN .END Finding the group of an object is a special kind of labelling problem. Given one way of numbering (labelling with distinct labels) the parts of the object, one finds all other ways which are equivalent. The labelling problem attacked in this paper is the converse: to find a maximal list of labellings, none of which are equivalent. In general the set of all posssible labellings can be split into subsets, such that: .BEGIN I6 1) If two labellings are in different subsets, they are not equivalent. 2) If two labellings are in the same subset, they are equivalent. .END These subsets are called ?Iequivalence classes?N of labellings; selection of one labelling from each equivalence class yields the maximal list of non-equivalent labellings. The relationship of finding the group, and of finding labellings when all the labels are distinct, can be seen as follows (see ?B4?N): view the n! possible labellings as laid out in a matrix such that each row contains one equivalence class. It can be shown that in the special case where the labels are distinct, all of the equivalence classes are of the same size. To find the group, one needs to find a row. To find the labellings, one needs to pick one element from each row (i.e. find a column). .Figure 4,15,"Equivalence classes, Groups, and Labellings" .TITL PART C: SUMMARY OF LABELLING STEPS .BEGIN SPREAD←1; PREFACE 1; INDENT 0,3,3 1) Calculate the group, if not previously calculated. 2) If there are more than two different node types, do the entire labelling process with one of the label types, and the rest "blanks"; then for each of the solutions obtained, label the "blanks" with the remaining label types using the reduced symmetry group and collect the results. 3) Otherwise, (only two label types), .BEGIN INDENT 3,6,6 a) Find the orbits. b) If more than one orbit, then .BEGIN INDENT 6,9,9 1) Find the different "cases" -- ways of distributing the labels among the orbits. 2) For each case, .BEGIN INDENT 9,12,12 a) Label the first orbit. b) Label the rest of the orbits using the reduced symmetry group obtained from a), and collect the results. .END END c) Otherwise (only 1 orbit and 2 label types): .BEGIN INDENT 6,9,9 1) If there is only one either label type, pick the "first" node and label it with that label type. This is the only distinct possibility. 2) Otherwise if there are exactly two of either label type, label the first node with that label type, calculate the reduced symmetry group and new orbits, and from each of the new orbits, pick the "first" node. The solutions consist of those possibilities (one for each new orbit). 3) Otherwise, (3 or more of each label type, and one orbit) label the "first" node, calculate the reduced symmetry group, label the rest of the nodes under the reduced group, and for each of the results, check if for every permutation ?P in the original group that .BEGIN NOFILL ?P(labelled set)?>labelled set .END .CONTINUE If this is not true of a labelled set, discard it as a solution. The result is every such labelling that satisfies this property. .END END END .TITL PART D: EXTENDED EXAMPLES .HD ?/Study of Diels-Alder Rings?N<<SIMEK:proposed by Jan Simek, .Chemistry Department, Stanford University .Research Proposal (unpublished), February, 1973.>?/ The labelling algorithm has been used to define the scope and boundaries of the Diels-Alder reaction, a well-known and commonly used synthetic reaction. The reaction is shown in ?B11?N, and is defined as the 4+2 cycloaddition of an olefin, termed the dienophile, with a conjugated diene, leading to the formation of a cyclohexene-type of ring system (Diels-Alder Ring). .FIGURE 11,5,Diels-Alder reaction The method used by the program to generate Diels Alder Rings is described below, followed by an example of the labelling procedure. The program generated 1176 Diels-Alder Rings using any combination of C, N, O and S. Other atoms such as P could have been included but were deemed not interesting to us. A comparison of the computer print-out with the Ring Index (which covers the literature through 1963) revealed that only 224 (about 19% of 1176) are "known" systems. A ring system was said to be known if the same sequence of atoms had been reported regardless of number or position of unsaturations. The complete list of Diels-Alder Rings is richly suggestive to the synthetic chemists and may serve to increase the information on the scope of the Diels-Alder reaction. .Figure 12,20,Diagram of Method of Generation .HD ?/METHOD ?N(see ?B12?N)?/ .graf Step 1 The initial pot of atoms consisted of C?D6N?D6O?D4S?D4. The number of oxygens and sulfurs was limited to four, because no Diels-Alder ring can be made with five or six bivalent atoms, owing to valence restrictions. A list of all possible 76 compositions of 6 atoms was produced using a purely combinatorial procedure. .graf Step 2 Eleven of the 76 compositions were eliminated, again owing to valence limitations. An example of the eleven compositions eliminated is O?D3S?D3. .graf Step 3 For each of the 65 remaining compositions, the Diels-Alder ring skeleton was labelled with the atoms, while ensuring valence constraints were satisfied. .graf Step 4 The results from all labelling steps were collected, without needing to check for duplicates. The labelling algorithm guarantees that the lists were irredundant. .graf Example of labelling An example of a valid composition is C?D4O?D2. Diels-Alder rings formed with this composition can only have carbons double bonded to each other (see ?B13?N). The atoms remaining to be assigned are C?D2O?D2 and the ring positions are numbered 1,2,3,4. .FIGURE 13,16,Diels-Alder Example A .HD Verification by Enumeration The results of of the labelling procedure can be verified by combinatorial counting techniques .REFERENCE! DEBRUIJN?)?U, .REFERENCE! POLYA?). The following derivation follows closely from that in Liu<<LIU:?N?NLiu, Introduction to Comb. Math, McGraw-Hill, 1968>. The generating function<<genfn:generate a reference for generating functions> of the number of labellings with two types of labels can be calculated with the Cycle Index of Group=PG = .BEGIN NOFILL SELECT 4 1/2(y?U4?A?D1+ y?U2&?D1) .END CONTINUE and the Pattern Inventory is .BEGIN NOFILL SELECT 4 1/2 (x?U1?A?D1+ x?U1?A?D2)?U4+ (x?U2?A?D1+ x?U2?A?D2)?U2 .END CONTINUE showing that there are precisely these number of labellings: .BEGIN NOFILL SELECT 4 labels #labellings C?D4 1 S?D4 1 C?D2S?D2 4 CS?D3 2 C?D3S 2 .end The third entry in the table verifies our labelling with C?D2S?D2 in four ways. .HD Labellings on the Octahedral Skeletal Framework. This example is concerned with the geometric isomers of structures with octahedral symmetry (see ?B14?N). .FIGURE 14,10,Octahedral Skeletal Framework .begin nofill .SEPLIN Labellings on Octahedral Skeletal Framework Group is: i (1 2 3 4 5 6) r?D1 (1 3 5 4 6 2) r?D2 (1 5 6 4 2 3) r?D3 (1 6 2 4 3 5) v?D1 (4 5 3 1 2 6) v?D2 (4 2 6 1 5 3) v?D3 (4 3 2 1 6 5) v?D4 (4 6 5 1 3 2) Orbits: ?(1,4?), ?(2,3,5,6?). Example with labels (A A A A B B) Number of labels A = 4 Number of labels B = 2 Partitioning Labels into Orbits: .BEGIN GROUP Case number of A's assigned to orbit: # ?(1 4?) ?(2 3 5 6?) 1 2 0 2 1 1 3 0 2 .END Case 1. To map A A --> ?(1,4?) and A A B B --> ?(2,3,5,6?) Map A A --> ?(1,4?) (trivial case) New group i, r?D1, r?D2, r?D3, v?D1, v?D2, v?D3, v?D4 New orbits ?(2 3 5 6?) To map A A B B --> ?(2 3 5 6?) Case 1.1. Assign A --> 2 New group i,r?D2 New orbits ?(5?),?(3,6?) Case 1.1.1. Assign A --> 5 Remaining B B --> ?(3,6?) 1 labelling (A A B A A B) Case 1.1.2. Assign A --> 3 Remaining B B --> ?(5,6?) 1 labelling (A A A A B B) Case 2. To map A B --> ?(1,4?) and A A A B --> ?(2,3,5,6?) Assign A --> 1 and B --> 4 New group i, r?D1, r?D2, r?D3 New orbits ?(2 3 5 6?) To map A A A B --> ?(2 3 5 6?) Case 2.1. Assign B --> 2 To map A A A --> ?(3 5 6?) 1 labelling (A B A B A A) Case 3. To map B B --> ?(1 4?) and A A A A --> ?(2 3 5 6?) Assign B --> and B --> 4 New group i, r?D1, r?D2, r?D3, v?D1, v?D2, v?D3, v?D4 New orbit ?(2 3 5 6?) To map A A A A --> ?(2 3 5 6?) 1 labelling (B A A B A A) .FIGURE 15,5,?/Four labellings of Octahedral Skeleton with Labels (A A A A B B)?/ Example with Labels A B C D E F Case 1. A --> orbit ?(1 4?) A --> 1 new group i, r?D1, r?D2, r?D3 new orbits A1 = ?(2 3 5 6?) A2 = ?(4?) Assign label B Case 1.1. B --> orbit A1 B --> 2 new group i; special case No Group Case 1.2. B --> orbit A2 B --> 4 new group i, r1, r2, r3 new orbits AA1 = ?(2 3 5 6?) Case 1.2.1 C --> orbit AB1 C --> 2 new group i; special case No group Case 2. A --> orbit ?(2 3 5 6?) A --> 2 new group i, ?????? new orbits B1 = ?(1 4?) B2 = ?(5?) B3 = ?(3 6?) Case 2.1. B --> orbit B1 B --> 1 new group i; special case No Group; 24 labellings Case 2.2. B --> orbit B2 B --> 5 new group i,?? ???????? fill in new orbits BB1 = ?(1 4?) BB2 = ?(3 6?) Case 2.2.1. C --> orbit BB1 C --> 1 new group i; special case No Group; 6 labellings Case 2.2.2. C --> orbit BB2 C --> 3 new group i ; special case No group ; 6 labellings Case 2.3. B --> B3 B --> 3 new group i ; special case No Group; 24 labellings 90 unique labelings all together. Verification: When all labels are different the total number of labellings =?U?L(# labels)!?R?A?D?L(size of group)?R?A[--------------] = ?U?L6!?R?A?D8?A?L---?R=90 labellings .end .TITL ACKNOWLEDGEMENTS We gratefully acknowledge the contributions of D. H. Smith, whose aid in the preparation of this paper was invaluable, Jan Simek, who proposed the application in Part D, Larry Hjelmeland who formulized the initial problem, Professor Harold Brown, who proved the correctness of the original algorithms, and Professor Joshua Lederburg, whose advice and guidance has always been an inspiration.