perm filename LABEL.DOC[DOC,LMM]1 blob sn#044084 filedate 1973-05-18 generic text, type T, neo UTF8

LABELLING OBJECTS HAVING SYMMETRY L. Masinter N.S. Sridharan Computer Science Department Stanford University Stanford, California May, 1973 ABSTRACT. An algorithm for finding a complete set of non- equivalent labellings of a symmetric object and applications of the algorithm to problems in chemistry are presented. ␈↓αINTRODUCTION␈↓β The class of combinatorial problem dealing with finding a complete set of non-isomorphic objects under varying constraints and concepts of isomorphism, have wide applications in a variety of fields. The problem attacked in this paper is one of finding all distinct ways to assign a given number of labels or colors to the parts of a symmetric object when it is also known how many parts get each of the labels or colors. In chemistry, one manefestation of this problem is to make all assignments of ligands (from a fixed set) to a given carbocyclic skeleton. Part A of this paper may be read as a brief tutorial on the nature of the problem and an introduction to the terminology found in more technical treatments. Part B describes a method for the solution of this type of problem; a formal description and proof of correctness is available elsewhere. Part C gives examples and applications of this algorithm in both organic and inorganic chemistry. This problem is encountered in a wide range of applications beyond chemistry-- within many areas of graph theory and combinatorics, for example. It has been known how to compute the number of solutions [1], but an efficent method of actually constructing the solutions has not previously been reported. 1 The discussion in this paper is restricted to the labelling of graphs with the "topological" symmetry groups; the algorithm, however, is immediatly applicable to other types of labellings; one could, for example, generate all distinct "permutational isomers" for a given three dimensional ring system as defined in Ruch [2]. ␈↓αPART A: DEFINITIONS␈↓β ␈↓αSYMMETRY AND ITS RELATIONSHIP TO LABELLING␈↓β Consider the special case of the general problem: suppose all of the labels are distinct. This means that, for example,one wishes to number the faces of a cube with the numbers {1,2,3,4,5,6}, or the "nodes" (atom positions in a graph) of the decalin skeleton with numbers {1,2,3,4,5,6,7,8,9,10}. There are n! (n factorial) ways of labelling where n is the number of parts. If the object has no symmetry then each of these n! ways are distinct from the rest. However for the decalin skeleton (where n! = 10! = 3,628,800 ways) there is some symmetry. Pick one of the numberings as a point of reference (Fig 2a). Some of the 10! ways are different (Fig 2b); some of them are essentially the same (Fig 2c). 2 ------------------------------------------------------------ Figure 1 The Decalin Skeleton ⊗ ⊗ / \ / \ / \ / \ ⊗ ⊗ ⊗ | | | | | | ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ ⊗ ------------------------------------------------------------ Figure 2 Three of the 10! numberings of the Decalin Skeleton. 2 4 7 1 4 2 / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ 1 3 5 2 8 9 5 3 1 | | | | | | | | | | | | | | | | | | 10 8 6 3 4 5 6 8 10 \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / 9 7 6 10 7 9 (2a) (2b) (2c) ------------------------------------------------------------ Intuitively, Figs 2a and 2c are equivalent because one could take 2a, flip it over and get 2c. There is another way of determining the "sameness" of such numberings which is easier in the more complicated cases and is in general more suited to computer application: 3 Write down the respective "connection tables". (See Table I). Note that the connection table for Fig 2c is identical to that of Fig 2a; that of Fig 2b is different. ------------------------------------------------------------ Table I. Structure (2a) Structure (2b) Structure (2c) node|connected | node|connected | node|connected | to | | to | | to | | 1 2,10 | 1 8,9 | 1 2,10 2 1,3 | 2 7,3 | 2 1,3 3 2,4,8 | 3 2,6 | 3 2,4,8 4 3,5 | 4 6,8,10 | 4 3,5 5 4,6 | 5 9,10 | 5 4,6 6 5,7 | 6 3,4 | 6 5,7 7 6,8 | 7 2,8 | 7 6,8 8 3,7,9 | 8 1,4,7 | 8 3,7,9 9 8,10 | 9 1,5 | 9 8,10 10 1,9 | 10 4,5 | 10 1,9 ------------------------------------------------------------ DEFINITION: two numberings of the nodes of a graph are ␈↓↓equivalent␈↓β if the connection tables with the respective numberings are identical when the node numbers are written in ascending order and each "connected to" list is in ascending order. This means, among other things, that properties such as valency are preserved: If two numberings are equivalent and in the first, node 1 is trivalent then in the second, node 1 is trivalent. If there are other properties of the nodes (already colored or labelled, for example), then this definition can be extened to include the preserving of those properties. 4 For example, suppose there are atom names associated with (some of) the nodes of the graph. Then one can define equivalent numberings to be those which not only have identical connection tables, but also the same atom names for the corresponding nodes. Then Fig 3a would still be equivalent to Fig 3c but no longer to Fig 3b since, although the connection tables of 3a and 3b are identical, node 1 in Fig 3a is labelled with an "N" while while in 3b it unlabelled. ---------------------------------------------------------- Figure 3. Partially labelled graphs reduce the equivalencies. 2 4 9 7 4 2 / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ N1 3 5N N10 8 6N N 3 1N | | | | | | | | | | | | | | | | | | 10 8 6 1 3 5 6 8 10 \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / 9 7 2 4 7 9 (3a) (3b) (3c) ---------------------------------------------------------- ␈↓αPERMUTATIONS AND PERMUTATION GROUPS␈↓β Given one numbering, one can use a condensed notation to write down other numberings in terms of the first. In the 2b case (Table II), the row of numbers means that, in sequence, the node numbered 1 in the reference numbering 2a is now numbered 2, the node originally numbered 2 is now numbered 10, and so on. All of these are written 5 down with respect to the original "reference" numbering of figure 2a. ------------------------------------------------------------- Table II Condensed Notation for Numberings Figure 2a numbers: 1 2 3 4 5 6 7 8 9 10 Figure 2b numbers: 2 10 3 7 4 6 5 8 1 9 Figure 2c numbers: 5 4 3 2 1 10 9 8 7 6 ------------------------------------------------------------- One can conceptualize a numbering as a transformation or as a function: The transformation for 2c is π (1)=5, π (2)=4, π (3)=3, 2c 2c 2c ... π (10)=6. These transformations are ␈↓↓permutations␈↓β : one to one 2c mappings from the integers {1,2,...,n} to themselves. The transformation for the "reference" numbering is the identity i(x)=x. It can be shown that whatever the graph, the set of transformations satisfying the "equivalency" requirement above satisfies the property of a group. One may then say: The ␈↓↓symmetry ␈↓↓group␈↓β of a graph is the set of all transformations which yield identical connection tables. (If there are other properties to be considered, one may include them as part of the connection table). For the decalin skeleton there are 4 permutations in the group. 6 -------------------------------------------------------------------- The Symmetry Group of the Decalin Skeleton π 1 2 3 4 5 6 7 8 9 10 i π 5 4 3 2 1 10 9 8 7 6 v π 10 9 8 7 6 5 4 3 2 1 h π 6 7 8 9 10 1 2 3 4 5 180 ------------------------------------------------------------------- These correspond directly to the intuitive geometric symmetries π =identity, π =reflection about horizontal axis, π =reflection about i h v vertical axis, π = rotation 180 degrees. It is not too difficult 180 for a computer program to figure out the symmetry group of a graph given its connection table. In many cases, one might wish to label the "edges" (interconnecting lines) of a graph. In this case, the symmetry group on the edges rather than on the nodes is required. It is very easy to calculate this group. First find the group on the nodes. Then, for each edge in the graph, write down the nodes that form the end-points. Then the corresponding permutations can be obtained as follows: π {n ,n }={π (n ),π (n )} i 1 2 i 1 i 2 This is most easily shown by way of an example. (Table IV). 7 ------------------------------------------------------------------ Table IV Group of Decalin Skeleton acting on the edges π 1-2 2-3 3-8 3-4 4-5 5-6 6-7 7-8 8-9 9-10 1-10 i π 4-5 3-4 3-8 2-3 1-2 1-10 9-10 8-9 7-8 6-7 5-6 v π 9-10 8-9 3-8 7-8 6-7 5-6 4-5 3-4 2-3 1-2 1-10 h π 6-7 7-8 3-8 8-9 9-10 1-10 1-2 2-3 3-4 4-5 5-6 180 ------------------------------------------------------------------ Finding the group of an object is a special kind of labelling problem. Given one way of numbering (labelling with distinct labels) the parts of the object, one finds all other ways which are equivalent. The labelling problem attacked in this paper is the converse: to find a maximal list of labellings, none of which are equivalent to each other. In general the set of all posssible labellings can be split into subsets, such that: 1) If two labellings are in different subsets, they are not equivalent. 2) If two labellings are in the same subset, they are equivalent. These subsets are called ␈↓↓equivalence ␈↓↓classes.␈↓β The relationship of finding the group, and of finding labellings where all the labels are distinct, can be viewed as follows: Take the 8 n! possible labellings and lay them out in a matrix: each row will contain one equivalence class. (It can be shown that in this special cases where the labels are distinct, all of the equivalence classes are of the same size). To find the group, one needs to find a row. To find the labellings, one needs to pick one element from each row. ------------------------------------------------------------- Figure 4 Equivalence classes, Groups, and Labellings ------------------------------------------------------------- ␈↓αPART B. SOLUTION TO THE LABELLING PROBLEM␈↓β ␈↓αA Naive Method␈↓β An obvious method to find the distinct labellings would be to generate all n! of the possible ones, and for each one to check if an equivalent one was previously generated. Unfortunately, this method takes an exhorbitant amount of computation (proportional to n! 9 squared). Below a method is discussed which takes an amount of time roughly proportional to the number of solutions (i.e. the number of equivalence classes of labellings) and requires only knowledge of the symmetry group. Thus it is useful for labelling objects using their geometric symmetry as well as the topological symmetry defined above. ␈↓αA Better Method␈↓β ␈↓αSpecial case: distinguish 1 node.␈↓β First consider the special case where there are only two types of labels such that there is one label of the first type and all the rest of the second. (E.g., color one red, and the rest green, or label one N and the rest C.) Intuitively, for the decalin skeleton, one can see that there are three classes of symmetric nodes, or orbits, marked with "⊗", "+" and "&" in Fig. 5, and that each distinct labelling corresponds to selecting one node from each type. (see Fig 6.) ------------------------------------------------------------- Figure 5 Orbits in the Decalin Skeleton ⊗ ⊗ / \ / \ / \ / \ + & + | | | | | | + & + \ / \ / \ / \ / ⊗ ⊗ ------------------------------------------------------------- 10 ------------------------------------------------------------- Figure 6 Three Labellings of Decalin with 1 N and 9 C's. ⊗ ⊗ N ⊗ ⊗ ⊗ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ N ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ N ⊗ | | | | | | | | | | | | | | | | | | ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ------------------------------------------------------------- Thus there are three distinct labellings (the ten possible labellings split into three equivalalence classes). ␈↓αComputing orbits.␈↓β In general, the parts of an object with symmetry split into different orbits (sometimes there is only one type, as in the faces of a cube, or the nodes of the cyclohexane skeleton). To label the parts of an object such that one is distinguished, it is necessary only to find the orbits and then, for each type, pick one of the parts in that type arbitrarily. Note that if the object has no symmetry each type has exactly one part in it. It is very simple to find the different types from the table of the symmetry group: if one writes out the group, as in Table III, with 11 each permutation as a row, then the numbers in each column, taken as a set, form an orbit. The orbits of the group in Table III are: {1,5,6,10}, {2,4,7,9}, {3,8}. After finding the set of orbits, one then can do the special labelling described above (distinguishing only one node): the number of distinct labellings is the number of orbits. Each corresponds to selecting an element from an corresponding orbit and labelling it. For reasons to be explained later, the first element of each orbit should be chosen (i.e. the one with the smallest number in the reference numbering). One might note here that if one has n-1 labels and 1 "blank" it is the same as 1 label and n-1 "blanks". This special case can be applied here as well. ␈↓αThe reduced symmetry group.␈↓β Once a group has been calculated for a structure, many times one wants to know what the group is after some labels have been attached. The group of a labelled structure is always a ␈↓↓subset␈↓β of the group of the unlabelled structure. Thus one needs to know which permutations in the group must be thrown out. To do this, write the "labels" associated with each node next to the node number in the permutation table as in Table II. If in any column, there is an element which has a different label than the label in the "reference" numbering 12 (identity permutation), then that row can be discarded. That is, every permutation in the reduced symmetry group must satisfy: for every node i, label(π(i))=label(i). Exactly those permutations in the original group that satisfy this equation are the ones in the reduced symmetry group of the labelled structure. ␈↓αReduction techniques␈↓β In the general labelling problem, there are two important techniques used to reduce the problem. The * first is called LABEL RECURSION and the second ORBIT RECURSION. The idea behind LABEL RECURSION is that one can do one label at a time. The idea behind ORBIT RECURSION is that one can label one orbit at a time. These reductions are discussed in detail below. ------------------------ * A ␈↓↓recursive␈↓β technique is one which is defined in terms of itself. For example, n! can be defined by: { 1 if n=1 n! = { { n*(n-1)! otherwise The computation of 10! involves computing another factorial. It is necessary that at each step, the problem is reduced. Here the general solution of the labelling problem is described in terms of less complicated labellings. Several special cases (analogous to the n=1 case in the definition of factorial) are also defined. 13 ␈↓αLABEL RECURSION␈↓β If one is given many (more than 2) kinds of labels, say n of type 1, 1 n of type 2, ... n of type k, proceed as follows: (1) solve the 2 k labelling problem for n of one type of label and n +n +...+n of 1 2 3 k another type. (Call the second type of label "blank"). The result will be a list of partially labelled objects (along with their reduced symmetry groups). Take each of the results and label the "blank" parts with n labels of one kind, n of another, ... ,n of 2 3 k another. It can be proved that the results will be a list of all distinct solutions to the original problem. For example, to label the decalin skeleton with 1 label "N", 1 label "S" and 8 labels "C", one first labels with 1 "N" and 9 "blanks" obtaining the 3 results in figure 7a. (Fig 7a1,7a2,7a3). There are now 3 new problems to label The "blanks" if Figs 7a1-3 (under their respective reduced symmetry), with 1 "S" and 8 "C"'s. In Figs 7a1 and 7a2, there is no symmetry left, and so each of the "blanks" has its own orbit, thus there are 9 distinct labellings apiece. In Fig. 7a3, there are 5 orbits under the reduced symmetry, and thus there are 5 additional possiblities. (Fig 7b). 14 ------------------------------------------------------------- Figure 7 Labellings with 1 N, 1 S, and 8 C's. A B C ⊗ ⊗ ⊗ ⊗ / \ / \ / \ / \ N ⊗ ⊗ N ⊗ ⊗ 7a1 | | | 7a1a | | | ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ ⊗ ⊗ ⊗ N ⊗ N ⊗ / \ / \ / \ / \ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ 7a2 | | | 7a2 | | | this diagram ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ is not complete \ / \ / \ / \ / ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ / \ / \ / \ / \ ⊗ N ⊗ ⊗ N ⊗ 7a3 | | | 7a3 | | | ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ ⊗ ⊗ ⊗ ------------------------------------------------------------- ␈↓αORBIT RECURSION␈↓β There are 3 cases in the 1 N, 9 C on the decalin skeleton problem. (case 1) 1 N goes to orbit {1,5,6,10}. (case 2) 1 N goes to orbit {2,4,7,9}, (case 3) 1 N goes to orbit {3,8}. There is only one possibility in each of these cases. Suppose there were 3 N's. Then there would be 9 cases. (Table IV). 15 ------------------------------------------------------------ Table IV. (3 N's on a decalin) # N's going to case orbit orbit orbit number {1,5,6,10} {2,4,7,9} {3,8} 1 3 - - 2 2 1 - 3 2 - 1 4 1 2 - 5 1 1 1 6 1 - 2 7 - 3 - 8 - 2 1 9 - 1 2 ------------------------------------------------------------ In some of these cases there are more than one possibility (cases 2,3,4,5 and 8). However, every labelling fits into one of these cases, and labellings from different cases cannot be equivalent. Thus, each of these cases can be done independantly, and the results collected together. To do any one of the cases, the labellings of the orbits can be done sequentially. Take case 5. First label one of {1,5,6,10} with one N, (and the rest "blanks"). Since {1,5,6,10} is an orbit, we can chose one arbitrarily and get Fig 8. (The first one is chosen). 16 ------------------------------------------------------------ Figure 8 1 N to orbit {1,5,6,10} ⊗ ⊗ / \ / \ / \ / \ N ⊗ ⊗ | | | | | | ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ ⊗ ------------------------------------------------------------ Second, label {2,4,7,9} with 1 N (and 3 blanks). Note that {2,4,7,9} is no longer an orbit under the reduced group. Stick to the original plan-- it is just necessary to find ␈↓↓new␈↓β orbits under the reduced group to label {2,4,7,9}. Since there is no symmetry left, each of {2,4,7,9} falls into its own orbit. The "special case" can be used directly to find the 4 solutions (Fig 9). 17 ------------------------------------------------------------ Figure 9 Second step of case 5 N ⊗ ⊗ N / \ / \ / \ / \ N ⊗ ⊗ N ⊗ ⊗ 9a | | | 9b | | | ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ / \ / \ / \ / \ N ⊗ ⊗ N ⊗ ⊗ 9c | | | 9d | | | ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ \ / \ / \ / \ / ⊗ N N ⊗ ------------------------------------------------------------ Then, for each of these solutions, {3,8} must be labelled with 1 N (and 1 blank). The final result is 8 possibilities for case 5, none of which have any remaining symmetry. ␈↓αFINAL TECHNIQUE.␈↓β Now the only problem to be solved is this: suppose there are two types of labels, n of the first, and n of the second, neither n or 1 2 1 n are 1, and there is only one orbit. No more simple reductions 2 left. The solution, however, is another trick. Pick the first node and label it, calculating the reduced symmetry group and new orbits. 18 Label the rest of the nodes (under the reduced group) with n -1 1 labels of type 1 and n labels of type 2. The result will contain a 2 representative of each equivalence class of labellings; however, if n >2 then there may be some duplicates (i.e., two of the results may 1 actually be equivalent). For example, the cyclohexane skeleton has a group of order twelve (has twelve permutations), and there is only one orbit. To label it with three N's, one labels node 1 with a N, calculates the reduced group and new orbits; then finds the various cases for distributing the remaining two N's among those orbits. (Table V.) Then for each case, do each orbit sequentially. Cases 1, 3, 4 and 5 are fairly straightforward; in case 2, first label {1,2} with 1 N. (Figure 9). Now the group reduces even further, and one gets the two results depicted in Figure 9b. Note that cases 1 and 2a are equivalent as well as 2b, 3, and 5. What to do? Fortunately, there is a good way of throwing out the impostors without having to check each of the results against each of the others for equivalency. If there is a permutation π in the group, such that π(labelled set) > labelled set then the labelled set is an impostor-- throw it out. Furthermore, every impostor is detected this way. All that is necessary is that when doing these "lower level" labellings, that one is careful to 19 pick the "first" element of each orbit to label and the "first orbit" when there are a choice of orbits. ------------------------------------------------------------ Table V cases for 2 N's on {2,3,4,5,6} of cyclo- hexane case {2,6} {3,5} {4} 1 2 - - 2 1 1 - 3 1 - 1 4 - 2 - 5 - 1 1 ------------------------------------------------------------ Fortunately, this technique is rarely necessary -- usually in the course of a computation, the "special cases" catch almost everything. For example, to label the decalin skeleton, it is never needed since even when one is labelling say orbit {2,4,7,9}, there is either 1, 2, n-1 or n-2 labels to be attached. For the cyclohexane skeleton, it is only needed if there are 3 of one label and 3 of another (if there were 3,2 and 1 of three respective label types, just do the singleton first -- the group will then reduce and again this "FINAL TECHNIQUE" will not be necessary. Only in cases where there is at least 6 fold symmetry (i.e., every orbit has at least 6 elements) and there are at least 3 of each of the label types is this technique required. 20 ␈↓αC. SUMMARY OF LABELLING STEPS␈↓β 1) Calculate the group if not previously calculated 2) if more than 2 different node types, do the entire labelling process with 1 of the label types, and the rest "blanks"; then for each of the solutions obtained, label the "blanks" with the remaining label types using the reduced symmetry group. and collect the results. 3) otherwise, a) find the orbits b) if more than one orbit, then 1) find the different "cases" -- ways of distributing the labels among the orbits 2) for each case, a) label the first orbit b) label the rest of the orbits using the reduced symmetry group obtained from a). and collect the results c) otherwise (only 1 orbit and 2 label types) 1) if there is only one of one of the label types, pick the "first" node and label it with that label type. This is the only distinct possibility. 2) otherwise, if there are two of one of the label types, label the first node with that label type, calculate the reduced symmetry group and new orbits, and from each of the new orbits, pick the "first" node. The solutions consist of those possibilities (one for each new orbit). 3) otherwise, (3 or more of each label type, and one orbit) label the "first" node, calculate the reduced symmetry group, label the rest of the nodes under the reduced group, and for each of the results, check if for every permutation π in the original group that π(labelled set)≥labelled set If this is not true of a labelled set, discard it as a solution. The result is every such labelling that satisfies this property. 21 ␈↓αREFERENCES␈↓β [1] De Bruijn, N. G., Generalization of Polya's Fundamental theorem in Enumerative Combinatorial Analysis, Nedarl. Akad. Wentensh. Proc. Ser. A, 62 (1959), 59-69; also Polya's Theory of Counting, in Applied Combinatorial Mathematics, E. Beckenbach, ed, Wiley, New York, 1964, pp 144-184. [2] I. Ugi [3] Previous DENDRAL papers [4] Perlman? 22