MATH 360 PROJECT: EMPIRICAL MODELLING OF CHEMICAL REPLICATION How do enzymes work? How do molecules of DNA and RNA replicate themselves? Chemists and molecular biologists have found that the basic process is that the enzyme or "template" molecule has a surface with holes which are well matched to the shape of just one compound (such as an amino acid) found in the "soup" in which the template is located. When all the pieces of the molecule to be formed have been attracted to the template, the molecule returns to the surrounding environment and the template is unchanged, ready to attract more pieces. One part of this procedure which is difficult to describe is just how the template arranges to attract just the piece it needs to complete the molecule, since the various pieces flow past randomly. In principle, the process is the result of the strength of the electrochemical bonding forces, but these are extremely difficult to calculate in complex molecules. In this project, we will try to determine empirically how the template "decides" where to place an incoming molecule. We will use a computer simulation of the attraction of pieces to a template. The environment will consist of a two-dimensional empty rectangle 10 units wide and 20 units high; the template will be the bottom edge. We will represent the pieces floating by as "compounds" consisting of four "atoms" (unit squares). The molecules to be formed from these will be the horizontal rows (10 units wide). So our representation of this process is as follows. One of the 4-unit pieces will appear at the top of the environment. The template must "decide" how to move the piece left or right or rotate it so that it can fit in the space adjoining the template. Whenever a horizontal row is filled up, it is removed from the environment (and the remaining "atoms" shift down closer to the template. EEEEEEEEEEEE E E E E E=Edge of environment E E C=New compound to be placed E CCC E A=atoms not yet part of replicated molecule E C E X=desirable loaction for new compound E E T=template E E E E E AAA E EAAAAAA XXXE EAAAAAAAAAXE EAA AAAAAAAE EAAAAAA AAAE EAAAAAAAAAAE EAAAA AAAAAE EAA AAAAAAAE TTTTTTTTTT Now the problem is to take this simulation of the process of molecular replication and to figure out the pattern of how the template can decide where to place the incoming pieces to make full molecules without cluttering up the environment with pieces waiting for completion. We will say we have understood the process well when we can make the decisions for the template for a long time without having the environment fill up with incomplete molecules. 1. How many different "compounds" are there in this simulation? (Count two the same if they are identical after shifting or rotation, but different if they would need to be flipped out of the plane to become identical). How many different ways can the template choose to arrange each piece before drawing it in closer? (The arranging is down with 90-degree rotations and whole-unit shifts left or right). In the computer lab we have left a program to carry out the simulation. You can run this progam in one of several modes. Try first playing the role of Active Template: when each piece appears, you must shift it left or right, or rotate it, and then drop it into place. This will give you a sense of what constitute good practices on the part of the computer. 2. Is it conceivable that the environment might look like the sample illustration above at some point in the process? What kinds of configurations are impossible in this simulation? Each of the different possible arrangements of a piece will give a different resulting configuration, after the piece is dropped as low as it can go to the template and full molecules are removed. Some of these resulting configurations are "bad" - they leave the environment cluttered up with incomplete molecules that will be difficult to complete when more pieces show up. It is the role of the template, then, to choose the arrangement of the new piece in order to get the "best" of the possible resulting configurations. Our goal will be to advise the template how to recognize one configuration as better than another. 3. Describe characteristics of a configuration that make it worse than another. (For example, a "hole" - a vacant grid location with atoms above it - is one such bad characteristic). In the Adviser mode of the simulation, the template will indeed automatically choose the arrangement that yields the best resulting configuration - but you need to tell it what "best" means. The program will list a variety of characteristics of a configuration, such as those you listed in Problem 3. Your job is to assign a penalty to each of those characteristics (e.g., you could choose to give a penalty of 10 points per hole). The program will then assess this penalty each time the characteristic appears in the configuration. The configuration with the fewest penalties will be considered the best, and the template will arrange the given piece so as to attain that configuration in the end. 4. Suppose one set of penalties has already been assigned. What would happen if you ran the simulation again with all the penalties doubled? 5. Run the simulation in Adviser mode ten times with ten different choices of how to assign penalties (you and your teammates can each do a few of the runs). Note your success rate -- how long does each simulation run? Which appear to be the most important characteristics to penalize? The simulation program also allows you to create a file (using a random process) which can then be used to make the pieces appear in the sequence in more than one run of the simulation. The is called "Adviser-nonrandom" mode. Use it when you wan to compare the efficacy of more than one assignment of penalties for a given sequence of pieces. You will need to keep a diskette in the A: drive to hold the file. 6. Use your results from Prob. 5 to select a good assignment of penalties, and run the simulation in Adviser-nonrandom mode with this assignment. Then, if the results are encouraging, use the same mode, with the same file of pieces appearing, but with somewhat different penalties assigned. Do this five or ten times and note how long you were able to keep the environment clear in each case. 7. Use Minitab on your data to try to predict the "score" as a function of the various penalties you assigned. (Hint: by Prob.4, you can and should keep one of the penalties the same in each of your runs.) 8. Using the linear function you create, find a better assignment of penalties than you used in Prob. 6. (Since the function you created in Prob. 7 is only an approximation, you should not choose an assignment which differs from the previous ones by more than they differed from each other.) You may wish to repeat problems 6, 7, and 8 to improve your performance even more. Keep in mind that you are optimizing the penalty assignments for one particular file determining the appearence of the pieces. Slightly different results may be obtained from a different file. You may also wish to compare your results when you advise the template in this way to the results you obtain from choosing the arrangements of the pieces yourself. Bring your best results to clas on the day of the project presentation. We will run the simulation in Adviser-Nonrandom mode with the same file selecting the pieces for all five teams, and compare results.