X-ray crystallography

by Anders Bárány

Nobel Prizes and X-ray crystallography
In the autumn of 1895, Alfred Nobel must have been thinking a lot about the new (and final) version of his will, which was signed in the presence of witnesses in Paris on November 27. As is well known, Nobel mentioned physics as the first prize area to be rewarded and the physics Nobel Laureates always receive their prizes first. What Nobel could not be aware of in the autumn of 1895 was a research project carried out at the same time in Würzburg by the man who, in 1901, would receive the very first Nobel Prize in Physics and thus the very first Nobel Prize all categories. His name was Wilhelm Conrad Röntgen, and during an investigation of cathode rays he made a remarkable discovery already on the 8th of November. When the cathode rays, that we now know are energetic electrons, hit the wall of the evacuated glass tube that housed the electrical connections, a new kind of rays were emitted. In his first communication on the discovery, in December 1895, Röntgen named them X-rays, the “X” being used, as in mathematics, for something unknown. We now know that X-rays are energetic electromagnetic waves with a very short wavelength, today typically measured in terms of a nanometer, i.e. 10-9 meter. We all know that X-rays are very useful in practical medicine, since they penetrate most matter and allow insight into our bodies. But what will be described in this Topic Cluster is another use of them: Since the nanometer scale also happens to be the typical scale for the atomic and molecular structure of matter, X-rays have become the so-far most important tool for probing this structure.

Crystallography as such is a very old topic, at times involving quite a lot of mathematics. But X-ray crystallography starts in 1912 with a discovery by Max von Laue (Nobel Prize in Physics 1914) and collaborators. They found that crystals bend (diffract) X-rays into certain preferred directions. Since X-rays have the property of darkening photographic plates, putting such plates behind an irradiated crystal will produce photographs with spots. The spots form regular patterns, and the preferred directions can thus be stored on the photographic plates and analysed according to location and intensity. This diffraction effect was quickly explained as a wave phenomenon, an interference effect between X-ray waves reflected from different crystal planes, by Lawrence Bragg (Nobel Prize in Physics 1915). Together with his father William (with whom he shared the Nobel Prize) he used these so-called diffraction patterns to determine the structure of many simple crystals.

Since the pioneering work led by von Laue and the two Braggs, X-rays have become the dominant tool in crystallography, even though neutrons and (sometimes) electrons are also used. For a chemist having, e.g., synthesized a new molecule, the standard method to get at its precise structure is to get crystals of it and then use X-ray diffraction. A fairly large number of Nobel Prizes in Chemistry and some in Physiology or Medicine have rewarded work done using X-ray diffraction methods to unravel the structure of biologically important molecules, “the molecules of life”. To a large extent, this development started with a young mathematically inclined scientist from William Bragg’s laboratory in London, J.D. Bernal, who eventually moved to the physics institute in Cambridge, the Cavendish Laboratory, led by Ernest Rutherford (Nobel Prize in Chemistry 1908). In the 1920’s, when the scientific community realized that proteins are individual molecules and not aggregates of smaller molecules, Bernal had the idea that also these large molecules could be crystallized and analysed using X-ray diffraction. He started a project led by Max Perutz (Nobel Prize in Chemistry together with John Kendrew 1962) as well as one led by Dorothy Crowfoot Hodgkin (Nobel Prize in Chemistry 1964). Perutz and Hodgkin both performed heroic 20+ year long searches for the structure of proteins (haemoglobin and insulin, respectively). On the way these pioneers, who of course had no computers when they started their projects, had to overcome a large number of technical and fundamental problems and had to devise their own methods of investigation. While the projects of Perutz, Kendrew and Crowfoot Hodgkin mark the beginning of molecular biology, meanwhile the molecular genetics revolution was started by the intelligent modelling of the structure of the nucleic acid (DNA) by Francis Crick and James Watson, also at the Cavendish Laboratory, using X-ray diffraction data taken by Maurice Wilkins and Rosalind Franklin in London (Nobel Prize in Physiology or Medicine 1962 to Crick, Watson and Wilkins). What came out of these projects was not only the atomic structure of the proteins and the DNA, respectively, but also that the structure could explain some of the reactions that the molecules take part in. This has become the overriding motivation for all the later studies of the structure of proteins: How do they behave “at work”.

Since the 1950’s, computers have become important tools used to disentangle the structure of more and more complex proteins with the primary data collected from X-ray diffraction. Today the X-rays are usually produced in high-energy synchrotron storage rings, where electrons or positrons emit X-rays when they are accelerated in the bending magnets or in different types of magnetic insertion devices. At the X-ray beam lines, research equipment is put up by university scientists, but also by companies performing structure determination on a commercial basis. Examples of successful investigators rewarded with Nobel Prizes are Johann Deisenhofer, Robert Huber and Hartmut Michel, who determined the structure and function of a two-dimensional so-called membrane protein, a photosynthetic reaction centre (Nobel Prize in Chemistry 1988); John Walker, who determined the structure and dynamics of the ATP molecule, which surprisingly contains a rotating part (Nobel Prize in Chemistry 1997 with Paul Boyer); Roderick MacKinnon, who unravelled the structure and dynamics of the ion channel protein, which acts as a tunnel into the nerve cells (one of the Nobel Prizes in Chemistry 2003); Roger Kornberg, who determined the molecular mechanism that copies single strands of DNA into messenger RNA (Nobel Prize in Chemistry 2006); Venkatraman Ramakrishnan, Thomas Steitz and Ada Yonath, who managed to determine the structure and function of the protein factory of the cells, the ribosome molecule that gets its information from the messenger RNA (Nobel Prize in Chemistry 2009); Brian Kobilka and Robert Lefkowitz, who as a part of their investigation determined the structure of G protein-coupled receptors using X-ray crystallography (Nobel Prize in Chemistry 2012).

But even though the actual X-ray diffraction work has become more and more of a standard procedure, the problem of producing and keeping complicated protein crystals intact during the investigations is still something that has to be solved over and over again. As if this problem was not enough, there is also a more fundamental one, the so-called phase problem. Without going into details, the phase problem is connected with the fact that the spots on a photographic plate can give you a measure of the intensity of an electromagnetic wave, but not of its phase. Since the latter is needed for a mathematically stringent extraction of the crystal structure from the diffraction data, a number of practical methods have been used to overcome this problem. One is producing crystals with and without insertions of heavy atoms and comparing diffraction data for the two kinds of crystals. But in the 1950’s, the mathematician Herbert Hauptman and the physical chemist Jerome Karle devised a probabilistic approach to the problem and showed that by solving a certain large set of equations, the crystal structure can in fact be determined in a direct method (Nobel Prize in Chemistry 1985).

Lindau Lectures and X-ray crystallography
In the Lindau Mediatheque database of lectures that have been recorded since 1951, there are today (2013) around 50 lectures by Nobel Laureates who have used X-ray crystallography as a method of investigation. Instead of following a purely historic path among the lectures, we here jump immediately to one of the more recent ones. This lecture describes how one of the most complex protein structures, the ribosome, has been revealed using this method. The ribosome is the “factory” molecule that produces proteins as ordered by the genetic code through the messenger RNA. The first snippet comes from a lecture given in 2011 by one of the 2009 Nobel Laureates in Chemistry, Ada Yonath. She describes, in particular, how she was led to the idea that one could produce crystals of such a complicated molecule as the ribosome and how she and her collaborators overcame the first set of difficulties in using these crystals to unravel the structure of the molecule. The title of the talk is “Climbing the Everest Beyond the Everest”.

Ada E.  Yonath (2011) - Climbing the Everest Beyond the Everest

After yesterday I think that I really have to show you what I called yesterday the evidence. How this project started and continued. Maybe it is as important as the result. So let’s climb the Everest beyond the Everest. I think that you all know that the Ribosome is the last chain, the last member in the chain of translating the genetic code into proteins. The genetic code is being expressed in messager RNA which is very similar to DNA but can exist as a single strand therefore it can direct the information and it is available for translation. So ribosomes are universal. Each ribosome can translate every messenger, every sequence in a similar fashion. There is a huge number of ribosomes that function in each cell. Mammalian cells can contain millions of ribosomes. This is a piece of information I didn’t know when I started. Even bacteria can reach in the low period they can reach 100,000 ribosomes functioning together. In my opinion, I'm a chemist or at least I studied chemistry, ribosomes are amazing molecular machines. They act continuously, and they can make between 15 to 40 peptide bonds in one second. I needed eight hours in the lab. You can imagine what would happen to your life if (laughing). I also needed 100°C and PH of 2. And the ribosomes can do it in every cell under almost all conditions known to support life. They also hardly make mistakes. The numbers in the literature until recently was 1:1,000,000. Today people speak also about 1:10,000. In my opinion both are admirable. The main constellation is an L-shaped double strand with an anti-codon loop that can make codon, anti-codon interactions here like (inaudible 00:02:25). They carry their amino acid far away from here, it’s about 70 angstrom always in all cells, which is coordinated to this one. So the messenger RNA is attracted and bound to the small ribosomal sub-unit, all ribosomes are made of two sub-units no matter from which cell they come. And they exist like two separated sub-units until they start functioning. So the cell has two different sub-units, two separated sub-units. They associate with each other when they have to function. The functioning is first to attract the messenger then to design the first initial codon to be translated into the translating decoding site which is called the P site tRNA site. tRNA binds to there by codon/anti-codon interactions and the proof reading machinery that everything is fine is also here in the small subunit. This is why we call the small subunit the brain of the ribosome. The hands that make the bond are in the large ribosomal subunit and this is universal and no matter which cell we are looking at. So what I just told you is shown here, the small subunit has a narrow part here. In the past also today people like to call this the head and the body and the neck of the small subunit. Messenger wraps around it. The first site to be populated is called P peptide transfer, this is tRNA. Symbolising the tRNA with the amino acid. There is another side that will be occupied afterwards call the A-site tRNA where the tRNA where amino acid will come, amino acid related tRNA. And there is an exit side from where the tRNA will go out afterwards. So once we have this all made, small subunit, messenger and tRNA and initiation factors which I don’t show here, non-ribosomal initiation factors, the large subunit can come. The surface complementary is very good but not perfect. So in order to have it perfect first of all inter-subunit bridges are being made like those. Bi-conformational changes mainly of the large subunit but not only. There are thirteen of those and I symbolise them in three triangles. When everything is fine the P-site tRNA with the amino acid is positioned so the amino acid is on the entrance of a very long tunnel that spans the large subunit. For which the newly born protein will progress and exit. So now everything is ready for the next journey. It will go now more or less sideways to this position and make a peptide bond here, which is the first bond for the newly born protein like that. So now we have a D peptide beginning of the newly born protein of the nascent chain. The first tRNA is now empty, it can go out and a new one can come in. These things happen more or less the same time but I'm just a human being I couldn't draw them at the same time. So that’s the story and it will continue, the translocation from A to B and to E is mainly in this direction. So many ribosomes act together, you can see it here, I must say that I am amazed by this electron microscope grid. Every time I look at it again it's from the ‘70s, I don’t even know who did it. I found it in the literature without reference. It shows many ribosomes, these black things, black bodies, that are bound to the same messenger RNA act continuously. You can see that each of these black things have two subunits, you can see here small large, small large, small large, small large and so on. You can see that they read the same messenger and the protein comes from the other side. The only thing that is not exactly as it happens in the cell is that the protein comes out flat. Within the cell you surely know it falls immediately when it comes out of the ribosome, either alone or with chaperones, with the aid of chaperones. So that’s the story. When we got the first structures we made a movie according to them. So the movie that you will see now was made by art students, based on our coordinates together with us with interpolation between our structure and others as it was known in 2002. So messenger comes to the small subunit, it waits for it and pay attention, now, now it comes and it took it, it’s a big conformation of change. These ones is our initiative factors, the blue ones, initiation factor three and two. The two brought the first tRNA and now we have an initiation complex, the large subunit can come and make bridges by conformational changes. And we have an active ribosome, that will accommodate the tRNAs that are brought by other factors, elongation factors on one side. And help the empty ones coming out on the other side, ribosome is really working. Peptide bond is being made here, the decoding is there. Now we take away the large subunits and you see something that I want to mention. The motion is more or less sideways. But here on the other side of the tRNA there is a conformational change. I just said I won’t be able to talk about it much today, but at least pay attention. So their proxies continues until there is a stop codon. And then the tRNAs being replaced by factors, either release factors or recycling factors. This gold thing that will come in and separate between the two subunits. The tRNAs can go home, the protein can fold, the two subunits can wait for the next job to do. So what you saw in the movie was not very clear, we know now the position of every atom in the ribosome. Molecular weight of the bacteria ribosomes is 2.5m. What you see here rotating is bacterial small subunit. They are more or less the same in all bacteria. Changes are minute, the sediment with the sedimentation coefficient of 30 so we go 30S. Total molecular weight of the small subunit is 0.85 Mega Dalton. And what you see here rotating is the complex, the RNA, the ribosomal RNA is in grey or silver. And in the small subunit of bacteria it contains about 1,600 nucleotides. The curly things, the coloured curly things mainly on the surface are ribosomal proteins. Each of them is now shown in a different colour. There are up to 21 different proteins in this small subunit, it means between 20 ~ 21. And the task of the small subunit to remind you is decoding of the genetic code. The large subunit its task is to make the peptide bond, to make sure that it’s being elongated, the nascent chain is elongated. It means it’s a polymerase and to protect the newly born protein in the tunnel that I showed earlier. Sedimentation 50 S. Total molecular weight is 1.5 millions in bacteria. Two chains of RNA total 3,000 nucleotides, one of them very long, 2,800 and the other the rest. And up to 34 different proteins. When we look at the surface that will come together in the assembled ribosome like that we see that the decoding here in the small subunit and peptide bond formation here and you can see here how they are connected by the tRNA decoding here and peptide bond here, the amino acid is bond here. We can see that the region where things are happening, the two regions here and here are made of RNA, not proteins. Only silver, which tells us that the ribosome is actually a ribozyme. It’s an RNA machine. And this is in good correlation with the ribosome composition. In all ribosomes except in mitochondria the ratio of RNA to proteins is 2:1 it means twice as much RNA. Even in mitochondria where part of the RNA is changed by proteins the active sites are preserved and they are RNA made. And this was suggested by Francis Crick back in the ‘60s was not easily digested by scientists who thought that making a bone its enzymatic reaction and enzymes are known to be proteins. Well, Francis Crick was right! What does it show us? What does it hint at? It hints that first of all usually RNA machine, ribosomes are lousy proteins, not efficient, lousy enzymes. Not efficient enzymes. There are not many known today and all of them are rather slow and sometimes not very selective. The ribosome as I said earlier is an extremely efficient machine. So nature found a way to improve the ribosomes, to improve the RNA machine and make it efficient. Nature can find ways when it’s needed. Second it shows that RNA existed before proteins. I want to talk about it for a second. We think that within this complex ribosome we identified a little, a little piece. It looks here large because we zoom on it but it’s actually not more than 3 or 4% of the whole RNA, of the ribosome depending if it’s bacteria or higher organisms. This region can make the peptide bond and it is in our opinion a remnant of the prebiotic machinery for bonding. That is still functioning today in the modern ribosome. We are very excited about this finding and we are trying now to understand it better and to prove it. On what do we base our thinking? I cannot go into the whole thing now I just tell you the most important two facts. First of all it can make peptide bonds, the whole machinery is there. Second, it has an incredible conservation. So out evolution 98% conservation. The last is that when we look only at that machine from a chemical point of view it is stable, it can make the bond right here and it has orientation that can support itself. So that’s what we call the pocket. Unfortunately I can’t talk about it today. But, from this we studied something, or we can suggest something. It’s a question that the world is busy, what was first the chicken or the egg? The DNA, the genetic code or the proteins? Or the machines of life. Well in our opinion first was the ribosome, not the genetic code, neither the proteins. Okay ribosome is the target for antibiotics, the natural antibiotics are weapons that bacteria from one type uses for eliminating other types. And because the ribosomes are so important producing the proteins they are target for many antibiotics that are of clinical use. Over 40% of the useful antibiotics target protein biosynthesis. We like to compare between the ribosome and antibiotics to David and Goliath. David this little kid had a very small stone and he had to eliminate this huge Goliath that was mainly covered and protected. So he aimed at the forehead because the forehead was exposed and important. He didn’t aim at the hands for instance. He was successful. Because he touched a function, a very important function of sight. The reason that we compare or we like to discuss it like that is the molecular weights. Antibiotics are less than a 1,000 ribosome, bacterial ribosome which are the ribosomes that we want to hamper, they are 2.5m. How does it happen exactly like the stone of David? The antibiotics hit the ribosome at its functional sites. It disturbs the function. So here are some ribosomal antibiotics that are bound in their binding position to the small subunit where the messenger binds or where the tRNA has to be decoded. All those that allow the motion that is associated with it, and in the large subunit again those that allow the motion, you remember in the movie there were hands going in and out. These disturb the motion or in the active site or at the tunnel for which the proteins go. I want to discuss only a few of them. First what you see rotating is the large subunit RNA in blue and all proteins in green. The three colour things inside, the magenta is Chloramphenical that binds to the A-site. To the lower part of the A-site. In yellow is Clindamycin that disturbs the creation of the peptide bond. And in red is Erythromycin that binds to the tunnel. Erythromycin is the first antibiotic that targeted ribosome that came into use. And I want to show where it binds. So first let’s look at the tunnel. Here is the large subunit, with A and P tRNAs in it, a bond will be made here. And the tunnel is highlighted here but modelled polyaniline. So the tunnel has an uneven shape, and near the constriction the narrowest part is the binding point of erythromycin and the family that came afterwards is called macrolides. You can see here what we mean by blocking the tunnel. So protein will be made here, what you see here is a section for the ribosome like we'd like to do with apples. Peptide bond is being made here, protein will go through the tunnel if it’s not stopped by erythromycin. Looking into the tunnel from top this is the large subunit, entrance to the tunnel a little bit lower, erythromycin blocks most of it so protein cannot go through. Macrolides are made of a ring, the normal macrolide erythromycin 14 members and 2 sugars. Erythromycin was found to be a very, very successful and very useful antibiotic when it came into use in the last century, in the middle. But not very stable in acid solutions which is the stomach. And companies modified it to make the stability higher in two positions. I want to show you want happened. The first is looking at entrance to the tunnel, what you see here is the whole RNA of the large subunit. Because the antibiotic macrolides bind only to RNAs, I can show just RNA. Entrance to the tunnel I zoom it erythromycin binding way. To take several more binding modes but all of them are blocking the tunnel with more or less the same way, it shows that there is flexibility even in binding depending on how the experiment was done. However, the binding, the anchors are the same. Addition of clarithromycin better blocker and more stable, addition of a longer arm is roxithromycin that better blocks and is more stable in the tunnel because it has more anchors. And maybe this provides to my understanding only the first time for antibiotics the structural basis for dosing. So in many cases 150mg of roxithromycin twice daily has the same effect as 500mg of erythromycin four times a day. So it’s 1:6, it’s very impressive. So I already say that all antibiotics bind to ribosomal functional sites. I also say ribosomal functional sites are highly conserved. Mandatory for clinical use is the distinction between the pathogen and the patient, we want to kill the pathogen not the patient. So if we target ribosomes without distinction it’s a problem. How do the antibiotics differentiate between patients and pathogen, by subtle differences, I want to show you one. So here is section through the tunnel, these are the tunnel walls. You cut it like that. And you already saw that it has an uneven shape. So in grey is the tunnel. Inside here are all types of structures of antibiotics from the macrolide family, erythromycin family that block the tunnel perfect, almost perfect but always very good. And only one nucleotide is being highlighted A2058 nomenclature according to e-coli. A2058 is the name of the game, it’s adenine in new bacteria. Adenine, erythromycin very high chemical affinity here. All pathogens are new bacteria. So all pathogen can bind this way. Have a look here, this is us, all higher animals. G instead of A in this position. But when there is a G here, this is the only difference when there is a G here there is rejection based on two short contacts and no binding. That’s all. So I showed you how the differentiation is made, now I have to talk about the problem, which is resistance. And resistance is a very, very acute problem in modern medicine. How do the pathogens acquire resistance? Prominent way is to hit the anchors. Like here, you remember 2058 now you see it from the other side. An erythromycin these are good contacts and this is the basis for selectivity. It’s also the basis for resistance. So the bacteria, the pathogens can change their own genome from A to G, or they can make a post translational modification, ERM modification which is methylation in this position. Most mechanisms are really involving the modification of such anchors but there are some that are mediated by remote changes and I want to show you an example. So again you see here the tunnel. You can see it now in the whole ribosome, it is here. And you can see that the tunnel walls are made mainly of RNA but there are two proteins that their tips are getting into the tunnel wall, here you can see here. Macrolide binding site is more or less here, those are below it. Yet, it was found already in the ‘70s that minute mutations in these two proteins L22 and L4 can acquire resistance to erythromycin although L22 and L4 are not part of the binding site. And although the resistant mutants bind erythromycin. So how does it happen? We really wanted to know it, in collaboration with Dr. Janice we got this mutant in the ribosomes that we can crystallise, we have crystals now for the mutants and what we found is that this minute changes did not really change much in the protein position. But triggered a whole chain of changes in the RNA walls. So that the tunnel can now accommodate erythromycin and the newly born protein. So this is a way that we didn’t know about until recently. What do people, companies do in order to combat resistance? I think that always it will be only partially combat, maybe controlling it. Because bacteria wants to live, but what do we do? Sometimes there are new compounds with additional anchors or synergies that have two compounds together. For instance additional anchors or higher flexibility is shown in azithromycin by the addition of one atom. Just compare this to erythromycin almost the same chemistry. And it’s one of the most successful antibiotics, it also works against resistant strains. But I was worried when I heard about it, that the Croatian group developed it. Because if it binds resistant strains that have G instead of A it will also bind to the patient. So what did we get? Luckily I was wrong, and this you can see by comparing the binding of azithromycin to a model for pathogens, the green, and to a model for patients the H. I cannot talk much about it here and now but you can see the binding happens. But sometimes it’s along and sometimes it’s across. This shows that binding, 2058 is only showing if there is binding but not how. Two compounds, there is one commercially available drug called Synercid, it has two compounds, one in the active side and one in the tunnel. You want to see how it binds, here is the tunnel wall one and second they glue each other and they block the tunnel. Make me a bit happier and we looked for another one. So Synercid is what I showed you, this is another one. Two compounds, pair of compounds produced by one drug that goes in Sri Lanka, this is why it’s called Lankacidin and Lankamycin. We found that both of them bind together, that the interaction between them is very strong and they introduce also conformational change. So we have some hopes about that. Also we found that Lankamycin which is binding to Lankacidin is very similar to erythromycin, yet erythromycin is a competitor whereas Lankamycin is a binder. And this gives us a tool to make the coupling better. So a movie that may work of how some of the antibiotics that we know today work, Edeine doesn’t let messenger binding. The next one will be Tetracycline that doesn’t let a-site tRNA binding by occupying the space, the position here. The third one is erythromycin that we discussed. This is erythromycin, the last one is clindamycin that doesn’t let the bond being made. So a minute about inspiration. We are using crystallography I think that you all know that by using x-ray crystallography we can see details that we cannot see otherwise. The idea is we have crystals, we get diffractions in all directions, we collect it and we make if we know how to collect them together and face them we do it correctly, we merge them correctly we get electron density maps and since x-rays are interacting with the electrons we know from there where the atoms are. And I really don’t go into details just see what I mean by diffraction. So here there is a crystal beam comes in, and is diffracted to all types of directions with all types of intensities. But we need crystals. And crystals are entities that have perfect order on three dimensions or almost perfect order. So in salt, sodium chloride is no problem. In the middle of last century people started to get crystals even of proteins and a 6th grader could repeat it so it’s possible, this is lysozyme. That was crystallised even by a child now. But those ribosomes that are so complicated and you saw that they are very flexible. I didn’t say that they deteriorate fast, there was a big question mark when we started actually the question mark was negative. It cannot be done. And these are the reasons that I told you earlier, the most important one in my idea is the marked tendency to deteriorate. They deteriorate very quickly within one to four days in the test tube or in the body. So why did I being young, almost your age, why did I start it? Because I had a chance to read, even not only scientific, I read about the delegation that went to the North Pole and looked at the metabolism of the sleeping bears. And they found that while they were sleeping the ribosomes are orderly packed inside their membranes. ribosomes can be orderly packed. Every bear every winter, it’s not an accident. Why do they do this? This is what I thought because I thought this is the way to keep a large pool of active ribosomes because when they get up they want to do things they didn’t do when they were sleeping and they need proteins. So that more or less what you can see here. I was not the only one that read it, many other groups read it for instance people used this finding in order to make crystals or mono-layers actually on membranes. You know what they did? They took fertilised eggs they put them in the fridge and in three minutes they had this beautiful order. So people thought that the organisation is because of cold, because of cooling, I thought that it’s general stress and it was shown later that also bad diet does it. But anyway this happens in the cell and all what we wanted to do is to make it in the test tube and use ribosomes from bacteria. Not from bears and not from fertilised eggs. So the solution was to crystallise ribosomes from bacteria that live under harsh conditions. And to use procedure that extend the life a little bit, I was lucky, I found a one like this. I want to show you harsh conditions for instance the Dead Sea, the lowest point in the world. It’s in Israel, you can see it from top, you can see it closer. These are salt crystals that come out of the very saturated sea. But even on them bacteria grow. You see here some colonies that grow on the salt, you can see it even better here. The level of water here is much lower, it’s September. This was the level of water in January about 70cm, they still grow. So we found this could be good ones or these that grow in sewages off atomic virus, deinococcus radiodurans that can grow also on dust in hunger, heat, cold because only quarter of it is functional and the other four are packing their DNA and will come into action when one quarter has to be repaired. They also have all the repair mechanisms of the known today. And to use the procedures as I said earlier. Using these procedures we could see even after a day evidence for organisation. This is an electron micrograph of a drop that we put for crystallisation. You see here order. So even we could do it in the lab. So yesterday we talked about persistance and belief. I say faint but solid evidence were given all the way through. I want to show you also only a few of them. So the curiosity was the reason but the evidence was the driving. We wanted to get this picture, in the beginning we got that, and this was for me good enough because it had some evidence for high resolution. And when we cut it, it was very ordered. We looked at it, you see the benefit of working on very large particles is that you can see them in an electron microscope. So we could see the order inside. We needed twenty-five different conditions but it was not difficult it took only four months. Because we devised a way, a matrix to watch it and when we found the region for instance the pH region where we started to see the difference between solubitiy and sedimentation we transferred it to capillaries and look how large the crystals became. Were not still good but we developed. And look how nice they were ordered inside. So I don’t want to tell you about all the types of crystals that you can see from the inside, I will show you in a minute from the outside. We went from this to this, you remember what we desired, I showed in the beginning. So this was our desire. This is the story. In four years after we started we had a few stops first time, still far but much more than in the first days. And then we got really good crystals, many good crystals that could be exposed. We went to synchrotron to measure, these are facilities where particles are being accelerated and we see it on the tangential in stations and measure. And we found six years after we started we had good crystals, very, very fantastic diffraction. But after 0.1 of a second it disappeared. The decay was very large and then we failed here comes the Everest. We climbed an Everest, we have good crystals, fantastic but only there we found the way still very far, the real Everest is still in front of us. So we could stop then, we did not. We thought why is there damage, I can’t go into detail but the main thing is damage happens by x-rays heating peptide bonds or any bonds and we can only try to minimise the progression. We did it by cryo temperature, look at my face when the experiment started. In Stanford 1986, I was not very hopeful. But this was the breakthrough. And today with informants it became routine in the whole world and today almost all the structures are measured this way. So we have wonderful diffraction and the whole world has many more structures. So this is when we introduced cryo-crystallography together with some other developments for phasing and detectors. Please multiply this by thousand and see how many structures are now in the world. PDB, is protein databank deposition, each of them is a structure. So the thanks and I said yesterday that there were only a few people that could grasp the potential of this work. They didn’t really believe like I, they still doubted but they suggested it to the Weizmann Institute President and they let me work for a long time on a project that was considered maybe or may be not. So Sir John Kendrew, Chris Anfinsen and Alex Rich and the president Michel Sela and Haim Harari. The whole work started as I said before in Max Planck in Berlin together with Dr. Wittmann that was crazy about knowing the structure of the ribosomes. He also helped us to have a work in Hamburg near the synchrotron. He died rather early and was replaced by Franceschi and Fucini. Now because we had two research groups at the same time I am thanking for the young members there for their devotion and determination. In good and in more than this bad times. You can see the Hamburg group, with their angels, our technicians and looking for better bacteria in the Dead Sea. And you can see the Israeli group that is still functioning, it’s run by Dr. Anat Bashan, she is the senior scientist. I can’t tell you what everybody did but everybody contributed, what you see here is the collection of the last three years. But I want you to look at Tamara, Tamara Auerbach she came to us thirteen years ago for ten weeks, she’s still there. She came as a bachelor, she’s still, she now has three kids. But the day I took a picture and this is the important thing, she had her birthday and she baked a cake. And this is her cake which shows that for my group cake ribosomes are considered sweet. I also want to thank my family that supported me for the whole time, it’s the whole family but I want to focus on my granddaughter. And the reason I'm doing it is for you young female that hesitate whether it’s possible to do science and have a family. Have a look, you know what she’s saying, this is a speech she gave without telling me in Paris. As you know she’s very busy, she means me, but she always finds time for me. And at age of five she invited me to her kindergarten to explain what ribosomes are. At the age of ten she gave me the most important prize in my life: The grandma of the year is Ada Yonath. There is no year, and she told me I have to reprove myself every year which means it’s a loving but demanding family. When I fail she will take it off. Thank you.

Yonath Climbs the First Everest
(00:29:45 - 00:37:22)

Next we go back to the intellectual discoverer of X-ray diffraction, Max von Laue, who lectured on “X-ray Interferences” at the very first physics meeting in Lindau, in 1953. This historic lecture snippet is in German, which of course by far was the dominating science language at the time von Laue received his Nobel Prize. Even though the year of the prize officially is 1914, he did not actually receive it until 1920. That year the Nobel Foundation arranged two Nobel Ceremonies, one in June for the Laureates who had not been able to come to Stockholm because of WWI and one in December (as usual) for the Laureates of the year. This explains why von Laue in the beginning of his talk refers to his visit to Stockholm 33 years ago! In the lecture snippet, he talks about the discovery made in Munich in 1912. Of special interest is the role of Arnold Sommerfeld, who was the head of the department and who suggested the investigation, and of the two experimental physicists W. Friedrich and P. Knipping, who performed the actual measurements. If given today, the Nobel Prize could very well have included also one or two of these scientists (but not all three!).

Von Laue Discovers the Diffraction of X-rays (in German)
(00:08:08 - 00:16:05)

Luckily enough, also 1915 Physics Nobel Laureate Lawrence Bragg managed to come to Lindau once, in 1968, to give a lecture from which the next snippet is taken. When Bragg, at the end of the 1930’s, took over the directorship of the Cavendish Laboratory from Ernest Rutherford, there was already an ongoing investigation there of molecules of life using X-ray diffraction. The investigator was Max Perutz, who studied haemoglobin. He was eventually joined by John Kendrew, who looked for the structure of myoglobin. Another investigation was later planned for nucleic acid (DNA) with Francis Crick as investigator and with James Watson about to join. But even though these projects in a sense were quite unphysical, the physicist Bragg kept an open mind and let them go on in his laboratory. So with the Nobel Prize outcome of the investigations, Bragg could say at the beginning of his talk that he represented not only his own subject physics, but also chemistry and physiology/medicine, and therefore ought to be invited every year! The title of his talk is “History of the Determination of Protein Structure”.

William Bragg (1968) - History of the Determination of Protein Structure

Count Bernadotte, Ladies and gentlemen. I would first like to take the opportunity to thank you all and Count Bernadotte particularly for the invitation which brings us here. This is unfortunately the first time that I have been able to attend one of these occasions. And I realise how much I have missed. I would only like to point out to Count Bernadotte how widely my subject extends. A few years ago the young people who were working in my physics laboratory, two of them shared the Nobel Prize for chemistry and two of them shared the Nobel Prize for medicine. There is therefore, Count Bernadotte, no reason that I can see why you should not invite me every year to... That is of course if you like to do so. A story I want to tell you is the history of a research which lasted over 25 years. I have always been interested in trying to apply x-ray analysis to more and more complicated bodies. And this story is the analysis of the most complicated bodies which have yet been successfully investigated, the protein structures. When Perutz and I started on this 25 years ago, success seemed almost impossible, they were so complicated. But the prize, the reward for getting them out was a dazzling one. And we felt we must have a try. Now, perhaps I may just outline the problem. The protein molecules are large molecules which play a part in the processes of life. Haemoglobin, the one with which we started, is a molecule containing 10,000 atoms. It has the special function in the body of conveying oxygen from the lungs to all parts of our body and then taking back the carbon dioxide again to be given out in the lungs. When we started this investigation, the haemoglobin which we chose is a molecule containing 10,000 atoms. So far, the most complex structures which had been investigated were those done by Dorothy Hodgkin. In particular the structure of vitamin B12, which contains 181 atoms. The difficulty of getting out a structure goes up as a high power of the number of atoms in it. So we were trying to do a molecule with 10,000 atoms when the best so far had 180. May I remind you of the nature of x-ray analysis. X-rays fall on a crystal structure composed of the molecules which you are examining. You notice how strongly the x-rays are diffracted by the molecule in different directions. And from this you have to deduce the arrangement of the molecules. You make a measurement on diffraction and the result is the positions of the atoms. Well, may I outline for you the nature of the problem in its full fearfulness. The molecule contains 10,000 atoms, now we have to find their positions. And their positions are defined by their coordinates, what we call parameters. So as we have to find X, Y, Z, the coordinates of each atom, there are 30,000 variables in our equations, Actually it is not as bad as that because fortunately, if I may draw on the board for a moment, the molecule of haemoglobin has an axis of symmetry. So if you know the position of an atom there, you know the position of its partner over there. So we have 15,000 variables only to determine. Now, we have enough equations to determine those variables. Because each spot of the diffraction pattern gives us one equation. A function for the first spot of 15,000 variables equals something we can measure, the amplitude of the x-rays that make that spot. And then you pass to the next spot, F2 and A2 and there are 30,000 of these equations. So that then is the problem, to solve 30,000 equations, which are more than sufficient to determine 15,000 variables. When we proposed to do this I went to Mellanby who was at that time the head of the medical research council and I asked him for money, that is of course why one goes to people like Mellanby, and I said: On the other hand, the importance of getting it out, if we are successful, is practically infinity. And if you multiply zero by infinity it is possible that you will get something.” And I’m very glad to say that he agreed and that started our research. Now, what was my part in this research, because I think it is very wrong for heads of laboratories to talk too much about the work which their young people do, it should be left to the young people to do that. But I can perhaps again on the board illustrate what my part was, ... going back to earth was WLB, going into a protiate orbit was Perutz. I think that perhaps describes it, that I was the first stage of the rocket which got this research off the ground... What is the nature of a protein molecule? It is very interesting the way in which nature builds up these very complex molecules which have to play a very specific part in our bodies. Because the main characteristic of a protein molecule is that it does one task in the body and absolutely nothing else. One particular little bit of a chemical process in our living bodies has its appropriate protein molecule which does that and does nothing else. Now, instead of building a special structure for each of these protein molecules, nature has adopted a simpler device. It is a device very like that which we use for making letters serve for so many words. We have some 25 or 26 letters in the alphabet and with those we can build words of any kind conveying the most complicated ideas. That is not the only way in which you can do this, the Chinese, as you know, have a symbol for each fundamental idea, instead or writing letters one after another, they draw a symbol which represents a word. Nature is European and not Chinese. She builds up these protein molecules with 20 simple aminoacids. Small bits of chemical structure, not at all complicated, rather conveniently 20 can represent the letters of an alphabet. And the different proteins, just like again words with letters, are built by using these 20 simple building blocks. And arranging them end to end. Could I have my first slide, please. This slide shows these chemical bits, they are slightly different on the right hand side, that is what gives them their character. And again, like the letters which a printer uses, the little blocks which he puts to make a word, they are all the same on the left hand side, it is these left hand sides which can fit together. The NH3, condense with a COOH of that one to form a bond, by elimination of water and then you can string these all in a row. And that is how the proteins are built up. Each protein has a characteristic order of these little building blocks which then form a structure which we will be examining. My next slide shows the kind of picture given by the protein molecule. That is only a sample, there are many other sheets of spots like that, but those are the spots we have to explain. One has altogether some 30,000 of those spots in the diffraction picture given by protein and those form the 30,000 equations. The positions of the atoms must explain the strength, strong or weak of those spots, that was our material. And there are so many spots, it’s clear there are more equations than variables, and therefore it should be possible to solve them. And we looked at those for 25 years, trying to find out how to do it. Now, the method of attack, in early x-ray analysis, what one did was to move the atoms about in a likely way to try various structures. Calculate how they would diffract the x-rays and then match that with what was observed. Clearly it is quite impossible to try all positions of 10,000 atoms. So that method was completely ruled out, particularly, too, because we had no idea of how they ought to be arranged, nobody knew what a protein structure should be like. So we had to try another method, the Fourier method, which is now used for all complex x-ray analysis. Could I have the next slide, please. That slide represents various musical notes. The variation of the pressure of the air in a musical note. And I use that just to point out the principle of a Fourier series. The top one, which has had its head cut off, is the noise made by a flute, it is very simple, it is a very pure tone, just the fundamental note and one overtone, that and that. This one is a little more complicated, the next one is a clarinet, the next one is an oboe and this horrible one here is a saxophone. Now, the Fourier series is just a representation of the fundamental and the overtones, which you add together to get a curve like that. One can add together the fundamental and the overtones to produce the curve. Only the fundamental and the first overtone for the flute, much higher overtones say for the oboe. Right lights up, please. Now a crystal, you can think of a crystal as a musical note in three dimensions, if you like. In just the same way that you can build a 1-dimensional curve like that by adding fundamental and overtones. So in a crystal you have periodic variations of density in all directions, in three dimensions, that way and that way and this way and that way. Add those all together and you get the density in a crystal, because a crystal is a pattern and repeats, just like a note repeats. When we examine a crystal by x-rays, we are measuring these overtones. Here is the unit cell of our crystal. We can represent that crystal like the musical note by waves running this way and waves, another set shall we say, running this way. And perhaps you can see that if the strata, if the wave has a big amplitude, this particular white one, then when x-rays are deflected from those plains, they will be strong. That’s to say a strong spot means a strong stratification or Fourier element of the crystal in that particular direction. And if this one is weak, then when x-rays are reflected from there, we would only get a weak reflection. In other words, the strength of each of those spots which you saw in the picture is a measure of the strength of the Fourier element which represents that repeat in the crystal structure. So if we measure all these spots and can put together these Fourier elements, then we have the answer to our crystal structure. That sounds very easy, if that is all x-ray people have to do, it is hard to see how they earn their salaries. But actually there is a difficulty. You can with x-rays measure the amplitude of that component but you cannot tell how much that way or that way it is in the crystal structure. Because wherever it exists it gives the same x-ray reflection. To put it mathematically, you can measure these amplitudes but you cannot measure the phases, not directly, anyhow. The strength of the spots does not tell you the phase. So x-ray analysis is a hunt for phases, if we could only find out those phases, we could find out our structure. Well, now for the next 5 minutes I am going to tell you how proteins were not solved. How we started up a blind alley, because it is a rather interesting, I think perhaps there’s a certain interest in this story. When we started we thought that the protein molecule probably had some kind of regular structure. That the protein chains in it would be like say a series, rows of these amino acids regularly arranged. Why did we think that? I think because it was too horrible to think that they were not, because it made it all seem so terribly difficult. So we started off with this idea. Now there is, I always think it is very bad indeed in a lecture to mention mathematics, one ought never to do that. But just to sum up the results of a certain mathematical treatment, may I just tell you about something which x-ray analysts use, which they call a Patterson, after the name of the man who first pointed out the relationship. Excuse me drawing so much on the board, but I always think better if I can draw at the same time. The real crystal, as I showed, is built up by adding Fourier elements. On the other hand, if you add up just the strength intensities of the spots in a Fourier series, do not bother about phase, you get a diagram which is called a vector diagram or a correlation diagram. In the real crystal, the actual crystal, the atom A there and atom B there, in the Patterson, this vector diagram, starting from an origin here, that vector appears again down here of strength AB. The strength of A multiplied by the strength of B, the mass of A, if you like, the mass of the atom of A and the mass of the atom of B, multiplied together, appear at that dept of distance from the origin. Now, we made a Patterson of haemoglobin. Perutz faithfully measured all the intensities and summed them up in a Fourier series of intensities and got this diagram. Now, this is the difficulty, there are, in the real crystal 10,000 atoms, so in the Patterson there are 100 million vectors, because every pair of atoms gives you another vector. So this is a picture of 100 million vectors laid on top of each other. And at first sight it would seem you would not see very much in such a number. But if there were a regularity, if in the protein molecule there were rows of these amino acids parallel to each other, then we thought there’d be certain vectors which repeat so often that they will be overmastering and we will see them in the Patterson in spite of there being this very large number. So we tried this and the slide, next slide please, shows what the Patterson looked like and we thought, yes, there is something. There is the origin and you can see stripes of density going along this way, which might represent rows of these aminoacids in the crystal. And we were very excited. We were even more excited when Pauling slightly later proposed his alpha-helix, which had the same repeat at about 10 Å that the vectors seemed to show in this picture here. My next slide shows Pauling’s alpha-helix. He said that when the amino acids join in a chain, the form of least energy is a cork screw or helix going round and round. These are the different ends which characterise the aminoacids which you saw in my first picture. The common COCHNH of the chain is the backbone here which runs down and is coiled in a helical form. So we said now this is wonderful, probably a protein is a set of Pauling helixes, which are all parallel to each other like logs of wood in a bundle, and so there is a chance that we can find out how these lie and solve our protein structure. We were in a state of great excitement and we were quite wrong because protein has no such simple structure at all. There is a saying in English, perhaps in other languages too, that fools rush in where angels fear to tread. It was very fortunate that Perutz and I were not angels because I think at this stage, if we’d know how much harder the real thing was, how complicated, we would have stopped the research altogether. Now, for the next stage, which had a partial success. I hope I can explain it. The crystal of haemoglobin has a series of shrinkage stages. If you put it in mother liquor of different PH, it will shrink or expand while remaining crystalline, a rather marvellous phenomenon. My next slide shows the nature of this shrinking or expansion. Two sides of the unit cell, the side this way and the side this way remain constant but the third one, the angle changes, and you get a series of stages of haemoglobin, there are more than this really, with different unit cells. Now, perhaps without my going in it too far, I can explain what my next slide means. Perutz measured the diffraction of all these shrinkage stages and plotted the results on one diagram like this. Now, you see our problem, if we looked just at the diffraction picture of the haemoglobin in this projection, it had 150 well marked spots. Now, these spots, each of these spots represented one of these waves of density. And as we were looking at the symmetrical projection, the one where there was an axis of symmetry, the corresponding waves of density had to have a phase either plus or minus. By symmetry, either the crest of the wave goes through the axis of symmetry or a trough of a wave, you can’t have a wave in any position if you’re to have a symmetry axis. So we had to assign sines, plus or minus, to about 150 spots and if you work out the number of ways of giving plus or minus to 150 spots, it is 2 to the power of 150, which was a very large number of possibilities to sort out. On the other hand, when we had plotted this diagram - this is the point I hope I can make clear – when we had plotted this diagram it was clear that the diffraction by the haemoglobin changed of course as the form of the crystal changed. The spot appeared in a different place because the crystal axis had different angles. Now, it is a very fortunate fact that if a quantity is plus or minus and if it changes through a zero value, it must be going from plus to minus. I hope that is mathematically correct, mathematicians always have a way of getting round these things. But to a simple-minded physicist it seems clear that if a quantity is real and can only be plus or minus, and if you find that it varies steadily and goes through a zero value, clearly it must be going from plus to minus or minus to plus. So you see, although considering a single form of the crystal of 150 spots had 2 to the power of 150 possibilities of sines. If we could draw this and say, now, if that’s plus there, it’s minus there and it’s plus there. That means that if only we knew the sine of one of these loops, as we call them here, that other loops in this row would be known. In other words, we had reduced the number of possibilities from 2 to the power of 150 to 2 to the power of 7. We had reduced it by a factor of 2 to the power of 143, a big reduction. But still the number of possibilities remained rather large and we were stuck. Now, could I have the lights up, please. I want to draw again, I’m afraid. This difficulty was got round by a brilliant discovery by Perutz. And this is where Perutz left the first stage of the rocket and went off into orbit by himself. There is our protein molecule with its axis on symmetry. That protein molecule gives those diffraction results you saw there in the last slide. And we want to know the sines of those results. Perutz found that he could attach two molecules of mercury, atoms of mercury, if you attach one, you attach two, because the crystal has that symmetry. Now, if you attach two molecules of mercury, they add their diffraction to the diffraction by the protein. And the kind of pattern you get from two scattering objects are of course Young’s fringes, a series of fringes at right angles to the line joining the mercury atoms. Not there but in the diffraction picture. I just draw them there to show they are at right angles to the line joining the mercury. So Perutz discovered not only that he could attach the mercury atoms to the protein molecule but that they made a difference in the diffraction, a measurable difference he could observe. Could I have my next slide, please. This is a composite slide, it is the diffraction by the protein without the mercury compared with the diffraction of the protein with the mercury. And the two slides have been displaced slightly, so that the corresponding spots are just under each other. Now, I hope you can see that in many cases the attachment of the mercury has made a difference in the intensity of the spots. Where is a good one to observe - there’s a very striking one. The case there, where the pure protein has a strong spot and the mercury one has nothing, whereas when you attach the mercury, it becomes strong and the protein has nothing. If you look at these pairs, you see in many cases changes taking place in the intensity. Perutz was able to measure those accurately. Part of the success of this whole project was that Perutz was a brilliant experimenter and could get reliable measurements. Now, if I could have my next slide, please. You see that at once that tells us all the sines. Here are the mercury fringes, the mercury gives fringes that slope down this way. Actually this slide, I’m afraid, is wrong way round. Doesn’t matter I think we’ll leave it. Thank you. Now, you see, ... if there is a plus fringe and it makes this get less, that must be a negative loop on the diffraction pattern. If on the other hand the plus fringe makes something go up, as it does there, that must be a plus fringe. U upside down there means it goes up, D means it goes down. So you see, by noticing whether the spot became stronger or weaker, one could tell whether a loop was negative or positive. Here is another case, a minus, a trough of the Young’s fringe, making this go down, become less, therefore that must have been a plus fringe. Whereas it made that one go up, so that must be a minus fringe and so on. So we had all the sines and we put those sines into a Fourier series, in this case only a 2-dimensional one to get a projection, and for the first time we got a picture of the haemoglobin molecule. The next slide shows this picture. And it tells us absolutely nothing at all. The trouble is that the protein molecule is so thick that it’s about 30 or 40 atoms thick, they’re all on top of each other and you really can’t see anything. But it was encouraging that anyhow, if you did look at a row of protein molecules, that is what they would look like, it was something in the way of success. It was clear, however, that if we were to go on and get a real picture of a protein molecule, we must do it in three dimensions. Now, that was hard, but in principle it is possible. Again, if I may draw a picture. The unit cell, one of the Fourier components perhaps is like that, but how do we know where it is. It may be anywhere this way, because that is a thing you cannot measure with x-rays. But now you see, if you put into the unit cell a heavy atom, mercury or gold or something of that kind, iodine, which you can do, as Perutz discovered. If you find an atom put in at A makes the spot stronger, and atom put in at B makes the spot weaker because it’s in the trough you see of these waves. An atom put in at C, half way between the crest and the trough, makes very little difference. Then you know your right in putting the waves there. Of course, I have turned it the other way round, what you find out is A makes it stronger, B makes it weaker, C makes no difference, therefore my waves must lie like that. To do it really properly, you draw a vector diagram and make it all fit, but that gives the principle. So you see, if you can find out the change in intensity, when heavy atoms are put in like this, you can find the phase. But you need at least three of these heavy atoms to be sure of your phase. One is not good enough, one atom was good enough for Perutz’s haemoglobin, where we had worked out all these loops. But when you have the general problem, when the phase may have any value, you need at least three heavy atoms to get reasonable equations, to find the phase. Perutz could not do that for haemoglobin, it proved impossible at first. And that is where Kendrew came in, Perutz had the great idea of the heavy atom but it was Kendrew who went into orbit first and found out the structure of a protein, because he found he could get four heavy units stuck on to his protein, one called myoglobin. With that he worked out all the phases, using of course a computer. The computer had the task, it is quite a task, it had to form a series, a Fourier series with 20,000 terms. And it had to sum up this series at 250,000 places inside the cell. In order to get the distribution of density inside the cell. Then there was the problem, when you had got all these densities as figures by the computer, what did they all mean? My next slide shows how, ... this is a slide of Kendrew trying to think what his results meant. He took a large room in the laboratory, he bought five kilometres of brass wire, which he stood up on blocks of wood, and then he got six young ladies who put little coloured clips, blue meant very dense and red meant very little, you see, on the wires. Because from the results from the computer you could see what the density was. So he assembled together all these little coloured clips and then the wires were on blocks of wood that could be moved a little so that he could walk inside and he tried to see what that meant in terms of atoms. And you can see here, he is beginning to build up the structure of the protein. These wires represent lines between the atoms. Wherever the wires join, there is an atom. And he is piecing it together and finding little bits. My next slide shows Kendrew contemplating the result of all this work. That is the structure of the myoglobin molecule. That is a simpler molecule, it’s only got 2,500 atoms in it. This white thing represents the course, it has a single polypeptide chain, a chain of aminoacids, and this is merely like the white line down the middle of the road, which tells you where the road runs. It just shows the direction of the chain. The next slide is a better one to show the molecule. If one wants to have a very good picture of something, the thing to do is to persuade the Scientific American to accept an article, because it always draws the most beautiful pictures. This is the myoglobin molecule, there’s quite a lot of Pauling helix in it, you can see a bit like that and a bit like that, this bit here with its iron atom and what is called the haem group around it, that is the place which holds the oxygen. In myoglobin there is one haem group, it holds one molecule of oxygen, in haemoglobin there are four, holding four molecules of oxygen. The myoglobin has the oxygen in our muscles to keep. So that then is the success of Kendrew’s results. The first protein to be worked out. Well, now I’ll say a little bit more about the structure of another protein, lysozyme. This I’m interested in because this was number two protein to be worked out and was done in the laboratory of the Royal Institution by Dr. Phillips and his colleagues. Lysozyme is a protein which is an enzyme. It has a very specific chemical job to do. Lysozyme is a protective enzyme in our bodies. It can attack certain kinds of bacteria and kill them. and it attacks these bacteria and kills them by destroying their cell walls. The walls of these bacteria have ribs in them, very like cellulous. A compound very akin to cellulous. And lysozyme is able to, as it were bite these ribs in two, to cut them. It applies itself to the wall, it catalyses a change which breaks the cellulous chain by introducing water and turning the usual chemical COC into COH and CH. And so you get a breaking of the chain. Lysozyme, we have lysozyme in our bodies as a protective enzyme. It was discovered by Alexander Fleming who at first he thought he’d discovered something with the properties of penicillin. Actually lysozyme is so present already in our bodies that we don’t need any more. And it’s not a medical help. There is for instance a good deal of lysozyme in our tears, in the fluid of the eye, I suppose to protect the eye. Phillips has lent me this next slide which shows the first way in which lysozyme had to be produced. Fortunately it was discovered that egg white has a lot of lysozyme in it and so this process was no longer necessary. My next slide shows the nature of the lysozyme molecule. It’s a big molecule again, nearly as big as myoglobin, and it has a curious cleft in it. I think perhaps you can see that running down there is a kind of valley, a valley in the molecule. And opposite this valley are two very, either sides, are two very active units, glutamic acid and aspartic acid as amino acids. And it is these which are going to do the job of breaking down the wall of the bacteria. My next slide, again the Scientific American, has made the rest of the molecule rather faint, so that you can see clearly the bit of the wall of the bacterium which has placed itself in this cleft. And that is what the lysozyme is going to break down. Now, it is quite fascinating, as Phillips has shown, in the first place there are, along the chain which is part of the wall of the bacterium, there are units that can form hydrogen bonds. Each of these units comes exactly opposite a molecular item in the lysozyme, which can form a hydrogen bond. So when the chain fits into this crack, everywhere there are hydrogen bonds to hold it exactly in place and to bring the weak point of the chain, which is going to be broken, which is at this bend here, exactly opposite the two active units, the 2 acids which are going to hydrolyse the chain and break it there. So although we have now found out only a few of these proteins, even these first examples show the nature of a protein structure. It is like a kind of machine tool, such as is used in industry. In the case of the lysozyme, the hydrogen bonds are grips which hold the work in place, in exactly the right place. And bring the right point of the work opposite the cutting instrument, which is going to perform the necessary action. These aminoacids perform two functions in the molecule. For the most part these amino acids are, how shall I put it, something more than mere packing, but they’re space filling elements. They are elements which are just right to bring the active part of the molecule into the right confirmation. Then the active part is at exactly the right confirmation to do the job which nature wants it to do. Perutz has been examining the different forms of haemoglobin, haemoglobin, there are many variations of the structure of haemoglobin, changes in the aminoacids. But he finds that these, the innocuous changes, the ones that don’t matter, different species of animals have slightly different aminoacid contents of their haemoglobin. The ones that don’t matter are all packing, as long as they have about the right shape they’re alright. But the ones which hold the chains together in haemoglobin and the ones which surround the place where the oxygen comes, they are vital, if one of those is changed, the person is either very sick or dies, some form of anaemia. Now there are certain vital amino acids which must be there to do the job. The others which hold those in the right place, you can change quite a lot and get away with it. So that where the picture of a protein is beginning to form as one of nature’s machine tools, which just do this one specific job in the body. Now, if I could have my last slide, please. That is the structure of haemoglobin. That is the first crystal to be worked out to the same scale, the structure of rock salt. And when I look at this picture I always feel how very fortunate I was to get the Nobel Prize 53 years ago, when the standard was so very much lower.

Bragg Describes X-ray diffraction
(00:01:35 - 00:08:40)

As described above, already in the 1930’s, the brilliant scientist J.D. Bernal had conceived the idea of using X-ray crystallography to determine the structure of proteins. His disciple Dorothy Crowfoot Hodgkin went to Oxford and started work on the insulin molecule in her laboratory. It took her about 20 years to determine the structure, but on the way she also determined other structures such as vitamin B12 and penicillin and received the 1964 Nobel Prize in Chemistry before she was actually finished with insulin. She obviously fell in love with the Lindau meetings and lectured there many times. Here we present a snippet of her 1980 lecture entitled “History and the X-ray Analysis of Protein Crystals”, in which she tells the story of the beginnings of the crystallographic work on proteins.

Dorothy  Crowfoot Hodgkin (1980) - History and the X-ray analysis of protein crystals

Professor Hoppe and friends, I find professors Hoppe's introduction very useful to me, I should be illustrating some of the remarks he has made in the course of this lecture. As you have heard already from other speakers, discoveries often get lost in the literature, well, perhaps Dickinson really showed you how to find them, however far back you had to go. And sometimes observations that should lead on to great developments get made and somehow not used, there are gaps of ten, fifteen, twenty years, before they are really, finally put to good use, that everybody works on the problems revealed, as in the case of interferon yesterday. And these gaps are, in themselves, quite interesting, and they occurred in the course of the story of the x-ray analysis of proteins. Protein crystals were observed in plants and in animal tissues during the course of the 19th century, and there are many nice drawings of them made by botanists and others who examined biological tissues. The first one, on my first slide, was made by Professor Schimper and I think from this part of Germany, pictures of crystals observed in plant cells, and you can see, he was looking at them through microscopes and he, there is one, particularly here - is there a pointer? - which you can see, he has viewed through nickels in different directions, which shows pleochroism. And some of them are protein crystals. I'm meaning this one here where he's obviously got his nickels in different directions. And the, another one on the next slide, also taken from more than a hundred years ago, from Preyer's book on "Die Blutkristalle", a very lovely photograph of haemoglobin crystals, I think they are dog haemoglobin, showing that he was viewing them through the microscope using nickels, turning them round so that the crystals appeared different coloured in different directions. And even at that time it was realized that proteins, that molecules, whatever they were, in these crystals were large. On the next slide there is a little early analysis made, figured in Preyer's book, giving a shot at the molecular weight of haemoglobin. It isn't quite right in any direction because the iron analysis is too high, and, but it gives you see a large figure of 13.000, it should be more like 17.000, and then the actual molecule is four times that. But it was known that there were large molecules in the crystals in the 19th century, and observations were made by both Schimper and Preyer, that showed that to get these beautiful pictures of crystals, you must keep the crystals covered with liquid. And that they dried or shrank when removed from their mother liquor. Now the next discovery was the discovery made in Munich by Von Laue, Friedrich and Knipping, illustrated by the next slide. Oh sorry, this is just another one that shows you a crystal actually growing, a crystal of haemoglobin actually growing in a red blood cell, and you can see that the haemoglobin crystal is occupying almost the whole of one of the blood cells. It's this particular cell, probably the wall was damaged, and so the crystal started to grow, whereas in the next one you can see the normal appearance. And now the photograph which was taken by Von Laue, Friedrich and Knipping in 1912 in Munich by passing x-rays through copper sulphate, and it shows that x-rays have wave lengths of the order of magnitude of diffracting units in the crystal, and that these units must be atoms arranged in a regular arrangement in three dimensions to produce these effects. Von Laue, Friedrich and Knipping didn't go on to work on this crystal, its structure wasn't solved for more than twenty years. It seemed quite complicated in those days, they moved over to a cubic crystal, zinc sulphide, and took very beautiful photographs. But the actually first use made of these x-ray diffraction photographs was made by a very young man, W.L. Bragg, age twenty four, in England, who showed how to use the diffraction effects to find the relative positions of the atoms in space in sodium chloride. He was helped by the structure having been suggested to him by Barlow, who in fact had published a proposed structure in 1885, quite correctly, again a long time before. I illustrate the structure on the next slide by an actual section in the electron density in the crystals of sodium chloride. The electrons scatter the x-rays and because they are grouped into atoms, in a regular arranged three dimensions, the interference is partly destructive, and from the spectra one can form a Fourier series as W.H. Bragg first suggested in 1915. The spectra provide the components, the terms of the Fourier series, they, from the scattering separates the terms, they have to be recombined to give you back the pattern which produced them. And the recombination has to be done, is usually done mathematically by calculating the contribution of every term observed to every position in the crystal by a mathematical formula, and for this you have to know the amplitudes of the waves, which you can easily measure, and also their relative phases, which are lost in the process but can sometimes be easily recovered. The calculation, though suggested in 1915, was not in fact made till 1926 by Habbicurst in America, and Dewan who suggested it, pointed out that the phases were, one, known in sodium chloride from Bragg's work, but two, could have been inferred because the heavier atom would dominate the effects, or alternatively, as Bragg had used to begin with, the differences between sodium chloride and potassium chloride, where one ion varied in density, would give a direct method of finding the phase relations. And then the picture can be combined and the electrons plotted, the electron density plotted at any density intervals you liked, to show the arrangement of the atoms. Now, when the experiments on x-ray diffraction were first made, passing x-rays through crystals, it was natural for different people in different parts of the world to repeat the experiments, W.L. Bragg was one. But they were also repeated in Japan, and in Japan, for the first time in 1913, immediately afterwards, x-rays were put through a protein, silk fibres. And I think the next slide should show you a photograph of silk fibroin. Now, this is a photograph in which the reflections are very fuzzy. Good photographs were obtained first actually in Berlin in 1921/22 by again a very young man, as he was then, Rudolf Brill, who I hope is still alive and living near Munich, as he was a year or two ago, and he took the photographs for his dissertation, helped in the interpretation by Michael Polanyi and Hermann Mark, who were slightly older, in the same laboratory. And the interpretation was that in these fibres there must be long chains of proteins, as indicated by Emil Fischer's experiments, and that these chains were not quite regular, that the amino acids might not repeat quite regularly. So this isn't a perfect crystalline photograph, but one in which the essential intervals shown by these fuzzy spots, are the intervals in the chains. And the next slide shows Professor Mark and Meyer's idea of what the protein chains, amino acid chains should be like to give the actual observed distances between these fuzzy reflections on the silk fibroin photographs. Extended zig-zag chains alternately glycine and alanine in the fibre structure, running through the unit cells. In this period of the 1920s, there were a number of experiments in which crystals were actually prepared in the laboratory from newly isolated enzymes and hormones, urease, Sumner, Northrop, pepsin, insulin by J.J. Abel in America, and it was natural for young crystallographers in the 1920s to try to put x-rays through these crystals too. So in the laboratory of W.H. Bragg at the Royal Institution, several attempts were made to get x-ray photographs of insulin, haemoglobin, one of the enzymes, edistin, a plant hormone, and they all got nothing but somewhat vague blurs. Two of the young men present were Asbury and J.D. Bernal, and when they left the laboratory of the Royal institution, Bernal for Cambridge and Asbury for Leeds, to work on wool fibres at York, they were both very anxious to work on proteins. And they corresponded with one another, and their correspondence exists in the Cambridge University Library where I found it. And Asbury described how he wrote to Northrop for pepsin crystals and Northrop sent him ones and he got absolutely nothing on the photographs, except a sort of, he got two rather diffuse reflections, rather like some of the silk fibroin ones. And he took fibre photographs as well; in fact that particular silk fibroin photograph is taken by Asbury and not by Brill or anyone of the earlier workers. He found that protein fibres in general tended to give two patterns, one hair when it was un-stretched with two reflections which he called the alpha pattern, and then if you pulled it out, it gave the pattern that suggested stretched chains, the beta pattern. But he was really anxious to work on crystals and to collaborate with Bernal, the only thing he complained of in his letters was he would like to start a serious collaboration, if only you were not such a soft-hearted chap, and taking on problems for all sorts of other people. And the problem of course that J.D. Bernal was taking on at that particular moment was the structure of the sterols. He had just put x-ray photographs, x-rays through calciferol crystals and shown from his results that the Wieland-Windaus formula couldn't be correct, and so opened the way for a whole new passage in sterol chemistry. But Asbury wrote, why not ask for haemoglobin crystals, a dare is the bloke, and Bernal, I think, hesitated a little, but suddenly the crystals were brought to him in his hand. They were brought from Uppsala, where they had been grown by a young man called John Philpot, who was a biochemist, learning how to purify proteins with Tiselius. And John Philpot enjoyed skiing. He went off skiing in the mountains for a fortnight, leaving his crystals growing in the fridge, and when he came back, he found his tubes of purified pepsin full of the most marvellous large crystals, about two millimetres long. And as good fortune for the advance of science would have it, they are passed through the laboratory, Glen Millikan, the son of R.A. Electron Millikan, who was working in Cambridge on fast reactions, and he was shown the crystals and he said, I know a man in Cambridge who would give his eyes for those crystals. And Philpot happened to know the same man, John Desmond Bernal, because he had earlier been involved in the isolation of vitamin D at the medical research institute in our country, and so he, very willingly, handed him a tube of the crystals, which Millikan stuck in his coat pocket right way up, crystals still in their mother liquor, and took them back to Cambridge. The year was then 1933, and when Bernal saw the crystals, of course he immediately did, he looked at them first within the tube under the microscope, and saw they were brightly shining, brightly birefringent, and he took one out of it, being in a hurry to see what was happening, just with a needle out of the tube, and took an x-ray photograph of it, and got exactly what Asbury had got, only perhaps rather less because he was a less skilful in general experimenter, hardly anything on the photograph. And he thought, this must be wrong, went back and looked at the crystals, bright in their mother liquor, and it suddenly struck him that they needed their mother liquor round them to keep their actual form. And he was lucky in another way because he was working at that time also on the problem of ice and water, and he had in the laboratory a student, Helen Megaw, taking x-ray photographs of ice crystals which she grew in little fine Lindemann glass tubes and kept at low temperatures. So Bernal took just one of her little fine walled tubes, about half a millimetre across, and fished out a pepsin crystal within its mother liquor, sealed the tube at two ends and put x-rays through it and immediately got an x-ray photograph with reflections, ever so many reflections all over the photograph. Now, the next slide should, if I am remembering right, well, first we have the people, so here is J.D. Bernal much later on in life, and talking to Kathi Dornberger, a student who was working at that time with V. M. Goldschmidt in Göttingen, but came to work with him and came to work on some of the protein problems later, and myself, and I must say, this photograph was taken in relatively old age, when Kathi had just become the director of a small institute for x-ray diffraction studies in Berlin East. Now, the next slide shows another character in the story whom I will mention, A.L. Patterson, in his laboratory with two students. And the next slide shows the photographs, this isn't the original pepsin photograph, here are some pepsin crystals swimming about in their mother liquor, and above is a photograph taken of them by Professor Tom Blundell in Birkbeck College, London, showing very, very many reflections along these parallel lines on an x-ray photograph. The original photographs we think must have perished at Birkbeck since the laboratory, part of the laboratory was destroyed during bombing during the war, but it, this at least illustrates the character of the picture, quite different from silk fibroin, a definite crystal repeat, you can very easily measure one that is constant, 67 angstroms from the separation of the lines, and the other one, in fact, we got it wrong when first we measured it, we got it about half the size it really is, it's really nearly 300 angstroms, corresponding to the long dimension of the pepsin crystals. I was at that time working with Bernal, and it was only a sort of bit of bad luck, but perhaps it was good luck for science, that I was not in the laboratory the day the first crystals came in, I was having a bad cold or something, and Bernal made all of the first observations himself. I'm always a little afraid I might have got more on the first photograph since it was possible to get more reflections from the dried crystals than Bernal actually did, and so delayed the observation that it was absolutely necessary to keep these crystals in their mother liquor. But I went on to take most of the rest of the photographs, but I do some of the calculations which we didn't carry very far because our first measurements indicated that we had a very large unit cell, that it could correspond to there being 12 pepsin molecules in this cell, each of weight about 40.000, several thousand atoms each you see within each molecule, and that this was, it was beyond our possible means at that time to think that we could work out the structures of such molecules. And the, yet the reflections extended to about 1 1/2 angstroms, it was clear that they were sufficient to show us atoms if ever we could form an electron-density pattern from them and look at it. At that time I was under pressure to return to Oxford to a college teaching appointment that should lead to a permanent appointment. I was most unwilling to go but everyone in Cambridge said, difficult to get university jobs in this time, of course you must take it, so reluctantly I went back to Oxford. Bernal had got a small grant to support me, of 200 a year, which he gave to another young person and I think who had come over with W.L. Bragg and was working with W.L. Bragg at Manchester, and heard about the work at Cambridge and wanted to join in. And he came in to take the next protein crystal, Chymotrypsin from Northrop, and then passed over to work on virus crystals that played a very important part in the development of the subject, and particularly later in America. And the next year there came another young person to work with Bernal in Cambridge and I think he's shown the next slide, again much older than he was, Max Perutz, and Max Perutz came from Vienna wanting to work with Hopkins, but Mark had forgotten to ask Hopkins to have Max Perutz as a research student when he visited Cambridge in 1935, because he was so excited by the work that Bernal was doing and sent him instead to work with Bernal, saying there's someone who really needs you, and Max said, but I don't know any crystallography, and Mark said, you will learn, my boy, which he did, in the hard way, for many years to come. The other one in the picture is John Kendrew, who Professor Hoppe mentioned, and he doesn't come into the story for very much longer. Now, what happened to me in Oxford, working, going back to begin work all on my own, was that Sir Robert Robinson who was then Professor of organic chemistry, was given a small present of the first insulin crystals obtained by the firm brutes in our country following a prescription for growing insulin crystals given by D.A. Scott in America, that it was necessary to add zinc to the preparation. They gave Robinson 10 milligrams in a little tube, and he hadn't any use for them, and knew the work that we had done in Cambridge taking x-ray photographs of pepsin crystals, so he said, why don't you try to photograph these? And they were microcrystalline but very bright and birefringent, and so I looked up all the preparations and grew the crystals finally, not very well, by Scott's method, large enough to take x-ray photographs of, and I made a horrible mistake, I decided that it didn't matter whether they were wet or dry and it was easier to handle them dry. I dried them like good organic chemists did, pouring methyl alcohol over them and then took x-ray photographs of them. And these very dry looking crystals, as you can see, are the crystals, not looking very good single crystals but they are, and up at the top is the little x-ray photograph they gave. Well, they gave an x-ray photograph, spots on the film, and I developed the first x-ray photograph about ten o' clock at night, and waited in the lab while I fixed it and washed it, and then walked out absolutely dazed, very excited, little spots on this photograph, down through the centre of Oxford, away from my lodgings and about midnight I was accosted by a policeman who said: "Where are you going?" So I said, not very truthfully: "Back to college", and turned round and went back. But I woke up in the morning, next morning, about six, and I was suddenly extremely worried, and I went to the - I thought, perhaps those spots, perhaps those crystals aren't really protein crystals at all, but something else, some impurity, some breakdown product in the preparation. And I went down very quickly before breakfast to the laboratory and picked one out of the tube and tried protein tests on it, and I tried the xanthoproteic reaction, which consisted of dropping first a drop of concentrated nitric acid and it turns yellow, and then a drop of ammonia and it turns brown, which it did, to my great relief, and I went back, happily, to breakfast. Now I perhaps should tell all those who are young here: Why I knew that reaction so well was because I was rather young and still at school I had a laboratory and did experiments on my own, and I was doing experiments suggested by Parson's Fundamentals of Biochemistry, and completing something one Sunday in a nice new silk frock, and of course one should never do this kind of thing, and I accidently dropped a spot of nitric acid on the front of my dress, and seized the nearest alkali which was ammonia and put it on it, well, of course it was much worse, and I was dreadfully upset, my mother comforted me and said she could cover it all with a frill, which she did, and so this particular reaction is indelibly engraved in my mind and I was very pleased when I could test it once again with the insulin crystals. Now, what happened after that? I didn't really remember until I was back reading the Bernal files at Cambridge, but it's obvious that directly after breakfast, I rang up the Cambridge Lab to tell them that I had taken these insulin photographs and got the very sad news that Bernal was at home with a temperature of 104. So then I wrote a little letter to his wife saying, please tell him when he's well enough that I have these insulin photographs, and I gave the rough dimensions of the crystal unit cell, and they are on the next slide. A rhomb is the real form of the crystals, 74.8 across 30.9 high, and within the crystals, there's roughly 36.000 molecular weight of protein, and it should formally be divided into three, which by the crystal symmetry to give you 12.000 molecular weight for the insulin molecules in the unit cell. Bernal recovered and wrote me a letter which begins "Dear Dorothy, ZN 0.52%, CO 0.49%, CD 0.74%. This gives rather less than three in each case. I am going back to Cambridge on, I forget when, I will send you some cadmium stuff" and any crystallographer can see what he was saying in this letter. This zinc, according to D.A. Scott's observations, is replaceable by other elements, of which cadmium is the heaviest so far observed, you should try and see if the cadmium crystals show changes in the intensity of the x-ray reflections and then you might be able to use the method of isomorphous replacement to determine phase constants for the different reflections and really see the atoms in your crystal. Terribly premature, I'm afraid, again I didn't remember all the details, I found the letter in which I said, I'm having a terrible time with scholarship examining for the college", and I did make one or two abortive efforts. But I had a feeling that cadmium wasn't really heavy enough to do what was wanted and that anyway, if I did a little calculation on the number of atoms, that there were in insulin. It was too large a problem for myself to set out to work on at the age of 24, and that I must try to solve some simpler structure first. I tried out the idea of the isomorphous replacement, again actually in little calculations at the Royal Institution in a notebook on cholesterol, chloride and bromide, while I was taking the insulin photograph which is shown on the previous slide, because it's a very, I took that particular photograph for show, for publication in the Royal Society, using the very big x-ray tube at the Royal Institution for the purpose. So, I didn't go on with insulin, but Max Perutz went on with haemoglobin. Could I have the next slide please? I went on, as I said, with sterols, and with the sterols I explored the possible use of both heavy atoms and isomorphous replacement for showing electron densities, and this is one of the experiments we did. And I show it because I have to introduce another character in the story, and this is A.L. Patterson. A.L. Patterson, I showed on that earlier slide, was one of the young men, A.L. Patterson, Asbury and Bernal ...

Crowfoot Hodgkin on History
(00:12:28 - 00:20:15)



As already noted, on a fundamental mathematical level, it is really impossible to determine the structure of any crystal or large molecule by only using X-ray diffraction as described above. The reason is the phase problem, where the term “phase” refers to the wave property of X-rays. Even though experimental tricks to overcome this problem have been developed, the 1985 Nobel Prize in Chemistry was given for a theoretical solution to the problem. Around 1950, Jerome Karle and Herbert Hauptman devised the so-called direct method, which turned out to be very useful when computers became available. Both have come frequently to Lindau and here we present a snippet from Hauptman’s very instructive 1989 lecture “A New Minimal Principle in X-ray Crystallography”.

Herbert Hauptman (1989) - A New Minimal Principle in X-ray Crystallography

In contrast to Dr. Deisenhofer’s beautiful lecture mine concerned as it is with methods of crystal emulative structure determination is of necessity highly theoretical. However I hope to show by my lecture today that it doesn’t follow that it must be incomprehensible as well. The first slide shows in a schematic way the fundamental experiment which was done by Friedrich and Knipping in the year 1912 at the suggestion of Max von Laue. It shows very briefly that x-rays are scattered by crystals and the scattered x-rays if caused to strike a photographic plate will darken the photographic plate at the points where the scattered rays strike the plate. And the amount of blackening on the photographic plate depends upon the intensity of the corresponding scattered x-ray. Because of the consequences of this experiment, because this experiment was the key which unlocked during the course of the next seventy-five years, the mystery of molecular structures, this experiment must be regarded as a fundamental landmark experiment of this century. The slide on the right shows a typical molecular structure. It’s the structure of decaborane which consists of ten borane atoms and fourteen hydrogen atoms. The borane atoms are located at the vertices of a regular icosahedron. I’ve shown these two slides together because I wish to stress the mathematical equivalence between the diffraction pattern, which is to say the arrangement and the intensities of the x-rays scattered by crystal and the molecular structure on the right, the information content of this defraction pattern and the information content of the molecular structure, which is to say the arrangement of the atoms in the molecule, the information content of these two slides is precisely the same. If one knows the molecular structure shown on the right one can calculate unambiguously completely the nature of the defraction pattern shown on the left. Which is to say the directions and intensities of the x-rays scattered by the crystal which consists of the molecules shown on the right. And conversely if one has done the scattering experiment and has measured the directions and the intensities of the x-rays scattered by the crystal then the molecular structure shown on the right is in fact uniquely determined. What I would like to describe next is precisely what the relationship is between the structure shown on the right as an example and the defraction pattern shown on the left. Here we have an equation which I hope doesn’t frighten you. On the left hand side is simply the electron density function which is simply a function of the position vector r and it gives us the number of electrons per unit volume. And on the right hand side is the formula which enables us to calculate the electron density function Rho(r). If we knew all these quantities on the right, those of you who are familiar with the elements of x-ray crystallography or even with most elementary mathematics, know that this function on the right is simply a Fourier series, a triple Fourier series. The scaling parameter v is not important for our present purpose. This expression on the right is a sum taken over all triples of integers, so called reciprocal lattice vectors. And on the right hand side we have simply a Fourier series expressed in pure exponential form. We have the magnitudes or the non-negative numbers which are the coefficient of the exponential function. We have the reciprocal lattice vector H and a triple of integers. We have an arbitrary position vector R, which has also three components. This is simply the scale of product. And here we have the phases of the structure factors, the magnitudes of which are shown here as the co-efficient of the exponential function. If we knew everything that we need to know on the right, which is to say these magnitudes and these phases, then we could calculate this function. This triple Fourier series as a function of the position vector R. And therefore we could calculate the electron density function Rho(r), read off the positions of the maxima of the electron density function and that would give us the positions of the atoms or in other words the crystal structures. The problem which was alluded to just a few minutes ago, is that although these magnitudes are obtainable directly from the defraction experiment, from the measured intensities. The intensity of the x-ray scattered in the direction labelled by the reciprocal lattice vector H, although these magnitudes are directly obtainable from the experiment these phases are lost in the defraction experiment. And so although from the very earliest years because of the known relationship between the fraction patterns and crystal structures it was felt that the fraction experiment did in fact unlock the key to the determination of crystal and molecular structures. Because these phases were missing, because they were lost in the defraction experiment it was thought that after all what could be observed in the defraction experiment was in fact not sufficient to determine unique crystal structures. The argument that was used was a very simple one and a very compelling one. It was simply that we could use for these co-efficient, for these magnitudes the quantities which were directly obtainable from the experiment. Which is to say the intensities of the scattered x-rays in calculating this function. And we could put in for the lost phases, the missing phases arbitrary values. And depending upon which values we put in for these phases we would get different electron density functions. And therefore different crystal and molecular structures, all however consistent with what could be measured which is to say the intensities of the x-rays scattered by the crystal. And it was therefore believed for some forty years after this experiment was done, it was therefore believed that the fraction experiment could not even on principle lead to unique crystal and molecular structures. Now there was a flaw in this argument, as simple as it appears to be and as overwhelming as the logic appears to be there was a fatal flaw in it. And that was that one could not use arbitrary values for these phases. For the simple reason that if one were to do that, one would obtain electron density functions which were not consistent with what was known about crystal structures. For example one of the properties of the electron density function which must be satisfied by every crystal is that the electron density function must be non-negative everywhere. After all the electron density function Rho(r) gives us the number of electrons per unit volume. And from its very definition therefore it must be non-negative everywhere. On the other hand for a given set of known magnitudes F sub H, if one used arbitrary values for these phases in general one would obtain electron density functions which were negative somewhere, for some values of the position vectors R and therefore would not be permitted. So that the known non-negativity of the electron density function restricts the possible values which the phases may have. In fact restricts rather severely the possible values which the phases may have. And the non-negativity condition alone, the non-negativity restriction on the electron density function is in fact sufficient to enable one to solve some rather simple crystal structures. However the restrictions on the phases which are obtainable in this way are of a rather complicated nature. And therefore the non-negativity conditional law has proven to be not very useful in the actual applications. A much more useful restriction may be summarised in the one word – atomicity. Since molecules consist of atoms it follows that the electron density function is not only non-negative everywhere but must take on rather large positive values at the positions of the atoms. And must drop down to very small values at positions in between the atoms. And this requirement of atomicity, this property of the electron density function turns out to be a severely restrictive one. And in general at least for small molecules, say in molecules consisting of a hundred or a hundred and fifty non-hydrogen atoms, this requirement is sufficiently restrictive that the measured intensities in the x-ray defraction experiment is in general enough. In fact, in general far more than enough to determine unique crystal structures. I should also mention before I leave this slide is that we should carry with us the fact that if we know these magnitudes which as I said are obtainable directly from the measured intensities in the defraction experiment and if somehow or other we can find these phases then by calculating this Fourier series on the right we can calculate the electron density function Rho(r) and therefore determine the crystal and molecular structure. In the next slide I want to show that not only do the crystals structure factors, which is to say magnitudes and phases of the crystal structure factors, determine crystal structures, but that the converse is also true. In order to exploit the atomicity property of real crystal structures, it turns out we have to make a small change in these F’s. We replace theses structure factors by what is called the normalised crystal factors E, shown on this slide, and defined in this way. Again we have a magnitude E sub H which is directly obtainable from the measured intensities in the diffraction experiment. We have the missing phases Phi(H), and this complex number may be represented in polar from in this way. The product of the magnitude times the pure exponential function e ^i * Phi(H). Where this Phi(H) is of course the phase of the normalised structured factor E(H). And what this equation tells us is that if we know the atomic position vector r(J), the r(J) now represents an atomic position vector labelled by the index J. We have here a sum of a linear combination of exponential functions taken over all the N atoms, in the unit cell of the crystal. On the right hand side we have the atomic number of the atom labelled J. We have the atomic position vector r(J) in the atom labelled J. H is a fixed reciprocal lattice vector and ordered triple of integers. Sigma sub 2 is not very important. For our present purpose it is simply the sum of the squares of the atomic numbers of all the atoms in the unit cell of the crystal. What this equation tells us then is that if we know atomic position vectors we can calculate magnitudes and phases of the normalised structured factors E(H). This slide tells us that the converse is true. If we know magnitudes and phases by calculating the Fourier series we can get the electron density function and therefore the crystal structure. This tells us that the converse of that statement is also true. If we know atomic position vectors we can calculate essentially the co-efficient of this Fourier series. However, I’ve already suggested that because of the requirement of atomicity that measured magnitudes alone provide a very strong restriction on the values of the phases and in fact require that the phases have unique values. But what that means of course is that if we have measured a large number of intensities, therefore magnitudes E(H), somehow or other these phases are determined. And now our problem is, in fact the solution of the phase problem requires that using only known magnitudes E(H) how does one calculate the unknown phases Phi(H)? Now, this equation tells us actually that right away if we examine it closely we see that we have a complication. And the complication comes from the fact that the position vectors r(J) are not uniquely determined by the crystal structure. Because if we have a given crystal, then the position vectors or the atomic position vectors r(J) depend not only on the crystal structure but depend also on the choice of origin. If we move the origin around in the unit cell of the crystal and in this way do not change the crystal structure, we change the value of this function and therefore we change the value of the normalised structure factor E(H) on the left hand side. What this suggests then is that these normalised structured factors, which is to say these magnitudes and these phases depend not only on the crystal structure but also on the choice of origin. And this of course causes a complication. As it turns out the crystal structure does determine unique values for these magnitudes no matter where the origin may be chosen. But the values of the individual phases do in fact depend not only on the crystal structure but also on the choice of origin. As you can see that complicates our problem. Because if the phases are not uniquely determined by the crystal structure, if the phases are not uniquely determined by the crystal structure then certainly they are not uniquely determined by measured intensities alone. Or by the known values of these magnitudes. Because we have somehow or other to find unique values for the individual phases we have to have a mechanism for specifying the origin. So what's called for before we can even hope to solve the phase problem, to calculate the values of the phases for given values of these magnitudes. Before we can even hope to do this we have to, in the process which leads from known magnitudes to unknown phases, we have to incorporate a recipe or a mechanism for origin fixing. Now that as I say introduces a complication which is not too difficult to resolve. The way to resolve it is to separate out from the contributions to the value of a given phase. There are as I indicated two kinds of contributions to the value of an individual phase. The contribution which comes from the crystal structure. And the contribution which comes from the choice of origin. And the first thing that has to be done is to separate out these two contributions. So we can decide once and for all what part of the value of the phase depends upon the crystal structure and what part comes from the choice of origin. And the best way to do that is to observe something that I don’t, which is not possible for me to show where without causing a lot of confusion. The best way to do that is to introduce the idea of what is called the structure and variant. Which is to say certain special linear combinations of the phases which have the remarkable property that their values are in fact uniquely determined by the crystal structure, no matter what the origin may be. So the first thing to do then is of course is to identify these very special linear combinations of the phases. The so-called structure invariance and I would like to show on the next slide a typical example of such a special linear combination of the phases. The three phase structure invariant the so-called triplet is simply a linear combination of three phases, Phi(H) + Phi(K) + Phi(L) where H + K + L = 0. If this condition is satisfied this linear combination of three phases as a structure invariant and it has the property that its value is uniquely determined by the crystal structure no matter where the origin may be chosen. Now, you can see the fundamental importance of these structure invariance because it’s only linear combinations of this kind whose values we can hope to estimate in terms of measured intensities alone. We’ve already seen that measured intensities alone do not determine unique values for the individual phases, because the values of the phases depend also on the choice of origin. But measured intensities alone do determine the values of these special linear combinations of the phases. So the phase problem then is really broken down into two parts. First to use the measured intensities, to provide estimates of this structure invariance, these special linear combinations of the phases, and once the values of a sufficiently large number of these structure and variance are known, then we can hope to calculate the values of the individual phases. Provided that in the process leading from the estimated values of a large number of these structure invariance to the values of the individual phases we incorporate a mechanism for origin fixing. So these structure invariance therefore play a fundamental role in the solution of the phase problem. They serve to link the observed magnitudes, these quantities here with the desired values of the individual phases. Because we can hope to estimate these linear combinations of the phases in terms of these measured magnitudes. And once we have estimated a sufficiently large number of these we can hope to calculate the values of the individual phases. Now, I have to indicate briefly how one estimates the values of these, not only this structure invariance but others as well. In order to do this the method which was introduced is a probabilistic one. Because of the large number of intensities which are available from experiment a probabilistic approach to this problem, to the solution of the phase problem is strongly suggested. And the strategy, the device which is used is simply to replace these position vectors R or the atomic position vectors r(J) replace them by random variables which are assumed to be uniformly and independently distributed. This is using the language of mathematical probability. In every day terms what we are doing is assuming that all positions of the atoms in the crystal are equally likely that no positions are preferred over any other. And that amounts the same then that the atomic positions vectors r(J) are assumed to be a primitive random variables uniformly and independently distributed. Now, once we do that then the, (could I have the previous slide on the right hand side please). If we assume these atomic position vectors are random variables, uniformly and independently distributed then the right hand side becomes a function of random variables. The left hand side is also a function of random variables and is therefore itself a random variable. And we can calculate by standard techniques its probability distribution, if we choose to do that. However, its probability distribution will not be useful to us but what will be more useful to us is the probability distribution of the structure invariance. These linear combinations of the phases. the other direction you are going the wrong direction, the one before this). Okay this is a structure invariant. What we are asking for now is the probability distribution of this structure invariant because we know from the discussion that I’ve already given that it’s only the values of these special linear combinations of the phases which we can hope to estimate in terms of measured magnitudes alone. Therefore, what we are looking for is the probability distribution of a structure invariant in the hope that the probability distribution will give us some information about its value. In particular we not only are looking for the probability distribution of this structure invariant but we are looking for the conditional probability distribution of this structure invariant assuming as known a certain set of magnitudes. Because after all the magnitudes or intensities are known. This is what is given to us from the defraction experiment. And we want to use that information in order to estimate the values of these structure invariance. What that calls for then is the conditional probability distribution of a structure invariant given a certain set of magnitudes. And on this slide if I can have the next slide we’ll see the formula which tells us what is the probability distribution of a structure invariant. Here we have a three phase structure invariant, Phi(H) + Phi(K) + Phi(HK). I’ve written it in this form rather than this form where we clearly show explicitly that the sum of the three indices, H+K-H-K adds up to zero. So that this condition is satisfied. This triplet then is a structure invariant and we can ask for its conditional probability distribution assuming as known these three magnitudes. And these three magnitudes are of course known from the defraction experiment. I have written down the formula only in the case that all the atoms are identical and that we have N of them in the unit cell. It isn’t necessary to specialise it in this way but I’ve done so in order to simplify the formulas. This gives us the conditional probability distribution then of the three phase structure invariant, the triplet, assuming as known three magnitudes. And this is the analytic formula and in a few seconds I’ll show you what it looks like by means of the next slide. Right now what I would like to emphasise, is that we can calculate any parameters of this distribution that we chose and in particular we can calculate the expected value or the average value of the cosine of this triplet. This is the formula for it, it turns out to be a ratio of these two vessel functions. It’s not important for us to know what they look like at the moment, it’s something, these functions are known functions and I’ve abbreviated it by writing T(H). And the important thing, the only thing we should carry away with us is that the average value of the cosine, of the triplet can be calculated from the distribution. This is what it’s equal to, it depends only on known quantities measured magnitudes and the number of atoms and in the unit cell. And it turns out always to be greater than zero. The next slide, on this side shows us as picture of what that distribution looks like. And we can clearly see when the parameter A shown on the previous slide is about seven tenths the distribution looks like this. It goes from -180° to +180° and what the distribution tells us is that the values of the triplet, of the three phase structure invariant tends to cluster around zero. There are more values of this triplet in the neighbourhood of zero than there are let’s say in the neighbourhood of 180°. So the distribution then, the known distribution which we can calculate carries information about the possible values of these triplets. And in fact it enables us to estimate the triplet, the estimate in this simple case would be that this triplet is probably approximately equal to zero. But in this case when the parameter A is only about 7/10ths the estimate is not a very good one because values near 180° were much, well not very frequent are still possible. It’s still possible to get a substantial number of values of the triplet in the neighbourhood of 180°, when the parameter A is only about 7/10ths. However, when the parameter A is larger as shown on this next slide, when the parameter A is 2.3 or so the distribution looks like this. Again values of the structure invariance in the neighbourhood of zero are much more common now than in the neighbourhood of 180°. So the estimate of the triplet in this favourable case when the parameter A is about 2.3 the zero estimate of the triplet is a particularly good one in this favourable case. When the parameter A is large, bigger than two or so, then we get a very reliable estimate of the triplet. And if we can estimate a sufficiently large number of them as I’ve already indicated we can then hope to calculate the values of the individual phases provided once again that in the process leading from estimated values of the structure invariance to the values of the individual phases we incorporate a mechanism for origin fixing. What I would like to do next is show another class of structure invariance. The so called quartets which are linear combinations of four phases now, Phi(H)+Phi(K)+Phi(L)+Phi(M) where H+K+L+M is equal to zero. This is very analogous to the triplet that I showed on an earlier slide. It’s a linear combination of four phases now, instead of just three phases. Just as we did with the triplets so we can do with the quartets. We can find the conditional probability distribution of the quartet assuming as known certain magnitudes. But there is an important difference between the quartet and the triplet which I showed earlier. The distribution actually has a very similar functional form. It’s exactly the same as for the triplet but the parameter BLMN is an abbreviation for this, well I see I didn’t write the quartet on this slide. I suppose because there wasn’t enough room. But BLMN is simply an abbreviation, no it’s not an abbreviation. BLMN is given by this, Phi represents the quartet. What this shows us is that here too we can calculate the conditional probability distribution of the quartet now. Assuming as known not three magnitudes as we had in the case of the triplet but seven magnitudes - EL, EM, EN and this. These are the magnitudes corresponding to these indices and three other magnitudes, so-called cross terms. It’s not important to know what these magnitudes are it’s sufficient to know that the single parameter on which the distribution depends can be calculated from seven known magnitudes. Magnitudes obtained from the defraction experiment. The important difference though between the quartet distribution and the triplet distribution is that the parameter B now on which the distribution depends may be positive or negative depending upon the sine of this expression embraces. If these three cross stems are large then this term embraces will be positive and the parameter B will be positive. And the distribution will have a maximum around zero, as we had for the case of the triplets. But if these three cross terms are small the expression embraces is negative and this distribution instead of having a maximum at zero will have a maximum at 180°, so that the estimate of the quartet in that case and it’s a case which can be calculated in advance, the estimate of the quartet becomes not zero but 180°. However just as in the case of the triplet we can calculate again the expected value of the cosine of the quartet, again it turns out to be the ratio of vessel functions because it has the same functional form as the distribution for the triplet. And we call it for abbreviation T(LMN) but now T may be positive or maybe negative depending upon whether this parameter B is positive or negative. And we know in advance which it will be. So the next slide on this side will show us what the distribution looks like in the case that the parameter B is negative. I’ve shown it for the case -7/10ths. Now in sharp contrast to what the situation is for the triplets the distribution has a maximum at 180°. So that the estimate for the quartet instead of zero will now be 180°. But it will not be a very reliable estimate in the case that B has such a small value because as you can see values of the quartet in the neighbourhood of zero while less likely than values in the neighbourhood of 180° still will occur. What's needed then is a distribution which is sharper than the one shown here and that will happen when the value of the parameter B is say -1.2. In that case we again have a peak at 180° so that the estimate of 180° is rather reliable but certainly not as reliable as we would like it to be. Now the traditional techniques of direct methods which have proven to be useful in the case that we are determining structures of so-called small molecules, molecules of less than a 100 or 150 non-hydrogen atoms in the molecule, those can be solved in a rather routine way using estimated values of the structure invariance. The reason that the methods eventually fail when the structure becomes very large is that we can no longer obtain distributions which give us reliable estimates of the structure invariance. As the structures become more and more complex there are very few distributions which have a sharp peak, whether at zero or 180°. And therefore there are very few structure invariance, whether they are triplets or quartets, whose values we can reliably estimate. And therefore eventually the methods fail. The one point which should be emphasised however, and which I have emphasised on the next slide on the right hand side, is what I’ve called the fundamental principle of direct methods. And this simply states that the structure invariance link the observed magnitudes E with the desired phases Phi. By this I mean, this is what the traditional direct methods tell us, the direct methods for solving the phase problem, is that if we can estimate from measured intensities alone a sufficiently large number of these structure invariance whether they are triplets or quartets or whatever. Then we can hope to use those estimates to go from, which are after all determined by the measured magnitudes, we can use those estimates to derive a value or to calculate the values of the individual phases provided that in the process leading from estimates of the structure invariance to the values of the individual phases we incorporate a mechanism for origin fixing. For this reason the structure invariance serve to link measured magnitudes, known magnitudes with unknown phases. But they require that we estimate fairly reliably the values of a large number of structure invariance. Well we can’t do that for very complex structures, for very complex structures we don’t get a sufficiently large number of probability distributions which yield reliable estimates for the structure invariance. So we have to do something else, when we try to strengthen the traditional direct methods to be useful for much more complicated structures. Say structures in the neighbourhood of three or four or five hundred or even more non-hydrogen atoms in the molecule. We have to do better than we have done in the past. But again we use the fundamental principle of direct methods. We use again the fact that it is the structure invariance which link these measured magnitudes with unknown phases, even though we can no longer estimate reliably the values of a large number of these structure invariance in the case of very complex molecular structures. We can always calculate reliably these conditional probability distributions. So just as for the traditional direct methods, the structure invariance link known magnitudes E with unknown phases Phi. Now they all again link these magnitudes with these phases but the property of these structure invariance which we surely know is their conditional probability distributions. That we surely know. And so we can try to solve the following problem. We can try to estimate the values of a large number of individual phases, say several hundred, three-hundred, four-hundred or five-hundred individual phases in one block, at one stroke. By requiring that the values have the property that when we construct from those phases, several hundred phases all the structure invariance which we can construct. Let’s say all the triplets and all the quartets that those structure invariance have a distribution of values then which agrees with theoretical distributions. We know their theoretical distributions and we require that the individual phases have such values that when we generate all the triplets and all the quartets which we can that their distributions, their conditional distributions assuming as known certain magnitudes, agree with the known theoretical distributions. The one thing we know for sure is that even for complex structures we know the probability distributions of the structure invariance. We may not be able to use these distributions to give us reliable estimates of the structure invariance but we know their distributions. And we have from this point of view a tremendous amount of over-determination because from a set of say three-hundred phases or so we can generate in any given case some tens of thousands of triplets and hundreds of thousands of quartets. And we know of course the distributions of all these triplets and all these quartets. And we can ask the question, whether we can answer it or not is another question, but we can certainly ask the question. What must be the values of the individual phases so that when we generate these enormous numbers of structure invariance, perhaps millions of them in any give case, that they have distributions of values which agree with their known theoretical distribution? If I may use that term. So that’s the problem that we try to answer now and I hope in the next few minutes to tell you what the answer to that question is. On this slide I just have just a brief summary of what I’ve already shown. I’ve already shown that for the triplets, Phi(HK) and for the quartets, Phi(LMN) we can calculate these parameters of the distribution. For example the expected value of the cosine for the triplet, I already showed you the formula for that. We can also calculate what I’ve called the weight which is the reciprocal of the variance for the cosine. I haven’t shown you the formula for that but it’s easily calculated once we know the distribution. And we can do exactly the same thing for the quartet, we can calculate as I’ve already shown you what the expected value of the cosine of the quartet is and we can also calculate the variance of the cosine of the quartet. So we can assume that these are known parameters of the distributions that we are concerned with. I should mention one other thing that I haven’t stressed. That is because from a set of phases, let’s say three or four hundred phases we can generate hundreds of thousands of invariance it follows that their must exist a very large number of identities which the invariance must satisfy. The very fact of the redundancy here, the fact that we can generate hundreds of thousands of invariance from just a few hundred phases means that the invariance must of necessity satisfy a very large number of identities. We shall make important use of that over determination property of this method. On this slide I’ve shown you what the mathematical formulation is of the requirement that the structure invariance, these hundreds and thousands of them which are generated by a set of several hundred phases, the requirement that those structure invariance obey their known theoretical probability distribution. The requirement is very simple, here we have the triplets. Here we have the quartets, incidentally in this work it’s absolutely essential that we use the quartets in addition to the triplets. Although the traditional direct method depends mostly on the triplets and very little on the quartets if at all. For the present formulation we need to have both triplets and quartets, because of the fact that with the triplets the only estimates of the triplets that we can obtain are the zero estimates where the cosines are positive. But for the quartets where the quartets may have the value, most probable values may be 180° the cosines are negative and we need to use those quartets. The fact that we have one or two orders of magnitude, more of these so-called negative quartets. Quartets whose probable cosines are, the expected values of these cosines is negative we need to make very strong use of those. Well I’ve already told you that these parameters, this T is determined from the known distributions. It’s simply the expected value of the cosine of the triplet. This is the expected value of the cosine of the quartet. These are simply weights which I already described before and I relate it to the variances of the cosines of the quartets and triplets. So all these parameters are known. Phi(HK) is an abbreviation for this triplet. Phi(LMN) is an abbreviation for this quartet. The condition which has to be satisfied if we are to find an answer to the question that I raised a few minutes ago, is that the cosines of the triplets must, well the value of this function of the invariance, Phi(HK) and Phi(LMN), this function of these invariance of which there are maybe hundreds of thousands of them. So this is a sum over several hundred thousand of terms. The value of this function, of these invariance, this one and this one must be a minimum. When this function is a minimum then we can be sure that we have answered our question which I raised before. That is to say - what must be the values of the individual phases so that when we generate triplets and quartets we get distributions of values for these which agree with their known theoretical distribution? The answer to the question is to minimise this function of invariance, Phi(HKK) and Phi(LMN) subject to the constraint that all the identities which the invariance must satisfy are in fact satisfied. Now that requirement that the identities which must exist among the invariance simply because there are so many of them and there are relatively few phases, that requirement of course is a tremendously restrictive requirement. So our problem then is formulated in a very simple way. Here is a known function of several hundred thousand invariance. We have to find the values of the phases which minimise that function of several hundred thousand of invariance. Subject to the condition that all identities which must hold among the invariance are in fact satisfied. The answer is very simple. However we still have a major problem. How do we find the answer? How do we determine the phases which will make this function a minimum, considered as a function of these invariance? And the first step to the answer to that question is shown on the next slide, on the right hand side which looks very similar to this. Except now, and I’ve called this the minimal principle. It’s the minimal principle for the individual phases. This is a function of invariance, Phi(HK), Phi(LMN), but the invariance themselves are explicitly expressed in terms of individual phases. So this defines implicitly a function of phases of which there may only be a few hundred. Here we have several hundred thousand invariance, here on the right hand side when we consider this function to be a function of phases, we have only three or four or five hundred phases. So this is a function of a relatively small number of phases. And the minimal principle says that that set of phases is correct which minimises this function of the phases. So the answer to the question that I previously raised is in fact formulated in a very simple way. It’s formulated as this minimal principle. But there still remains a major problem. Even a function of three or four or five hundred phases is a function for which it is very difficult to find the global minimum, especially if as in this case there are many local minimum. In the case like this with several hundred phases there may be something of the order of ten to the one-hundredth power local minima. From this enormous number how are we to select the one global minimum which is the answer to our question? Well, it would be very nice of course, if this function were very well behaved in the sense that we could start with a random set of values for the phases. Just choose phases at random. And then use standard techniques to find the minimum nearby that. There is several ways of doing that one is the least squares technique which however has the disadvantage that it will get the local minimum which is near to the starting point, will be trapped in a local minimum far away from the global minimum that we are looking for. So that’s a method that in general will not give us the answer. Or we could use a different method, a method called parameter shift method in which we vary the phases one at a time, look for the minimum as a function of a single phase and that way escape the trap of being caught in the local minimum. We may get an answer; a minimum far removed from the stating set but in general still a local minimum as it turns out. Not the global minimum that we are looking for. So it looks as if we have traded one very difficult problem for another problem just as difficult. But I would like to describe in the remaining few minutes that I have what we have done in order to try to solve this problem. And to show in fact that at least for a small molecule we have been able to resolve this problem. We have in fact found the unique global minimum chosen from this set of maybe ten to the one-hundredth power local minima we have in fact gotten the global minimum. I would like to describe in the next few minutes how we have done this. We have taken a small molecule, a molecule consisting of twenty-nine atoms, non-hydrogen atoms in the molecule. And we’ve constructed this function, this RFV function, and we calculated that function. First when we put in, since we know the answer beforehand, we know the values of the phases. And when we put in those values, the value of this function turns out to be approximately four tenths. And then we also have put in seven other randomly chosen values for the phases and in each case as you can see the values of the function is bigger than when we put in the true values of the phases. Which of course is in agreement with the property that I’ve already stated. That it is for the true phases that this function has a minimum. And has the minimum of approximately four tenths compared to random phases which give minima running around .67 or .68 or so on. Incidentally in this case we have calculated not merely the values of the function for seven randomly chosen phases but for thousands of them. And in all cases the value of the function is much larger than four tenths. It runs from about .66 to .69 or so. So there is no doubt that we have in fact confirmation of the theoretical result that the function as a minimum when the phases are equal to their true values. Well, starting with the true values, we went through two methods for getting the local minimum near to the starting set. One method was the least squares method, we went through a number of cycles of least squares and we ended up with values from the phases near to the starting set, not exactly the same. And it gives us a minimum of .366. The set of phases incidentally corresponding to this global minimum now gives us by means of the Fourier synthesis essentially the whole structure. The whole 29 atoms appear in the Fourier map when the phases which are put in are the phases which correspond to the global minimum of this function which is .366. If we use a parameter shift method for getting the minimum near to the starting set we get the same minimum which is not too surprising. But what happens when we put in a random set of phases and we go through both processes we get a local minimum, .44 here and .46 here. It’s not a global minimum clearly, this is the global minimum so we get a local minimum. And the same thing happens with each of these other random starts; we get local minima which however are not the global minimum. Well of all these minima we have chosen two to be of particular interest, 1.4125 which is the smallest one in this column. And the other .43 which is the smallest one here except for the true global minimum. And we have made the assumption that because .41 and .43 are both less than the other local minima which run about .45 or .46, that the phases which give us these minima, these local minima now, somehow or other carry some structural information in them. They are not, certainly they are not the correct phases, we know that, the correct phases give us the global minimum. But the assumption is made that they carry some structural information. If they are to carry structural information the question is how do we find what that structural information is? And the answer of course is very simple, all we do is use the phases that we get let’s say from this local minimum, calculate the Fourier series and have a look at it. See if in fact the structure is in there. Well we’ve done that, the next slide shows what happens. We’ve done that for that minimum, this was the random start, after minimisation we get .4125. We construct the Fourier series with co-efficients using these phases and known magnitudes and we take a look at it. Well it doesn’t look very good, it doesn’t seem to have any structural information in it. But we expect there will be some structural information in it and the way that we have chosen to extract that structural information is to assume that the information is contained in the largest peaks of that Fourier series. So we’ve taken the top six peaks of that Fourier series, that gives us what we hope is a fragment of the structure. Using those presumed atomic position vectors we can now calculate normalised structure factors E, which is to say both magnitudes and phases. In this way we get a new set of phases. Different from the random set we started with and certainly different from the set which gave us that local minimum. We get a new set of phases. We use the known magnitudes of the normalised structure factors with this new set of phases in our minimal function again. Well it turns out that the value of the function is now less than what happened when we had random start but more than the local minimum which we got before. And that's not surprising because we are using only six peaks among the total of maybe several hundred peaks. We are using the six strongest peaks. But when we go through the minimisation process again we find that we get a smaller minimum than we had before. Another local minimum .39, smaller than before and so we expect that the phases which give rise to this local minimum carry still more structural information than this set of phases. Well it turns out although we might have difficulty doing this if we didn’t know the structure that the full structure, all 29 atoms do in fact appear among the strongest 135 peaks. That may not seem like a very useful result of course because it may be difficult in the case that we didn’t know the structure to see it, to see the 29 atoms in the 135 strongest peaks. Well we don’t assume that we’ve done that. Instead from this Fourier series, the Fourier series calculated with the phases which give us this local minimum. From that Fourier series we take the top twelve peaks now, again under the presumption that most or all of these peaks do in fact correspond to true atomic positions. We go through the process once more, we calculate the value of this function for the set of phases calculated on the basis of these 12 peaks. And we now find the value of this minimum function to be .439. Smaller than each of these but bigger than what we got before. Again we are not surprised at that because we are using here only 12 peaks among maybe 135 peaks. But we go through the minimisation process again, and now the local minimum turns out to be .37. By doing this process then is among these enormous numbers of local minima we have been able to find the unique global minimum or something very close to it. Sufficiently close that it’s trivial to pick out the structure. Now I see that my time is up, so I can’t describe the second application which however is very similar to this, instead of using the local minima of .41 as the next slide shows we used the next local minimum which was .43. We go through a rather similar process and we end up with the same result, essentially the same results. After two cycles 28 of the 29 atoms appear among the strongest 31 peaks and the 29th atom appears at the peak number 44. For this starting point as well as the starting point shown on the previous slide we are able to find essentially the global minimum or something very close to the global minimum and in both cases to solve this structure. What remains to be seen is whether we can do the same thing for a much more complicated structure. Say a structure with several hundred atoms where the calculations then become much greater than they are now. Because instead of using only 300 phases as we’ve done in this case. We may need to use for a much more complicated structure instead of 300 phases maybe 1,000 phases. And instead of a couple hundred thousand invariance we may need to use a couple of million. So the calculations become much greater. But if the only problem is complexity of calculation then we have made a big advance because even existing computers are capable of handling that kind of calculation. Thank you.

Hauptman on the Phase Problem
(00:03:09 - 00:11:11)

As pointed out several times above, computers have become extremely powerful tools for the X-ray diffraction method and some parts of the investigations have almost become routine. But for large biological molecules, there is also the problem of making high-quality crystals. In 1988, Johann Deisenhofer, Robert Huber and Hartmut Michel received the Nobel Prize in Chemistry for determining the structure and function of a so-called membrane protein, a protein no one had thought could be crystallized. With their German connection, the three Nobel Laureates are invited to Lindau every year and they have responded to this invitation several times. Here we present a snippet from Hartmut Michel’s 1998 lecture “From Photosynthesis to Respiration: Structure and Function of Energy Transforming Membrane Protein Complexes”, in which he shortly describes a very interesting high-tech approach to getting useful crystals.

Hartmut Michel (1998) - From Photosynthesis to Respiration: Structure and Function of Energy Transforming Membrane Protein Complexes

Thank you. As indeed was said I am going to talk about membrane proteins. You will see lots of membrane protein structures but I should make clear from the very beginning that we don’t do membrane protein structures in order to see how the structures look like. We would like to know how these machines work. And for this we have to know what the structure is. And this will then form the basis to understand the mechanism of action of these molecular machines. And I should say that membrane proteins in general perform many important roles. But the major problem is that we cannot study them in great detail because we don’t have much membrane protein, much of the membrane protein available from the amount. And also they are very difficult to handle and we have still a severe lack of knowledge about membrane proteins. Membrane proteins constitute about 40% of all proteins of your body. But we know at present about 20 membrane protein structures from 12 different proteins. And we know about maybe 5,000, 7,000 water soluble proteins. And this tells you where the challenges are. And the challenges in membrane proteins are primarily crystallisation. Crystallography is a method, we have many different methods. And one of the points of my lecture will be that you need many different methods nowadays in order to solve biological problems. And the first slide, I show you the biogenic system from purple bacteria, which is a form of photosynthesis. And we have here in the heart, the blue machine here is a photosynthetic reaction centre and that’s the one we were awarded the Nobel Prize in 1998. And the next, this actually, this machine gets the energy from the light harvesting complexes. You see here this one in pink, which surrounds the whole reaction centre and actually there are more light harvesting complexes which are in the membrane which transfer the energy from a light harvesting complex 2 to light harvesting complex 1 to the reaction centre where we get electron transfer. I will only shortly touch the reaction centre. The role of the reaction centre is to transfer electrons from a primary electron donor here across the membrane towards acceptor molecules which are quinones, here we’ve a QB and we have to transfer a second electron to get a double protonation. We have the hydronium ion diffuses towards, in the membrane towards the cytochrome BC1 complex, gets oxides there and the protons are transported across the membrane in this complex. And the electrons end up in cytochrome C2, diffuse spec, in the periplasmic space of the bacterium and reduce the primary electron donor again via bound heme groups. So we have a cyclic flow of electrons. And the purpose for that cyclic flow of electrons is to pump protons in the cytochrome BC1 complex across the membrane. And we will hear more about the cytochrome BC1 complex in the following talk given by Hans Deisenhofer. And the cytochrome BC1 complex also plays a very important role in respiration and you will see this complex in a similar scheme. The electrochemical proton gradient consisting of an electric field which is the more important component and the proton gradient drives proton spec through the membrane. And it’s now clear from the work of Paul Boyer who presented that part yesterday and John Walker and his colleagues in Cambridge, that you have here this water soluble complex for which they determined the structure. That the gamma subunit here rotates and the backflow of protons here drives the rotation of the gamma subunit. And per one rotation of the gamma subunit you get one ATP formed from ADP and phosphate. ATP is the general currency of life, you could call it the Euro of life. So we have already a unified energy currency in biology, we will have it in Europe, maybe in the world at the end also. Now I start with the light harvesting complex. You see here the structure of a light harvesting complex determined by us recently in Frankfurt. And you see here in green chlorophyll molecules. We have two rows of chlorophyll molecules in overall. There are 8 chlorophylls in this complex, there are 16 of this type up here and they are actually, they are linked, they are bridged by carotenoid molecules. Carotenoids are very important molecules in life. First, they have a photoprotective function. And they prevent the damaging effect of activated oxygen species. And also they quench triplet states of chlorophylls which could generate such excited oxygen molecules. And second, they absorb light and they transfer the energy to the chlorophylls and the chlorophylls then transfer the energy to the next light harvesting complex and then to the reaction centre where some work is done. This here is the same complex viewed from the top on to the membrane. You see here 2 helical membrane proteins. This is the so called alpha subunit, this is the beta subunit. And in between we have in green the chlorophylls, one circle and another chlorophyll we have here. And you see again here the carotenoids with their double role. And this kind of circular arrangement is quite remarkable and maybe nature invented some kind of synchrotron before man did it. So nature maybe is always first. And how the whole structure is arranged in the membrane, photosynthetic membrane is in here. We have here the light harvesting complex 2. That’s the light harvesting complex 1 which surrounds the reaction centre. You see here in yellow the reaction centre. And the energy, only the energy is transferred from here to the next and from here to the primary donor of the reaction centre. You see the respective chlorophylls here vaguely in red and there you get electron transfer. And electron transfer then is chemistry. We start off with first light absorption, energy transfer and then electron transfer which is chemistry. And at the end we generate ATP, the ATP is used to fix carbon dioxide, to synthesise carbohydrates in the dark reaction. And this is the food where we all live from. And of course we eat the food, we degrade food and at the end we have the citric acid cycle, we have glycolysis. And we end up in the splitting of our food stuff into hydrogen, in some biological form of hydrogen which is then converted further. And this is done in the respiratory chain. And the respiratory chain is shown on this slide. You see here the respiratory chain of a bacterium called paracoccus denitrificans, which is quite well studied. It has the advantage that you can use genetic methods in order to study the role of these components. And also it appears to be closely related to the ancestor of your own mitochondria. And everything what I tell you here is also valid for your own mitochondria in your own body. First you generate NADH, in the cyclic acid cycle. And this is the bound form of hydrogen in biology. And what the respiratory chain does, it converts it with oxygen to form water. And it has developed a quite complicated machinery. Not only to prevent that you have a detonating gas reaction to prevent explosions in your body all over and the loss of energy. And also we like to make use of the energy and the principle is the same as in photosynthesis. We have here four complexes, NADH is first oxidised here in this complex called complex 1. And protons are translocated across the membrane in addition to some electrons, to reduction of quinone molecules. So we reduce quinone molecules in the reaction centre. Complex 1 does the same. The hydroquinone diffuses in the membrane towards the cytochrome BC1 complex. And we will hear all the details of that complex in the subsequent talk. Electrons there are transferred towards cytochrome C, cytochrome C diffuses towards cytochrome C oxidase, this is this complex. And this complex is of particular interest because it is the one where oxygen is reduced, water is formed. And here we have some kind of vectorial reaction. These kind of vectorial reactions were first described by Peter Mitchell in his so called chemiosmotic hypothesis for which he was awarded the Nobel Prize too. And this is a very important concept and it was very tough to get the concept brought through and he fought about 20 years. And I think that Boyer yesterday had a slide where he showed you the rate of acceptance of his hypothesis among his colleagues. But I think now nearly everybody accepts it. And cytochrome C oxidase there is the terminal enzyme. So electrons from cytochrome C are transferred to a binuclear copper A centre. So we have here some inorganic chemistry going on, reduction of two copper atoms. Electrons are transferred further to a first heme A molecule. Heme A which is in this case called simply heme A. And the electron further transferred to a second heme A which is now called heme A3, for historical reasons. And pretty near the heme A3 iron we have a copper B bond and the active site, the business unit here is between the iron and the copper B. That’s the place where oxygen is found, where electron reduced the oxygen, where protons are taken up. Protons are taken up exclusively from the inside and water is formed. So we get creation of an electric field by having the electrons from the outside and the protons from the inside. And in addition nature has invented a trick to transport the same amount of protons as there are consumed by water formation across the membrane. And this doubles the energy yield in our cytochrome C oxidase. The consequences of course that you need only half of the food in order to get the same amount of ATP at the end. Nowadays of course in Europe we have the opposite problem. And there certainly are people which would like to switch off the proton pump in order to have less efficient energy conversion and to fight obesity in the body. But certainly this is not the approach, not the goal of our research to abolish the proton pump in cytochrome C oxidase. We tried again the method of crystallography to get the structural information which we need to understand the mechanism of this enzyme at the end. So we tried again to crystallise it, isolating this complex first from the membrane in the form of detergent micelles and trying to crystallise. This didn’t work from the beginning and we used another trick. We used monoclonal antibodies. So we used methods from immunology in order to get crystals. We used methods from gene technology to produce crystals. We produced monoclonal antibodies first, which means we take a mouse, immunise it with this complex, the mouse develops antibodies against the cytochrome C oxidase. We sacrifice the mouse, take the spleen cells, fuse them with cancer cell lines. Get a hybridomas cell lines which produces antibodies against the cytochrome C oxidase. Then we continue, isolate the genes for these antibodies and express these genes in bacterium E coli. Then we go on in our stuff, make the complex of cytochrome C oxidase in the part of the antibody fragment and crystallise this complex. These are now the crystals. They have a length of about 1 millimetre, diameter is about 0.3. Then we use crystallography and the most important point when you get crystals is that they should diffract electrons well. The diffraction here was I would say, was motivated best, you see here diffraction. Taking a synchrotron and the synchrotron actually was in Japan, in Tsukuba. And this synchrotron was particularly helpful because it had a mode of data collection which could be used for less well diffracting crystals and also for adjacent sensitive crystals, much better than the European ones. And the diffraction limit here is about 2.8 angstrom but perpendicular to it, it’s only about 4 angstrom which is just the limit of getting a protein structure. Nevertheless you see here now this structure of the entire complex. So this is the cytochrome C oxidase. This is the antibody fragment which we used for getting the crystals. It helps to form the crystal lattice and without this addition to watch the cytochrome C oxidase, we would not get crystals because most of the protein here is actually buried in a detergent micelle. And in a detergent micelle this part of the protein is not available for crystallisation. So this is why we had to use this antibody fragment. And we tried it also with a PC1 complex, it worked well and at present the method, we have a 100% success rate. And I hope that we will be able to determine more membrane protein structure in the near future. And I hope you can hear about more membrane proteins 3 years from now in the same chemistry symposium here. Here you see here in purple, subunit 2, you barely can see here carbon atoms bound to the protein. You can see here innovate now a heme A which is bound towards the yellow subunit 1. You see here heme A3 and here is the copper B. So electron transfer is restricted to this rather narrow part here. Cytochrome C binds towards this corner of the enzyme and transfers the electron towards here and then the electron is transferred here and then here. Protons are taken up from the inside. And protons are pumped across the membrane and it is of great interest to understand how the transfer of electrons and protons are coupled. What's also of particular interest is the distribution of amino acid residues. And I show you here the distribution of both residues which are frequently charged in nature. Of course argenine and lysine can be positively charged, aspartic acid and glutamic acid can be negatively charged. And we have only very few in the hydrophobic environment of the membrane. And they are there for either structural or functional reasons, these are there for structural reasons. This is, we don’t know it yet for structural reason. This is here a glutamic acid residue and when you change this by genetic methods, so you have to use genetic methods, site directed mutagenesis to change it, even to a glutamine which is a rather minor change. The enzyme is dead, it doesn’t work. And this is therefore of great importance. There is another residue of great importance which is this lysine here in blue. And if you change this to a neutral residue, you get the same phenotype, the enzyme is dead, it doesn’t function. And we presume that these residues are involved in proton transfer towards the active site. And this may be also involved in transferring those protons which are transported across the membrane. And I will also discuss some more residues later on. I simply, and I will describe individual protein subunits. We have here subunit 3, subunit 3 has an open V-shape arrangement. We have two transmembrane alpha helices here, we have here five transmembrane alpha helices. And we have some bound lipid molecules here in the cleft. And the role of subunit 3 is not well understood. There are two suggestions, first you can delete the gene in our bacterium and what you get is you get a reduced amount of cytochrome C oxidase consisting of two subunits. This 2 subunit cytochrome C oxidase is still active in proton pumping and in the turnover of the enzymes. So it can have nothing to do, the subunit with proton pumping or with the oxygen reduction. We think actually that this subunit has some role in oxygen diffusion and I will talk about this point later on a little bit. Here I show you subunit 2, subunit 2 has an N terminus up here. Here it goes to the C terminus and we have an irregular protein fold. Then we have two membranes spanning helices. Then we have two here, the kind of protein fold which was already known, it corresponds to that of the type 1 copper proteins. And here there are some differences. Primarily we have here, we have two copper atoms. You see them here at the tip together with the mode of binding. I don’t go into further details, due to the time limits. And why we have here two instead only of one copper atoms, is not well understood. It might be due to the reorganization energy and what this actually is, probably we will hear tomorrow by Rudi Marcus. And he would develop a theory of electron transfer and the reorganization energy is certainly an important parameter and might be the reason why we here have two and not only one copper atom. This is here subunit 1. Subunit 1 has a rather regular appearance, looks like a cylinder consisting of 12 transmembrane helices. And in here we have the binding site of the porphyrin rings, the hemes, we have here the heme A3. And quite interesting, the hydrophobic tail here is bent away and this allows the access of protons towards the active site here. And I will discuss this point also further. And here you see subunit 1 now from the top and this is actually the most interesting view. You see the 12 transmembrane helices, they are all tilted. They are rather long, much longer than was expected. Here we have helix number 1, the brown one, number 2 is the green one, number 3 the blue, 4 is purple, 5 is the red, this red one is 6, they continue like this with counting. It’s a sequential simple topology, sequential arrangement of helices. However 4 helices each in projection form a half circle. So here we have one half circle, here we have another half circle and here we have the third half circle. And in this kind of arrangement you generate three pores. So we have here some kind of pore. And we think that this pore is used for proton transfer to the centre of the membrane. And then the proton is diverted towards the active site. Here we have the second pore and this pore is used for proton access towards the active site partly, controlled proton access to the active site. And then the pore is blocked by the heme A3 which is seen edge on. Here you see the copper B nearby. And here is the place where the oxygen is bound. And the tail here is bent away again, as I mentioned already in order to allow access of protons towards this place. Here we have the third pore and this appears to be tightly blocked by the heme A which is the first electron acceptor which transfers the electron towards this other heme. The entire cytochrone C oxidase can be seen here from the periplasmic space. Again in a simplified truncated version. Subunit 1 in yellow, subunit 2 in blue, together with a beta-strand (in red) rich area. And you can see here this part covering here additionally the heme A 3 and the copper B which is in this position. And you see also very nicely here subunit 3. And what we actually think is, we have here identified a hydrophobic channel from the binding site of the lipids towards the active site here. And we think that this channel actually is used for the diffusion of oxygen towards the active site. And you are probably not aware of the fact that oxygen is a hydrophobic entity. It likes to be enriched in membranes and you know from spectroscopy that it is enriched in the membrane by a factor of 7. And therefore it’s very likely and it makes good sense if we have oxygen diffusion from the lipidic milieu from the membrane towards selective site. And if we here have loosely bound lipid molecules, we are able probably to enrich oxygen here further. So this may be an oxygen trap and the oxygen then can diffuse from this oxygen pool towards the active site. Here you see more details of this oxygen diffusion channel, of this presumed oxygen diffusion channel I should say. You see primarily hydrophobic residues, phenyl rings, lining up, tryptophan residues. And at the end you here have valine. Here we have the binding site for the copper B together with a ligand. Here we have the heme A3. And it is known from spectroscopy that when oxygen diffuses in, it first becomes temporarily bound towards copper B and then it is transferred to the heme A3. It also makes sense from the channel structure. And site directed mutagenesis, genetic method has shown that when you change this valine to the slightly larger isoleucine, you get a reduction of the diffusion speed which means that the KM value for the enzyme for oxygen is increased by a factor of 10. But maximum to an over number of the enzyme is still the same if you simply have a 10 fold higher oxygen concentration. So this makes sense with the assignment of this channel to an oxygen diffusion channel. Now I come towards the most complicated thing in my talk. This is the environment of the binding site of the heme A3. You see the heme A3 light blue together with the protein. The heme groups have two propionate side chains. So here we have one propionate side chain and this is charged, neutralised by forming an iron pair with an arginene here. We have the second propionate here and this is not charged, neutralised. The charge there gets stabilised by accepting hydrogen bonds from neighbouring residues here. Quite interestingly, we also had from the very beginning, we could identify the copper B down here. And from site directed mutagenesis work it has been known that there are mostly like three histidine ligands. But we could only identify two histidine ligands and the other one, for the third one we had no electron density at all. And there had been the proposal in the literature that a histidine might change its protonation stage during the term of the enzyme being negatively charged meaning imidazolate form in one state of the enzyme becoming imidazole and imidazolium. The positively charged form in the enzyme and the imidazolium form cannot be ligand to copper B. So the histidine could shuttle between two positions depending on the portonation state. And it could carry over protons by switching between the two positions and this might be the mechanism of how a proton pump works. But this I have to say is speculation and there is at present no evidence for that. And when we repeated the whole experiment in the absence of azide with the oxidised and reduced forms, we could not discover the absence of the histidine ligand. We could clearly see the ligand of the histidine as being still a ligand to copper B. And it stays there upon oxidation and reduction of the enzyme. And then we think that the absence of the histidine ligand was an artefact due to the presence of azide. So you have to be pretty careful. As I mentioned we had some problems with our original crystal form and a graduate student in the lab Christian Ostermeier who also had made the first crystal form, worked very hard to get better crystals. He succeeded now with the 2 subunit form of the enzyme, only two proteins of unit presence, he got these crystals. They don’t look as nice as the other ones, the protein is pretty unhappy, you see some denature protein under the crystallisation conditions, but the crystals are much bigger than our first version. So we could determine an improved structure and the improvement was largely, we had a much better definition of the position of the atoms and we could start to identify bound to water molecules. And each of these green balls means the position of a water molecule. And much to my surprise you see many water molecules up in the upper half of the enzyme but only very few below them. Which probably means that we have also water molecules here but they are much less ordered whereas the water molecules here are much better ordered, much more highly ordered. And this could have some functional significance. This here, coming back to mechanism, shows you the electron density of the copper B, this is the copper B electron density. And this is the missing electron density for histidine ligand. And this could be interpreted in the form of a mechanism of proton pumping when you consider that actually you have an iron atom here, a copper B atom, the iron has a formal charge of +3, the copper of +2. And most likely you have an OH-, a hydroxyl group between both atoms cannot be resourced by x-ray crystallography at our resolution. But there is spectroscopic evidence for that. And I think also otherwise if you would have only the positive charges here, you would get a too strong electrostatic repulsion, the distance is only about 5 angstrom. And the whole thing nearly would explode if you wouldn’t have a negative charge between both atoms. So the OH here makes good sense. The basic idea however in this histidine cycle mechanism is when you start from the oxidised form of the enzyme, you get reduction of the enzyme from the periplasmic place. The electron is transferred first to the iron, then it hops over, jumps over on to the copper. And this then disturbs the charge balance in the whole environment. And in order to regain the charge balance, you take up a proton, very simply. And this then converts upon the first reduction, this copper B, this ligand from the imidazolate to the imidazole. Then you take up the second electron which stays on the iron. You neutralise this additional charge by taking up a proton. You convert this ligand to the imidazolium form and the imidazolium can no longer be a copper B ligand and it flips over in this position. Then you bind oxygen, this is known that only after full reduction you bind oxygen here. And then the chemistry, the oxygen starts. You get uptake of protons to form the first water molecule. When you take up protons you disturb again the charge environment and the charge of the protons then expels these two protons up here. And so the uptake of the two protons to form water would expel the protons already here in order to have the same charge environment. And this hypothesis I think is the most simple one for explaining a pump mechanism but it certainly cannot be phrased in the term of a histidine cycle and we have to think of alternatives. This is now the electron density for the oxidised and reduced form. And in the histidine cycle, in the reduced form we should not have all the histidine ligands present. They are there and this I think is one of the fundamental experiments which tells you that the histidine cycle mechanism is unlikely. In addition I have great problems to see how a positively charged histidine would not deliver back its proton towards a reduced oxygen species at the binuclear site. Now I show you some more interesting new details. Again it’s the heart of the enzyme. You see here an unbiased electron density map, calculated by a technique called simulated annealing omit map, you omit from the model all these residues and start to, and do simulate annealing to get a way of bias, use the new phases, calculate an unbiased electron density map and this is this electron density here and you rebuild the model. You see here the model of the heme A3, this is the iron, it’s a histidine ligand to the iron. You see here the copper B and you see here the histidine ligands towards copper B. This is the third one and much to the general surprise what you find is that a nearby tyrosine is covalently crosslinked with the histidine. So we have here an unexpected covalent crosslink. And of course this kind of thing, these modifications cannot be discovered from DNA sequences. You still have to do some further experiment apart from DNA sequencing. And there is some meaning about this and I would think that the principle meaning is that you get such a high oxidative power during the reaction that it extracts an electron from the environment. You generate a radical and it’s known that tyrosine can form this kind of radicals. And then in a radical mechanism you simply, you get this kind of crosslink. And such kind of cross links also have been observed, not with histidines but with for instance cysteines and other residues in peroxidases. So peroxidase in general have enough oxidative power to generate radicals which then leads to crosslinks in proteins. And this means we can have the same type of reaction here. Now I think it becomes more systematic. This is now the catalytic cycle of cytochromes C oxidase. And this is simple, we start off with oxygen, the iron is in +3 form, copper +2. When we put on the first electron which goes on to the copper, converting one to copper 1 and this is known that this is accompanied by the uptake of a proton, this is the one electron reduced form, take up the second electron accompanied by the uptake of a proton, you get the reduced form iron +2, copper +1. Then you take up the oxygen only and you get formation of the so called compound A, which was discovered by Britton Chance in Philadelphia more than about 20 years ago. And then you form a compound called P, P was for a long time thought to be a peroxy compound. At least a former peroxy compound, iron +3, copper +2, but the electrons from the metals are transferred onto the oxygen. I always considered this as a very unlikely structure and I wondered how this could be stable. And actually my scepticism was, I should say was satisfied by recent experiment in Japan by Kitagawa, who by resonance in spectroscopy got clear cut evidence that you actually have already in the P state split oxygen compound. And you have actually oxo-ferryl. This means an oxygen bound to the iron in a double bonded way, iron +4, oxygen -2 in this form and you have the copper +2. But this would then mean that you miss an electron here. And this compound has to steal an electron. And the possibilities are that it’s stolen, the electron is stolen from the porphyrin, it’s stolen from the residue and the tyrosine is a good explanation. People discuss also that the copper might become copper +3. And also people discuss that the iron might be +5. But these are all less likely than this explanation where you steal the electron from a tyrosine creating a tyrosine radical. But this is under debate but I think that this root became much less likely within the last year. So this is a rather new experiment. The major problem that we then have is however that it is known primarily from Wikström’s work in Helsinki that the proton pumping is of course only in the transition from the P state to the F state. This state here is well characterised and everybody agrees that this is oxo-ferryl state here. And it’s also known that the next two protons are pumped in the conversion, in that way of conversion. And if we have this kind of mechanism we might, we should have observed proton pumping already and we have to think about a way out if we don’t have formation of water molecules and the incoming protons expel the protons which had been taken up during reduction at that time. So now I just want to come back towards the structure. You see here in red together with the histidine ligands, the heme A, this is the heme A3. And the point I want to discuss is where do the protons go which are taken up upon reduction? I think actually that the protons which are taken up are stored on the propionates in this area up here. And there appears to be a rapid equilibration of protons in that area. And later you get uptake of the protons and I have formulated a mechanism which I think would be in agreement with all kind of mechanisms. Actually they are in the protein which I didn’t say so far, they are the two identifiable proton transfer pathways, this one with lycene involved and this one here which has an aspartic acid at the beginning and a glutamic acid at the end. And I think that the protons are delivered from this glutamic acid on to the propionates towards this area during the pumping. And when electrons come in towards forming the oxygen, they are then expelled from that area to the outside, this would explain pumping. But I cannot go further into details in order to explain that cycle. And I would take too long now to explain these details and you have to contact me privately in order to discuss it further. But just to say, we need different methods in order to prove this and what we did now, we made a biosynthetic deficient mutant for heme A and fed isotopically labelled precursors in a way that only these carbon atoms are labelled with 13C of the heme A and did Fourier transform infrared spectroscopy. So we now use even another method and we look what changes in the vibrational bands upon reduction. And you see this for labelled and unlabeled, you see the differences for labelled and unlabeled cytochrome C oxidase. And there are clear differences which tell you that there are changes of protonation states and conformation of the propionates upon reduction. And this at the end can be summarised in some kind of mechanism which purely operates on electrostatic grounds. And this is now our working hypothesis and we work very hard either to prove or disprove such kind of mechanism in detail. But at least you get an idea. The whole thing is complicated. And still we have to get an idea, we have to go into details and we need all the methods. And I think the message for you is, for the students is, that you have to know all the methods in order to use them efficiently to solve biological problems. And the people at the end I would like to acknowledge are here from my own group, Gerald Kleymann establishing the method of the antibody fragment production in the lab. Christian Ostermeier did most of the crystallisation work and also determined the second crystal structure. Hanni Müller isolated most of the material, So Iwata isolated in a very, sorry solved the protein structure in a very short time. Axel Harrenga then improved his structure and got now a much better structure. Aimo Kannt did electro study calculations which, theoretical calculations which I didn’t mention. And Julia Behr is involved in this Fourier transform infrared study together with Werner Mäntele and Petra Hellwig from the University of Munich. Some mutant work was done at Frankfurt University by our biochemical collaborator Bernd Ludwig and Heike Witt. Now I thank you for your attention.

Michel on Using Gene Technology
(00:13:00 - 00:15:55)

Another complicated protein molecule that no one believed could be crystallized is the so-called ion channel. It was shown to be amendable to X-ray diffraction analysis by a young electro chemist, Roderick MacKinnon. The ion channel sits on the surface of nerve cells and acts as a gate-keeper, only letting particular ions through. This is the way that electrical signals are transmitted along the nerves, which was known earlier, but the way the gate-keeper actually works was unknown. A snippet from a lecture given in 2005 by one of the 2003 Nobel Laureates in Chemistry, Roderick MacKinnon, is presented here. The title of the talk is “Ion Channels: Life’s Electronic Hardware”.

Roderick MacKinnon (2005) - Ion Channels: Life's Electronic Hardware

Thank you, it’s very nice to be here. And I realise that I’m in this session because if you look at the program, I have ‘electronic’ in my title and all three speakers in this session have ‘electric’ or ‘electronic’ or ‘electron’ in the title, so we must have come out of a word search. What I’ll be talking about is a process that occurs at the cell membrane, and as you know or may not know, all living cells are surrounded by a very thin oily coat called the cell membrane. It’s about 40 Å thick and it essentially keeps all the chemical components of a cell in one place, so the chemistry of life can go on. But it presents a barrier to getting things in and out of the cell. For example the ions, that is charged atoms, have difficulty crossing this barrier because its inside is oily, so the bilayer membrane, it’s made of amphipathic molecules, it’s watery on the outside, facing the aqueous solution in oily alkane tails on the inside. And the problem with ions crossing the membrane can be understood in this simple experiment of mixing cobalt chloride in oil and water and shaking it up and then letting the phases separate and you see red from the cobalt here. The cobalt and chloride stay in the water phase, not in the oil phase and you can shake it as long as you want, or wait as long as you want, and the salt says here, the ions stay here, they don’t go into the oil. And the simple explanation for why is that: Water is a more polarisable substance, so an ion, for example a potassium ion in water is surrounded by water molecules that point their partially negatively charged oxygen atoms toward the positive charge of the ion and thereby stabilise it. And so an ion is more stable in the polarisable medium of water than it is in oil. And because the inside of a membrane is oily, this is an energetic barrier for ions to cross the cell membrane. So life has come up with many protein molecules to deal with this problem, and in a sense you can divide them into two categories, 1 I would call pumps and another channels. And the distinction really is that a pump moves an ion against its electrochemical gradient, so it builds gradients across the membrane. So for example a sodium potassium ATPase uses the energy of ATP hydrolysis to pump potassium inside the cell and sodium outside the cell. And these gradients are a form of energy, of course. And that’s where the channels come in, channels are passive, unlike pumps. When an ion channel opens up, the ion simply diffuses down its electro chemical gradient. So the ion channels spend the energy of the gradient by dissipating it but it does this for useful purposes in the cell. And here is, in this cartoon, one example of what an ion channel, like a potassium channel does. If you imagine a cell membrane with a potassium channel in it, when the potassium channel is open, the potassium ions of course begin to run down their electrochemical gradient. But when a potassium ion runs down, of course it carries a positive charge and if this membrane doesn’t have other ion channels, for example anions cannot follow or sodium ions cannot go through because the channel is selective. Then what happens is this potassium channel, by letting some ions cross, creates a charge separation across the membrane. And what an electro physiologist says is this membrane is an electrical capacitor and the ion gradients are the battery and the potassium channel is actually, allows the battery to charge the capacitor. And in fact the potassium ion will stop running downhill when the electrical gradient balances the chemical gradient, at the Nernst potential. If you take any living cell and stick an electrode in it and measure the voltage across the cell membrane, inside compared to out, you find it’s negative, something like between - 50 to - 200 mV, depending on which cell you're looking at. And in most cases the reason for this is exactly what's shown in the picture here. The membrane contains potassium selective channels that render this membrane an electrode for potassium ions. Now, the idea that what was called the cell wall, more than 100 years ago, the idea that channels could exist was already proposed. And it was to try to describe osmotic phenomenon with cells. The idea that ion channels, or at least somehow the membrane could be permeable to ions, specifically permeable, that is selectively, is more than 50 years old and two scientists from Britain, Alan Hodgkin and Andrew Huxley, described the theory of how this charging the capacitor is used to send an electrical impulse. Now, when they described this theory, they didn’t have a notion of channels. In fact they didn’t say whether it was channels, they made careful measurements and could tell that when an electrical impulse travels along a membrane and what an electrical impulse is, is a transient swing in this potential, from the direction you see it minus inside, positive outside to the opposite. And this moves like a wave across a cell membrane surface. They made careful measurements and discovered when that was happening the permeability of the membrane was going from weakly potassium selective, or very small conductance, to actually high sodium conductance. And they proposed that sodium rushes in to make it positive inside and then quickly after that the sodium conductivity goes away and the membrane becomes permeable to potassium, potassium rushes out and repolarises the membrane. And this could spread in a wave across the membrane. And if you don’t understand why it would spread as a wave, you’re right, because there’s a property of these channels that I didn’t tell you about, in fact I’m being careful about this because I’m told, I’m examined, I was told by a prominent Nobel laureate that if he didn’t understand my lecture, then I failed. So this wave part of it, in 30 minutes I’m not going to talk about, but I’m going to focus on the property that the membrane can change its permeability and that we know today that this is due to ion channels. I should just make one comment about this and putting in the context of the importance of this electrical impulse propagation and it’s just to say that when I think to wiggle my finger, obviously a signal got from my brain to my finger, that went about a meter, told the muscles to contract and it wiggles. And in fact this is how we move and it's such impulses between neurons in our nervous system that let us think these signals are travelling fast, ok. So in a fraction of a second the signal gets out to my finger. Now, this kind of process, this electrical property of the cell membrane is very important for that. And if you think about it, this was a challenge for life, to make organisms as big as we are. Because if a cell had to rely on diffusion alone, then we couldn’t have processes like this happening in a reasonable amount of time. Because with the diffusion coefficient of typical small molecules and ions in solution, something around 10 to the minus 5th cm2/sec, if you’re an E.coli and about a micron across, a molecule takes a mean time of about a millisecond to diffuse from one side to the other of the cell, so a millisecond is reasonable in terms of biological time, processes can happen just by diffusion. But if you’re limited to diffusion and you want the signal to get out to my finger, there’s a problem. Well, even for a molecule with that diffusion coefficient or an ion to diffuse a centimetre would take around a day. And to go a metre would take years actually. So if you think about it, if we had to rely on diffusion for signals to go out, life would be very different. This lecture would be a long time, it would be very boring. So this is actually quite an interesting elaboration of that biology has evolved to get signals across a space very quickly, in a short period of time. As I said, what I’m going to focus on for the limited time is just what is it that makes a membrane allow potassium to conduct across it and this is really what got me interested in this problem. When I started my work in this field, I was an electrophysiologist and I was really kind of mesmerised by two aspects of these channels. And what I’m showing here is what happens if you isolate a tiny patch of membrane, there are different ways of doing it, you can do it with patch pipettes to isolate them, a technique developed by Sakmann and Neher, by putting a patch pipette on a cell and isolating a tiny component of it so that on average you have only one channel and you can look at its electrical activity. Another way shown here in a cartoon is you can take and isolate the channel and you can put it reconstituted into a plain or lipid membrane. You can then put an amplifier across this that controls the voltage across the membrane and at the same time measures the current. And what you see in this case, at this potassium channel, is you’re seeing two records, it’s a continuous record going from a long time on this x-axis and current on the y-axis. And what you’re looking at is a channel flipping open and closed. So when it’s closed it’s at the zero current level, and then you see this flickers open. And when it’s open, potassium ions are running through this channel, and then the channel closes again. And the reason it’s going back and forth, is you’re really looking under the conditions in which this channel was recorded, it was at equilibrium and you were just looking at fluctuations between the open and closed state of the channel. So you’re looking at the sort of equilibrium if you will, the fluctuations back and forth. And you’d say that this channel is probably open 30% of the time or so, on average, but you watch, it’s never open only 30% of the time, it’s either open or not and that’s what you are looking at. Now, what really interested me is the fact that when this channel opens, this is a scale of 10 picoamps, so 10 picoamps is 6 times 10 to the 7th ions per second, almost 10 to the 8th ions per second. And this is, the reason the ions are moving through is there’s a voltage across the membrane and it’s a voltage, the kind you’d see under physiological conditions in a cell, so it’s a sort of reasonable electrochemical driving force. And you’re watching ions go through very, very fast, near 10 to the 8th is near what you’d call the diffusion limit. And what that means is if you imagined a little patch opening to the channel, the pore of the channel, that would say be as, diameter being equal to that of a potassium ion and then you imagined and made a little calculation based on a solution outside having .15 molar potassium chloride, something you’d find in a cell. You’d ask how fast do potassium ions collide with that little disc and you’d come up with a number that’s very close to what you actually observe for throughput. And what that tells you is the channel has somehow lowered the barrier for the ion to go through that oily membrane very beautifully, so that it goes through with the energy barriers that are approximately equal to the energy barriers that an ion or that a potassium ion experiences as it diffuses through water. And then the other fascinating thing is that this is very selective for potassium over sodium, so it tells the difference by a factor of 10 to the 3 to 10 to the 4th, so it’s working near the diffusion limit and it’s discriminating between potassium and sodium very well. So this summarises what I just said, that the conduction rates for potassium channels are approaching the diffusion limit and yet the selectivity is quite exquisite and here is potassium and here is sodium. So it’s letting the larger ion through and not letting the smaller ion through. These alkali metal cations have differences but they’re not hugely different, how does this happen? That’s the problem that fascinated me. Now, as an electrophysiologist, what we had working on this 15 years ago, or a bit more now, really were the aminoacid, the gene, genes for a few potassium channels, which means we had the deduced amino acid sequence, so remember proteins are made of a polypeptide chain, amino acids linked together that fold up to form a 3-dimensional structure. We had no idea what that was. But what we could do is we could make the electrical recordings in the way I’ve shown you, we could then alter the gene such that we replaced amino acids with other amino acids and go through this and we could imagine what was, define the effects and then imagine what was happening. And we could tell a lot this way, we could tell the potassium channels had four subunits, we could tell, you know, where some of what made them open and close occurred. And what we could, the thing that we really came upon, and the thing I was interested in is that there was a little sequence of here, 8 amino acids shown here, TATTVGYG, each one of these, it’s a single letter code for an amino acid, threonine, alanine, threonine, threonine, valine, glycine, tyrosine, glycine. And when we started this work, there was one potassium channel gene in the shaker potassium channel, it has 100’s of amino acids in it, but it was out of a small number that we found that if you altered these you altered selectivity in the conduction process. And as different laboratories identified more potassium channel genes, they all have this sequence. So it was clear to me, you know, we call this the signature sequence of potassium channels, it was actually used to identify you as DNA was sequenced, you’d see the sequence and call it a potassium channel. Now, this was a certain kind of information that was to some extent useful, but actually it was very limited. Because if you wanted to know how the potassium channel tells the difference between potassium and sodium and conducts at a high rate, you know, there’s certain pieces of information missing here from this linear representation. And at that point I decided that we would have to see this and that I would study crystallography to work on this problem. Now, at that point I had two problems, the first problem was, people said Well, that seemed ok, that’s probably difficult, but I refuse to believe it’s impossible, because scientists before me, Robert Huber, Johann Deisenhofer, Hartmut Michel, had determined the structure of a membrane protein, just like an ion channel, and I had worked out some of the techniques to do that. And others had then after them done that with a few different membrane proteins. So it seemed to me this is not an impossible problem. The second problem I had though, is people said ‘You don’t know what you're doing.’ And that was a bigger problem actually. But when I thought about it I said: I’ll certainly never see this thing, or at least never be the first one to see it.’ And also there’s nothing like being afraid of making a fool of yourself to make you study hard and learn fast. And so this was clearly the thing to do. And obviously I’m jumping over a number of years to show this, this sequence makes this structure you're looking at, two of the subunits of a potassium channel, the blue stuff, blue mesh is electron density and this is a big jump, there were many steps along the way, lower resolution, actually initially non-membrane protein structures to learn crystallography, and then lower resolution structures, and then working out tricks to get better crystal lattice, higher resolution diffraction, to get this kind of description of the potassium channel. And this is the pore of the potassium channel, it’s actually a very simple structure. You’ve now seen these …-like structures, they’re simplified renditions of, so this thing that looks like … is actually a polypeptide chain folded up in Linus Pauling’s alpha helix, and so you see the helixes, there are four subunits, they’re shown in different colours, the membrane runs from here to here, so outside is on the top, inside is on the bottom, and there’s a pore down the middle between these subunits. And one thing I want to show you about this is a different representation of the channel, even at low resolution, what I found very beautiful in looking at this structure is that if this is outside the membrane, this is inside, in the centre of the membrane, the ion conduction pore is very wide, in fact it makes a cavity that will hold 30 some-odd water molecules in it. This red cherry here is actually electron density for a rubidium ion, it’s a good analogue of potassium. And remember the oil and water thing, I shook the cobalt chloride in oil and water and said the ion stay in water. Well, it turns out the energetically most unstable point for an ion to cross the membrane is at the centre of the membrane, where it’s furthest from the water. And what nature has done here is made a very beautiful structure, where it keeps an ion fully hydrated at this otherwise unstable point. And another feature of this is that these alpha helixes, shown in yellow, are pointed in a particular way, so that they have a partial, it’s such that their partial negative charge is on this end pointing at the positive ion. So when you think about that, you say ‘Ah, I see what nature is up to here, it makes a protein structure with water, keeps the ion hydrated at the centre and in fact points partial negative charges close to the positive ion to help stabilise it, to overcome this barrier inside the membrane. What makes the potassium channel a potassium channel, though, is this part right up here. And that's what we look at here, looking at two of the subunits, the potassium ions are in four positions, labelled 1 through 4, each one is surrounded by oxygens, shown as red sticks here, from the protein, and so each one actually will sit in a little cage of protein oxygen atoms and you can get a good feel looking at this representation of the selectivity filter, of what the potassium, to what nature is up to with the potassium channel. So here are the four sites in the selectivity filter and in this crystal structure, in the centre of that opening, half way through the membrane, is a potassium ion that happens to be hydrated, you can see eight electron density for eight water molecules surrounding this potassium ion suspended in this pool of water. And if you look at this, you see, look how the water molecules are organised around the potassium ion, and then look at each position in the selectivity filter, and you see a very similar organisation of oxygen atoms surrounding the potassium ions, so here four on top, four on bottom, three of the sites are in a square antiprism, like you see the water surrounding the potassium ion here in the crystal structure. And so you look at this and say, ok, it’s very simple, the potassium channel has presented four sequential sites in a selectivity filter, each site has protein oxygens that act as surrogate water molecules to let the potassium ion come out of its hydrated state and into the selectivity filter where the protein replaces the water molecules. And then you ask, well, why wouldn’t sodium do this? And the experimental observation, as I’ll show you in a minute, that it doesn’t and the explanation would be that, what I’m not showing around here, is a well packed protein structure. So this structure actually makes these cages that would be the size for the potassium ion. The essence of selectivity really is the ion has to come out of the water and into the site in the protein, which means the site in the protein has to replace the water molecules or has to compensate for the energy of dehydration. Because the sodium ion has a smaller radius, its energy of dehydration is much larger. And so in a sense it’s easier for, in a sense you’re ahead in making a potassium channel, by that energetic argument. If you do the experiment of asking, well, let’s try to take the potassium away and put sodium in here, it’s interesting. So here’s a simple representation of the filter, you then go from a condition where you crystallise this with 150 mM potassium and you now replace all but 2 mM of that potassium with sodium. And what happens is the filter actually changes its confirmation, it loses the ions from the centre. And what you conclude is that the structure that has evolved to select for potassium, actually its structure depends on the presence of potassium. And if you think about it, that’s a very nice sort of property of a device that has to select a potassium ion. When it’s challenged with the ion that’s not supposed to go through it, it changes its structure and actually pinches shut. Another thing, if you look carefully at the, this is the conductor form, and now I’ll show you, just above, just outside the selectivity filter, in this picture we looked at before, this is something interesting, there’s actually some low occupied potassium ions there. And in fact if we zoom in on that, it’s very peculiar, this is an electro negative region of the molecule, it should attract a cation, but actually the cation sits in sort of two possible energy minima that are too close to each other to be simultaneously occupied. Notice that this one would be dehydrated on the bottom but still hydrated on the top. The explanation for this, and I just give you the answer but many experiments we’ve done to pursue this observation tell us that what this represents, this electron density represents is actually the fact that when we look in the crystal and see these four sites, that what we’re looking at is two configurations of potassium channels in the filter. And half of them have ions in the 1 in 3 position and half have potassium in the 2 in 4. We haven’t directly observed the water in between, but chemically it would make sense that there’s water in between. And functional experiments tell us that potassium and water is coupled moving through the selectivity filter. And these, just giving a survey here, leaving of course a lot out, but these two, what we call 1, 3 and 2, 4 configurations, actually make the end points of a very simple throughput cycle, shown here. Where potassium ions could be in a 1, 3 or a 2, 4 and they can rattle the queue of potassium and water can thermally agitate back and forth. Then entry of a potassium ion from one side or the other would happen simultaneously with the movement of the potassium ion out the opposite side. So in a sense one throughput event isn’t one potassium ion going, so one charge unit going all the way across corresponds to several moving a fraction of the way. And this is a little cartoon just to show you the rough idea, it’s only a cartoon, for people who don’t like to look at cycles, and the water that would be in between the ion is not shown here. But the idea is that the filter wants to, actually what I think is the filter wants 2½ potassium ions in it. And potassium ions only come in units of 1, 2, 3. And so it in a sense is trying to draw a third one in, but 3 is too many. So it’s on this sort of state of unhappiness that will, if you think about it, support a high throughput rate. Now, I know some people in the audience are probably thinking that’s very funny, he just showed us four equal peaks in the selectivity filter and yet he’s telling us that these represent two distributions of the potassium ions, why are they equal, why don’t you see mostly, you know a lot of, why are half the proteins in the crystal, why do they have the ions in 1, 3 and the other half in 2, 4, why is it balanced this way? And I can’t give you the definite answer, but a thought experiment gives a good possible explanation. And that is if you just simulate throughout put through a system like this, where you have 1, 3 and 2, 4 and you then look at what the throughput rate would be as a function of the energy difference between these states, ok. So here is energy difference between 1, 3 and 2, 4, imagining this cycle. And this is the rate. What you see is that you get a peak in the throughput rate when there’s no energy difference. And if you just close your eyes and think for a minute, this is a very intuitive result. Because if there’s an energy difference between them, every time the system goes around the cycle once it has to step up an energy barrier. And so this in a sense, I would say evolution has made this thing operate at near its maximum throughput rate by balancing these two configurations that we have deduced in the selectivity filter of the potassium channel. And now, just for the last minute I want to say, I said nothing about how, why, when I showed you that record in the beginning it was flipping open and closed, I just talked about what happens when the ions rush through, and it’s because in different potassium channels these pores have additional structures added to them, that actually do something like binding of the ligand molecule, uses the energy of binding to make a conformational change that opens the pore. And in other cases, the channel has a little voltage meter on it and it changes its confirmation in response to the value of the membrane voltage and those are called voltage dependent ion channels. This is an example of a ligand gated channel that is open by calcium, and this is a voltage gated channel. And the idea is that the channels, whether they’re open or closed, the probability that they’re open and closed is controlled by stimuli. And what we know from multiple structures is that when a channel opens and closes it does a confirmation like this. So if you are inside the cell, looking out through the pore, the closed channel would look like this and an open channel would look like this. And finally, I just want to say in the last 3 minutes I think I have, I just wanted to say three things, to the students. That actually the first is that we are only just scratching the surface of understanding the cell membrane. And I think over the next few years, in fact I’m certain that over the next few years our concept of the membrane is going to change a lot. Not just a barrier to things where you have a molecule, a protein sitting in the membrane and a ligand binds sends a signal through that happens on the inside. The membrane is a very interesting chemical environment with a hydrophobic pore, a head group and we can start to see now that biology has really exploited this situation in very interesting ways. I’m rather certain that signalling, there will be amplification steps within the membrane and lateral signalling through the membrane, and we know almost nothing about that. And so we’re just starting to scratch the surface, there’s a huge forefront. The second thing is that proteins in general, we know very little about them. We don’t know a lot more about them than when Max Perutz first showed us haemoglobin. We know more structures, we know what they look like, but we don’t know why that polypeptide folds up to make the structure it does. We don’t really know why the potassium channel undergoes its conformational change when you take potassium away. We can guess, but we don’t know. We see that evolution has selected certain aminoacids in the core and they’re probably for that purpose but we don’t understand the energetics. And the third thing I want to say is when you do your science because it’s something that fascinates you and in particular, if you feel like it’s something you're not supposed to do, forget it, just do it. As I said, there’s nothing like feeling like you’ll make a fool of yourself, nothing like feeling like you’ll be a fool to really work hard. And actually, so what if you make a fool of yourself, you’re pursuing the thing you like. Thank you.

MacKinnon on Learning a New Technique
(00:15:06 - 00:19:01)

Finally, to end up where we started, here is a snippet from one of Ada Yonath’s co-recipients of the Nobel Prize in Chemistry 2009, Thomas Steitz, who also lectured at Lindau the same year as she, in 2011. Their molecule, the ribosome, is the same, but the lecture entitled “From the Structure of the Ribosome to the Design of New Antibiotics” is quite different. Knowledge of the structure of bacterial ribosome can help design new antibiotics that hinder the ribosomes “at work”.

Thomas A. Steitz (2011) - From the Structure of the Ribosome to the Design of New Antibiotics

Well, it’s a pleasure to be here and to talk to you all. This is a very wonderful meeting and venue for interacting between the laureates and the students. What I’m going to do is talk about some of our work on the ribosome and I’m also going to try and say a few words about how we got into studying this particular problem. Every set of studies has a pathway and we had a pathway. So when I was a postdoc in Cambridge, Brian Hartley, who is on the next slide, came up to me in the hallway. In Cambridge people talked a lot and Brian came up to me and he said, well, and this was about 1969, the year before I was going to take my faculty position. He said: “Well, what are you going to do next? What is your next problem?” And I said: “Well, I thought I’d work on the structure of aminoacyl tRNA synthetase complex with tRNA and aminoacid.” And he went: “There there my boy, that’s a very good idea, but why don’t you work on something you could actually do? I suggest hexokinase.” So I thanked him and I ran down to the library to find out what hexokinase was. And I then found that Dan Koshland had used hexokinase as one of the basis for proposing induced fit. Because he was trying to address the question: Why does hexokinase not hydrolise ATP in the absence of glucose? After all, a hydroxile group of sugar is just a water mimic and he said: Ah, there must be an induced fit, the only one the right substrate binds to the catalytic groups, arrange themselves. And if you don’t have, in this case, both substrates you won’t get the real arrangement. So we saw the structure of hexokinase with and without glucose and this is one of the structures that we’ve done in Stryer’s textbook and there is the structure without glucose and there is the structure with glucose. So right, he was right, that is induced fit. So then I wanted to turn back to the problem that had been interesting me all along. And that was inspired by two individuals whom I interacted with a lot, Jim Watson and Francis Crick. Now, Jim was at Harvard when I was a graduate student, actually I played tennis with him, went to his group meetings and he was working on the ribosome, so he got me interested in that area. And when I went to Cambridge, Francis of course was always around talking about everything, including the ribosome and other things. So that got me interested in thinking about the central dogma of molecular biology, which is DNA is copied into DNA, then DNA is transcribed into RNA and then finally the RNA is transcribed into protein. So we started out working on regulatory proteins and polymerases and did this for many years and we are still doing it. We are still trying to put together this entire machinery that is involved in replication, transcription, transcription regulation and the synthesis by the ribosome, and now I’m going to focus on how peptide bond formation occurs. Now, twenty years after my conversation with Brian Hartley, you probably can’t read it, but this is in 1989, our structure of aminoacyl tRNA synthetased, gluing tRNA synthetased with tRNA and ATP was published, twenty years. That would have been a very long assistant professor project. I don’t think I would have made it. So it is important to pick important problems, but you have to have the timing right, the right thing at the right time. So the ribosome, in 1964, when I was a graduate student, this was Jim Watson’s drawing, this is what the ribosome looked, was known about it. It had two subunits, a large and a small one and tRNA came into the A side with the amino acid, but they didn’t know what the tRNA structure was, and then the P side had a peptydyle, a group and didn’t know about the tunnel. And then there was translocation, so that was the general picture, but not much structure really. And then Jim Lake, in 1976, did the first EM studies of the ribosome and showed that the large subunit had this sort of a shape, that’s called the crown view, and the small subunit had a nice shape and they would snuggle up to each other in a friendly fashion, but it didn’t say much about function really. Some years later, actually this was published initially in 1997, by Joachim Frank, he introduced cryo-em, which I think is a fabulous technique for the present and the future, began to show some little invaginations in the Ribosome. They approximately positioned the tRNAs and the A side, P side and the E side, but not quite right, because all these anticodons are together, so that was the beginning of understanding the structure. So in 1995 we figured that it was time to move on and decide on the ribosome, because we’d been studying everything else in the central dogma, and you need the right person at the right time. And Nenad Ban joined the lab in 1995 and said he wanted to work on the ribosome and he was absolutely the right person at the right time. And two years later, Poul Nissen joined and the two of them really initiated the project and did a fabulous job. So what was the problem? Well, my experience tells me that in an audience like this I’m not in 25 words or less going to tell you the details of what the problem in phasing is, or for that matter even what phasing problem is, but we followed the work of the … and the first one to solve a structure of a ... using a heavy atom. You had to find the heavy atom, but then you have to locate it. It has to have a large enough signal so that you can measure its effect on the crystal and on the ribosome we need a super big heavy atom, so we use cluster compounds at very, very low resolution, where the signal is much stronger. And so here’s Joachim Frank’s twenty axiom resolution map of the large subunit. The first structure that was published was at nine axiom resolution, that we published, and we could see helixes and we saw the shape was the same, so we knew we were on the right track and then we went to progressively higher resolutions. And then finally in 2000, we got the atomic structure with the large ribosome RNA in green and white and the 5S rRNA coloured in purple and white. And protein scattered around the surface and the preptydyl transferring centre right at the bottom on the left, totally surrounded by our … and I’m not going to go into this in detail but we found that only RNA was at the side of peptide bond formation. So, as Francis Crick had predicted in 1968, the ribosome is indeed a ribozyme, that is it’s the RNA that’s involved in catalysis, which makes sense. If you think about the fact that the machine that makes proteins, the first protein couldn’t have been a protein, it’s the chicken and the egg problem. So indeed we showed that was right. And if you split the ribosome in half, now you can see that there’s a tunnel through it that’s about 100 Å long and about 20 Å wide and right at the surface of RNA that’s split, showing there’s pretty tightly packed and that there’s protein that protrudes in towards the centre. And here is a model of the 70th, that we’ve done recently, and again you can see the RNA has a very complicated twisted shape with the proteins embedded around. Here’s the large subunit RNA again, protein scattered around on the surface and the bits of protein that go in are in stabilising the RNA and here we come back to the A side tRNA. And the message is going in and being decoded here and the peptide bond formation is occurring in the bottom in the large subunit. So let me now turn to the source of the regular and catalytic power, and again this was done by a graduate student, Martin Schmeing, who also made a movie that I’m going to show you momentarily. And so what he did was, what I think is important in using structural studies to understand function, he captured every state in the process and took a structure of that and then he put it together in a movie. So here is the process, the A side substrate with an amino acid, the alpha amino group attacking the carbonile carbon of the peptidal tRNA, Now, he couldn’t, in the large subunit, look at the whole tRNA so he actually had just CCA amino acid, CCAI peptide and so here are the three tRNAs and the long polypeptide tunnel, and you can see the proteins coming into the interior. And so we’re going to focus on this area here, and finally he got the situation right at the side of peptide bond formation with the attacking alpha amino group in a position to attack the carbon EO carbon, positioned by a three prime hydroxyl of a sugar of A76 and positioned also by this A. And so then the question is: How does it work, what’s involved in catalysis? Well, part of it is orienting the substrates, that’s standard. Orienting the substrates, very important, but there’s more to it than that. And what’s involved in chemistry, and that was what we were wondering, ... Scott Strobel and Rachel Green’s lab removed the 2 prime hydroxyl, the P site here, so this is the 2 prime hydroxyl of the A and the P site tRNA, because that is about the only thing that could be helping here. And indeed they found that reduced the rate by 10.000 fold. And earlier, Andrea Barta, using our structure work in her similar studies, proposed a shuttle mechanism in which the 2 prime hydroxyls, picking up the proton from the alpha amino group to facilitate a nuclearphyl attack and then that proton going to relieving 3 prime hydroxyl group. And all the subsequent structural and bio chemical studies are consistent with this shuttle mechanism. So what’s involved in a ribosome’s catalytic power are substrate orientation, this proton shuttle and probably transition stabilisation by water molecule. Well, I could show you all these detailed structures, one at a time, but rather than doing that I will show you Martin’s movie with music. Okay, so that’s the RNA proteins in blue, now we’re focusing in onto the peptidyl transferring centre, we’re going to land on the peptidyl transferring centre, and on one side is going to be the P loop from that ribosome, it’s combining the P side substrate, and then the A loop is on the other side. So now we have coming in here, an analogue of the P side substrate, it’s going to make hydrogen bonds, there’s the structure by the way, and as the commune carbon is protected by this base, Martin discovered, now we have the attacking A site substrate coming in. Again making the hydrogen bond with the A loop and now it’s oriented incorrectly to attack, but when you get a CCA, Martin found, you get a rearrangement and now it’s ready to attack and there’s a water molecule stabilising the touch of hetero intermediate. So now we’re about to have the reaction here. It’s about to attack, okay. So there’s the touch of hetero intermediate, the Oxi-N9 interacting with the water molecule and it breaks down to give the product. And there’s the product CCA and that’s another crystal structure here. And now the product is going to go off to the exit site, the E site. There’s always a little bit of Walt Disney in between the structures here, so this is made up. So the E site is some distance away, now the thing that’s important about the E site is it has to discriminate against tRNAs with an amino acid, it only binds the A-slated tRNA. So it goes into a little pocket and there is no room for an amino acid in this region for, so it only binds the A-slated TRNA. Okay, and then one last trip from the A site to the P site, and it’s ready to start over again. So that’s what we know about peptide bond formation. Actually, Martin went from my lab to working in Venki Ramakrishnan’s lab and he made another movie with music in for EfTu, you ought to listen to that as well. He’s great. He has a second career possibility, I’m sure. So now on the last twelve minutes or so I want to talk about antibiotics, what we know about how antibiotics bind to the ribosome and inhibit it, and the source of antibiotic resistance and then what is being done to design new antibiotics, not by us but by a company we founded called Rib-X. I’m sure you’re all aware of MRSA, Mersa, Methicillin-resistant Staphylococcus Aureus, which in the New York Times, October 17th, in 2007, four years ago almost, 19.000 deaths in US hospitals alone. I’m told now it’s around 20.000, or 100.000 worldwide. So it’s a major problem and it’s not the only resistant bacteria, so this is becoming a major problem that needs to be dealt with. So the question is how to antibiotics bind to the 50S. It turns out that the ribosome is, about 50% of antibiotics bind to the ribosome, and most of them bind to the large subunit. So it’s a major target. And so this work was done by another postdoc, Jeff Hansen initially, and then there was more work done subsequently that I’ll mention. So here’s the tRNA binding site, split away in the interior of the ribosome and there are some antibiotics that are specifically against TB, that I will mention at the end. There is an E site inhibitor, chloramphenicol and anisomycin. I’m actually going to skip over this particular one, and then there are macrolides, I shall concentrate on the bind further down in the tunnel, erythromycin and azithromycin, z-pack, I expect many of you have taken that. These are current antibiotics. And so how do these bind? So these studies were done some years ago by Jeff Hanson looking at the binding of a whole group of macrolides and the macrolide rings, all super imposed, they have different substitutions in making different interactions, almost all with the RNA. And it binds, here’s where peptide bond formation occurs, it binds further down the tunnel. So how do they work? So again, here’s where the antibiotic binds in red, and green are the residues whose mutation make the ribosome resistant to the antibiotics and there’s where polypeptide chain occurs, synthesis occurs, so if you look up the ribosome, there’s where peptide bond formation occurs, residue mutations lead to resistance and there is the antibiotic, it’s blocking the tunnel. I like to call it molecular constipation, as a way of getting the point across. And mutation of this residue to a G reduces the binding constant, so it’s a G in Haloarcula Marismortui, which is a species we use in an AN E.coli, so they decided that in fact they should mutate this residue, so there is the A or the G, and you can see that that N2 is right under the macrolide ring, so they got to get rid of that. So this student in post doc decided to mutate that G to an A in Haloarcula Marismortui, make a pseudo U bacteria if you will. And in fact now erythromycin binds very well, no problem whatsoever. And you can see where there’s erythromycin, which could bind either situation, it only changes the position a little bit due to this pesky N2 that’s on the G. So that means it doesn’t pack as tightly, which is why the binding constant goes down. Okay, well, now what? Well, the question that arose for many years is whether antibiotics bind differently to the archaeal ribosomes that we were studying, Haloarcula Marismortui and the bacterial ones, because archaeal is a sort of a eucaryote in disguise. They hung out with eucaryotes for a billion years or so. And so we thought, of course, they were the same, but there was some disagreement. So if we compare erythromycin as it’s bound to the deinococcus radiodurance in white and Marismortui in yellow, the rings are orthogonal, so that didn’t look so good. So these three, a student and two post docs, actually David is here, used thermos thermophilus and redid the experiments, now asking on the eubacteria what happens and the long and short of the story is that it binds the same. Here is erythromycin bound to the 70S, it superimposes exactly on our structure and doesn’t superimpose on the deinococcus radiodurance model. And the same thing is true for azythromycin, so we conclude that in fact macrolides bind to the mutated 50S exactly as they bind to the thermo thermophilus ribosomes. And that’s important if you’re going to use this information, because the next thing and last thing I’m going to talk about is how do you use this information to make new antibiotics. And this work I’m going to talk about is done by Rib-X pharmaceuticals and I’m just a reporter here. And the idea is to take bits and pieces of adjacently bound antibiotics, so there are many different families of antibiotics that buy nearby in the peptidyle transferring centre. And so what they do is they take a piece of one and chemically tie it to a piece of another with a lot of chemical synthesis and computational analysis to see what’s the right thing to do. So a lot of computational chemistry involved in this as well. And so here, for example, is one combination, which led to a compound they call Radezolid, in which these two compounds are combined and Radezolid has completed phase-two clinical trials successfully. And then another one is to combine macrolide with groups on what’s called the A site, and that’s looking successful. And then finally, what’s really exciting, is to just use this whole binding area and construct daynovel using computational methods, as well as structure-based design to make new antibiotics. And they have a couple of families of compounds that are getting ready for phase-one clinical trials, that are active against gram-negative bacteria, as well as all the resistance strain. So everybody is very excited about that as a new way, and so it looks like there’s many possibilities for designing new antibiotics. So this is the class called Oxazolidinone and here are the names of the people from Rib-X who worked on it and so here is, Linezolid is one that you can get from Pfizer, it’s a good compound but against various strains. So this is the MIC, minimum inhibitory concentration, and if it’s above four it’s not good, you only have to look at green and red. Red is bad, and so influenza, MRSA and so on, but Linezolid goes to Radezolid, the Rib-X compound, and it’s much better. And so again, different enterococci, again, Radezolid is very good, Linezolid is not. And then the next generation enhanced macrolide, again, the azythromycin, very bad against many things, stick it together and you get a new macrolide that works very well. So I think it’s going to work very very well, and then in the last minute or so I’m going to point out another direction that’s possible to take and that is to attack tuberculosis, a very big problem. There are, and this is done by a former graduate student and a post doc, binding some antibiotics to the 70S and here is the 70S, the A site tRNA, small subunit, large subunit, and this is where this family of compounds, capreomycin, viomycin bind, and they interact with both the small subunit and the large subunit right near the decoding centre, and let me just skip ahead quickly, and look at where it’s bound. It’s interacting with the large subunit RNA, small subunit RNA, the tRNA and here is the messenger RNA where it’s being decoded and it locks the tRNA on that orientation. And so what can you do with that? I mean, this is nice to understand, how it works, but what can you do about it? Well, it turns out that two other compounds that Venki’s lab have shown bind to the small subunit, hygromycin and paromomycin bind near to viomycin. So what would you do if you had that information? You’d combine it together and make a new compound, and so I’m hoping that someone will do that hopefully at Rib-X in the near future and get a better antibiotic that’s useful against TB, because there are some resistant strains of TB, particularly XRD that’s resistant to all antibiotics that are coming up, that really need to be addressed. So my main point at the end is that basic research, we only worked on the ribosomes to do basic research, but it’s having a consequence that will be useful in designing drugs effective against bacteria that are resistant, and so at the end let me just thank the people who did this work here, not entirely all on the screen are those in my group who have worked on the ribosome over the years, some of whom I’ve mentioned. This was a collaboration with Peter Moore’s group, and here are some of the group in his group, and there is Scott Strobel group made some of the substrate analogues, and here are some of the people from Rib-X, and then finally I was able to take 14 guests to Stockholm to enjoy the festivities which were really fantastic I must admit, and I was able to take some members of the lab who were pivotal and again I’ll mention Peter Moore, a long-term friend and pillar of the ribosome community. He was very important in this work and interactions. Nenad Ban, who started it, Poul Nissen, who then joined in, Hansen, who did the antibiotic work and Martin Schmeing, who did the substrate work and my administrative assistant Peggy Etherton who keeps me on the straight and is my memory chip and makes sure I go to the right meetings in the right place and I’ll stop there. Thank you.

Steitz on the Machinery of the Cell
(00:03:20 - 00:10:20)


Additional Lectures by the Nobel Laureates associated with X-ray crystallography

Introductory Mini Lecture on X-ray crystallography.

Max von Laue 1956: "From Copernicus to Einstein".

James Watson 1967: "RNA Viruses and Protein Synthesis".

Dorothy Crowfoot Hodgkin 1970: "Structure of Insulin".

Dorothy Crowfoot Hodgkin 1983: "Insulin 1983".

Max Perutz 1986: "Hemoglobin as Receptor for Drugs: Stereochemistry of Bonding".

Maurice Wilkins 1987: "Ideals of Science and Medicine".

Dorothy Crowfoot Hodgkin 1989: "A Life in Science".

Johann Deisenhofer 1989: "The Three-Dimensional Structure of a Membrane Protein".

Hartmut Michel 1989: "Structure and Function of a Biological Light Energy Converter".

Johann Deisenhofer 2002: "Back to Proteins".

Johann Deisenhofer 2006: "Structural Insights into Cholesterol Homeostasis".

Jerome Karle 2006: "Kernel Energy Methods Illustrated with Peptides".

Robert Huber 2007: "Proteolysis and its Regulation, a Molecular Basis".

Johann Deisenhofer 2008: "Structural Biology – Quo Vadis?".

Robert Huber 2008: "Beauty and Usefulness of the Building Blocks of Life: The Architecture of Proteins".

Robert Huber 2009: "Molecular Machines for Protein Degradation Inside Cells".

Robert Huber 2010: "Basic Science and Co-entrepreneurship, My Experience".

Ada Yonath 2010: "The Amazing Ribosome".

Robert Huber 2011: "Proteasome and DegP Protease, Mechanisms and Drug Design".

Robert Huber 2013: "Proteases and Their Control in Health and Disease".

Brian Kobilka 2013: "G Protein Coupled Receptors: Challenges for Drug Discovery".

Hartmut Michel 2013: "Structure and Mechanism of Otto Warburg's Respiratory Enzyme, the Cytochrome c Oxidase".

Ada Yonath 2013: "Curiosity and its Fruits: From Basic Science to Advanced Medicine".


Cite


Specify width: px

Share

Cite


Specify width: px

Share