Fundamentals of organic chemistry II: How molecules are different

4.2. Fundamentals of organic chemistry II: How molecules are different#

Professor: Jonathan Goodman (Yusuf Hamied Department of Chemistry)

Learning objectives:

Keeping track of molecules
Differentiating molecules
Determining Structure

Recap#

In the first lecture, we looked at atoms and put them together into molecules. Atoms can back together in metallic structures, like lithium and beryllium. or they can form covalent bonds in which pairs of atomic orbitals interact to form new molecular orbitals. This is the language of organic chemistry and it takes a while to get used to it and familiar with it. The names of molecules are not very important, but tracking molecules is. The molecules that this creates do not have an even distribution of electron density, so some parts of the surface are slightly positive and some parts are slightly negative. This means molecules can stick together in a oriented way by matching positive and negative areas.

More on drawing molecules#

../../_images/organicmolecules1.png — Fig. 4.17 Some molecules from lecture one#

In all of these molecules, carbon atoms form four bonds, nitrogen atoms form three bonds, oxygen forms two bond and hydrogen forms one bond. The number of bonds is called the valency of the atom. All of the bonds in these molecules take one electron from each of the atoms they connect to fill the bonding molecular orbital. However, it is also possible to take both electrons from one of the atoms, particularly if is has a lone pair. This is illustrated by the structure of nitric acid in Fig. 4.18.

../../_images/nox.png — Fig. 4.18 The nitrogen atom in nitrous acid has a lone pair which can form two bonds (\(\sigma\) and \(\pi\)) with the unfilled orbitals of an oxygen atom. This forms nitric acid. The new bond can be represented in several ways. The recommended convention is to use charges, as illustrated here. The positive charge on the nitrogen makes it possible to increase its valency to four, whilst the negative charge on the oxygen takes up the position of a second bond to oxygen#

The use of these formal charges makes it possible to represent molecules for which the orbitals can combine to form stable molecules, but the rigid valencies of row one and row two elements do not fit neatly into structural drawings. Nitromethane (Fig. 4.19) is another example of a neutral molecule which is best represented using charges to balance the valencies. As it is written, one oxygen atom appears to have a negative charge and the other appears to be neutral. Considering the orbitals, however, the nitrogen is \(sp^2\) hybridised and its \(p\) orbital overlaps with the \(p\) orbitals on the \(sp^2\) oxygens. These three \(p\) orbitals combine to form three symmetrical molecular orbitals and the charge is spread evenly between the two oxygens.

../../_images/cyanide.png — Fig. 4.19 Nitromethane and cyanide#

Fig. 4.19 also shows a molecule of hydrogen cyanide. The central carbon atom is \(sp\) hybridised and its two \(p\) orbitals each form a \(\pi\) bond with the nitrogen. The carbon and nitrogen are also linked by a \(\sigma\) bond, so there is a triple bond overall. A very similar molecule, hydrogen isocyanide, has the hydrogen on the nitrogen instead of the oxygen. If there is a mechanism for these two molecules to interconvert, hydrogen cyanide is the preferred form. If the molecules are isolated, as they are between stars, they will not change and both have been detected. Both are neutral molecules. Removing a proton gives a cyanide anion with a net negative charge. This is a common species: the negative charge may well interact with anything it collides into, if there are enough molecules in its surroundings for it to collide with anything. Removing a hydrogen atom (a proton and an electron) from hydrogen cyanide gives a neutral cyanide radical. This has an unpaired electron, indicated by a dot, and is also likely to interact with anything it collides into. Hydrogen cyanide can be broken up by light to form a hydrogen atom and a cyanide radical.

Double bonds#

Atomic \(p\) orbitals can interact to form \(pi\) bonds, which are represented as double bonds in our structural representations of molecules, as they reinforce a \(\sigma\) bond. This happens for a pair of \(p\) orbitals, but can also happen for larger groups of \(p\) orbitals, provided that the geometry of the molecule is appropriate the this to happen, lined up so the lobes of the \(p\) orbitals can overlap well. This interaction lowers the energy of the system, as the lowest energy electrons get ever lower in energy the more \(p\) orbitals are able to overlap. This is called conjugation. It influences the geometry of molecules, because conjugated systems need to be flat, and it also influences the spectra of molecules, as conjugation adjusts the relative energy of orbitals and so the frequency of electromagnetic radiation required to move electrons between them.

../../_images/conjugation.png — Fig. 4.20 The first row shows conjugated systems, except for penta-1,4-diene which will have a higher energy than penta-1,3-diene. A cyclic, conjugated system with \(4n+2\) electrons, where \(n\) is an integer, the molecules have enhanced stability, which is called *aromaticity*. In the second row, both benzene and toluene are aromatic and so considerably lower in energy than 5-methylenecyclohexa-1,3-diene, even though this has the same number of \(p\)-orbitals.#

Fig. 4.19 shows examples of conjugated systems. In the first row, penta-1,4-diene is not conjugated and so will be higher in energy than penta-1,3-diene. All the others will prefer flat geometries to permit orbital overlap. Benzene and toluene are aromatic. The atomic \(p\) orbitals are shown pairing up with double bonds, but moving the double bonds one place around the circle would be exactly the same molecule, just represented in a different way. All of the bonds in the six-membered ring are the same, even though they are represented as alternately single and double bonds.

Stereochemistry#

All of the molecules we have looked at so far are achiral which means they are indistinguishable from their mirror images. Many molecules are different to their mirror images and we need to have a way to represent this. We do this using dashed and wedged bonds, as shown in Fig. 4.20. The wedged bond comes forwards from its point, and the dashed bond goes backwards from its point. Butan-2-ol is a simple example: the two molecules in Fig. 4.20 are identical in every way, except that they are mirror images. Experimentally they can be distinguished because they interact with polarised light in different ways.

../../_images/mirror_images.png — Fig. 4.21 Mirror images#

The right hand molecules in Fig. 4.21 are alanine and its mirror image. Alanine is one of the naturally occurring amino acids which build up proteins which are key biomolecules for life on earth. Three-dimensional representations are shown in Fig. 4.22. Its mirror image is not. Homochirality is central to terrestrial life, as the mirror images of proteins, nucleic acids, sugars and many other naturally occurring molecules do not exist in the natural world.

Could there be achiral life? We cannot rule out this possibility, just because homochirality is present in all of the life we know about. However, even if the building blocks for some life were to be achiral, it is hard to imagine complex achiral structures interacting without differentiating between right and left. For example, the constraints on building assemblies of differently sized spheres so that there is always a mirror plane are severe.

../../_images/alanine.png — Fig. 4.22 Two alanine molecules in three dimensions. This complex shape is built from atoms connected by covalent bonds. This representation simplifies the structure, ignoring the clouds of electron density and the details of the orbital, focussing on the positions of the atomic nuclei, which are determined by the wavefunctions of the electrons around them.#

The dashed and wedged line notation allows us to represent most molecules in an unambiguous way. You might like to try to construct reasonable molecules which cannot be represented in this way. How many molecules are there? Restricting ourselves to the first two rows of the periodic table and putting a limit on size, such as considering only molecules with fifty atoms or fewer, the answer is a gigantic number. A commonly used estimate is \(10^{60}\) although this was calculated in a rather crude way as the people who came up with the figure readily admit. Perhaps a more precise figure can be calculated. However, it is clear that the number of possible small molecules is very large.

This complexity has consequences. Suppose spectral data suggests that some sort of molecule exists, perhaps in the atmosphere of another planet. What is the molecule? Given a molecular structure, we can calculate spectra reasonably well. One approach to solving the problem would be to guess a molecular structure, see if the spectrum fits the observation and then go on to guess a different structure if it does not. This strategy is unlikely to be successful. Calculation of spectra requires some computer time, and calculating \(10^{60}\) spectra is both impossible now and unlikely to become possible at any time in the foreseeable future.

The problem is even more complicated than this. Most molecules are rather flexible. It is possible to rotate around \(\sigma\) bonds and the barriers to do this are usually rather low. At almost any temperature, even approaching absolute zero, molecules will be moving around. Calculation of the spectra usually requires consideration of all of the different shapes (or conformations) that they can adopt.

Distinguishing molecules#

Are two molecules the same? This sounds as if it should be an easy question, but it is not. One way would be to check whether they have the same name. I have been using chemical names to label molecules in these lectures, and they are helpful labels. I hope none of you have been memorising them. Many molecules have multiple names, and some names refer to multiple molecules. Using only the name is not a good way of distinguishing molecules. PubChem is a US-based index of molecules which includes names and alternate names. Most common molecules have a large number of different names. PubChem contains information on more than a hundred million different moelcules, which is approximately the number which have ever been characterised since the development of our modern understanding of molecular structure over a century ago.

If we are searching to identify molecules in distant locations, or cataloguing the molecules that lead to the generation of extraterrestrial life, we need to have an effective way of doing this. Drawing structures of molecules is effective but imperfect. If you draw a molecule upside down, it is the same molecule but looks different. Translating a diagram into a text-based form is straightforward to do, just listing the cartesian coordinates of the atoms and listing the connections between them. However, there are many different ways to draw molecules and so many different text-based descriptions for each of them, using this method. What differences in structures are significant and which are simply decorative? Which atom should you list first?

The InChI is the solution that the International Union of Pure and Applied Chemistry has produced to address this problem. Any structural representation of a molecule can be turned into a unique text string, the InChI, and each of these text strings corresponds to a single molecule. This can be tried out using the IUPAC InChI Web Demo.

The InChI algorithm provides a unique numbering of the atoms which will be the same whatever order was chosen in drawing the molecule. Many different structural diagrams will have the same InChI but they will all be different representations of the same substance.

Unique Molecules#

The question of whether two molecules are the same is difficult, even for molecules in the first two rows of the periodic table. For example, how many stereoisomers are there for: 3,4-bis(1-hydroxyethyl)hex-3-ene-2,5-diol (Fig. 4.23)?

../../_images/howmany.png — Fig. 4.23 3,4-bis(1-hydroxyethyl)hex-3-ene-2,5-diol without stereochemistry.#

As drawn in Fig. 4.23 there appears to be only one structure. However, the structure is drawn flat. What happens when stereochemistry is considered? There are four hydroxyl groups, all of which could go either up or down. This suggests there are sixteen possible molecules. However, the molecules is highly symmetrical and so the correct answer is likely to be fewer than sixteen.

Working out the right answer is quite fiddly to do by hand, and potentially challenging for automation. Fortunately, the InChI provides a straightforward solution. Generation of the InChI for all sixteen structures shows there are many duplicates. Counting the unique InChI gives the correct answer: seven. This is illustrated in Fig. 4.24.

../../_images/seven_structures.png — Fig. 4.24 The sixteen possible structures for 3,4-bis(1-hydroxyethyl)hex-3-ene-2,5-diol, coloured by unique InChI. There are seven distinct structures.#

Handling large groups of molecules requires an effective way of indexing and distinguishing them. The InChI provides this and tools have been written which enable molecular structures to be manipulated, including the open source cheminformatics software RDKit.

This provides the greater part of the answer to the question of whether two molecules are the same and allows us to automate the handling of large libraries of molecules which is likely to be necessary for the investigation of partial structural information gathered from spectroscopic measurements of distant locations.

It is not a complete answer. Molecules might be different in the dark and cold of interstellar space, and yet indistinguishable when dissolved in water. For example, guanine (Fig. 4.25) is a key fragment of DNA and RNA. The figure shows four slightly different graphical representations, which differ by the attachment of the hydrogen atoms to nitrogen atoms. These structures are all distinct, but when dissolved in water they interconvert so rapidly that they cannot be distinguished. There are many other molecules which are different at low temperatures and identical at higher temperatures. Other molecules may be considered as isolated molecules, but may join together in groups, given the opportunity. Should these still be considered as the same molecule? There are edge cases which are unanswerable without knowing the environment of the molecule. Fortunately, nearly all molecules constructed of elements from the first two rows of the periodic table can be handled effectively by the InChI.

../../_images/guanine.png — Fig. 4.25 Guanine: when dissolved in water, all four of these structures are the indistinguishable from each other.#

Determining Structure#

Tracking molecules is essential, but only useful if we can discover molecular structure from data that we have available. How can we get the information needed to work out what the molecular structures are, and how can we do it if the molecules themselves are a long way away?

First, how do we discover the structure of a molecule if we actually have some of the material?

If the material is crystalline, it should be possible to determine the position of every nucleus by X-ray diffraction. This works very well, but requires a good sample of the material. The Cambridge Structural Database provides an index of high-quality crystal structures and currently contains over a million small-molecule structures. This is a large number in absolute terms, but small compared with the total number of small molecules. The strengths of X-ray crystallography are the detail and quality of the primary data. The challenges are the difficulty of collecting very large amounts of these data.

The most common way to determine molecular structures is NMR spectroscopy. By putting a molecule in a very strong magnetic field, nuclei with suitable spins will align with the field. This makes it possible to measure the environment of every nucleus, and so get a very large amount of information about molecules. Fortunately, both carbon and hydrogen have suitable nuclear spins for this technique. The majority of the hundred million structures in PubChem have been characterised by NMR spectroscopy.

Both of these techniques are very powerful, but neither can be applied to molecules which are present only in very small quantities or a long way away. This can be done by observing the electromagnetic radiation which is passing through the molecules and seeing what is absorbed.

Infra-red radiation, which is low frequency and so low in energy, has right level of energy to make molecules vibrate. This is very useful information. Infra-red spectroscopy is routinely used in laboratories and gathers valuable information but usually not enough to precisely identify a molecule, except for rather simple molecules. Higher energy radiation, uv/vis, is able to excite electrons between orbitals. This also provides very useful information about molecules.

Provided that the exact spectra of a molecule are already known, this may well be enough to demonstrate the molecule is the same as the known one. If the molecule is not know, it may well be possible to calculate the spectra to a high level of precision. However, these calculations require a significant amount of computer time, so it is necessary to have a fairly good idea of what the structure is.