A Deep Dive Into the Traditional Approaches to DNA Sequencing

Our DNA is made up of four nucleotides - namely adenine (A), thymine (T), cytosine (C), and guanine (G). All nucleotides contain a sugar ring (called the pentose sugar) and a phosphate group, and they differ in their nitrogenous bases (A/T/C/G).

Figure 1: The basic structure of a nucleotide.

The sequence in which these nucleotides appear are read in sets of threes by our body. Those three consecutive nucleotides are known as a triplet codon, which is further elaborated on in The Genetic Code.

One important question to ask ourselves is – how could we identify the nucleotide combination of a specific gene? This is where DNA sequencing comes along.

We will be looking at Sanger sequencing and Maxam-Gilbert sequencing today.

Sanger Sequencing (Chain Termination Method)

Sanger sequencing was once regarded as the gold standard method for sequencing short fragments of DNA (that is, less than 1000 bases), with an astonishing accuracy of 99.99%! This technique was named after Frederick Sanger, the two time Chemistry Nobel Laureate who developed the method alongside Steven Nicklen and Alan R. Coulson in 1977.

Take Figure 2 as a guide for you as you read through the next few paragraphs:

Figure 2: A brief insight into the components and steps of Sanger sequencing. Figure created on BioRender.

The template DNA is the DNA strand we are looking to sequence. It must first be exposed to a high temperature so that the double-stranded template will denature into single strands. A short complementary sequence (a “primer”) anneals to the single-stranded template. Primers are required because DNA cannot be synthesised out of nothing; it must be extended from an existing strand (i.e. the primer). Utilising the same elongation action as seen in DNA replication, DNA polymerase adds on nucleotides to the 3’ end of the primer. DNA polymerase is an enzyme that is essential in DNA replication, whereby it extends existing strands of DNA by attaching new nucleotides in a sequence complimentary to the template strand.

The whole idea of Sanger sequencing is to have the sequence terminated at a specific nucleotide. This is to produce fragments of different lengths..

To do this, Sanger and his colleagues used a mixture of fluorescent chain-terminating dideoxynucleotide triphosphates (ddNTPs) and regular nucleotides that are also known as deoxynucleotide triphosphates (dNTPs).

In essence, regular nucleotides come in four forms, which are dATP, dTTP, dCTP and dGTP. On the other hand, there are four types of chain-terminating nucleotides, namely ddATP, ddTTP, ddCTP and ddGTP, whereby scientists can differentiate between these different dideoxynucleotides as they fluoresce with a different colour.

Some of you may be asking – what is the difference between dNTPs and ddNTPs?

Good question. Let’s explore that a little further.

Figure 3: An overview of the difference between the deoxynucleotide triphosphate (dNTP) and dideoxynucleotide triphosphate (ddNTP).

As you can see in Figure 3, dNTPs have a hydroxyl group on carbon-3 of the sugar ring, and this feature is missing in ddNTPs.

This observation is important because the hydroxyl group of a nucleotide is typically where the phosphate group of another nucleotide binds to. Because ddNTPs lack the hydroxyl group, no more nucleotides can be added on, thereby causing a phenomenon known as chain termination.

During elongation by DNA polymerase, dNTPs or ddNTPs are added at random at the 3’ end of the primer in a sequence that is complementary to the template strand. If a dNTP is added, the elongation continues. If a ddNTP is added, the elongation comes to a halt and no further nucleotides can be added. The resulting DNA is denatured to produce single strand fragments, which are of various lengths that fluoresce with either one of the four possible colours.

After the fragments are produced, electrophoresis is carried out on a polyacrylamide gel. Shorter fragments travel faster and thus would be located further down within the gel. On the other hand, longer fragments tend to travel slower and so are higher up in the gel. Finally, researchers will read the DNA sequence starting from the bottom of the gel whilst moving up (i.e. from the shortest fragment to the longest fragment).

With that in mind, can you determine the sequence in step 4 of Figure 2? The sequence obtained should be read as “ACCTAGCTG”.

Maxam-Gilbert Sequencing (Chemical Cleavage Method)

A few months prior to the emergence of Sanger sequencing, Allan M. Maxam and Walter Gilbert introduced the Maxam-Gilbert method of DNA sequencing. This method involves the preferential cleavage of DNA strands at specific sites through the usage of certain chemicals.

Figure 4: An overview of the components and steps involved in Maxam-Gilbert DNA sequencing. Figure created on BioRender.

The single stranded template sequence is obtained by first denaturing our target DNA strand using high heat. Alkaline phosphatase is then added to remove terminal phosphates at the 5’ end of the template. The removal means the 5’ end of the template strand is now lacking its phosphate groups. This allows us to add phosphorus-32 (32P), which radioactively labels the template. We can detect radioactive bands by autoradiography after gel electrophoresis has been carried out.

Moving on, different combinations of chemicals are required to cause cleavage at specific sites on DNA - as illustrated in step 3 of Figure 4, and in the table above. Heated piperidine is required in all four reactions as it catalyses the cleavage of the phosphodiester bond of DNA after the nucleotides are displaced by either dimethyl sulphate or hydrazine.

Dimethyl sulphate and piperidine causes the cleavage of guanine (G) nucleotides only. On the other hand, in the presence of formic acid, dimethyl sulphate and piperidine cleaves adenine (A) and guanine (G) nucleotides. In contrast, hydrazine and piperidine causes the cleavage of cytosine (C) and thymine (T) nucleotides. However, when sodium chloride (NaCl) is present, hydrazine and piperidine will only cleave cytosine (C) nucleotides.

The 5’-radioactively labelled fragments produced from chemical cleavage are then run on polyacrylamide gel as part of the electrophoresis process. Each lane corresponds to the cleavage site produced:

Guanine alone
Adenine + Guanine
Cytosine + Thymine
Cytosine alone

A total of four lanes are run, with the shortest fragment moving the fastest, whereas the longest fragment moving the slowest. Therefore, reading from the bottom of the gel and upwards would generate the DNA sequence. After that, the radioactive bands on the polyacrylamide gel are made visible by autoradiography.

What do the bands in each lane mean? If a band is present on the G lane and the A+G lane, that represents a guanine (G) nucleotide. Meanwhile, if a band is present only on the A+G lane, that represents an adenine (A) nucleotide. Conversely, if a band is present on the C lane and the C+T lane, that represents a cytosine (C) nucleotide. And finally, if a band is present only the C+T lane, that represents a thymine (T) nucleotide.

Taking that into consideration, can you determine the sequence in step 5 of Figure 4? The sequence obtained should be read as “GTTAGCTAC”.

Sanger or Maxam-Gilbert Sequencing – which is preferable?

The Maxam-Gilbert method involves the usage of chemicals that are hazardous to human health. In particular, hydrazine is a neurotoxin whilst phosphorus-32 is radioactive. Additionally, the entire process is time consuming.

Sanger sequencing, on the other hand, has much simpler and safer steps. Additionally, it can be automated. Hence, the Sanger sequencing technique is preferred over Maxam-Gilbert sequencing. Though initially the go-to technique, Maxam-Gilbert sequencing is no longer in widespread use at present.

The function of DNA sequencing – why does it matter?

Now that this technology is available, scientists were able to map the entire human genome by 2003 – this is known as the Human Genome Project (HGP). With the entire genome mapped out, scientists could now easily identify mutations in genes that give rise to genetic diseases. This would allow us to discover and develop new ways to treat those disorders or minimize the symptoms..

DNA sequencing also allows us to study gene function. We can then manipulate genes to produce more of a desirable product. A good example of this is the Golden Rice Project, which tackles Vitamin A deficiency in developing countries.

Nevertheless, it is important to note that many other DNA sequencing techniques have been developed since Sanger and Maxam-Gilbert – these are collectively known as “next-generation sequencing (NGS)”. Interestingly, NGS is much faster and cheaper compared to the traditional (“first-generation”) sequencing methods and can be carried out on a large scale.

For further clarification, check out the following YouTube resources to learn more about Sanger sequencing and Maxam-Gilbert sequencing.

Author: Amanda Goon Suet Min, BSc Biotechnology

#genomics #sequencing #technology

Disclaimer: All figures created using BioRender are intended solely for educational purposes and not for profit.