Molecular Biology

Q: What is molecular biology?

Molecular biology is the branch of biology that studies the molecular mechanisms underlying biological processes — particularly how genetic information stored in DNA is copied, expressed as RNA and protein, and regulated in response to cellular and environmental signals. The field emerged in the mid-twentieth century from the convergence of genetics, biochemistry, and structural biology, and is now foundational to virtually every area of biomedical research. Its central concern is understanding the structure, function, and interactions of the major biological macromolecules — DNA, RNA, and proteins — and how these molecular events underlie cell function, heredity, development, and disease.

Q: What is the central dogma of molecular biology?

The central dogma of molecular biology, articulated by Francis Crick in 1958, describes the general scheme of information transfer in biological systems: DNA is replicated to produce new DNA; DNA is transcribed to produce RNA; RNA is translated to produce protein. The 'dogma' aspect refers to the directionality of information flow — genetic information normally flows from nucleic acid to protein, not in reverse. Retroviruses and retrotransposons represent the exception: they use reverse transcriptase to copy RNA back to DNA. The central dogma is not a law but a framework — it does not exclude all possible information flows but describes the predominant direction in which genetic information moves in cells.

Q: What is the structure of DNA?

DNA (deoxyribonucleic acid) is a double-stranded polymer of deoxyribonucleotides arranged in an antiparallel double helix, as described by Watson and Crick in 1953 based on Rosalind Franklin's X-ray crystallography data. Each strand consists of nucleotides connected by phosphodiester bonds between the 3' carbon of one deoxyribose sugar and the 5' carbon of the next. The two strands are held together by hydrogen bonds between complementary base pairs: adenine (A) pairs with thymine (T) via two hydrogen bonds; guanine (G) pairs with cytosine (C) via three hydrogen bonds. The B-form helix makes a complete turn every 10.5 base pairs (~3.4 nm), has a major groove (~22 Å wide) and a minor groove (~12 Å wide) that provide differential binding surfaces for regulatory proteins. The antiparallel orientation — one strand runs 5'→3' and the other 3'→5' — is fundamental to replication and transcription directionality.

Q: How does the lac operon regulate gene expression?

The lac operon in Escherichia coli is the paradigmatic example of prokaryotic gene regulation. It controls three genes for lactose metabolism (lacZ for β-galactosidase, lacY for permease, lacA for transacetylase) through dual control: negative regulation by the lac repressor and positive regulation by catabolite activator protein (CAP). When lactose is absent, the lac repressor (encoded by lacI) binds the operator sequence between the promoter and the structural genes, blocking RNA polymerase access — the operon is off. When lactose (actually its isomer allolactose) is present, it binds the repressor as an inducer, causing a conformational change that reduces repressor-operator affinity — RNA polymerase can transcribe the genes. However, even with repressor removed, transcription is inefficient unless glucose is also absent: low glucose elevates cAMP levels, which binds CAP, enabling CAP to bind the CAP site upstream of the promoter and strongly stimulate RNA polymerase binding. The operon is maximally expressed only when lactose is present (repressor inactive) AND glucose is absent (CAP active).

Q: What is PCR and how is it used in molecular biology?

PCR (polymerase chain reaction) is an in vitro method for amplifying a specific DNA sequence exponentially using a thermostable DNA polymerase and two short oligonucleotide primers flanking the target sequence. The reaction cycles through three temperature steps: denaturation (~95°C) separates the double-stranded template; annealing (~55–65°C) allows primers to hybridise to complementary template sequences; extension (~72°C) allows Taq polymerase to synthesise new strands from each primer. After 30–35 cycles, the target sequence has been amplified approximately 10⁶–10⁹-fold from a starting template of even a few molecules. PCR is used in virtually every area of molecular biology: diagnostic testing (COVID-19 RT-PCR, HIV viral load), genotyping, cloning, sequencing library preparation, gene expression analysis (RT-qPCR), forensics, and ancient DNA analysis. Developed by Kary Mullis in 1983 and recognised with the Nobel Prize in Chemistry in 1993.

Q: What are restriction enzymes and how are they used in molecular cloning?

Restriction enzymes (restriction endonucleases) are bacterial enzymes that cut double-stranded DNA at specific recognition sequences — typically 4–8 base pair palindromic sequences. Type II restriction enzymes (the most commonly used in molecular biology) cut within or adjacent to their recognition sequence, producing either blunt ends or 4-nucleotide 5' or 3' overhangs (sticky ends). In molecular cloning, restriction enzymes cut both a target DNA fragment and a vector (a plasmid or bacteriophage) at compatible sites; the complementary sticky ends allow the fragment and vector to be joined by DNA ligase in a ligation reaction. The recombinant vector can then be introduced into bacteria by transformation; bacteria containing the recombinant plasmid are selected (typically by antibiotic resistance) and grown to amplify the cloned insert. Restriction-ligation cloning has been largely supplemented by PCR-based cloning methods (Gibson assembly, Gateway cloning, Golden Gate assembly) in modern molecular biology, but restriction enzymes remain essential for analytical restriction mapping, DNA fingerprinting, and many cloning applications.

Home / Academic Skills / Molecular Biology

MOLECULAR BIOLOGY · BIOCHEMISTRY · GENETICS

Molecular Biology

The complete guide to DNA structure and replication, the central dogma, transcription and translation, prokaryotic and eukaryotic gene regulation, recombinant DNA technology, CRISPR genome editing, PCR, gel electrophoresis, epigenetics, RNA biology, and the molecular mechanisms driving medicine, biotechnology, and our understanding of life itself.

Biology Assignment Help Biology Research Papers

55–65 min read Undergraduate to Postgraduate All core topics covered 10,000+ words

Custom University Papers Molecular Biology Team

Specialists in molecular biology, biochemistry, genetics, and biomedical sciences — supporting students across biology, biomedical science, medicine, pharmacy, and related disciplines through assignments, research papers, literature reviews, and dissertations spanning the entire scope of molecular biology from DNA structure to genome editing technology.

In 1953, James Watson and Francis Crick published a 900-word paper in Nature proposing a double-helical structure for DNA that, as they famously understated, “suggests a possible copying mechanism for the genetic material.” That understatement concealed a revolution. The structure explained, in a single elegant molecular model, how genetic information could be faithfully copied (through complementary base pairing), how mutations could arise (through copying errors or chemical damage), and how information encoded in sequence could in principle specify the structure of proteins. In the seven decades since, molecular biology has moved from that first structural insight to sequencing entire genomes in hours, editing any gene in any organism with nucleotide precision, and developing molecular medicines that treat diseases previously untreatable. Understanding molecular biology — its foundational concepts, its experimental tools, and its clinical applications — is now not merely useful for scientists but essential context for medicine, public health, biotechnology policy, and informed citizenship in an era when the molecular basis of life shapes virtually every significant biological advance.

What This Guide Covers

Molecular biology — definition and history DNA structure and properties DNA replication — mechanism and fidelity The central dogma — information flow Transcription — from DNA to RNA RNA processing — splicing, capping, polyadenylation Translation — the genetic code and ribosomes Prokaryotic gene regulation — operons Eukaryotic gene regulation — chromatin and TFs Epigenetics — DNA methylation and histone modification Recombinant DNA technology PCR, sequencing, and gel electrophoresis CRISPR-Cas9 genome editing Molecular medicine and clinical applications Frequently asked questions

Molecular Biology — Scope, History, and the Questions That Define the Field

Molecular biology is the study of biological processes at the molecular level — primarily the structure, function, and interactions of the nucleic acids (DNA and RNA) and proteins that carry, express, and regulate genetic information. It is both a discipline in its own right and a foundation for virtually every area of modern biology and medicine. Microbiology, cell biology, genetics, developmental biology, neuroscience, pharmacology, and cancer biology all depend on molecular biological understanding to interpret their observations and design their experiments.

The field took shape in the 1940s and 1950s through a series of landmark discoveries that established what genetic information is and how it works: Avery, MacLeod, and McCarty’s 1944 demonstration that DNA (not protein) is the transforming principle in bacteria; Chargaff’s 1950 base composition rules (A = T, G = C in any DNA sample); Hershey and Chase’s 1952 phage experiment confirming DNA as the genetic material; and Watson and Crick’s 1953 double helix model. The following two decades established the central dogma, cracked the genetic code, characterised the ribosome, identified restriction enzymes, and produced the first recombinant DNA molecules — laying the technical foundation for the biotechnology industry and modern genomic medicine.

1953Watson and Crick’s double helix — the structural insight that made the molecular mechanisms of heredity and mutation comprehensible

3.2 GbHuman genome size — 3.2 billion base pairs encoding approximately 20,000–25,000 protein-coding genes, completed by the Human Genome Project in 2003

64Codons in the universal genetic code — encoding 20 standard amino acids plus start and stop signals, with redundancy ensuring fault tolerance

CRISPRGenome editing technology awarded the Nobel Prize in Chemistry in 2020 — enabling targeted editing of any gene in any organism with precision previously impossible

DNA Structure — The Double Helix, Base Pairing, and the Chemical Basis of Heredity

DNA (deoxyribonucleic acid) is the molecule that stores genetic information in all cellular life forms and most viruses. Its structure, elucidated by Watson and Crick in 1953 using X-ray diffraction data primarily from Rosalind Franklin and Maurice Wilkins, reveals a molecular architecture perfectly suited to its biological functions: faithful copying through semiconservative replication, long-term information storage in a chemically stable form, and encoding of information in sequence that can be read and regulated by proteins.

The Chemical Components of DNA

Each nucleotide monomer of DNA consists of three components: a deoxyribose sugar (lacking the 2′-OH of ribose in RNA), a phosphate group (connecting the 3′ carbon of one nucleotide to the 5′ carbon of the next via a phosphodiester bond), and one of four nitrogenous bases — the purines adenine (A) and guanine (G), and the pyrimidines thymine (T) and cytosine (C). The phosphodiester backbone is uniformly charged (negative) and forms the structural framework of each strand; the bases project inward and are responsible for sequence-specific interactions — both with the complementary strand and with regulatory proteins that read the DNA sequence.

DNA double helix — key structural parameters and base pairing rules Structural Biology Reference

B-FORM HELIX (predominant in cells):
  Diameter:          ~2 nm (20 Å)
  Rise per base pair: 0.34 nm (3.4 Å)
  Helix pitch:       3.4 nm (one full turn = 10.5 bp)
  Major groove:      ~2.2 nm wide — principal binding site for sequence-specific proteins
  Minor groove:      ~1.2 nm wide — binding site for some drugs (netropsin, actinomycin D)
  Strand orientation: Antiparallel (one 5′→3′, complementary 3′→5′)

WATSON-CRICK BASE PAIRS:
  A  ···  T  (adenine–thymine):   2 hydrogen bonds  — AT/TA  (weaker)
  G  ···  C  (guanine–cytosine):  3 hydrogen bonds  — GC/CG  (stronger)

  Chargaff rules: [A] = [T]; [G] = [C] in any dsDNA sample
  %GC content varies by organism: bacteria 25–75%; human genome ~41%
  Higher %GC → higher melting temperature (more H-bonds per bp)

DNA TOPOLOGY:
  Relaxed circular DNA   — no supercoiling tension
  Positively supercoiled  — overwound (forms ahead of replication fork)
  Negatively supercoiled  — underwound (predominant in cells; facilitates strand separation)
  Topoisomerase I        — relaxes both positive and negative supercoils
  Topoisomerase II (Gyrase) — introduces negative supercoils (antibacterial target)

Chromatin — Packaging DNA in the Eukaryotic Nucleus

A human cell must fit approximately 2 metres of linear DNA into a nucleus ~6 μm in diameter — a packaging challenge that requires ~10,000-fold compaction. This is achieved through successive levels of DNA organisation, beginning with the nucleosome: approximately 147 bp of DNA wrapped around an octamer of histone proteins (two each of H2A, H2B, H3, and H4) to form the ~11-nm “beads-on-a-string” nucleosomal array. Linker histone H1 associates with the linker DNA between nucleosomes, facilitating further folding into 30-nm fibres and higher-order chromatin domains. At the level of the nucleus, DNA is organised into topologically associating domains (TADs) — megabase-scale chromatin compartments that segregate active (A compartment) and inactive (B compartment) chromatin regions. Condensed, transcriptionally silent chromatin is called heterochromatin; open, transcriptionally accessible chromatin is euchromatin. Nucleosome positioning and histone modification state critically regulate gene expression by controlling transcription factor and RNA polymerase access to DNA — the molecular basis of epigenetic regulation.

DNA Replication — Semiconservative Copying, the Replisome, and Maintaining Fidelity

Watson and Crick’s 1953 paper noted that the complementary base-pairing of the double helix “immediately suggests a possible copying mechanism” — if the two strands separate, each could serve as a template for synthesis of a new complementary strand, producing two identical daughter duplexes. This semiconservative replication model was confirmed by Meselson and Stahl’s elegant 1958 experiment using density-labelled nitrogen isotopes, which showed that each daughter DNA molecule retains one parental strand and contains one newly synthesised strand. The molecular machinery that executes this copying — the replisome — is one of the most sophisticated biological machines known.

Origins of Replication

Replication begins at defined chromosomal sequences called origins of replication. The simple bacterium E. coli has a single origin (oriC) on its circular chromosome, where the DnaA initiator protein binds, melts the duplex, and recruits the replicative helicase (DnaB). The human genome has approximately 30,000–50,000 origins — necessary because eukaryotic chromosomes are too large to replicate from a single origin within the time constraints of S phase. Origins are licensed for replication by loading the MCM2-7 helicase complex during G1; firing of licensed origins is triggered during S phase by CDK2-cyclin E/A and DDK (Dbf4-dependent kinase) activities. Each origin, once fired, replicates bidirectionally — two replication forks travel in opposite directions from each origin.

Replication Fidelity — Three Levels

The remarkable accuracy of DNA replication (~1 error per 10⁹–10¹⁰ base pairs) is achieved through three sequential mechanisms. First, nucleotide selection: the induced-fit conformational change of the polymerase fingers domain preferentially incorporates correctly base-paired dNTPs (~1 error in 10⁵). Second, 3’→5′ proofreading exonuclease activity of the replicative polymerase: removes incorrectly incorporated nucleotides before the next addition (~100-fold improvement). Third, mismatch repair (MMR): post-replicative scanning by MutS/MutL proteins identifies and corrects residual mismatches (~100-fold further improvement). Together these three mechanisms achieve an overall error rate ~10 billion-fold lower than the spontaneous error rate of uncatalysed nucleotide addition.

The Meselson-Stahl Experiment — Proof of Semiconservative Replication

In 1958, Matthew Meselson and Franklin Stahl grew E. coli in medium containing heavy nitrogen (¹⁵N) until all cellular DNA was fully labelled with ¹⁵N. They then transferred cells to normal (¹⁴N) medium and allowed one, two, and three rounds of replication. After centrifuging DNA in a caesium chloride density gradient (which separates DNA by buoyant density), the results were unambiguous: after one replication, all DNA migrated at an intermediate density (one heavy strand, one light strand — confirming semiconservative copying). After two replications, DNA appeared at two positions — intermediate and light — in a 1:1 ratio, consistent with only semiconservative replication. The experiment excluded conservative replication (which would show only heavy and light bands after one round) and dispersive replication (which would show a single intermediate band after both rounds that shifted lighter with each successive division).

The Meselson-Stahl experiment is frequently cited as “the most beautiful experiment in biology” — it used an elegant physical technique (density gradient centrifugation) to distinguish between three mechanistically distinct models with a single experiment, producing results that were definitive, visually clear, and immediately interpretable.

The Central Dogma — Information Flow from DNA to RNA to Protein

Francis Crick coined the term “central dogma” in 1958 to describe what was then a hypothesis about the directionality of biological information transfer. In its most general form, it states: information stored in nucleic acid sequences can be transferred to other nucleic acid sequences or to protein sequences, but information in protein sequences cannot be transferred back to nucleic acids. The specific transfers that normally occur in cells are: DNA replication (DNA → DNA), transcription (DNA → RNA), and translation (RNA → protein). The transfer RNA → DNA (reverse transcription by retroviral reverse transcriptase) is a known exception. Protein → nucleic acid and protein → protein transfers have never been demonstrated for sequence information under normal cellular conditions (though prion propagation involves protein-directed protein conformational change without sequence transfer).

DNA Replication — Faithful Copying of the Genome

Before cell division, the entire genome must be duplicated. DNA polymerase uses each parental strand as a template to synthesise a complementary daughter strand, producing two identical daughter duplexes. The fidelity of replication is approximately 1 error per 10⁹–10¹⁰ nucleotides — essential for maintaining genome integrity across trillions of cell divisions in a human lifetime. Replication occurs during S phase of the cell cycle and is tightly coupled to cell cycle checkpoints that verify completion before division.

Transcription — Converting DNA Information into RNA

RNA polymerase reads the template (antisense) DNA strand 3’→5′ and synthesises a complementary RNA strand 5’→3′. In eukaryotes, three RNA polymerases divide transcription: Pol I transcribes ribosomal RNA genes; Pol II transcribes protein-coding genes (mRNA) and most non-coding RNAs; Pol III transcribes tRNA, 5S rRNA, and small non-coding RNAs. Transcription is the primary control point for gene expression — the rate of transcription initiation determines how much of any given mRNA is present in the cell, and therefore how much of the encoded protein can be made.

RNA Processing — Preparing mRNA for Translation

In eukaryotes, the primary transcript (pre-mRNA) requires extensive processing before translation: 5′ capping (addition of a 7-methylguanosine cap that protects against degradation and aids ribosome recruitment), 3′ polyadenylation (addition of a ~200 nucleotide poly-A tail that aids export and stability), and splicing (removal of introns and joining of exons by the spliceosome). Alternative splicing of the same pre-mRNA can produce multiple different protein isoforms from a single gene — greatly expanding proteome diversity beyond the ~20,000 protein-coding genes in the human genome.

Translation — Decoding mRNA Sequence into Protein

The ribosome reads the mRNA sequence in triplet codons (5’→3′) and synthesises the encoded polypeptide chain. Each codon specifies a particular amino acid (or a start/stop signal) via the universal genetic code. Aminoacyl-tRNAs — tRNA molecules covalently linked to their cognate amino acid by aminoacyl-tRNA synthetases — serve as adaptors, presenting the correct amino acid to the ribosome’s A-site when the anticodon of the tRNA matches the codon of the mRNA. After synthesis, the polypeptide is folded (assisted by chaperone proteins), post-translationally modified, and targeted to its correct cellular location.

Transcription — Initiating, Elongating, and Terminating RNA Synthesis

Transcription is the synthesis of an RNA molecule using a DNA template, catalysed by RNA polymerase (RNAP). Unlike DNA polymerase, RNA polymerase can initiate RNA synthesis de novo — it does not require a primer with a free 3′-OH — because the energy requirement for the first phosphodiester bond is offset by the release of pyrophosphate from the initiating NTP, which is hydrolysed by cellular pyrophosphatase. RNA polymerase also has lower fidelity than DNA polymerase (~1 error per 10⁵ nucleotides) and lacks proofreading — acceptable because RNA is a transient product rather than a heritable record.

Prokaryotic Transcription

In bacteria, a single RNA polymerase (core enzyme: α₂ββ’ω) is responsible for transcribing all RNA species. The sigma (σ) factor confers promoter recognition specificity — different sigma factors direct the core RNAP to different promoter classes, allowing the cell to shift gene expression programmes in response to stress (σ⁷⁰ for exponential growth; σ³² for heat shock; σ⁵⁴ for nitrogen limitation). Prokaryotic promoters have two conserved sequence elements: the –10 element (consensus TATAAT) and –35 element (consensus TTGACA) upstream of the transcription start site. σ factor contacts these elements and positions RNAP, then is released after initiation. Termination occurs either by Rho-independent (intrinsic) termination — a GC-rich hairpin followed by a run of U residues causes RNAP to pause and dissociate — or by Rho-dependent termination, where the Rho helicase tracks the mRNA and dislodges RNAP at pause sites.

Prokaryotic Transcription

Eukaryotic Transcription

RNA PolymeraseSingle RNAP core enzyme; sigma factor confers promoter specificity. No nuclear membrane — translation can begin co-transcriptionally (coupled transcription-translation).

RNA PolymerasesThree nuclear RNA polymerases (Pol I, II, III) with distinct roles. Pol II transcribes mRNA; requires general transcription factors (TFIIA, B, D, E, F, H) for PIC assembly at the promoter.

Promoter Elements–10 (TATAAT) and –35 (TTGACA) consensus sequences recognised by sigma factor. Most bacterial promoters 20–200 bp upstream of gene start.

Promoter ElementsTATA box (~30 bp upstream), initiator element (Inr), downstream promoter element (DPE). Distal enhancers (up to Mb away) loop to contact the promoter via the Mediator co-activator complex.

mRNA ProcessingNone — bacterial mRNAs are not capped, spliced, or polyadenylated. Transcription and translation occur simultaneously in the cytoplasm.

mRNA ProcessingExtensive: 5′ 7-methylguanosine cap, 3′ polyadenylation (~200 nt poly-A tail), splicing of introns by the spliceosome, nuclear export. Processing occurs co-transcriptionally in the nucleus.

TerminationIntrinsic (hairpin + U-run) or Rho-dependent termination. Termination sequences are relatively simple and well-defined.

TerminationTorpedo model: cleavage at the poly-A signal triggers 5’→3′ degradation of the downstream RNA by XRN2, which catches and dislodges Pol II; cleavage/polyadenylation coupled to termination.

RNA Processing — Splicing, the Spliceosome, and Expanding the Proteome

The discovery of introns in 1977 by Richard Roberts and Phillip Sharp (Nobel Prize 1993) revealed that eukaryotic genes are discontinuous: protein-coding sequences (exons) are interrupted by non-coding sequences (introns) that must be precisely removed from the pre-mRNA before translation. This process — RNA splicing — is catalysed by the spliceosome, one of the most complex macromolecular machines in the cell, and it fundamentally changes the relationship between gene number and protein diversity.

RNA Splicing

The Spliceosome — Five snRNPs, One Machine

The spliceosome is assembled from five small nuclear ribonucleoprotein complexes (snRNPs) — U1, U2, U4, U5, and U6 — each containing a snRNA and associated proteins. Assembly is sequential: U1 snRNP recognises the 5′ splice site; U2 snRNP binds the branch point adenosine (typically ~20–50 nt upstream of the 3′ splice site); U4/U6 and U5 join to form the active spliceosome. Two transesterification reactions (catalysed by the RNA component — the spliceosome is a ribozyme) remove the intron as a lariat structure and join the flanking exons. The entire spliceosome disassembles and recycles after each splicing event.

Proteome Diversity

Alternative Splicing — One Gene, Many Proteins

A single pre-mRNA can be spliced in different ways in different cell types, developmental stages, or in response to signals — producing multiple mRNA isoforms that encode different protein variants. This alternative splicing exponentially expands the proteome from a fixed gene number: it is estimated that >90% of human multi-exon genes undergo alternative splicing. The neurexin gene family exemplifies extreme alternative splicing — neurexins can generate thousands of protein isoforms from three genes through a combination of alternative promoters, alternative exon inclusion, and alternative 3′ splice site use. Alternative splicing is regulated by splicing regulators (SR proteins promoting exon inclusion; hnRNP proteins promoting exclusion) that bind exonic and intronic splicing enhancers and silencers.

5′ Processing

The 5′ Cap — Translation Initiation and mRNA Stability

Within seconds of transcription initiation, the 5′ end of the nascent RNA is modified by addition of a 7-methylguanosine cap via an unusual 5’–5′ triphosphate linkage. The cap is added co-transcriptionally when the transcript is approximately 20–30 nucleotides long, by capping enzyme recruited by the phosphorylated CTD of RNA Pol II. The cap serves multiple functions: it protects the mRNA from 5’→3′ exonuclease degradation, is recognised by the cap-binding complex (CBC) for nuclear export, is recognised by eIF4E for ribosome recruitment during translation initiation, and marks the mRNA as a legitimate cellular transcript (distinguishing it from viral or aberrant RNAs).

3′ Processing

Polyadenylation — Adding the Poly-A Tail

After cleavage of the pre-mRNA at a polyadenylation signal (typically the hexanucleotide AAUAAA ~10–30 nt upstream of the cleavage site), poly-A polymerase adds approximately 150–250 adenosine residues to the 3′ end without a template. The poly-A tail binds poly-A binding protein (PABP), which protects the mRNA from 3’→5′ degradation, stimulates translation by circularising the mRNA through eIF4G-PABP interaction (promoting ribosome recycling), and facilitates nuclear export. Alternative polyadenylation — selection of different cleavage and polyadenylation signals in a pre-mRNA — produces mRNA isoforms with different 3′ untranslated regions (UTRs) that differ in stability, translation efficiency, and microRNA responsiveness.

Non-Coding RNA

miRNA, lncRNA, and the Non-Coding Transcriptome

The majority of the human genome is transcribed, but the majority of transcripts are not translated. MicroRNAs (miRNAs, ~22 nt) are processed from hairpin precursors by Drosha and Dicer, then loaded into the RISC complex where they guide sequence-specific binding to 3′ UTRs of target mRNAs, causing translational repression or mRNA degradation. Each miRNA can regulate hundreds of targets; each mRNA has multiple miRNA binding sites — creating a complex regulatory network. Long non-coding RNAs (lncRNAs, >200 nt) regulate gene expression through diverse mechanisms: scaffolding chromatin-remodelling complexes (XIST in X-chromosome inactivation), enhancer RNAs, and competing endogenous RNAs. The regulatory capacity of the non-coding transcriptome is now understood to be extensive and essential to normal development and physiology.

RNA Surveillance

Nonsense-Mediated Decay — Quality Control for mRNA

Nonsense-mediated mRNA decay (NMD) is a cellular quality control pathway that detects and degrades mRNAs containing premature termination codons (PTCs) — preventing the translation of potentially dominant-negative truncated proteins. NMD depends on the exon junction complex (EJC) deposited on mRNA at exon-exon junctions during splicing: a ribosome encountering a PTC more than ~50–55 nt upstream of an EJC triggers NMD, activating UPF1/2/3-mediated mRNA decapping and degradation. NMD is important for disease understanding: many disease-causing nonsense mutations are subject to NMD — the severity of some genetic diseases (e.g., cystic fibrosis, Duchenne muscular dystrophy) depends partly on whether the mutant transcript escapes or is eliminated by NMD.

Translation — The Ribosome, the Genetic Code, and Protein Synthesis

Translation is the decoding of the mRNA nucleotide sequence into the amino acid sequence of a polypeptide. It is the most energy-intensive process in the cell — a rapidly growing bacterial cell devotes approximately 80% of its total biosynthetic capacity to ribosome production and protein synthesis. The ribosome is the molecular machine at the centre of translation: a two-subunit ribonucleoprotein complex (small subunit + large subunit) that reads the mRNA in triplet codons, recruits aminoacyl-tRNAs carrying the appropriate amino acid, and catalyses peptide bond formation between successive amino acids through peptidyl transferase activity — which, like the spliceosome, is RNA-catalysed (the peptidyl transferase activity resides in the large subunit ribosomal RNA, making the ribosome a ribozyme).

The Genetic Code — 64 Codons, 20 Amino Acids, and Degeneracy

The genetic code maps all 64 possible triplet codons to either one of the 20 standard amino acids or to a stop signal. Because there are 64 codons but only 20 amino acids, the code is degenerate — multiple codons (synonymous codons) specify the same amino acid. Most amino acids are encoded by 2–4 codons; arginine, leucine, and serine each have 6 codons. Only methionine (AUG, also the start codon) and tryptophan (UGG) are encoded by a single codon. Three codons (UAA, UAG, UGA) do not specify amino acids but signal termination of translation.

The code is read in a continuous, non-overlapping, non-punctuated series of triplets from the AUG start codon. The reading frame — which triplet grouping is used — is set by the initiator AUG and maintained by the ribosome throughout elongation. Frameshift mutations (insertions or deletions of non-multiples of 3) shift the reading frame and change the identity of all downstream codons, typically producing a premature stop codon — a truncated, non-functional protein.

The genetic code is nearly universal — the same codon table applies from bacteria to humans — with only minor variations in mitochondria and some protists. This universality both confirms the common ancestry of all life and enables the expression of genes from one organism in another (heterologous expression) — the basis of recombinant protein production and gene therapy.

Related Academic Support

Prokaryotic Gene Regulation — Operons, Repressors, and Metabolic Responsiveness

Gene regulation allows cells to adjust protein synthesis in response to changing environmental conditions — producing metabolic enzymes only when their substrates are available, and repressing energetically costly biosynthetic pathways when their end-products are abundant. In prokaryotes, regulation at the transcriptional level is achieved primarily through operons — clusters of functionally related genes transcribed as a single polycistronic mRNA unit, controlled by shared regulatory sequences including the promoter and operator.

The lac operon — dual control by the lac repressor and catabolite activator protein Molecular Genetics Reference

LAC OPERON STRUCTURE:
  lacI  — repressor gene (constitutively expressed)
  P     — promoter (RNA Pol binding site)
  CAP site — upstream activator sequence
  O     — operator (repressor binding site, overlaps P)
  lacZ  — β-galactosidase (cleaves lactose → glucose + galactose)
  lacY  — permease (lactose import)
  lacA  — transacetylase (acetylates toxic galactosides)

REGULATION — FOUR CONDITIONS:

  Glucose present, Lactose absent:     OFF
    Repressor bound to operator → RNAP blocked
    High cAMP (low glucose) never reached → CAP inactive

  Glucose present, Lactose present:     Weak ON
    Allolactose binds repressor → repressor released from operator
    But high glucose keeps cAMP low → CAP inactive → low transcription

  Glucose absent, Lactose absent:      OFF
    Repressor bound to operator → no transcription despite CAP being active

  Glucose absent, Lactose present:      MAXIMUM ON
    No glucose → adenylyl cyclase active → high cAMP → CAP-cAMP binds CAP site
    Allolactose → repressor released → operator free
    CAP-cAMP recruits RNAP to promoter → ~50× stimulation of transcription
    Cell uses available lactose as carbon source efficiently

Other important prokaryotic regulatory mechanisms include: the trp operon, regulated by a repressor that is activated (not inactivated) by its end-product tryptophan — a biosynthetic operon under end-product repression, the opposite of inducible catabolic operons; attenuation, a transcription termination mechanism in which the secondary structure formed by the nascent leader RNA depends on translational coupling and amino acid availability, fine-tuning the termination decision before the structural genes are reached; and riboswitches, RNA elements in the 5′ UTR of mRNAs that directly bind small molecule metabolites, causing conformational changes that alter transcription termination or translation initiation — a protein-independent regulatory mechanism found in bacteria and some eukaryotic organelles.

Eukaryotic Gene Regulation — Transcription Factors, Enhancers, and Chromatin Remodelling

Eukaryotic gene regulation is vastly more complex than prokaryotic regulation, reflecting the greater genomic complexity, the separation of transcription and translation by the nuclear envelope, and the requirements of multicellular development — where thousands of distinct cell types must each express specific subsets of the 20,000+ genes in the genome. Regulation occurs at every step from chromatin structure to post-translational modification, but transcriptional regulation — controlling the rate of RNA Pol II initiation at gene promoters — remains the primary and best-understood level.

Transcription Factors — Sequence-Specific Regulators

Sequence-specific transcription factors (TFs) bind defined DNA sequences through structural domains — zinc fingers, helix-turn-helix, basic leucine zipper (bZIP), basic helix-loop-helix (bHLH) — and activate or repress transcription by recruiting coactivator/corepressor complexes. The human genome encodes approximately 1,600 TFs. Activators recruit the Mediator co-activator complex that bridges TFs to RNA Pol II; they also recruit histone acetyltransferases (HATs) that acetylate histone lysines, relaxing chromatin. Repressors recruit histone deacetylases (HDACs) and histone methyltransferases (HMTs) that compact chromatin and reduce accessibility. TF combinatorial binding — multiple TFs binding to an enhancer — generates the cell-type specificity of gene expression: each TF has broad genomic binding but its combination with cell-type-specific TF partners determines which genes are actually activated.

Enhancers — Remote Control of Transcription

Enhancers are cis-regulatory DNA elements that activate transcription independent of their distance (up to 1 Mb) and orientation relative to the target gene. They function by looping the chromatin to bring bound activators into contact with the promoter-associated preinitiation complex — a process mediated by the Mediator complex and cohesin-facilitated chromatin loops within topologically associating domains (TADs). Enhancers are marked by histone modifications (H3K4me1, H3K27ac), bidirectional transcription (producing enhancer RNAs), and accessible chromatin (detectable by ATAC-seq and DNase-seq). Cell-type specificity of enhancer activity — different TFs binding the same enhancer in different cell types — is a primary mechanism generating cell-type-specific gene expression programmes.

Chromatin Remodelling — Controlling DNA Accessibility

Chromatin remodelling complexes — including SWI/SNF (BAF), ISWI, NuRD, and INO80 families — use ATP hydrolysis to reposition, evict, or restructure nucleosomes, controlling access to underlying DNA sequences. SWI/SNF complexes (mutated in ~20% of human cancers) slide or evict nucleosomes to create accessible chromatin at promoters and enhancers. ISWI complexes typically space nucleosomes to generate regular arrays associated with repressed chromatin. Pioneer transcription factors — a special class of TFs — can bind nucleosomal DNA and recruit remodelling complexes to establish new accessible regions, enabling cell fate transitions and reprogramming. The chromatin accessibility landscape of a cell — its “chromatin state” — determines which genes are available for activation and which are stably silenced.

Post-Transcriptional Regulation

Gene expression is regulated not only at the transcriptional level but also at RNA processing, nuclear export, mRNA stability, and translational efficiency. RNA-binding proteins (RBPs) bind specific sequences in the 5′ UTR, 3′ UTR, or coding sequence of mRNAs, regulating splicing, polyadenylation site choice, mRNA localisation, stability, and translation rate. The iron response element (IRE) / iron-regulatory protein (IRP) system is a paradigmatic example: in low-iron conditions, IRP binds IRE hairpins in the ferritin mRNA 5′ UTR to repress translation, and in the transferrin receptor mRNA 3′ UTR to stabilise the mRNA — coordinating iron uptake and storage from a single post-transcriptional regulatory mechanism without changing transcription.

Phase Separation and Transcriptional Condensates

Recent discoveries have revealed that gene regulation involves liquid-liquid phase separation — the formation of condensate droplets in the nucleus through the concentration of intrinsically disordered regions (IDRs) of TFs and co-activators. Transcriptional condensates at super-enhancers concentrate RNA Pol II, Mediator, and activating TFs, potentially creating high local concentrations that drive burst-like transcriptional activity. Heterochromatin protein 1 (HP1) forms phase-separated condensates at constitutive heterochromatin, contributing to stable silencing. Phase separation may explain how enhancers communicate over long distances in the nucleus — through condensate-mediated concentration of regulatory factors — though the precise relationship between condensates and transcriptional control is still being resolved.

Epigenetics — DNA Methylation, Histone Modifications, and Heritable Gene Regulation

Epigenetics refers to heritable changes in gene expression that do not involve changes in the DNA sequence — instead, they involve covalent modifications of DNA or histones, or non-covalent changes in chromatin structure, that alter transcriptional activity and are transmitted through cell division. The discovery that identical genomes can produce vastly different cell types (a hepatocyte and a neuron share the same DNA but express very different gene sets) established that epigenetic regulation is essential to differentiation and development. The finding that some epigenetic states are transmitted across generations — transgenerational epigenetic inheritance — has profound implications for our understanding of inheritance and the relationship between environment and phenotype.

Key epigenetic marks — associations with transcriptional state

H3K27ac — active enhancers and promoters

Active

H3K4me3 — active gene promoters

Active

H3K4me1 — active and poised enhancers

Poised/Active

H3K27me3 — Polycomb-repressed genes

Repressed

H3K9me3 — constitutive heterochromatin

Silenced

CpG methylation — gene body (active genes)

Variable

CpG methylation — promoter CpG islands

Silenced

DNA methylation at CpG dinucleotides is catalysed by DNA methyltransferases (DNMT3A and DNMT3B establish de novo methylation; DNMT1 maintains methylation patterns during replication by copying parental strand methylation to newly synthesised strands). Methylation of CpG islands at gene promoters is associated with stable transcriptional silencing — used in X-chromosome inactivation (where the inactive X chromosome is comprehensively CpG methylated), genomic imprinting (where gene expression from one parental allele is silenced by methylation), and cancer (where aberrant CpG island hypermethylation silences tumour suppressor genes). Active DNA demethylation is mediated by the TET enzyme family, which converts 5-methylcytosine to 5-hydroxymethylcytosine and further oxidised forms that are removed by base excision repair.

Recombinant DNA Technology — Restriction Enzymes, Cloning, and Heterologous Expression

Recombinant DNA technology — the set of methods for cutting, joining, copying, and introducing DNA molecules between organisms — transformed biology after 1973 when Herbert Boyer and Stanley Cohen demonstrated that a gene from a toad could be expressed in bacterial cells from a recombinant plasmid. The intellectual tools were available: restriction enzymes (discovered by Werner Arber, Hamilton Smith, and Daniel Nathans — Nobel Prize 1978) to cut DNA at specific sequences; DNA ligase to join DNA fragments; plasmid vectors to replicate foreign DNA in bacteria; and transformation to introduce DNA into cells. The combination created the foundation of modern biotechnology, genetic medicine, and much of contemporary research.

Restriction Enzymes

Type II restriction endonucleases cut dsDNA at specific palindromic recognition sequences (4–8 bp), generating blunt or sticky ends. EcoRI (cuts G↓AATTC), HindIII, BamHI, and hundreds of others provide a molecular toolkit. Methylation by cognate methyltransferases protects bacterial DNA from self-cleavage — the restriction-modification system is bacterial innate immunity against phage DNA.

Molecular Cloning

A target DNA fragment is cut with restriction enzymes compatible with the vector’s multiple cloning site (MCS), ligated into the vector with T4 DNA ligase, transformed into competent bacteria, and selected on antibiotic plates. Blue-white selection (lacZ α-complementation) or antibiotic resistance identifies recombinant colonies. Modern alternatives include Gibson assembly (exonuclease-mediated overlap joining), Golden Gate (BsaI-based modular assembly), and TOPO cloning.

Heterologous Protein Expression

Cloned genes can be expressed in bacterial (E. coli), yeast, insect (baculovirus/Sf9 cells), mammalian (CHO, HEK293), or cell-free systems to produce recombinant proteins for research, diagnostics, and therapeutics. Insulin (1982), human growth hormone, erythropoietin, and monoclonal antibodies are among the recombinant proteins now produced at industrial scale — enabled entirely by recombinant DNA technology.

PCR, DNA Sequencing, and Gel Electrophoresis — The Core Analytical Toolkit

Three techniques form the experimental backbone of virtually all molecular biology work: PCR for amplifying specific sequences from complex genomic backgrounds; DNA sequencing for reading the nucleotide sequence of amplified products or entire genomes; and gel electrophoresis for separating and visualising DNA, RNA, and protein molecules by size. Together, these methods enable the identification of mutations, the verification of cloned constructs, the expression analysis of genes, the fingerprinting of organisms, and the diagnosis of infectious diseases — all from minute biological samples.

Technique

Principle

Applications

Method

How it works

Key variants

Major applications

PCR

Exponential amplification of target DNA using thermostable Taq polymerase, two flanking primers, and thermocycling (denature → anneal → extend). 30–35 cycles produce ~10⁶–10⁹ copies of target

RT-PCR (RNA→cDNA→PCR), qPCR (real-time quantification), digital PCR (absolute quantification), multiplex PCR (multiple targets), nested PCR (sensitivity), RACE (rapid amplification of cDNA ends)

Diagnostic testing (COVID-19, HIV, TB), gene expression analysis, genotyping, forensic DNA profiling, ancient DNA, cloning, mutagenesis, sequencing library preparation

DNA Sequencing

Sanger: chain termination using ddNTPs generates nested fragments separated by capillary electrophoresis. NGS: sequencing of millions of fragments simultaneously by cyclic reversible termination (Illumina), nanopore translocation, or single-molecule real-time (PacBio)

Sanger (single locus, verification), Illumina short-read NGS (WGS, WES, RNA-seq, ChIP-seq), nanopore long-read (structural variants, methylation), PacBio HiFi (high accuracy long reads)

Whole genome sequencing, cancer mutation profiling, transcriptomics, metagenomics, clinical variant identification, epigenomics, phylogenetics, forensics

Gel Electrophoresis

DNA/RNA migrates through agarose (or polyacrylamide) gel matrix under electric field. Smaller molecules migrate faster (higher mobility). Distance migrated inversely proportional to log(size). Stained with EtBr or SYBR Safe for visualisation under UV.

Agarose (DNA/RNA, 100 bp–20 kb), polyacrylamide (DNA, high resolution, sequencing gels), pulsed-field gel electrophoresis (PFGE, chromosomes/very large DNA), SDS-PAGE (proteins by molecular weight), 2D-PAGE (proteomics)

PCR product verification, restriction mapping, RNA quality assessment, Southern/Northern blotting, protein size determination, DNA ladder calibration

Southern/Northern Blot

DNA (Southern) or RNA (Northern) separated by gel electrophoresis, transferred to membrane, hybridised with a labelled probe complementary to the target sequence. Detects specific sequences in complex mixtures with defined sizes.

Western blot (protein equivalent, using antibodies): detects specific proteins by size after SDS-PAGE. EMSA (electrophoretic mobility shift assay): detects protein-DNA binding by gel shift of a probe.

Gene copy number determination, mRNA expression analysis, protein expression and modification, protein-DNA interaction studies, restriction fragment length polymorphism (RFLP) analysis

CRISPR-Cas9 Genome Editing — From Bacterial Immunity to Precision Medicine

CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats — CRISPR-associated protein 9) is the bacterial adaptive immune system that has been repurposed as the most transformative tool in molecular biology since PCR. Discovered as a bacterial defence mechanism against phage infection (CRISPR arrays store fragments of phage DNA as an immunological memory; when reinfected, Cas9 guided by crRNA cleaves the matching phage DNA), its repurposing as a programmable genome editing tool was recognised with the Nobel Prize in Chemistry in 2020 awarded to Jennifer Doudna and Emmanuelle Charpentier.

20 nt

Guide RNA target

Length of the spacer sequence in the sgRNA that base-pairs with the genomic target — defining the specificity of Cas9 cleavage

NGG

PAM sequence

Protospacer adjacent motif — required immediately 3′ of the target sequence; Cas9 from S. pyogenes requires NGG (approximately every 8 bp in the human genome)

3 bp

Cut site

Location of Cas9 blunt-end double-strand break — 3 bp upstream of the PAM sequence, within the 20-nt target region

2020

Nobel Prize

Chemistry Nobel awarded to Jennifer Doudna and Emmanuelle Charpentier for developing CRISPR-Cas9 as a genome editing tool

CRISPR Applications — From Basic Research to Clinical Trials

Gene Knockout — Loss-of-Function Studies

The simplest CRISPR application: targeting a sgRNA to a coding exon creates an indel via NHEJ repair that frameshifts or truncates the protein — a functional knockout. This replaced the labour-intensive homologous recombination targeting used to create knockout mouse models, reducing timescales from years to weeks. Genome-wide CRISPR screens using pooled sgRNA libraries have mapped the genetic dependencies of cancer cells, identified essential genes, and uncovered drug resistance mechanisms at scale.

Base Editing — Single Nucleotide Changes Without Double-Strand Breaks

Base editors — developed by David Liu’s group — fuse a catalytically impaired Cas9 (nickase) to a deaminase enzyme, enabling conversion of one DNA base to another at the target site without creating a double-strand break. Cytosine base editors (CBEs) convert C→T; adenine base editors (ABEs) convert A→G. Because most single-gene disease-causing point mutations are C→T or G→A transitions (which ABEs can correct as A→G), base editors have broad therapeutic potential. Clinical trials using ex vivo base editing of haematopoietic stem cells are ongoing for sickle cell disease and β-thalassaemia.

Prime Editing — Writing Any Change into the Genome

Prime editing (PE), also from David Liu’s group, uses a pegRNA (prime editing guide RNA) that contains both the target-matching spacer sequence and the desired edit sequence, combined with a Cas9-nickase fused to reverse transcriptase. After the pegRNA-directed nick, the reverse transcriptase uses the 3′ extension of the pegRNA as a template to copy the desired edit into the nicked strand — capable of introducing any point mutation, small insertion, or small deletion without double-strand breaks or donor templates. Prime editing substantially expands the range of editable mutations, addressing disease-causing variants that base editing cannot correct.

Clinical Applications — CRISPR in Human Trials

The first approved CRISPR-based therapeutic, Casgevy (exagamglogene autotemcel), received FDA and EMA approval in late 2023 for sickle cell disease and transfusion-dependent β-thalassaemia. Casgevy uses ex vivo CRISPR editing of patients’ haematopoietic stem cells to reactivate fetal haemoglobin by disrupting the BCL11A enhancer — restoring functional haemoglobin production and eliminating disease symptoms. Multiple additional CRISPR therapies are in clinical trials for transthyretin amyloidosis (in vivo liver editing using lipid nanoparticles), acute myeloid leukaemia, metastatic cancers (CRISPR-engineered T cells), and Leber congenital amaurosis (in vivo retinal editing). The field of CRISPR therapeutics is advancing rapidly following the first successful regulatory approvals.

Molecular Medicine — Genomics, Gene Therapy, and the Clinical Applications of Molecular Biology

The translation of molecular biology knowledge into clinical medicine has produced some of the most significant advances in healthcare of the past three decades — from the molecular diagnosis of inherited diseases and the sequencing of tumour genomes to the development of targeted therapies, RNA-based vaccines, and gene therapies that address the root causes of genetic disease at the molecular level. Understanding the molecular basis of disease and the molecular mechanisms of its treatment is now a core requirement for medical education and clinical practice.

1977

Sanger DNA sequencing — first method to read DNA sequence

Frederick Sanger’s chain-termination sequencing method enabled the first direct reading of DNA sequences, including the bacteriophage ΦX174 genome. It remained the gold standard for single-locus sequencing for 30 years and remains the primary method for clinical mutation verification today.

1982

Recombinant human insulin — first approved recombinant protein therapeutic

Eli Lilly’s Humulin (recombinant human insulin expressed in E. coli) became the first recombinant DNA-derived therapeutic approved by the FDA, demonstrating that molecular biology could produce life-saving medicines at commercial scale — replacing animal-derived insulin and eliminating immunogenicity risks.

1998

RNA interference (RNAi) — discovered by Fire and Mello (Nobel 2006)

Andrew Fire and Craig Mello demonstrated that double-stranded RNA triggers specific silencing of complementary mRNA sequences — a conserved post-transcriptional regulation mechanism. RNAi became a standard research tool for gene knockdown and has been developed into RNA therapeutic drugs: patisiran (approved 2018) for hereditary transthyretin amyloidosis was the first RNAi drug approved by the FDA.

2003

Human Genome Project completed — reference sequence for all human genes

The Human Genome Project produced the first complete reference sequence of the human genome — 3.2 billion base pairs identifying the location and sequence of approximately 20,000–25,000 protein-coding genes. This reference enabled systematic association of genetic variants with diseases, the development of genome-wide association studies (GWAS), and the field of precision medicine.

2021

mRNA vaccines — molecular biology enables pandemic response

The COVID-19 pandemic demonstrated the clinical power of mRNA technology developed over decades by Katalin Karikó and Drew Weissman (Nobel 2023). mRNA vaccines encoding the SARS-CoV-2 spike protein, delivered in lipid nanoparticles, were developed and authorised within 12 months of the pandemic’s onset — the fastest vaccine development in history, enabled by molecular biology tools including reverse genetics, codon optimisation, and nucleotide modification chemistry.

2023

CRISPR therapeutics approved — first genome editing medicines

Casgevy’s approval for sickle cell disease and β-thalassaemia marked the clinical arrival of CRISPR-based genome editing in medicine — a development that was only possible because of foundational molecular biology discoveries over the preceding 70 years, from the Watson-Crick double helix through CRISPR’s bacterial origins to its repurposing as a precision editing tool by Doudna and Charpentier.

Molecular Biology Across Academic Curricula

Molecular biology features at every level of biology, biochemistry, biomedical science, medicine, pharmacy, and nursing curricula. Introductory courses cover DNA structure, the central dogma, PCR, and gel electrophoresis. Intermediate courses address gene regulation, RNA processing, recombinant DNA technology, and sequencing methods. Advanced and graduate-level courses engage with epigenomics, single-cell genomics, CRISPR applications, RNA therapeutics, and the molecular basis of cancer and inherited disease. Medical curricula integrate molecular biology through genetics, pharmacogenomics, cancer biology, and infectious disease.

For students needing support with molecular biology assignments, laboratory reports, research papers, or dissertations — from introductory DNA structure essays to advanced chromatin regulation analyses — our biology assignment help and biology research paper service provide specialist support. For pharmacology and medical applications of molecular biology, our nursing assignment help and custom science writing cover clinical molecular biology topics at all degree levels.

Molecular Biology, Biochemistry, and Genetics Academic Support

From DNA structure essays and PCR lab reports to full dissertations in CRISPR technology, epigenomics, and molecular medicine — specialist molecular biology writers available across all degree levels.

Biology Assignment Help Biology Research Papers

Frequently Asked Questions About Molecular Biology

What is molecular biology?

Molecular biology is the branch of biology that studies biological processes at the molecular level — particularly the structure, function, and interactions of the nucleic acids DNA and RNA and proteins that carry, express, and regulate genetic information. It emerged from the convergence of genetics, biochemistry, and structural biology in the mid-twentieth century, with foundational discoveries including the double helix structure of DNA (Watson and Crick, 1953), the mechanism of DNA replication (Meselson and Stahl, 1958), the genetic code (Nirenberg, Khorana, Holley — Nobel 1968), and restriction enzymes (Arber, Smith, Nathans — Nobel 1978). Today it encompasses genomics, epigenomics, transcriptomics, proteomics, and the genome editing technologies that are transforming medicine.

What is the central dogma of molecular biology?

The central dogma, articulated by Francis Crick in 1958, describes the general flow of genetic information: DNA is replicated to produce new DNA; DNA is transcribed to produce RNA; RNA is translated to produce protein. Information flows from nucleic acid to protein but not in reverse — proteins cannot direct the synthesis of nucleic acids with specific sequences. Retroviruses represent a known exception through reverse transcriptase (RNA → DNA). The central dogma does not claim that all possible information transfers are prohibited — only that protein-to-nucleic-acid sequence transfer does not occur under normal cellular conditions.

What is the structure of DNA?

DNA is a double-stranded antiparallel helix composed of two polynucleotide strands connected by hydrogen bonds between complementary base pairs: adenine (A) pairs with thymine (T) via 2 hydrogen bonds; guanine (G) pairs with cytosine (C) via 3 hydrogen bonds. Each strand is a polymer of deoxyribonucleotides connected by 3’–5′ phosphodiester bonds. The B-form helix (predominant in cells) has a diameter of ~2 nm, a rise of 0.34 nm per base pair, and 10.5 bp per full turn (~3.4 nm pitch). The major groove (~2.2 nm wide) is the primary binding site for sequence-specific regulatory proteins. The antiparallel orientation — one strand 5’→3′, the other 3’→5′ — is fundamental to the directionality of replication and transcription.

What is the difference between transcription and translation?

Transcription is the synthesis of RNA from a DNA template by RNA polymerase. The template DNA strand is read 3’→5′ and a complementary RNA is synthesised 5’→3′. In eukaryotes, it occurs in the nucleus and produces pre-mRNA requiring processing (capping, splicing, polyadenylation) before export. Translation is the decoding of the processed mRNA sequence by the ribosome to produce a polypeptide — each three-nucleotide codon in the mRNA specifies an amino acid via aminoacyl-tRNA adaptors. Translation occurs in the cytoplasm. Together, transcription and translation implement the second and third transfers of the central dogma.

How does the lac operon regulate gene expression?

The lac operon controls genes for lactose metabolism in E. coli through dual regulation. Negative regulation: the lac repressor binds the operator sequence, blocking RNA polymerase access when lactose is absent. When lactose (as allolactose) is present, it binds the repressor, reducing its DNA affinity and derepressing the operon. Positive regulation: low glucose levels raise cAMP, which activates catabolite activator protein (CAP). CAP-cAMP bound to the CAP site ~60 bp upstream of the promoter strongly stimulates RNA polymerase binding. Maximum transcription requires both repressor removal (lactose present) AND CAP activation (glucose absent) — the cell only fully produces lactose-metabolising enzymes when lactose is available and glucose (the preferred carbon source) is not.

What is CRISPR-Cas9 and how does it work?

CRISPR-Cas9 is a bacterial adaptive immune system repurposed as a precision genome editing tool. A synthetic single guide RNA (sgRNA) — containing a 20-nucleotide spacer complementary to the genomic target — directs the Cas9 endonuclease to the matching chromosomal sequence adjacent to an NGG protospacer adjacent motif (PAM). Cas9 creates a blunt double-strand break 3 bp upstream of the PAM. Non-homologous end joining (NHEJ) repair introduces indels (disabling the gene); homology-directed repair (HDR) with a donor template introduces precise edits. CRISPR-Cas9 received the 2020 Nobel Prize in Chemistry. The first CRISPR therapeutic (Casgevy) was approved in 2023 for sickle cell disease and β-thalassaemia.

What is PCR and how is it used in molecular biology?

PCR (polymerase chain reaction) amplifies a specific DNA sequence exponentially using thermostable Taq polymerase, two flanking oligonucleotide primers, and thermocycling. Denature (~95°C) → anneal primers (~55–65°C) → extend with polymerase (~72°C): after 30–35 cycles, the target is amplified ~10⁶–10⁹-fold. Variants include RT-PCR (reverse transcriptase converts RNA to cDNA first, for gene expression analysis), qPCR/real-time PCR (fluorescence-based quantification of amplification for gene expression measurement or viral load testing), and digital PCR (absolute quantification). Applications include diagnostic testing (COVID-19, HIV, genetic disease), forensic DNA profiling, cloning, sequencing library preparation, and genotyping.

What is epigenetics and how does it affect gene expression?

Epigenetics refers to heritable changes in gene expression that do not alter the DNA sequence — instead involving covalent modifications to DNA or histones that affect chromatin accessibility and transcription. Key mechanisms: DNA methylation at CpG dinucleotides (by DNMT3A/B de novo; DNMT1 for maintenance), associated with gene silencing at promoter CpG islands — used in X-chromosome inactivation, genomic imprinting, and cancer tumour suppressor silencing. Histone modifications: acetylation (by HATs) relaxes chromatin and activates transcription; H3K4me3 marks active promoters; H3K27me3 (by Polycomb) marks repressed genes; H3K9me3 marks constitutive heterochromatin. Non-coding RNAs (miRNA, lncRNA) regulate gene expression post-transcriptionally and through chromatin remodelling. Epigenetic marks are heritable through cell division and some are transmitted across generations.

What are restriction enzymes and how are they used in molecular cloning?

Restriction enzymes (Type II endonucleases) cut double-stranded DNA at specific palindromic recognition sequences (4–8 bp), producing blunt ends or 4-nucleotide sticky ends (overhangs). In molecular cloning, a target DNA fragment and a plasmid vector are cut with compatible restriction enzymes; their complementary sticky ends are joined by T4 DNA ligase; the recombinant plasmid is introduced into competent bacteria by transformation and selected on antibiotic plates. Modern cloning alternatives include Gibson assembly, Golden Gate, and TOPO cloning. Restriction enzymes are also used for restriction mapping (characterising DNA fragments by size), DNA fingerprinting (RFLP analysis), and verifying cloned inserts.

Related Academic Support for Molecular Biology, Biochemistry, and Biomedical Science Students

Explore further support: biology assignment help · biology research papers · chemistry homework help · custom science writing · nursing assignment help · literature review writing · research paper writing · dissertation support · lab report writing · biostatistics help · data analysis help · critical analysis papers · proofreading and editing · citation and referencing · challenging research topics · statistics assignment help

Molecular Biology

Molecular Biology — Scope, History, and the Questions That Define the Field

DNA Structure — The Double Helix, Base Pairing, and the Chemical Basis of Heredity

The Chemical Components of DNA

Chromatin — Packaging DNA in the Eukaryotic Nucleus

DNA Replication — Semiconservative Copying, the Replisome, and Maintaining Fidelity

Origins of Replication

Replication Fidelity — Three Levels

The Central Dogma — Information Flow from DNA to RNA to Protein

DNA Replication — Faithful Copying of the Genome

Transcription — Converting DNA Information into RNA

RNA Processing — Preparing mRNA for Translation

Translation — Decoding mRNA Sequence into Protein

Transcription — Initiating, Elongating, and Terminating RNA Synthesis

Prokaryotic Transcription

RNA Processing — Splicing, the Spliceosome, and Expanding the Proteome

The Spliceosome — Five snRNPs, One Machine

Alternative Splicing — One Gene, Many Proteins

The 5′ Cap — Translation Initiation and mRNA Stability

Polyadenylation — Adding the Poly-A Tail

miRNA, lncRNA, and the Non-Coding Transcriptome

Nonsense-Mediated Decay — Quality Control for mRNA

Translation — The Ribosome, the Genetic Code, and Protein Synthesis

The Genetic Code — 64 Codons, 20 Amino Acids, and Degeneracy

Ribosome Structure and Sites

Related Academic Support

Prokaryotic Gene Regulation — Operons, Repressors, and Metabolic Responsiveness

Eukaryotic Gene Regulation — Transcription Factors, Enhancers, and Chromatin Remodelling

Transcription Factors — Sequence-Specific Regulators

Enhancers — Remote Control of Transcription

Chromatin Remodelling — Controlling DNA Accessibility

Post-Transcriptional Regulation

Phase Separation and Transcriptional Condensates

Epigenetics — DNA Methylation, Histone Modifications, and Heritable Gene Regulation

Recombinant DNA Technology — Restriction Enzymes, Cloning, and Heterologous Expression

Restriction Enzymes

Molecular Cloning

Heterologous Protein Expression

PCR, DNA Sequencing, and Gel Electrophoresis — The Core Analytical Toolkit

CRISPR-Cas9 Genome Editing — From Bacterial Immunity to Precision Medicine

Guide RNA target

PAM sequence

Cut site

Nobel Prize

CRISPR Applications — From Basic Research to Clinical Trials

Gene Knockout — Loss-of-Function Studies

Base Editing — Single Nucleotide Changes Without Double-Strand Breaks

Prime Editing — Writing Any Change into the Genome

Clinical Applications — CRISPR in Human Trials

Molecular Medicine — Genomics, Gene Therapy, and the Clinical Applications of Molecular Biology

Molecular Biology Across Academic Curricula

Molecular Biology, Biochemistry, and Genetics Academic Support

Frequently Asked Questions About Molecular Biology