Cas9 Protein – Structure, Types, Function

Name Cas9 endonuclease
Alternative namespCas9/spyCas9
Organism Streptococcus pyogenes serotype M1
Molecular weight~163KDa
Gene cas9
Location on chromosome0.85 to 0.86Mb
ProteinCRISPR-associated endonuclease Cas9/Csn1
CofactorMg2+
Biological processing Interference- defense response to phage.
Maintaining CRISPR repeat sequences
FunctionsDNA and RNA binding
Metal ion binding
3’-5’ exonuclease activity
Endonuclease activity 
  • Cas9 is a nuclease that degrades phage DNA via RNA-guided double DNA cleavage, DNA binding, and nuclease activity.
  • Cas9 protein is prominent in CRISPR systems of bacterial type II.
  • It requires both crRNA and tracrRNA to function properly.
  • Catalytic activity also requires a PAM sequence on the target DNA.
  • Cas9 is changed for a variety of functions, including gene activation and gene expression suppression.
  • Cas9’s significance in CRISPR-mediated gene editing and applications such as disease modelling, gene role research, therapeutic and gene expression investigations is well established.
  • Using the PAM sequence as a marker, it simply locates, binds, and cleaves the target nucleic acid. To identify the fugitive, the sgRNA containing cRNA and tracrRNA seeks complementarity with the target location.
  • However, its two-level authentication (the employment of sgRNA and PAM) diminishes in vitro gene editing efficiency significantly. Therefore, customised Cas9 nucleases such as spCas9, dCas9, SaCas9, and XCas9 are available.

What is Cas9 Protein?

  • Cas9, also known as CRISPR-associated protein 9, is one of the well-studied, significant, and commercially available nucleases employed not only in bacterial systems, but also in in vitro gene-editing techniques.
  • Cas9 is a form of DNA nuclease that can accurately remove dsDNA, and it is exclusive to CRISPR type II. It is most typically found in Streptococcus pyogenes and is referred to as dual RNA-guided DNA endonuclease.
  • To comprehend why only Cas9 is commonly employed for gene editing, it is necessary to comprehend the structure, function, and significance of the Cas9 protein, formerly known as Cas5, Csx12, and Csn1.
  • S. pyogenes SpyCas9 is a large (1,368 amino acids), multidomain, and multifunctional DNA endonuclease.
  • It uses its two unique nuclease domains to snip dsDNA 3 bp upstream of the PAM: an HNH-like nuclease domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC-like nuclease domain that cleaves the DNA strand opposite the complementary strand (nontarget strand).
  • Cas9 also contributes in crRNA maturation and spacer acquisition, in addition to its essential involvement in CRISPR interference.
A gene map of the Cas9 gene.
A gene map of the Cas9 gene. | Image Credit: geneticeducation.co.in

Structure of Cas9

  • Cas9 in its apo state has two different lobes: the alpha-helical recognition (REC) lobe and the nuclease (NUC) lobe, which contains the conserved HNH and split RuvC nuclease domains as well as the more variable C-terminal domain (CTD).
  • Two linking segments join the two lobes, one created by the arginine-rich bridge helix and the other by a disordered linker (residues 712–717).
  • The REC lobe consists of three alpha-helical domains (Hel-I, Hel-II, and Hel-III) and is structurally distinct from all other known proteins.
  • The extended CTD has a Cas9-specific fold and contains PAM-interacting sites necessary for PAM interrogation. Nonetheless, this PAM-recognition region is highly disordered in the apo–Cas9 structure, indicating that the apo–Cas9 enzyme is maintained in an inactive state, unable to detect target DNA prior to binding to a guide RNA.
  • This structural finding is consistent with so-called DNA curtains tests demonstrating that apo–Cas9 binds nonspecifically to DNA and can be swiftly removed from nonspecific locations in the presence of competing RNA (guide RNA or heparin).
  • The structural superimposition of apo–Cas9 with sgRNA-bound and DNA-bound structures indicates further that the enzyme adopts a catalytically inactive conformation in its apo state, requiring RNA-induced structural activation for DNA recognition and cleavage.
  • This structural result corroborates the biochemical findings that Cas9 enzymes are inactive as nucleases in the absence of bound guide RNAs and further supports their activity as RNA-guided endonucleases.
Structure of Cas9
Structure of Cas9

HNH and RuvC Nuclease Domains

  • Comparing the structures of Cas9 nuclease domains to those of other DNA-bound nucleases shows that the Cas9 RuvC nuclease domain is similar to members of the retroviral integrase superfamily that have an RNase H fold. This suggests that RuvC probably uses a two-metal-ion catalytic mechanism to cut the nontarget DNA strand.
  • The HNH nuclease domain, on the other hand, has the same -metal fold as other HNH endonucleases and most likely uses a single metal ion to cut the target-strand DNA.
  • One metal-ion-dependent and two metal-ion-dependent nucleic acid cleaving enzymes can be identified by a general base histidine that is always the same and an aspartate residue that is always the same.
  • This is in line with Cas9 mutagenesis studies that show changing either the HNH (H840A) or the RuvC domain (D10A) turns Cas9 into a nickase, while changing both nuclease domains of Cas9 (so-called “dead Cas9” or dCas9) keeps its ability to bind to RNA-guided DNA but gets rid of its ability to cut DNA.
  • But these proposed catalytic mechanisms still need to be tested in the lab to make sure they work.

Mechanism of working

  • In the first step of bacterial interference, the REC lob and the gRNA complex work together to form the ribonucleoprotein complex (RNP). Then, the nuclease domains RuvC and HNH break two phosphodiester bonds between two different strands of DNA, which separates the dsDNA strands.
  • An in-depth study shows that the HNH active domain hydrolyzes the phosphodiester bond of the complementary strand, while the RuvC active site hydrolyzes the phosphodiester bond of the non-complementary strand. RuvC and HNH each use two metal ions and one metal ion for hydrolysis because it needs metal ions to work (Tang H et al., 2021).  
Structure of Cas9
Structure of Cas9 | Image Credit: geneticeducation.co.in
Lobe Domain Residues function
RECBridge helix60-93Recognition of DNA
RECREC194-179, 308-713RNA guided DNA targeting
RECREC2180-307DNA binding 
NUCRuvC (RuvCI, RuvCII and RuvCIII)1-59, 718-769, 909-1098RNase H activity; Nuclease activity for non-complementary target strand. 
NUCHNH775-908Nuclease activity for complementary target strand
NUCPAM-interacting- domain1099-1368Finds the PAM sequence on the target DNA.

Types of Cas9 nucleases

There are different kinds of Cas9 nucleases that come from both nature and labs. They are put into groups based on their function or the species from which they came. I’ll list and explain a few of them here.

Types of Cas9 nucleases
Types of Cas9 nucleases | Image Credit: geneticeducation.co.in

1. SpCa9

Structure Bilobed (REC and NUC)
DomainsNUC (Nuclease domain): HNH and RuvC
REC (recognition domain): Rec1, Rec2 and Rec3. 
Bacterial CRISPR system System II
PAM sequence 5’-NGG-3’ (N is any nucleotide)
SgRNARequired (crRNA: tracrRNA)
VariantsSpCas9-NRRH, SpG, SpCas9-NRCH, SpCas9-NRTH, 
  • SpCas9 comes from Streptococcus Pyogenes and is one of the most popular, well-studied, and widely used Cas9 nucleases in genetic engineering experiments.
  • As was already said, it needs both crRNA and tracrRNA as sgRNA and the PAM sequence to find the target.
  • Once the SpCas9 finds the PAM (5′-NGG-3′) sequence, the sgRNA sends the nuclease right to the target region, where the spCas9 cuts through both strands of DNA.
  • The structure is similar to the general structure of Cas9, with the nuclease lobe for catalytic activity and the recognition lobe for recognising and identifying the target DNA.

Advantages of SpCa9

  • easy to get and well-researched.
  • Simple to separate
  • Very efficient
  • Simple to use.

Disadvantages of SpCa9

  • Required PAM sequence.
  • Also finds false PAM and makes effects that don’t hit the target.
  • Learn to recognise other PAMs, such as 5′-NAG-3′ and 5′-NGA-3′.
  • It’s big and can’t be moved around easily.
  • Hard to say and say out loud.

Applications of SpCa9

As was already said, the current system has been carefully studied and has a lot of data. Because of this, it is popular in gene therapy. Among the most common uses are

  • Transcriptional repression
  • Activation of transcription
  • Epigenetic modulation
  • Gene disruption
  • Conversion of a single base pair

2. SaCas9

Structure Bilobed (REC and NUC)
DomainsNUC (Nuclease domain): HNH and RuvCREC (recognition domain): Rec1, Rec2 and Rec3. 
Bacterial CRISPR system System II
PAM sequence 5’-NNGRRT-3’ (N is any nucleotide)
SgRNARequired (crRNA: tracrRNA)
VariantsefSaCas9, KKHSaCas9 and SaCas9-HF
  • The SaCas9 is another very popular Cas9 nuclease. Its structure is similar to that of the SpCas9, but its size is different. The best thing about SaCas9 is that it is small. Since then, it can be used to replace the SpCas9.
  • SaCas9 comes from the bacteria Streptococcus aureus. It is made up of only 1053 amino acids, which is about 1Kb less than SpCas9.
  • It also needs a PAM sequence, such as 3′-NNGRRT-5′, to tell the difference between its own DNA and other DNA. When catalysed, it makes double-stranded ends that are sticky.

Advantages

  • Small in size
  • A lot of accuracy
  • Versatile 
  • Accurate
  • Easy to put into a virus’s carrier

Disadvantages

  • Required PAM sequence
  • You need a bigger sgRNA to have a big effect off-target.

Applications

The current Cas9 nuclease is used a lot to change the genome of plants in studies of how plants and pests interact.

  • Research on stress tolerance
  • Research into pathogen resistance
  • It can also be used to treat diseases that are caused by viruses or genes.
  • Recently, a special kind of SpCas9 was used to figure out what role the Myostatin gene plays in Muscular atrophy.

3. ScCas9

Name ScCas9
Species derived Streptococcus canis 
PAM sequence 5’-NNG-3’ 
sgRNA requirement Yes, as crRNA:tracrRNA
Variants SpCas9++, SpCas9n++
  • Streptococcus canis is where the ScCas9 nuclease was found. For it to work, it needed a slightly different PAM recognition site, which is 5′-NNG-3′ (instead of NGG).
  • The structure of the present nuclease is similar to that of other Cas9, but it shouldn’t be used because it doesn’t work as well.
  • Plant genome editing is often done with ScCas9 and its variations, such as SpCas9++, SpCas9n++, and SpCas9+.

4. dCas9

dCas9 variant  Function 
dCas9-TadArepair mutated resistance in gene bacteria, preserve adenosine deaminase activity. The present modification is capable enough to repair the faulty or mutated resistance gene for various gene editing purposes. 
dCas9-rAPOBEC1preserves cytidine deaminase activity 
dCas9-APOBEC3Apreserves cytidine deaminase activity
dCas9-AIDpreserves cytidine deaminase activity
SunTag-VP64transcriptional activator used to study the effect of overexpression. 
dCas9-VPRtripartite complex and transcription activator
dCas9-CBPrearranging chromatin structure by histone acetyltransferase domain.
Falk-fused dCas9transcriptional activator module
  • Why is dCas9 one of the most advanced, flexible, amazing, and unique versions of the Cas9 nuclease? Because it doesn’t have “nucleolytic activity,” which is the main job of nuclease. So, people call it the dead Cas9 system.
  • When the catalytic domain is taken away, the recognition domains can only find the target DNA, but they can’t cut it. So, in a technical sense, different transcriptional factors can be moved to a target location.

5. ThermoCas9

SpCas9GeoCas9
Size1368AA1087AA
PAMNGGCRAA (R=A or G)
Spacer length20nt22nt
Temperature33-4550-70
  • Mougiakos et al. (2017) created a thermoCas9 nuclease that could work well at a higher temperature. It is made from the thermostable bacterium Geobacillus thermodenitrificans T12.
  • They have also said that it can delete genes and stop transcription even at higher temperatures (55°C) without affecting the sensitivity or the need for PAM. Most of the time, it works well between 20°C and 70°C.
  • It can also be called GeoCas9.

6. HypaCas9

  • The HypaCas9 is a Hyper Cas9 that enhances genome-wide specificity without diminishing target activity in human and mouse cells.
  • Additionally, it reduces off-target activities. Technically, HypaCas9 is created by introducing the Cas9 mutations N692A, M694A, Q695A, and H698A.

7. eSpCas9

  • Enhanced precision Cas9 is a mutant version of the natural SpCas9, with a single point mutation reducing off-target activity.
  • It is sometimes referred to as high-fidelity spCas9 or highly specific Cas9

8. XCas9

  • XCas9 is a specialised, genetically designed nuclease with a reduced off-target effect with both non-NGG and NGG PAM.
  • As is well known, Cas9 requires a PAM sequence in order to function well, which boosts its specificity and significantly complicates research.
  • XCas9 can effectively detect many PAM sequences, including NGG, GAA, and GAT.
  • Therefore, it becomes more effective and efficient than SpCas9 or SaCas9 and significantly reduces the need for PAM (Hu et al., 2018).
Cas9 type Origin PAM sequence (5’ to 3’)Specialization 
SpCas9Streptococcus pyogenesNGGCleaves dsDNA using the sgRNA
SaCas9Streptococcus aureusNNGRRT or NNGRR(N)Small off-targeting effect 
ScCas9Streptococcus canisNNGThe PAM sequence can be altered depending upon the variant used. 
ThermoCas9Geobacillus thermodenitrificans T12CRAA (R=A or G)Can work efficiently at a higher temperature.
StCas9Streptococcus thermophilusNNAGAAWHigh on-target cleavage activity 
HypaCas9Streptococcus pyogenesN/AGreater genome-wide specificity
eSpCas9Streptococcus pyogenesNGGEnhanced SpCas9 work more effectively than native SpCas9
NmCas9Neisseria meningitidisNNNNGATTNeed longer cRNA which increases the accuracy 
XCas9Streptococcus pyogenesNGG and non-NGGA specialized Cas9 that works with/without the PAM. 
dCas9Streptococcus pyogenesNGGSpecialized Cas9 that lacks nuclease activity
Cas9-DDStreptococcus pyogenesNGGDestabilized Cas9 prepared to increase the accuracy and efficiency. 
SpCas9-VQRStreptococcus pyogenesNGAAltered PAM for increasing SpCas9 specificity
SpCas9-EQRStreptococcus pyogenesNGAGAltered PAM for increasing SpCas9 specificity
SpCas9-VRERStreptococcus pyogenesNGCGAltered PAM for increasing SpCas9 specificity
SpCas9-NGStreptococcus pyogenesNGAltered PAM for increasing SpCas9 specificity
SpCas9-HF1Streptococcus pyogenesNGGAltered PAM for increasing SpCas9 specificity
evoCas9Streptococcus pyogenesNGGAltered PAM for increasing SpCas9 specificity
Sniper-Cas9Streptococcus pyogenesNGGAltered PAM for increasing SpCas9 specificity

CRISPR–Cas9 Effector Complex Assembly

  • Cas9 must be associated with guide RNA (a natural crRNA–tracrRNA or a sgRNA) to create an active DNA surveillance complex for site-specific DNA recognition and cleavage.
  • The 20-nt spacer sequence of crRNA confers DNA target selectivity, whereas tracrRNA is indispensable for Cas9 recruitment.
  • Genetic and pharmacological research have elucidated the significance of a so-called seed sequence of RNA nucleotides within the spacer region of crRNAs for target selectivity.
  • In type II CRISPR systems, the seed region is described as the 10–12 nucleotides positioned at the 3 end of the 20-nt spacer sequence that are closest to the PAM.
  • Mismatches in this seed region severely impede or abrogate target DNA binding and cleavage, but close homology in the seed region frequently results in off-target binding events, even in the presence of numerous mismatches elsewhere.

Conformational Rearrangement Upon sgRNA Binding

  • The sgRNA-bound crystal structure best illustrates the concepts of Cas9–sgRNA assembly and the placement of guide RNA before to target identification.
  • Comparison of the sgRNA-bound structure to that of apo–Cas9 reveals precisely how guide RNA binding induces Cas9 to undergo a substantial structural rearrangement from an inactive conformation to a DNA recognition–competent conformation, as suggested by studies with lower resolution electron microscopy.
  • Upon sgRNA binding, the most notable conformational shift occurs in the REC lobe, namely Hel-III, which advances 65 A toward the HNH domain.
  • Cas9 exhibits much smaller conformational changes upon binding to target DNA and PAM sequence, indicating that the majority of the extensive structural rearrangements occur prior to target DNA binding and reinforcing the notion that guide RNA loading is an essential regulator of Cas9 enzyme function.

Interactions with sgRNA

  • Cas9 interacts extensively with the sgRNA. It forms several direct interactions with the repeat–antirepeat duplex, stem loop 1, and the linker region between stem loops 1 and 2 via Hel-I, the arginine-rich bridge helix, and the CTD domain.
  • Cas9 makes significantly less interactions with stem loop 2 of the sgRNA, mostly through its RuvC and CTD domains.
  • Due to the absence of a 3 tracrRNA tail in the sgRNA construct used for crystallography, no protein–RNA interaction was detected for stem loop 3 in the Cas9–sgRNA structure.
  • However, the DNA-target-bound structures demonstrate that Cas9 has very few interactions with stem loop 3.
  • According to biochemical investigations, sgRNAs lacking the linker region and stem loops 2 and 3 are still capable of inducing Cas9-mediated DNA cleavage, albeit with diminished efficiency, but stem loop 1 deletion entirely abolishes cleavage.
  • Nevertheless, functional studies demonstrate that stem loops 2 and/or 3 are necessary for substantial Cas9 activation in vivo.
  • These observations suggest that the repeat–antirepeat duplex and stem loop 1 are required for Cas9–sgRNA complex formation, whereas the linker, stem loop 2, and stem loop 3 are not required for function but may stabilise guide RNA binding to promote active complex formation, thereby enhancing catalytic efficiency in vivo.

Preordered Seed RNA and PAM-Interacting Cleft

  • Cas9 creates extensive interactions with the ribose–phosphate backbone of the guide RNA, thereby establishing the A-form conformation of the 10-nt RNA seed sequence required for initial DNA interrogation.
  • This preordering is assumed to be thermodynamically advantageous for target binding, similar to the positioning of guide RNA reported in other small regulatory RNA processes, such as the bacterial Hfq protein–RNA complex and eukaryotic Argonaute-mediated RNA silencing.
  • Notably, in the type I CRISPR interference complex Cascade, the guide RNA is preordered throughout the entire crRNA, not just in the seed region. This is likely due to the helical assembly of the complex and the release of topological constraints by completely flippedout nucleotides at every sixth position.
  • The PAM-interacting sites R1333 and R1335, which are responsible for 5 -NGG-3 PAM recognition and disordered in the apo structure, are prepositioned prior to establishing contact with target DNA, demonstrating that sgRNA loading permits Cas9 to form a DNA recognition– capable structure.
  • Notably, despite the fact that the 5 10-nt nonseed RNA sequence is completely disordered in the sgRNA-bound crystal structure, the electron microscopy (EM) structure of SpyCas9 bound to a full-length sgRNA (EMD-3276) reveals that the 5 end of the guide RNA lies within the cavity formed between the HNH and RuvC nuclease domains.
  • This structural observation shows that the 5 end of sgRNA is shielded from degradation and that an additional conformational change is necessary to liberate the 5 distal end from constraint during target DNA binding.

References

  • Jiang, F., & Doudna, J. A. (2017). CRISPR-Cas9 Structures and Mechanisms. Annual review of biophysics, 46, 505–529. https://doi.org/10.1146/annurev-biophys-062215-010822.
  • Wada, N., Ueta, R., Osakabe, Y. et al. Precision genome editing in plants: state-of-the-art in CRISPR/Cas9-based genome engineering. BMC Plant Biol 20, 234 (2020).
  • Nishimasu, Hiroshi et al. “Crystal structure of Cas9 in complex with guide RNA and target DNA.” Cell vol. 156,5 (2014): 935-49. doi:10.1016/j.cell.2014.02.001.
  • Zuo, Z., Liu, J. Structure and Dynamics of Cas9 HNH Domain Catalytic State. Sci Rep 7, 17271 (2017). https://doi.org/10.1038/s41598-017-17578-6
  • Mougiakos, I., Mohanraju, P., Bosma, E.F. et al. Characterizing a thermostable Cas9 for bacterial genome editing and silencing. Nat Commun 8, 1647 (2017). https://doi.org/10.1038/s41467-017-01591-4
  • Hu, J., Miller, S., Geurts, M. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018). https://doi.org/10.1038/nature26155.
  • Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell, 152(5), 1173–1183. https://doi.org/10.1016/j.cell.2013.02.022.
  • https://geneticeducation.co.in/cas9-protein-structure-function-types-and-importance/

Latest Questions

Start Asking Questions

This site uses Akismet to reduce spam. Learn how your comment data is processed.

⚠️
  1. Click on your ad blocker icon in your browser's toolbar
  2. Select "Pause" or "Disable" for this website
  3. Refresh the page if it doesn't automatically reload