Cas Proteins

A variable cassette of so called CRISPR-associated (Cas) genes which can encode Cas proteins, forms an important part of the CRISPR locus. The cas genes locate adjacent to the CRIPSR array and encode all proteins that are necessary for mediating the adaptive immune response. Cas genes exhibit an exceptional degree of variation and add to the complexity of the CRISPR-Cas systems. Dozens of cas gene products have been defined to date, of which several are generally conserved (such as, Cas1 to Cas6) and only two (Cas1 and Cas2) are present in all CRISPR loci. Cas genes encode a large group of proteins, which were analysed in detail with computational methods and found to contain domains that are characteristic of several nucleases, a helicase, a polymerase and various RNA-binding proteins.

Structure of Cas Proteins

CRISPR-Cas systems have been classified into three main types (I, II, and III) and lots of subtypes by bioinformatic analyses based on their cas gene organization, the sequence and structure of the corresponding proteins. Each of the three CRISPR systems is characterized by a signature gene: Cas3, a target-degrading nuclease/helicase in Type I; Cas9, an RNA-binding and target DNA-degrading nuclease in type II; Cas10, a large protein for multiple functions in type III. The three CRISPR types also differ in the composition and mechanisms of their effector complexes. Type I effector complexes are termed Cascade, type II effector complexes consist of a single Cas9 and two RNA molecules, and type III interference complexes are further divided into type III-A (Csm complex targeting DNA) and type III-B (Cmr complex targeting RNA). Cas proteins are important components of effector complexes in all CRISPR-Cas systems.

Structural organization of Cas9 protein

Fig 1. Structural organization of Cas9 protein.

We briefly introduce the cas9 protein in the most popular system known as CRISPR-Cas9. Structures of Cas9 in the apo state have two distinct lobes, the alpha-helical recognition (REC) lobe and the nuclease (NUC) lobe containing the conserved HNH and the split RuvC nuclease domains as well as the more variable C-terminal domain (CTD). The two lobes are further connected through two linking segments, one formed by the arginine-rich bridge helix and the other by a disordered linker. SpyCas9 is a large (1,368 aa) multidomain and multifunctional DNA endonuclease. It snips dsDNA 3bp upstream of the PAM through its two distinct nuclease domains: HNH domain that cleaves the target strand, and RuvC domain responsible for cleaving the nontarget strand. In addition to its critical role in CRISPR interference, Cas9 also participates in crRNA maturation and spacer acquisition.

Function of Cas Proteins

Different cas proteins play important roles in different stages of CRISPR-Cas immunity. The Cas proteins can be divided into four distinct functional modules: adaptation (spacer acquisition); expression (crRNA processing and target binding); interference (target cleavage); and ancillary (regulatory and other CRISPR-associated functions). In recent years, a wealth of structural and functional information has accumulated for the core Cas proteins (Cas1 Cas10), which allows them to be classified into these modules.

The adaptation module: The adaptation module is largely uniform across CRISPR-Cas systems and consists of the Cas1 and Cas2 proteins, with possible additional involvement of the restriction endonuclease superfamily enzyme Cas4 and, in type II systems, Cas9. Cas1, which adopts a unique α-helical fold, is an integrase that mediates the insertion of new spacers into CRISPR arrays by cleaving specific sites within the repeats. Cas2 has been shown to form a complex with Cas1 in the Escherichia coli type I CRISPR-Cas system and is required for adaptation. However, although Cas2 has RNase and DNase activities, its catalytic residues are dispensable for adaptation, indicating that these activities are not directly involved in this process.

The expression module: The expression and interference modules are represented by multi-subunit crRNA effector complexes or, in type II systems, by a single large protein, Cas9. In the expression stage, pre-crRNA is bound to the multi-subunit crRNA effector complex, or to Cas9, and processed into a mature crRNA in a step catalysed by an RNA endonuclease (typically Cas6, in type I and type III systems) or an alternative mechanism that involves RNase III and a trans-activating CRISPR RNA (tracrRNA) (in type II systems). However, in at least one type II CRISPR-Cas system, that of Neisseria meningitidis, crRNAs with mature 5′ ends are directly transcribed from internal promoters, and crRNA processing does not occur.

Functional classification of Cas proteins

Fig 2. Functional classification of Cas proteins.

The interference module: In the interference module, the crRNA effector complex (in type I and type III systems) or Cas9 (in type II systems) combines nuclease activity with dedicated RNA-binding domains. Target binding relies on base pair formation with the spacer region of the crRNA. Cleavage of the target is catalysed by the HD family nuclease (Cas3″ or a domain in Cas3) in type I systems, by the combined action of Cas7 and Cas10 proteins in type III systems or by Cas9 in type II systems. In type I systems, the HD nuclease domain is either fused to the superfamily 2 helicase Cas3′ or is encoded by a separate gene, cas3″, whereas in type III systems a distinct HD nuclease domain is fused to Cas10 and is thought to cleave ssDNA during interference. In type II systems, each of the nuclease domains of Cas9 (RuvC and HNH) cleaves one of the strands of the target DNA.

The ancillary module: The ancillary module is a combination of various proteins and domains that, with the exception of Cas4, are much less common than the core Cas proteins in CRISPR Cas systems. Aside from its putative role in adaptation, Cas4 is thought to contribute to CRISPR-Cas-coupled programmed cell death. Other notable components of the ancillary module include: a diverse set of proteins containing the CRISPR-associated Rossmann fold (CARF) domain; and the inactivated P-loop ATPase Csn2, which forms a homotetrameric ring that accommodates linear dsDNA in the central hole. Csn2 is not required for interference but apparently has a role in spacer integration. Ancillary module genes are often found outside of CRISPR-cas loci, but the functions of these stand-alone genes have not been characterized in depth.

Cas Proteins Related References

1. Makarova et al. An updated evolutionary classification of CRISPR–Cas systems. Nat Rev Microbiol. 2015 November ; 13(11): 722–736.
2. Ajla Hrle. The Backbone of Prokaryotic Adaptive Immunity: the Cas7 Protein Family. 30 October 2014.
3. Makarova and Koonin. Annotation and Classification of CRISPR-Cas Systems. Methods Mol Biol. 2015 ; 1311: 47–75.
4. Makarova et al. Unification of Cas protein families and a simple scenario for the origin and evolution of CRISPR-Cas systems. Biology Direct. 2011, 6:38
5. Fuguo Jiang and Jennifer A. CRISPR–Cas9 Structures and Mechanisms. Annual Review of Biophysics. 2017 May 25. 46:505–29.