Type I CRISPR-Cas systems exist in bacterial and archaeal organisms and provide immunity against foreign DNA. The core defining feature of CRISPR-Cas types and subtypes are the cas proteins, which are highly genetically and functionally diverse, illustrating the many biochemical functions that they carry throughout the processes of CRISPR-mediated immunity. Three major types of the CRISPR-Cas systems distinguish from each other with the unique signature genes: Cas3 in type I systems, Cas9 (formerly csn1) in type II, and Cas10 in type III. In type I and type II systems, but not in type III, the selection of proto-spacers in invading DNA depends on a proto-spacer-adjacent motif (PAM). Moreover, related studies have revealed striking similarity in the organization of effector complexes between type I and type III systems suggesting their origin from a common ancestor.
CRISPR-based immunity contains three distinct stages: adaptation, expression, and interference. Adaptation involves the acquisition of specific nucleotide sequence tags, referred to as protospacers in their native context within invading DNA, particularly bacteriophages and plasmids. During periods of predation, protospacers are rapidly acquired and incorporated into the host genome, where they are subsequently referred to as spacers. Cas1 and Cas2, which form a complex that mediates acquisition of new spacers, are the only proteins conserved between all CRISPR-Cas subtypes. Endonuclease activity of Cas1 is required for spacer integration whereas Cas2 appears to perform a nonenzymatic function. In type I and type II, protospacer-adjacent motif (PAM) palys important roles in spacer acquisition and interference stages.
In the second stage, CRISPR arrays are first transcribed into a single large pre-crRNA by a promoter located within the CRISPR leader (lead), which is cleaved into individual mature crRNAs by the Cas6 endonuclease (Type I and III systems) or the ubiquitous RNase III enzyme (Type II systems). Processing is mediated by characteristic secondary structures (hairpins) formed by Type I pre-crRNAs or by a trans-activating RNA (tracrRNA) possessing homology to direct repeat sequences in Type II systems. In type I, Cas6 is typically the active endonuclease that is responsible for crRNA processing, and Cas5 and Cas7 are non-catalytic RNA-binding proteins; however, in type I-C systems, crRNA processing is catalysed by Cas5. Once processed, crRNAs form complexes with specific Cas proteins, including endonucleases responsible for attack of invading DNA.
Fig 1. Mechanism of Type I CRISPR-Cas interference Systems.
Fig 2. Cryo-electron microscopic structures of type I Cascades.
In type I and type III CRISPR defence system, crRNA effector complexes mediate the interference stage. In Type I systems, crRNAs complex with the multiprotein 'Cascade' (comprised of cas6-cas8b-cas7-cas5 in Clostridium), encoded downstream of the Type I CRISPR array, and base pair with invader DNA, triggering nucleolytic attack by Cas3, whereas in type III-A and type III-B systems the complexes are known as Csm and Cmr complexes, respectively. In many CRISPR-Cas subtypes, Cascade may include Cas5, Cas6, Cas7, and Cas8. The site of nucleolytic attack differs between CRISPR-Cas Types, as Cas9 nuclease cleaves DNA 3 nucleotides upstream of the PAM resulting in a double-stranded DNA break (DSB) in Type II systems, while Cas3 nicks the PAM-complementary strand of invading DNA outside of the area of interaction with crRNA, generating a DNA nick (DN). Nicked target DNA is subsequently unwound and progressively degraded by Cas3.
Cas3 protein: The signature gene, cas3, encodes a large protein with a helicase possessing a single-stranded DNA (ssDNA)-stimulated ATPase activity coupled to unwinding of double-stranded DNA (dsDNA) and RNA–DNA duplexes. Often, but not always, the helicase domain is fused to an HD family domain which has an endonuclease activity and is involved in the cleavage of the target DNA. The HD domain is typically located at the N-terminus of Cas3 proteins (with the exception of subtype I-U and several subtype I-A systems, in which the HD domain is at the carboxyl terminus of Cas3) or is encoded by a separate gene within the same locus as cas3 helicase. In the latter case, the helicase is denoted cas3′ and the HD nuclease is denoted cas3″. In type II and type III systems, no Cas3 orthologue is involved.
Cascades: The effector complexes of type I CRISPR-Cas display elaborate architectures, with a backbone consisting of paralogous RAMPs (Repeat-Associated Mysterious Proteins), such as Cas7 and Cas5, containing the RRM (RNA Recognition Motif) fold and additional 'large' and 'small' subunits. These effector complexes contain one Cas5 subunit and several Cas7 subunits. The complex accommodates the guide RNA that consists of the spacer and a portion of a repeat. The Cas5 subunit binds the 5′ handle of the crRNA and interacts with the large subunit (Cas8). The small subunit is typically present in several copies and interacts with the crRNA backbone bound to Cas7. Notably, the length of the bound spacer correlates with the number of Cas7 subunits in the backbone of the complex. An additional RAMP, Cas6, is loosely associated with the effector complex and typically functions as the repeat-specific RNase in the pre-crRNA processing.
Subtypes of the type I CRISPR-Cas system are grouped according to their operon organizations and the phylogeny of the respective Cas1 proteins. Type I systems are currently divided into seven subtypes, I-A to I-F and I-U, each of which has its own signature gene and distinct features of operon organization. In the case of subtype I-U, U stands for uncharacterized because the mechanism of pre-crRNA cleavage and the architecture of the effector complex for this system remain unknown. The type I-C, I-D, I-E and I-F CRISPR-Cas systems are typically encoded by a single operon that contains the cas1, cas2 and cas3 genes together with the genes for the subunits of the Cascade complex. Notably, cas4 is absent in I-E and I-F systems, and cas3 is fused to cas2 in I-F systems. Many type I-A and I-B loci seem to have a different organization in which cas genes are clustered in two or more operons. Subtype I-B is the most abundant CRISPR-Cas system represented in nature.
Fig 3. Classification of Type I CRISPR-Cas Systems.
The only protein that shows no significant sequence similarity between the subtypes is Cas8. However, the Cas8 sequence is highly diverged even within subtypes, so that consistent application of the signature gene approach would result in numerous new subtypes. For example, there are at least 10 distinct Cas8b families within subtype I-B and at least 8 Cas8a families within subtype I-A. Thus, notwithstanding its complex evolution, we retain subtype I-B, which is best defined by the ancestral type I gene composition. The three main subdivisions within subtype I-B roughly correspond to the previously described subtypes Hmari, Tneap and Myxan, and now could be defined through specific Cas8b families, Cas8b1 (for Hmari), Cas8b2 (for Tneap) and Cas8b3 (for Myxan), with a few exceptions.
1. Makarova et al. An updated evolutionary classification of CRISPR–Cas systems. Nat Rev Microbiol. 2015 November; 13(11): 722–736.
2. Makarova and Koonin. Annotation and Classification of CRISPR-Cas Systems. Methods Mol Biol. 2015 ; 1311: 47–75.
3. S. Makarova et al. Evolution and classification of the CRISPR-Cas systems. Nat Rev Microbiol. 2011 June ; 9(6): 467–477.
4. Koonin et al. Diversity, classification and evolution of CRISPR-Cas systems. Curr Opin Microbiol. 2017 June ; 37: 67–78.
5. Pyne et al. Harnessing heterologous and endogenous CRISPR-Cas machineries for efficient markerless genome editing in Clostridium. Scientific Reports. May 09, 2016; 6(25666): 1–15.