Welcome to the IKCEST
In vivo diversification of target genomic sites using processive base deaminase fusions blocked by dCas9

An E. coli reporter strain to test the BD-T7RNAP fusions

A scheme of the overall strategy followed in this study is shown in Fig. 1a. To measure both the mutagenic and transcriptional activity of BD-T7RNAP fusions, we designed a gfp-URA3 genetic cassette comprising two gene reporters in reverse orientation: a promoter-less gfp gene and the URA3 gene from Saccharomyces cerevisiae (Fig. 1b). Transcription of the URA3 gene was placed under the Ptac promoter recognized by E. coli RNAP. Yeast URA3 encodes the enzyme orotidine 5′-phosphate decarboxylase involved in the synthesis of uridine monophosphate (UMP)25. The activity of URA3, and that of the E. coli orthologue pyrF, allows cell growth in the absence of uracil in the medium (positive selection). In addition, URA3 expression makes yeast and E. coli cells sensitive to 5′-fluororotic acid (FOA) allowing selection of null mutants (negative selection or counterselection)26,27. To enable specific recruitment of BD-T7RNAP fusions, the promoter PT7 was placed downstream of URA3, in reversed orientation to the coding sequence of URA3, but in the same orientation that the promoter-less gfp gene (Fig. 1b). Thus, expression of GFP acts as a reporter of the transcriptional activity of BD-T7RNAP fusions. The gfp-URA3 cassette, flanked by transcriptional terminators (T1 and T0), was integrated in the chromosome of E. coli K-12 replacing the flu gene28,29. The E. coli K-12 strain used for integration (MG1655*ΔpyrF) was derived from the reference strain MG165530 having a deletion of pyrF and a corrected pyrE gene (Supplementary Information) that eliminates a natural mutation reducing its expression31 (Supplementary Fig. 1a). The pyrE gene is required in the efficient incorporation of FOA to produce the toxic 5-FUMP. Deletion of pyrF makes bacteria auxotrophic for uracil and resistant to FOA (Supplementary Fig. 1b). The final reporter strain with the integrated cassette, named MG*-URA3, grows well in mineral media (M9) lacking uracil and is highly sensitive to FOA (Supplementary Fig. 1b). Lastly, an ung deletion mutant was obtained from this strain (MG*-URA3Δung).

Fig. 1: Graphic summary of the mutagenesis system development.

a Schematic representation of the mutagenic process. The T7 RNA polymerase fusion (BD-T7RNAP, blue shape joined by a black line to a purple elliptical shape) specifically binds the T7 promoter (i), initiating the transcription and moving along the target gene (yellow filled arrow) carrying the base deaminase (BD) that introduces mutations (red stripes) in the gene (ii). The fusion stops and detaches from the DNA when encounters a dCas9 molecule (translucid ovoid shape) bound to a specific sequence determined by the CRISPR RNA (crRNA, purple line). The trans-activating CRISPR RNA (tracrRNA, blue line) is also required for this process (iii). b Representation of the chromosomally integrated reporter gfp-URA3 cassette to test the mutagenesis system. The genes gfp and URA3 are represented with green and yellow filled arrows, respectively. Thin arrows indicate the promoters tac (Ptac) and T7 (PT7), lollipops indicate terminators T0 and T1.

Mutagenic activity of different CD-T7RNAP fusions

First, we investigated different N- and C-terminal fusions of human CD AID32 connected with a flexible peptide linker (G3S)7 to T7RNAP, and some including N-terminal thioredoxin 1 (TrxA) and a cytosolic version of the maltose-binding protein (MBPc). Native T7RNAP and these fusions were cloned under the control of the tetracycline-inducible promoter (TetR-PtetA)33 in a low copy-number plasmid (pSEVA221)34 and expressed in MG*-URA3Δung. These experiments revealed that all N-terminal fusions to T7RNAP produce a transcriptionally active polypeptide in E. coli with similar mutagenic activity (Supplementary Notes, and Supplementary Figs. 2 and 3a). We chose the AID-T7RNAP fusion, lacking any additional protein partner, to continue our work. Then, we constructed similar N-terminal fusions with other CDs, namely pmCDA1 and rAPOBEC1, in the same vector (Fig. 2a). MG*-URA3Δung strains carrying plasmids encoding native T7RNAP, AID-T7RNAP, pmCDA1-T7RNAP, rAPOBEC1-T7RNAP, or pSEVA221 (negative control), were grown and induced with anhydrotetracycline (aTc). Western blot analysis of whole-cell protein extracts revealed slighlty higher expression levels of pmCDA1-T7RNAP and rAPOBEC1-T7RNAP than AID-T7RNAP, and the overexpression of native T7RNAP (Fig. 2b). Flow cytometry analysis showed similar levels of GFP in bacteria encoding AID, pmCDA1, and rAPOBEC1 fusions, roughly half of those found in bacteria with T7RNAP (Fig. 2c and Supplementary Fig. 3b). Thus, all CD-T7RNAP fusions were expressed and transcriptionally active in E. coli.

Fig. 2: Expression and mutagenic activity of AID-T7RNAP, pmCDA1-T7RNAP, and rAPOBEC1-T7RNAP.
figure2

a Scheme of the different CDs fused to the T7RNAP by the linker (G3S)7. b Expression of the different fusions determined by western blot analysis of the cell extracts from induced cultures of the strain MG*-URA3∆ung. A representative immunoblot is shown from two independent experiments with similar results. c Processivity of the fusions assessed by flow cytometry analysis to detect expression of GFP in the induced cultures. The gating strategy and the corresponding pseudocolor plots are shown in Supplementary Fig. 3b. d, e Mutagenic activity in URA3 (d) and rpoB (e) of the different fusions using as hosts MG*-URA3, MG*-URA3∆ung, and MG*-URA3∆ung∆PT7. The histograms d and e show the single values (black dots), the means (bars), and standard errors (lines) of at least three independent experiments (n = 3, except for the strain MG*-URA3∆ung n = 6). The statistical analysis was done using two-tailed Mann–Whitney test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

Next, we determined the mutant frequency in URA3 (on-target) and rpoB (off-target) upon induction of native T7RNAP, AID, pmCDA1, rAPOBEC1 fusions, and pSEVA221, in three different E. coli strains: MG*-URA3 (ung+), MG*-URA3Δung, and MG*-URA3ΔungΔPT7 (lacking the PT7 in URA3). Bacteria were plated on M9 + uracil and M9 + uracil + FOA to determine the URA3 mutant frequency for each strain as the ratio of FOAR colony forming units (CFU ml−1) vs. total CFU ml−1 (Fig. 2d). The frequency of URA3 mutants in MG*-URA3Δung bacteria was ~10−6 for the control (pSEVA221), ~10−5 for native T7RNAP, and increased to ~10−3 for AID-T7RNAP, to ~5 × 10−2 for rAPOBEC1-T7RNAP, and up to ~10−1 for pmCDA1-T7RNAP (Fig. 2d). Thus, all CD fusions have a clear mutagenic activity over control strain (≥1000-fold) in the Δung mutant, with rAPOBEC1 and pmCDA1 fusions having a higher activity than AID fusion (~50- to 100-fold, respectively). All CD fusions showed a much lower mutagenic activity in the ung+ strain (<1% that in Δung; Fig. 2d). Importantly, the frequency of URA3 mutants for all CD fusions and native T7RNAP dropped to background levels in the strain lacking the T7 promoter (MG*-URA3ΔungΔPT7 strain; Fig. 2d). Noticiable, induction of native T7RNAP increased URA3 mutation rate ~10-fold in a PT7-dependent manner, but irrespective of UNG (Fig. 2d). Bacteria from these cultures were also plated on rifampicin (Rif)-containing plates to evaluate the specificity of the fusions. RifR colonies of E. coli are known to contain mutations in rpoB, encoding the β-subunit of E. coli RNAP35. Thus, RifR was used to determine the off-target rpoB mutant frequency as the ratio of RifR CFU ml−1 vs. total CFU ml−1 (Fig. 2e). The frequency of spontaneous rpoB mutants of the control strain (~10−7) was only mildly increased (~2 to 5-fold) by the expression of AID fusion, and ~10–20-fold with the more active pmCDA1 and rAPOBEC1 fusions (Fig. 2e). Hence, the mutagenic activity of CD-T7RNAP fusions in URA3 requires the presence of PT7, is more efficient in the Δung mutant and has a strong preference for the target URA3 gene vs. off-target rpoB.

Mutagenic activity of TadA*-T7RNAP fusion

To broaden the mutagenic capacity of the system, a fusion was constructed with TadA*19 (Fig. 3a). TadA* deaminate adenines in DNA generating inosines that lead to A:T > G:C transitions. TadA*-T7RNAP was expressed in MG*-URA3Δung at higher levels than AID-T7RNAP, as determined by western blot (Fig. 3b), but produced only slightly higher level of GFP (Fig. 3c and Supplementary Fig. 3c). The mutagenic capacity of TadA*-T7RNAP was evaluated in different genetic backgrounds, using AID-T7RNAP and pSEVA221 as positive and negative controls, respectively (Fig. 3d). Expression of TadA*-T7RNAP generated URA3 mutants with a frequency of ~2–5 × 10−4 (~100-fold that of the control) in both ung+ and Δung strains, indicating that TadA* activity is indepedent of UNG (Fig. 3d). The gene nfi encodes the endonuclease V of E. coli, which eliminates inosines36,37. When TadA*-T7RNAP was expressed in Δnfi mutants (MG*-URA3Δnfi and MG*-URA3ΔungΔnfi), the frequency of URA3 mutants increased to ~10−3, similar to that of AID-T7RNAP in Δung (Fig. 3d). Deletion of nfi had no effect on AID-T7RNAP (Fig. 3d). Notably, expression of TadA*-T7RNAP did not produce any significant increase in the levels of off-target mutagenesis in rpoB (Fig. 3e). In addition, the mutagenic activity of TadA*-T7RNAP in URA3 requires the presence of PT7, dropping to background levels in MG*-URA3ΔungΔPT7 (Fig. 3f). These data demonstrate that TadA*-T7RNAP fusion has a specific mutagenic activity for the target DNA having PT7. This activity is independent of UNG and increases moderately when endonuclease V is absent.

Fig. 3: Expression and activity of TadA*-T7RNAP.
figure3

a Scheme of the fusion TadA*-T7RNAP with the linker (G3S)7. b Expression of the fusion TadA*-T7RNAP in comparison to AID-T7RNAP determined by western blot analysis of cell extracts from induced cultures. A representative immunoblot is shown from two independent experiments with similar results. c Processivity of the fusions assessed by flow cytometry analysis to detect expression of GFP in the induced cultures. The gating strategy and the corresponding pseudocolor plots are shown in Supplementary Fig. 3c. d, e Mutagenic activity of the AID- and TadA*-T7RNAP fusions in URA3 (d) and rpoB (e), using as hosts MG*-URA3, MG*-URA3∆ung, MG*-URA3∆nfi, and MG*-URA3∆ungnfi (n = 3 independent experiments). f URA3 mutant frequency when TadA*-T7RNAP is expressed in MG*-URA3∆ung and MG*-URA3∆ung∆PT7 (pSEVA221 n = 4, TadA*-T7RNAP n = 6). The histograms (df) show the single values (black dots), the means (bars), and standard errors (lines) of multiple independent experiments. The statistical analysis was done using two-tailed Student’s t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

Characterization of the mutations

We randomly picked 30 FOAR colonies (URA3 mutants) from each of the MG*-URA3Δung strains expressing native T7RNAP, AID, pmCDA1, and rAPOBEC1 fusions, and 30 additional FOAR colonies from MG*-URA3ΔungΔnfi expressing TadA*-T7RNAP. The chromosomal Ptac-URA3-PT7 region from these colonies was amplified by PCR and sequenced. As a control, identical region was sequenced from 13 FOA-sensitive (nonmutant) colonies from MG*-URA3Δung (pSEVA221), showing no mutation from wild-type allele. In contrast, all FOAR colonies expressing CD-T7RNAP fusions had multiple transitions C:G > T:A in both DNA strands along the Ptac and URA3 gene, but none in PT7 (Fig. 4a). URA3 alleles from bacteria expressing TadA*-T7RNAP contained transitions A:T > G:C in both DNA strands, except a single C:G > T:A transition (Fig. 4b). No other type mutations, deletions, or insertions, were observed in any URA3 alleles from BD-T7RNAP fusions. This was not the case in FOAR colonies expressing native T7RNAP, which contained different types of mutations, including transitions, transversions, deletions, and insertions in URA3 (Supplementary Fig. 4). Showing correlation to the mutagenic capacity of the BD-T7RNAP fusions, the highest total number of mutations was found with pmCDA1-T7RNAP (426) followed by rAPOBEC1-T7RNAP (95), AID-T7RNAP (42) and TadA*-T7RNAP (37) (Fig. 4c). The average number of mutations per clone presented the same hierarchy: pmCDA1-T7RNAP (14.2) > rAPOBEC1-T7RNAP (3.2) > AID-T7RNAP (1.4) > TadA*-T7RNAP (1.2) (Fig. 4d). For all CD-T7RNAP fusions, transitions G to A were detected more frequently than C to T in the URA3 coding strand (Fig. 4a, c), indicating a higher mutation rate of Cs in the noncoding strand of URA3, which corresponds to the non-template strand for the CD-T7RNAP fusions (Supplementary Fig. 5). This bias toward the non-template strand of T7RNAP is less pronounced for AID (62%) than for rAPOBEC1 (74%) or pmCDA1 (91%) fusions (Fig. 4c). For TadA* fusion, we also found a bias favoring T to C mutations in the coding strand of URA3 (84%), corresponding to A to G mutations in the non-template strand of T7RNAP (Fig. 4c and Supplementary Fig. 5). Therefore, DNA sequencing of URA3 mutants demonstrates that BD-T7RNAP fusions induce the expected mutations with a bias toward the non-template strand.

Fig. 4: Characterization of URA3 mutations found in FOAR colonies expressing BD-T7RNAP fusions.
figure4

a Number of mutations per nucleotide identified in the URA3 locus from 30 FOAR colonies isolated from each MG*-URA3Δung strain expressing the indicated CD-T7RNAP fusions and b from MG*-URA3ΔungΔnfi strain expressing TadA*-T7RNAP fusion. The promoters Ptac and T7 are shown with green and red arrow heads, respectively, and delimited by dashed lines. The gene URA3 is shown with a yellow filled arrow. The indicated base changes correspond to the coding sequence of URA3. Different base substitutions found are labeled with the color codes on the right. A single G to A transition found with TadA*-T7RNAP is labeled with an asterisk. c Total number of mutations for each BD-T7RNAP fusion indicating the base substitutions found. d Average number of mutations per clone found in the FOAR colonies analyzed for each of the indicated BD-T7RNAP fusions. Single values are represented with blue dots and means and standard errors with black lines. e, f Variant calling analysis of a 200 bp region of URA3 after its massive DNA sequencing (ca. 106 reads). The number of reads with different variants vs. total reads are represented with circles (empty plasmid) and squares (AID-T7RNAP or TadA*-T7RNAP). The lines represent the means and the standard errors from each group. The statistical analysis was done using two-tailed Mann–Whitney test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

For deeper analysis, a 284 bp PCR fragment of the URA3 gene was amplified from an induced culture of MG*-URA3Δung expressing AID-T7RNAP without FOA selection, and was subjected to massive next-generation sequencing (NGS). The same region was subjected to NGS from a control MG*-URA3Δung (pSEVA221). Comparison of the variant call analysis from the two samples (ca. 1 × 106 reads/sample) indicated that only the transition C:G > T:A appeared in an statistically significant higher number in bacteria expressing AID-T7RNAP (Fig. 4e and Supplementary Fig. 6). We used the same approach to analyze the induced URA3 mutations by TadA*-T7RNAP in MG*-URA3ΔungΔnfi compared with the same strain carrying pSEVA221. In this case, only transitions A:T > G:C were detected in a statistically significant higher number in the sample expressing TadA*-T7RNAP (Fig. 4f and Supplementary Fig. 7). Hence, massive DNA sequencing data is consistent with the DNA sequencing results of individual URA3 mutants, and confirms that BD-T7RNAP fusions generate only the expected mutations.

We also confirmed the adequacy of the RifR phenotype as reporter of the off-target activity of CD and AD fusions. The RifR phenotype is mostly caused by rpoB mutations in a region between amino acid 507 and 687 of the β-subunit of E. coli RNAP known as RIF-resistant determing region (RRDR)38. We sequenced the rpoB-RRDR segment in 30 RifR colonies from strains MG*-URA3Δung and MG*-URA3Δnfi carrying pSEVA221. Among other mutations, transitions C to T and G to A were readily found in rpoB-RRDR from RifR MG*-URA3Δung (Supplementary Fig. 8a). Similarly, RifR MG*-URA3Δnfi contained diverse transitions A to G and T to C in rpoB-RRDR (Supplementary Fig. 8b). Consequently, different bases within rpoB-RRDR that are sensitive to CD and AD mutagenesis can generate RifR mutants, validating rpoB to assess the off-target activity of BDs.

Protection of downstream regions using dCas9

The BD-T7RNAP fusions were able to transcribe the gfp gene downstream of URA3 (respective to the elongation of T7RNAP) potentially generating mutations beyond URA3. To confirm this, we inserted sacB from Bacillus subtilis in the gfp-URA3 cassette as an additional counterselection gene39. B. subtilis sacB codes for the exoenzyme levansucrase, which utilizes sucrose to produce toxic levan, which accumulates in the periplasm of E. coli killing the bacterium. A Ptac-sacB fragment was cloned in gfp-URA3 and the new sacB-gfp-URA3 cassette (Fig. 5a) was integrated replacing the chromosomal flu gene in MG*ΔpyrFΔungΔnfi. The resulting strain, MG*-SacB-URA3ΔungΔnfi, was sensitive to sucrose with a frequency of spontaneous mutants of ~5 × 10−6. When AID-T7RNAP was expressed in this strain, the frequency of sacB mutants increased to ~6 × 10−4 (Supplementary Fig. 9), whereas that of URA3 (~1.7 × 10−3) was similar to that observed previously (Fig. 2d). This confirms that BD-T7RNAP fusions are able to mutate regions downstream of the target gene.

Fig. 5: Blocking AID-T7RNAP elongation and mutagenic activity by dCas9 and crRNAs.
figure5

a Scheme of the reporter cassette with sacB integrated in MG*-SacB-URA3∆ungnfi strain. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipops indicate terminators T0 and T1; red lines mark targeting sequences of the crRNAs a, b, and c; filled arrows indicate the genes sacB (orange), gfp (green), and URA3 (yellow). b Representation of the dCas9 blocking activity showing one crRNA (in red) targeting the non-template strand relative to T7RNAP transcription. The direction of the progession of the fusion along the DNA is indicated with a gray arrow. The mutagenic protein fusion (AID-T7RNAP) is displaced from the transcription bubble (dashed arrow) by bound dCas9/crRNA. The orange shape represents the dCas9, the blue shape joined by a black line to a purple elliptical shape represents the fusion AID-T7RNAP, the red RNA molecule represents the crRNA and the green RNA molecule represents the tracrRNA. c Relative GFP levels measured by flow cytometry of bacteria from strain MG*-SacB-URA3∆ungnfi expressing AID-T7RNAP and dCas9 in the absence (−) or presence of crRNA arrays b.a and b.a.c. The histogram shows the percentages of mean flurorescence intensities (MFI) for each condition relative to the strain lacking crRNAs. Background GFP fluorescence signals from this strain with pdCas9 and the empty vector pSEVA221 are subtracted from all values (n = 3 independent experiments). d Ratio of mutagenesis of sacB vs. URA3 in bacteria MG*-SacB-URA3∆ungnfi expressing AID-T7RNAP and dCas9 in the absence (−) or presence of crRNA arrays b.a and b.a.c. The ratio found in bacteria lacking crRNAs are considered 1 (control n = 6, b.a n = 6, b.a.c n = 3). For c and d, the histograms represent the single values (blue dots), the relative means (bars), and standard errors (black lines) from multiple independent experiments. The statistical analysis was done using two-tailed Student t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

To delimit the mutagenic activity within the target gene, we investigated the possibility of blocking elongation of BD-T7RNAP with dCas9. dCas9 can bind a target DNA sequence using crRNAs (or gRNAs) and has been used as transcriptional repressor for the endogenous E. coli RNAP40. Co-expression of two gRNAs targeting the non-template strand of a gene enhances the transcription repression of E. coli RNAP by dCas941. Therefore, we designed two crRNAs (a and b) against the non-template strand of gfp (Fig. 5a, b), generating a double crRNA array (b.a). Plasmid pdCas940 was used for the constitutive expression of dCas9, the trans-activating CRISPR RNA (tracrRNA), and the designed crRNA arrays. MG*-SacB-URA3ΔungΔnfi strains carrying pdCas9b.a, or control pdCas9, were transformed with the plasmid encoding AID-T7RNAP or pSEVA221 control. After growth and aTc induction, we determined that GFP expression by AID-T7RNAP was reduced by the double crRNAb.a to ~30% of that produced in the isogenic strain lacking crRNA (Fig. 5c). Basal levels of GFP were detected in strains carrying pSEVA221 and were considered 0. Protection of sacB from AID-T7RNAP mutagenesis was assessed by the ratio of the mutation in sacB vs. URA3 for each strain. This ratio was normalized as 1 for the strain expressing AID-T7RNAP and carrying pdCas9 without crRNAs (Fig. 5d). In concordance with the reduction of GFP expression, the mutagenesis of sacB dropped ~10-fold when the double crRNAb.a was expressed (Fig. 5d). Then, we tested whether the blockade of AID-T7RNAP could be enhanced with a triple (b.a.c) crRNA array (Fig. 5a). Expression of dCas9 and crRNAb.a.c repressed the levels of GFP to ~20% of those found with pdCas9 lacking crRNA, and the mutant frequency in sacB vs. URA3 decreased ~14-fold (Fig. 5c, d). Therefore, dCas9 directed with crRNAs is able to hamper the progression and mutagenesis of BD-T7RNAP fusions along the DNA, being a triple crRNA array more effective.

To evaluate whether dCas9 can be used to protect a particular region within a gene, we designed a new triple crRNA array (d.e.f) targeting URA3 (Fig. 6a). Cultures of MG*-SacB-URA3ΔungΔnfi strain expressing AID-T7RNAP and dCas9 with crRNAd.e.f (targeting URA3), or crRNAb.a.c (targeting gfp) as a control, were induced and the URA3 alleles from 30 FOAR colonies from each culture were sequenced. As expected, mutations in FOAR clones expressing crRNAb.a.c (targeting gfp) were found distributed all along URA3, whereas mutations in FOAR clones expressing crRNAd.e.f (targeting URA3) were only found in the gene segment between the recognition sites of the crRNAs and PT7 (Fig. 6b). Interestingly, albeit mutations in URA3 were not detected more distal than the hybridization site of crRNA d, we found mutations in regions between the hybridization sites of crRNAs d to e and e to f (Fig. 6b), as well as in the URA3 segment proximal to PT7. Collectively, these results demonstrate that targeted dCas9 blockade with a triple crRNA array can be used to concentrate the mutagenesis activity of BD-T7RNAP fusions to a specific target gene or gene segment, reducing the mutagenesis of downstream DNA regions.

Fig. 6: Protection from mutagenesis of a delimited segment of the target gene by dCas9 guided with crRNAs.

a Scheme of the URA3 cassette with the recognition sites (red lines) of the triple crRNAs b.a.c and d.e.f. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipop shapes indicate terminators T0 and T1. The genes gfp and URA3 are represented with green and yellow filled arrows, respectively. b URA3 mutations found in FOAR colonies of induced cultures of MG*-SacB-URA3ΔungΔnfi expressing AID-T7RNAP and the corresponding tripe crRNAs. For each culture the URA3 alleles from 30 colonies were sequenced. The indicated base changes correspond to the coding sequence of URA3. Different base substitutions found are labeled with the color code on the right. The promoters Ptac and T7 are shown with green and red arrow heads, respectively. The gene URA3 is shown with a yellow filled arrow. Red rectangles over the URA3 gene mark targeting sequences of the crRNAs d, e, and f, and delimited with dashed lines. Source data are provided as a Source data file.

Fast directed evolution of the TEM-1 gene

To test this in vivo mutagenesis system in a directed evolution process, we chose the antibiotic resistance gene TEM-1 as proof-of-principle. This gene encodes the TEM-1 β-lactamase that confers resistance to penicillins, cephalosporins, and related β-lactams42. The evolution of this enzyme has been extensively documented due to its clinical relevance, with the description of >170 variants, some of them having increased resistance to third-generation cephalosporins, such as ceftazidime (CAZ)42. We planned to evolve TEM-1 in vivo to obtain variants that confer resistance to CAZ. To this end, the URA3 gene in the sacB-gfp-URA3 cassette was replaced by TEM-1, and the new sacB-gfp-TEM-1 cassette was inserted, replacing flu, in the chromosome of MG*ΔpyrFΔungΔnfi (Fig. 7a). The resulting strain (MG*-SacB-TEM-1ungΔnfi) was transformed with plasmids encoding AID-T7RNAP and dCas9-crRNAb.a.c (to protect downstream genes). Three independent transformants were cultured separately and subjected to two iterative cycles of growth and induction with aTc (Fig. 7b). As controls of non-induced evolution (i.e., spontaneous TEM-1 mutation), three colonies from the same strain carrying the empty vectors (pSEVA221 and pdCas9) were grown and induced in parallel. At the end of each induction cycle, serial dilutions of each culture were plated on LB agar with increasing concetrations of CAZ (0, 1, 4, and 16 μg ml−1) to calculate the frequency of CAZR-mutants at each concentration, expressed as the ratio of CAZR CFU ml−1 vs. total CFU ml−1 (Fig. 7c). After one cycle, the frequency of CAZR-mutants in cultures expressing AID-T7RNAP and dCas9-crRNAb.a.c was ~6.5 × 10−5 at 1 μg ml−1, and ~3 × 10−7 at 4 μg ml−1 of CAZ, with no resistant colonies detected to 16 μg ml−1. In contrast, spontaneous CAZR-mutants in control cultures appeared at ~2000-fold lower frequency at 1 μg ml−1 of CAZ, and no resistant colonies appeared at any higher concentration (Fig. 7c). Interestingly, bacterial cultures expressing AID-T7RNAP, and subjected to two cycles of growth and induction increased the frequency of CAZR-mutants at 1 and 4 μg ml−1 of CAZ by ~10-fold (~5 × 10−4 and ~2 × 10−6, respectively) and CAZR colonies arose at the highest CAZ concentration (16 μg ml−1) with a frequency of ~2 × 10−7 (Fig. 7c). In contrast, after two cycles of growth, spontaneous CAZR-mutants were not detected in control cultures at 4 or 16 μg ml−1 of CAZ, and CAZR clones only appeared at low frequency (~1 × 10−7) in plates containing 1 μg ml−1 of CAZ. These data indicate that AID-T7RNAP was producing mutations within TEM-1 at a significant rate above spontaneous mutation (>1000-fold/mutagenic cycle), with mutations being accumulated in each cycle generating variants with different levels of resistance to CAZ.

Fig. 7: TEM-1 evolution using the mutagenesis system.
figure7

a Scheme of the cassette with the TEM-1 gene in the strain MG*-SacB-TEM-1ΔungΔnfi. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipop shapes indicate terminators T0 and T1. The genes sacB, gfp, and URA3 are represented with orange, green, and yellow filled arrows, respectively. Red lines mark targeting sequences of the crRNAs a, b, and c. b Scheme of the continuous evolution process with iterative cycles of mutagenesis induction. Transformed colonies of MG*-SacB-TEM-1∆ungnfi with the plasmids pdCas9b.a.c and pSEVA221AID-T7RNAP were grown overnight (O/N) in LB with Cm and Km at 37 °C with shaking (250 r.p.m.). The next day, the cultures were diluted 1:100 in fresh medium and incubated under the same conditions for 2 h. Then anhydrotetracycline (aTc; 200 ng ml−1) was added for induction and the cultures were incubated for 1 h. To start the second cycle of mutagenesis, the induced cultures were diluted 1:100 in new medium and grown O/N repeting the same steps as above. To monitor the mutagenic process, after every induction the cultures were washed with 1× PBS and serially diluted prior to be plated on LB agar alone and with increasing concentrations of ceftazidime (CAZ; 1, 4, and 16 µg ml−1). The strain MG*-SacB-TEM-1∆ungnfi with the plasmids pdCas9 and pSEVA221 was used as a control of the spontaneous mutations occurring in TEM-1. c Resistance frequency to increasing concentrations of ceftazidime (CAZ) after each cycle of the cytosine deaminase induced cultures of MG*-SacB-TEM-1ΔungΔnfi pdCas9b.a.c pSEVA221AID-T7RNAP (CD). As negative control (C−), the strain MG*-SacB-TEM-1ΔungΔnfi with the plasmids pdCas9 and pSEVA221 was used. The histogram shows the single values (color coded dots), means (bars), and standard errors (black lines) from three independent cultures for each strain (n = 3). The statistical analysis was done using a two-tailed paired t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

To characterize the TEM-1 variants from the evolved cultures expressing AID-T7RNAP, we sequenced TEM-1 alleles from 12 CAZR colonies (four from each triplicate culture) were isolated at 1 μg ml−1 CAZ after cycle 1, and from 12 CAZR colonies isolated at 16 μg ml−1 CAZ after cycle 2. This revealed that all mutations found in the TEM-1 alleles were the expected from the CD activity of AID, being transitions G > A more frequent than C > T in the coding strand of TEM-1 (Table 1). All sequenced TEM-1 variants from 1 μg ml−1 CAZ had mutations in residue R164 (R164C or R164H), and only two clones contained additional mutations (A249V and G267R). In contrast, all TEM-1 variants selected at 16 μg ml−1 CAZ after two cycles invariable contained R164H and E104K mutations, frequently associated to additional mutations (e.g., V33I and A150T). Mutations R164H and E104K are reported to provide increased resistance to CAZ42. We confirmed this determining the minimal inhibitory concentration (MIC) to CAZ of mutant R1.10 (R164H)43 and of mutant R2.2 (E104K and R164H)44 (Table 2), demonstrating that TEM-1 variants with increased resistance to CAZ were obtained in a fast and continuous manner using this in vivo mutagenesis system.

Table 1 TEM-1 mutations in ceftazidime resistant (CAZR) clones.
Table 2 Minimal inhibitory concentration (MIC) to ceftazidime (CAZ) of parental strain and evolved mutants.

Original Text (This is the original text for your reference.)

An E. coli reporter strain to test the BD-T7RNAP fusions

A scheme of the overall strategy followed in this study is shown in Fig. 1a. To measure both the mutagenic and transcriptional activity of BD-T7RNAP fusions, we designed a gfp-URA3 genetic cassette comprising two gene reporters in reverse orientation: a promoter-less gfp gene and the URA3 gene from Saccharomyces cerevisiae (Fig. 1b). Transcription of the URA3 gene was placed under the Ptac promoter recognized by E. coli RNAP. Yeast URA3 encodes the enzyme orotidine 5′-phosphate decarboxylase involved in the synthesis of uridine monophosphate (UMP)25. The activity of URA3, and that of the E. coli orthologue pyrF, allows cell growth in the absence of uracil in the medium (positive selection). In addition, URA3 expression makes yeast and E. coli cells sensitive to 5′-fluororotic acid (FOA) allowing selection of null mutants (negative selection or counterselection)26,27. To enable specific recruitment of BD-T7RNAP fusions, the promoter PT7 was placed downstream of URA3, in reversed orientation to the coding sequence of URA3, but in the same orientation that the promoter-less gfp gene (Fig. 1b). Thus, expression of GFP acts as a reporter of the transcriptional activity of BD-T7RNAP fusions. The gfp-URA3 cassette, flanked by transcriptional terminators (T1 and T0), was integrated in the chromosome of E. coli K-12 replacing the flu gene28,29. The E. coli K-12 strain used for integration (MG1655*ΔpyrF) was derived from the reference strain MG165530 having a deletion of pyrF and a corrected pyrE gene (Supplementary Information) that eliminates a natural mutation reducing its expression31 (Supplementary Fig. 1a). The pyrE gene is required in the efficient incorporation of FOA to produce the toxic 5-FUMP. Deletion of pyrF makes bacteria auxotrophic for uracil and resistant to FOA (Supplementary Fig. 1b). The final reporter strain with the integrated cassette, named MG*-URA3, grows well in mineral media (M9) lacking uracil and is highly sensitive to FOA (Supplementary Fig. 1b). Lastly, an ung deletion mutant was obtained from this strain (MG*-URA3Δung).

Fig. 1: Graphic summary of the mutagenesis system development.

a Schematic representation of the mutagenic process. The T7 RNA polymerase fusion (BD-T7RNAP, blue shape joined by a black line to a purple elliptical shape) specifically binds the T7 promoter (i), initiating the transcription and moving along the target gene (yellow filled arrow) carrying the base deaminase (BD) that introduces mutations (red stripes) in the gene (ii). The fusion stops and detaches from the DNA when encounters a dCas9 molecule (translucid ovoid shape) bound to a specific sequence determined by the CRISPR RNA (crRNA, purple line). The trans-activating CRISPR RNA (tracrRNA, blue line) is also required for this process (iii). b Representation of the chromosomally integrated reporter gfp-URA3 cassette to test the mutagenesis system. The genes gfp and URA3 are represented with green and yellow filled arrows, respectively. Thin arrows indicate the promoters tac (Ptac) and T7 (PT7), lollipops indicate terminators T0 and T1.

Mutagenic activity of different CD-T7RNAP fusions

First, we investigated different N- and C-terminal fusions of human CD AID32 connected with a flexible peptide linker (G3S)7 to T7RNAP, and some including N-terminal thioredoxin 1 (TrxA) and a cytosolic version of the maltose-binding protein (MBPc). Native T7RNAP and these fusions were cloned under the control of the tetracycline-inducible promoter (TetR-PtetA)33 in a low copy-number plasmid (pSEVA221)34 and expressed in MG*-URA3Δung. These experiments revealed that all N-terminal fusions to T7RNAP produce a transcriptionally active polypeptide in E. coli with similar mutagenic activity (Supplementary Notes, and Supplementary Figs. 2 and 3a). We chose the AID-T7RNAP fusion, lacking any additional protein partner, to continue our work. Then, we constructed similar N-terminal fusions with other CDs, namely pmCDA1 and rAPOBEC1, in the same vector (Fig. 2a). MG*-URA3Δung strains carrying plasmids encoding native T7RNAP, AID-T7RNAP, pmCDA1-T7RNAP, rAPOBEC1-T7RNAP, or pSEVA221 (negative control), were grown and induced with anhydrotetracycline (aTc). Western blot analysis of whole-cell protein extracts revealed slighlty higher expression levels of pmCDA1-T7RNAP and rAPOBEC1-T7RNAP than AID-T7RNAP, and the overexpression of native T7RNAP (Fig. 2b). Flow cytometry analysis showed similar levels of GFP in bacteria encoding AID, pmCDA1, and rAPOBEC1 fusions, roughly half of those found in bacteria with T7RNAP (Fig. 2c and Supplementary Fig. 3b). Thus, all CD-T7RNAP fusions were expressed and transcriptionally active in E. coli.

Fig. 2: Expression and mutagenic activity of AID-T7RNAP, pmCDA1-T7RNAP, and rAPOBEC1-T7RNAP.
figure2

a Scheme of the different CDs fused to the T7RNAP by the linker (G3S)7. b Expression of the different fusions determined by western blot analysis of the cell extracts from induced cultures of the strain MG*-URA3∆ung. A representative immunoblot is shown from two independent experiments with similar results. c Processivity of the fusions assessed by flow cytometry analysis to detect expression of GFP in the induced cultures. The gating strategy and the corresponding pseudocolor plots are shown in Supplementary Fig. 3b. d, e Mutagenic activity in URA3 (d) and rpoB (e) of the different fusions using as hosts MG*-URA3, MG*-URA3∆ung, and MG*-URA3∆ung∆PT7. The histograms d and e show the single values (black dots), the means (bars), and standard errors (lines) of at least three independent experiments (n = 3, except for the strain MG*-URA3∆ung n = 6). The statistical analysis was done using two-tailed Mann–Whitney test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

Next, we determined the mutant frequency in URA3 (on-target) and rpoB (off-target) upon induction of native T7RNAP, AID, pmCDA1, rAPOBEC1 fusions, and pSEVA221, in three different E. coli strains: MG*-URA3 (ung+), MG*-URA3Δung, and MG*-URA3ΔungΔPT7 (lacking the PT7 in URA3). Bacteria were plated on M9 + uracil and M9 + uracil + FOA to determine the URA3 mutant frequency for each strain as the ratio of FOAR colony forming units (CFU ml−1) vs. total CFU ml−1 (Fig. 2d). The frequency of URA3 mutants in MG*-URA3Δung bacteria was ~10−6 for the control (pSEVA221), ~10−5 for native T7RNAP, and increased to ~10−3 for AID-T7RNAP, to ~5 × 10−2 for rAPOBEC1-T7RNAP, and up to ~10−1 for pmCDA1-T7RNAP (Fig. 2d). Thus, all CD fusions have a clear mutagenic activity over control strain (≥1000-fold) in the Δung mutant, with rAPOBEC1 and pmCDA1 fusions having a higher activity than AID fusion (~50- to 100-fold, respectively). All CD fusions showed a much lower mutagenic activity in the ung+ strain (<1% that in Δung; Fig. 2d). Importantly, the frequency of URA3 mutants for all CD fusions and native T7RNAP dropped to background levels in the strain lacking the T7 promoter (MG*-URA3ΔungΔPT7 strain; Fig. 2d). Noticiable, induction of native T7RNAP increased URA3 mutation rate ~10-fold in a PT7-dependent manner, but irrespective of UNG (Fig. 2d). Bacteria from these cultures were also plated on rifampicin (Rif)-containing plates to evaluate the specificity of the fusions. RifR colonies of E. coli are known to contain mutations in rpoB, encoding the β-subunit of E. coli RNAP35. Thus, RifR was used to determine the off-target rpoB mutant frequency as the ratio of RifR CFU ml−1 vs. total CFU ml−1 (Fig. 2e). The frequency of spontaneous rpoB mutants of the control strain (~10−7) was only mildly increased (~2 to 5-fold) by the expression of AID fusion, and ~10–20-fold with the more active pmCDA1 and rAPOBEC1 fusions (Fig. 2e). Hence, the mutagenic activity of CD-T7RNAP fusions in URA3 requires the presence of PT7, is more efficient in the Δung mutant and has a strong preference for the target URA3 gene vs. off-target rpoB.

Mutagenic activity of TadA*-T7RNAP fusion

To broaden the mutagenic capacity of the system, a fusion was constructed with TadA*19 (Fig. 3a). TadA* deaminate adenines in DNA generating inosines that lead to A:T > G:C transitions. TadA*-T7RNAP was expressed in MG*-URA3Δung at higher levels than AID-T7RNAP, as determined by western blot (Fig. 3b), but produced only slightly higher level of GFP (Fig. 3c and Supplementary Fig. 3c). The mutagenic capacity of TadA*-T7RNAP was evaluated in different genetic backgrounds, using AID-T7RNAP and pSEVA221 as positive and negative controls, respectively (Fig. 3d). Expression of TadA*-T7RNAP generated URA3 mutants with a frequency of ~2–5 × 10−4 (~100-fold that of the control) in both ung+ and Δung strains, indicating that TadA* activity is indepedent of UNG (Fig. 3d). The gene nfi encodes the endonuclease V of E. coli, which eliminates inosines36,37. When TadA*-T7RNAP was expressed in Δnfi mutants (MG*-URA3Δnfi and MG*-URA3ΔungΔnfi), the frequency of URA3 mutants increased to ~10−3, similar to that of AID-T7RNAP in Δung (Fig. 3d). Deletion of nfi had no effect on AID-T7RNAP (Fig. 3d). Notably, expression of TadA*-T7RNAP did not produce any significant increase in the levels of off-target mutagenesis in rpoB (Fig. 3e). In addition, the mutagenic activity of TadA*-T7RNAP in URA3 requires the presence of PT7, dropping to background levels in MG*-URA3ΔungΔPT7 (Fig. 3f). These data demonstrate that TadA*-T7RNAP fusion has a specific mutagenic activity for the target DNA having PT7. This activity is independent of UNG and increases moderately when endonuclease V is absent.

Fig. 3: Expression and activity of TadA*-T7RNAP.
figure3

a Scheme of the fusion TadA*-T7RNAP with the linker (G3S)7. b Expression of the fusion TadA*-T7RNAP in comparison to AID-T7RNAP determined by western blot analysis of cell extracts from induced cultures. A representative immunoblot is shown from two independent experiments with similar results. c Processivity of the fusions assessed by flow cytometry analysis to detect expression of GFP in the induced cultures. The gating strategy and the corresponding pseudocolor plots are shown in Supplementary Fig. 3c. d, e Mutagenic activity of the AID- and TadA*-T7RNAP fusions in URA3 (d) and rpoB (e), using as hosts MG*-URA3, MG*-URA3∆ung, MG*-URA3∆nfi, and MG*-URA3∆ungnfi (n = 3 independent experiments). f URA3 mutant frequency when TadA*-T7RNAP is expressed in MG*-URA3∆ung and MG*-URA3∆ung∆PT7 (pSEVA221 n = 4, TadA*-T7RNAP n = 6). The histograms (df) show the single values (black dots), the means (bars), and standard errors (lines) of multiple independent experiments. The statistical analysis was done using two-tailed Student’s t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

Characterization of the mutations

We randomly picked 30 FOAR colonies (URA3 mutants) from each of the MG*-URA3Δung strains expressing native T7RNAP, AID, pmCDA1, and rAPOBEC1 fusions, and 30 additional FOAR colonies from MG*-URA3ΔungΔnfi expressing TadA*-T7RNAP. The chromosomal Ptac-URA3-PT7 region from these colonies was amplified by PCR and sequenced. As a control, identical region was sequenced from 13 FOA-sensitive (nonmutant) colonies from MG*-URA3Δung (pSEVA221), showing no mutation from wild-type allele. In contrast, all FOAR colonies expressing CD-T7RNAP fusions had multiple transitions C:G > T:A in both DNA strands along the Ptac and URA3 gene, but none in PT7 (Fig. 4a). URA3 alleles from bacteria expressing TadA*-T7RNAP contained transitions A:T > G:C in both DNA strands, except a single C:G > T:A transition (Fig. 4b). No other type mutations, deletions, or insertions, were observed in any URA3 alleles from BD-T7RNAP fusions. This was not the case in FOAR colonies expressing native T7RNAP, which contained different types of mutations, including transitions, transversions, deletions, and insertions in URA3 (Supplementary Fig. 4). Showing correlation to the mutagenic capacity of the BD-T7RNAP fusions, the highest total number of mutations was found with pmCDA1-T7RNAP (426) followed by rAPOBEC1-T7RNAP (95), AID-T7RNAP (42) and TadA*-T7RNAP (37) (Fig. 4c). The average number of mutations per clone presented the same hierarchy: pmCDA1-T7RNAP (14.2) > rAPOBEC1-T7RNAP (3.2) > AID-T7RNAP (1.4) > TadA*-T7RNAP (1.2) (Fig. 4d). For all CD-T7RNAP fusions, transitions G to A were detected more frequently than C to T in the URA3 coding strand (Fig. 4a, c), indicating a higher mutation rate of Cs in the noncoding strand of URA3, which corresponds to the non-template strand for the CD-T7RNAP fusions (Supplementary Fig. 5). This bias toward the non-template strand of T7RNAP is less pronounced for AID (62%) than for rAPOBEC1 (74%) or pmCDA1 (91%) fusions (Fig. 4c). For TadA* fusion, we also found a bias favoring T to C mutations in the coding strand of URA3 (84%), corresponding to A to G mutations in the non-template strand of T7RNAP (Fig. 4c and Supplementary Fig. 5). Therefore, DNA sequencing of URA3 mutants demonstrates that BD-T7RNAP fusions induce the expected mutations with a bias toward the non-template strand.

Fig. 4: Characterization of URA3 mutations found in FOAR colonies expressing BD-T7RNAP fusions.
figure4

a Number of mutations per nucleotide identified in the URA3 locus from 30 FOAR colonies isolated from each MG*-URA3Δung strain expressing the indicated CD-T7RNAP fusions and b from MG*-URA3ΔungΔnfi strain expressing TadA*-T7RNAP fusion. The promoters Ptac and T7 are shown with green and red arrow heads, respectively, and delimited by dashed lines. The gene URA3 is shown with a yellow filled arrow. The indicated base changes correspond to the coding sequence of URA3. Different base substitutions found are labeled with the color codes on the right. A single G to A transition found with TadA*-T7RNAP is labeled with an asterisk. c Total number of mutations for each BD-T7RNAP fusion indicating the base substitutions found. d Average number of mutations per clone found in the FOAR colonies analyzed for each of the indicated BD-T7RNAP fusions. Single values are represented with blue dots and means and standard errors with black lines. e, f Variant calling analysis of a 200 bp region of URA3 after its massive DNA sequencing (ca. 106 reads). The number of reads with different variants vs. total reads are represented with circles (empty plasmid) and squares (AID-T7RNAP or TadA*-T7RNAP). The lines represent the means and the standard errors from each group. The statistical analysis was done using two-tailed Mann–Whitney test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

For deeper analysis, a 284 bp PCR fragment of the URA3 gene was amplified from an induced culture of MG*-URA3Δung expressing AID-T7RNAP without FOA selection, and was subjected to massive next-generation sequencing (NGS). The same region was subjected to NGS from a control MG*-URA3Δung (pSEVA221). Comparison of the variant call analysis from the two samples (ca. 1 × 106 reads/sample) indicated that only the transition C:G > T:A appeared in an statistically significant higher number in bacteria expressing AID-T7RNAP (Fig. 4e and Supplementary Fig. 6). We used the same approach to analyze the induced URA3 mutations by TadA*-T7RNAP in MG*-URA3ΔungΔnfi compared with the same strain carrying pSEVA221. In this case, only transitions A:T > G:C were detected in a statistically significant higher number in the sample expressing TadA*-T7RNAP (Fig. 4f and Supplementary Fig. 7). Hence, massive DNA sequencing data is consistent with the DNA sequencing results of individual URA3 mutants, and confirms that BD-T7RNAP fusions generate only the expected mutations.

We also confirmed the adequacy of the RifR phenotype as reporter of the off-target activity of CD and AD fusions. The RifR phenotype is mostly caused by rpoB mutations in a region between amino acid 507 and 687 of the β-subunit of E. coli RNAP known as RIF-resistant determing region (RRDR)38. We sequenced the rpoB-RRDR segment in 30 RifR colonies from strains MG*-URA3Δung and MG*-URA3Δnfi carrying pSEVA221. Among other mutations, transitions C to T and G to A were readily found in rpoB-RRDR from RifR MG*-URA3Δung (Supplementary Fig. 8a). Similarly, RifR MG*-URA3Δnfi contained diverse transitions A to G and T to C in rpoB-RRDR (Supplementary Fig. 8b). Consequently, different bases within rpoB-RRDR that are sensitive to CD and AD mutagenesis can generate RifR mutants, validating rpoB to assess the off-target activity of BDs.

Protection of downstream regions using dCas9

The BD-T7RNAP fusions were able to transcribe the gfp gene downstream of URA3 (respective to the elongation of T7RNAP) potentially generating mutations beyond URA3. To confirm this, we inserted sacB from Bacillus subtilis in the gfp-URA3 cassette as an additional counterselection gene39. B. subtilis sacB codes for the exoenzyme levansucrase, which utilizes sucrose to produce toxic levan, which accumulates in the periplasm of E. coli killing the bacterium. A Ptac-sacB fragment was cloned in gfp-URA3 and the new sacB-gfp-URA3 cassette (Fig. 5a) was integrated replacing the chromosomal flu gene in MG*ΔpyrFΔungΔnfi. The resulting strain, MG*-SacB-URA3ΔungΔnfi, was sensitive to sucrose with a frequency of spontaneous mutants of ~5 × 10−6. When AID-T7RNAP was expressed in this strain, the frequency of sacB mutants increased to ~6 × 10−4 (Supplementary Fig. 9), whereas that of URA3 (~1.7 × 10−3) was similar to that observed previously (Fig. 2d). This confirms that BD-T7RNAP fusions are able to mutate regions downstream of the target gene.

Fig. 5: Blocking AID-T7RNAP elongation and mutagenic activity by dCas9 and crRNAs.
figure5

a Scheme of the reporter cassette with sacB integrated in MG*-SacB-URA3∆ungnfi strain. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipops indicate terminators T0 and T1; red lines mark targeting sequences of the crRNAs a, b, and c; filled arrows indicate the genes sacB (orange), gfp (green), and URA3 (yellow). b Representation of the dCas9 blocking activity showing one crRNA (in red) targeting the non-template strand relative to T7RNAP transcription. The direction of the progession of the fusion along the DNA is indicated with a gray arrow. The mutagenic protein fusion (AID-T7RNAP) is displaced from the transcription bubble (dashed arrow) by bound dCas9/crRNA. The orange shape represents the dCas9, the blue shape joined by a black line to a purple elliptical shape represents the fusion AID-T7RNAP, the red RNA molecule represents the crRNA and the green RNA molecule represents the tracrRNA. c Relative GFP levels measured by flow cytometry of bacteria from strain MG*-SacB-URA3∆ungnfi expressing AID-T7RNAP and dCas9 in the absence (−) or presence of crRNA arrays b.a and b.a.c. The histogram shows the percentages of mean flurorescence intensities (MFI) for each condition relative to the strain lacking crRNAs. Background GFP fluorescence signals from this strain with pdCas9 and the empty vector pSEVA221 are subtracted from all values (n = 3 independent experiments). d Ratio of mutagenesis of sacB vs. URA3 in bacteria MG*-SacB-URA3∆ungnfi expressing AID-T7RNAP and dCas9 in the absence (−) or presence of crRNA arrays b.a and b.a.c. The ratio found in bacteria lacking crRNAs are considered 1 (control n = 6, b.a n = 6, b.a.c n = 3). For c and d, the histograms represent the single values (blue dots), the relative means (bars), and standard errors (black lines) from multiple independent experiments. The statistical analysis was done using two-tailed Student t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

To delimit the mutagenic activity within the target gene, we investigated the possibility of blocking elongation of BD-T7RNAP with dCas9. dCas9 can bind a target DNA sequence using crRNAs (or gRNAs) and has been used as transcriptional repressor for the endogenous E. coli RNAP40. Co-expression of two gRNAs targeting the non-template strand of a gene enhances the transcription repression of E. coli RNAP by dCas941. Therefore, we designed two crRNAs (a and b) against the non-template strand of gfp (Fig. 5a, b), generating a double crRNA array (b.a). Plasmid pdCas940 was used for the constitutive expression of dCas9, the trans-activating CRISPR RNA (tracrRNA), and the designed crRNA arrays. MG*-SacB-URA3ΔungΔnfi strains carrying pdCas9b.a, or control pdCas9, were transformed with the plasmid encoding AID-T7RNAP or pSEVA221 control. After growth and aTc induction, we determined that GFP expression by AID-T7RNAP was reduced by the double crRNAb.a to ~30% of that produced in the isogenic strain lacking crRNA (Fig. 5c). Basal levels of GFP were detected in strains carrying pSEVA221 and were considered 0. Protection of sacB from AID-T7RNAP mutagenesis was assessed by the ratio of the mutation in sacB vs. URA3 for each strain. This ratio was normalized as 1 for the strain expressing AID-T7RNAP and carrying pdCas9 without crRNAs (Fig. 5d). In concordance with the reduction of GFP expression, the mutagenesis of sacB dropped ~10-fold when the double crRNAb.a was expressed (Fig. 5d). Then, we tested whether the blockade of AID-T7RNAP could be enhanced with a triple (b.a.c) crRNA array (Fig. 5a). Expression of dCas9 and crRNAb.a.c repressed the levels of GFP to ~20% of those found with pdCas9 lacking crRNA, and the mutant frequency in sacB vs. URA3 decreased ~14-fold (Fig. 5c, d). Therefore, dCas9 directed with crRNAs is able to hamper the progression and mutagenesis of BD-T7RNAP fusions along the DNA, being a triple crRNA array more effective.

To evaluate whether dCas9 can be used to protect a particular region within a gene, we designed a new triple crRNA array (d.e.f) targeting URA3 (Fig. 6a). Cultures of MG*-SacB-URA3ΔungΔnfi strain expressing AID-T7RNAP and dCas9 with crRNAd.e.f (targeting URA3), or crRNAb.a.c (targeting gfp) as a control, were induced and the URA3 alleles from 30 FOAR colonies from each culture were sequenced. As expected, mutations in FOAR clones expressing crRNAb.a.c (targeting gfp) were found distributed all along URA3, whereas mutations in FOAR clones expressing crRNAd.e.f (targeting URA3) were only found in the gene segment between the recognition sites of the crRNAs and PT7 (Fig. 6b). Interestingly, albeit mutations in URA3 were not detected more distal than the hybridization site of crRNA d, we found mutations in regions between the hybridization sites of crRNAs d to e and e to f (Fig. 6b), as well as in the URA3 segment proximal to PT7. Collectively, these results demonstrate that targeted dCas9 blockade with a triple crRNA array can be used to concentrate the mutagenesis activity of BD-T7RNAP fusions to a specific target gene or gene segment, reducing the mutagenesis of downstream DNA regions.

Fig. 6: Protection from mutagenesis of a delimited segment of the target gene by dCas9 guided with crRNAs.

a Scheme of the URA3 cassette with the recognition sites (red lines) of the triple crRNAs b.a.c and d.e.f. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipop shapes indicate terminators T0 and T1. The genes gfp and URA3 are represented with green and yellow filled arrows, respectively. b URA3 mutations found in FOAR colonies of induced cultures of MG*-SacB-URA3ΔungΔnfi expressing AID-T7RNAP and the corresponding tripe crRNAs. For each culture the URA3 alleles from 30 colonies were sequenced. The indicated base changes correspond to the coding sequence of URA3. Different base substitutions found are labeled with the color code on the right. The promoters Ptac and T7 are shown with green and red arrow heads, respectively. The gene URA3 is shown with a yellow filled arrow. Red rectangles over the URA3 gene mark targeting sequences of the crRNAs d, e, and f, and delimited with dashed lines. Source data are provided as a Source data file.

Fast directed evolution of the TEM-1 gene

To test this in vivo mutagenesis system in a directed evolution process, we chose the antibiotic resistance gene TEM-1 as proof-of-principle. This gene encodes the TEM-1 β-lactamase that confers resistance to penicillins, cephalosporins, and related β-lactams42. The evolution of this enzyme has been extensively documented due to its clinical relevance, with the description of >170 variants, some of them having increased resistance to third-generation cephalosporins, such as ceftazidime (CAZ)42. We planned to evolve TEM-1 in vivo to obtain variants that confer resistance to CAZ. To this end, the URA3 gene in the sacB-gfp-URA3 cassette was replaced by TEM-1, and the new sacB-gfp-TEM-1 cassette was inserted, replacing flu, in the chromosome of MG*ΔpyrFΔungΔnfi (Fig. 7a). The resulting strain (MG*-SacB-TEM-1ungΔnfi) was transformed with plasmids encoding AID-T7RNAP and dCas9-crRNAb.a.c (to protect downstream genes). Three independent transformants were cultured separately and subjected to two iterative cycles of growth and induction with aTc (Fig. 7b). As controls of non-induced evolution (i.e., spontaneous TEM-1 mutation), three colonies from the same strain carrying the empty vectors (pSEVA221 and pdCas9) were grown and induced in parallel. At the end of each induction cycle, serial dilutions of each culture were plated on LB agar with increasing concetrations of CAZ (0, 1, 4, and 16 μg ml−1) to calculate the frequency of CAZR-mutants at each concentration, expressed as the ratio of CAZR CFU ml−1 vs. total CFU ml−1 (Fig. 7c). After one cycle, the frequency of CAZR-mutants in cultures expressing AID-T7RNAP and dCas9-crRNAb.a.c was ~6.5 × 10−5 at 1 μg ml−1, and ~3 × 10−7 at 4 μg ml−1 of CAZ, with no resistant colonies detected to 16 μg ml−1. In contrast, spontaneous CAZR-mutants in control cultures appeared at ~2000-fold lower frequency at 1 μg ml−1 of CAZ, and no resistant colonies appeared at any higher concentration (Fig. 7c). Interestingly, bacterial cultures expressing AID-T7RNAP, and subjected to two cycles of growth and induction increased the frequency of CAZR-mutants at 1 and 4 μg ml−1 of CAZ by ~10-fold (~5 × 10−4 and ~2 × 10−6, respectively) and CAZR colonies arose at the highest CAZ concentration (16 μg ml−1) with a frequency of ~2 × 10−7 (Fig. 7c). In contrast, after two cycles of growth, spontaneous CAZR-mutants were not detected in control cultures at 4 or 16 μg ml−1 of CAZ, and CAZR clones only appeared at low frequency (~1 × 10−7) in plates containing 1 μg ml−1 of CAZ. These data indicate that AID-T7RNAP was producing mutations within TEM-1 at a significant rate above spontaneous mutation (>1000-fold/mutagenic cycle), with mutations being accumulated in each cycle generating variants with different levels of resistance to CAZ.

Fig. 7: TEM-1 evolution using the mutagenesis system.
figure7

a Scheme of the cassette with the TEM-1 gene in the strain MG*-SacB-TEM-1ΔungΔnfi. Thin arrows indicate the tac (Ptac) and T7 (PT7) promoters, lollipop shapes indicate terminators T0 and T1. The genes sacB, gfp, and URA3 are represented with orange, green, and yellow filled arrows, respectively. Red lines mark targeting sequences of the crRNAs a, b, and c. b Scheme of the continuous evolution process with iterative cycles of mutagenesis induction. Transformed colonies of MG*-SacB-TEM-1∆ungnfi with the plasmids pdCas9b.a.c and pSEVA221AID-T7RNAP were grown overnight (O/N) in LB with Cm and Km at 37 °C with shaking (250 r.p.m.). The next day, the cultures were diluted 1:100 in fresh medium and incubated under the same conditions for 2 h. Then anhydrotetracycline (aTc; 200 ng ml−1) was added for induction and the cultures were incubated for 1 h. To start the second cycle of mutagenesis, the induced cultures were diluted 1:100 in new medium and grown O/N repeting the same steps as above. To monitor the mutagenic process, after every induction the cultures were washed with 1× PBS and serially diluted prior to be plated on LB agar alone and with increasing concentrations of ceftazidime (CAZ; 1, 4, and 16 µg ml−1). The strain MG*-SacB-TEM-1∆ungnfi with the plasmids pdCas9 and pSEVA221 was used as a control of the spontaneous mutations occurring in TEM-1. c Resistance frequency to increasing concentrations of ceftazidime (CAZ) after each cycle of the cytosine deaminase induced cultures of MG*-SacB-TEM-1ΔungΔnfi pdCas9b.a.c pSEVA221AID-T7RNAP (CD). As negative control (C−), the strain MG*-SacB-TEM-1ΔungΔnfi with the plasmids pdCas9 and pSEVA221 was used. The histogram shows the single values (color coded dots), means (bars), and standard errors (black lines) from three independent cultures for each strain (n = 3). The statistical analysis was done using a two-tailed paired t test. Exact p values (p) are indicated in the figure. A p value < 0.05 was considered significant. Source data are provided as a Source data file.

To characterize the TEM-1 variants from the evolved cultures expressing AID-T7RNAP, we sequenced TEM-1 alleles from 12 CAZR colonies (four from each triplicate culture) were isolated at 1 μg ml−1 CAZ after cycle 1, and from 12 CAZR colonies isolated at 16 μg ml−1 CAZ after cycle 2. This revealed that all mutations found in the TEM-1 alleles were the expected from the CD activity of AID, being transitions G > A more frequent than C > T in the coding strand of TEM-1 (Table 1). All sequenced TEM-1 variants from 1 μg ml−1 CAZ had mutations in residue R164 (R164C or R164H), and only two clones contained additional mutations (A249V and G267R). In contrast, all TEM-1 variants selected at 16 μg ml−1 CAZ after two cycles invariable contained R164H and E104K mutations, frequently associated to additional mutations (e.g., V33I and A150T). Mutations R164H and E104K are reported to provide increased resistance to CAZ42. We confirmed this determining the minimal inhibitory concentration (MIC) to CAZ of mutant R1.10 (R164H)43 and of mutant R2.2 (E104K and R164H)44 (Table 2), demonstrating that TEM-1 variants with increased resistance to CAZ were obtained in a fast and continuous manner using this in vivo mutagenesis system.

Table 1 TEM-1 mutations in ceftazidime resistant (CAZR) clones.
Table 2 Minimal inhibitory concentration (MIC) to ceftazidime (CAZ) of parental strain and evolved mutants.
Comments

    Something to say?

    Log in or Sign up for free

    Disclaimer: The translated content is provided by third-party translation service providers, and IKCEST shall not assume any responsibility for the accuracy and legality of the content.
    Translate engine
    Article's language
    English
    中文
    Pусск
    Français
    Español
    العربية
    Português
    Kikongo
    Dutch
    kiswahili
    هَوُسَ
    IsiZulu
    Action
    Related

    Report

    Select your report category*



    Reason*



    By pressing send, your feedback will be used to improve IKCEST. Your privacy will be protected.

    Submit
    Cancel