Extension of the minimal functional unit of the RNA polymerase II CTD from yeast to mammalian cells

The carboxy-terminal domain (CTD) of the largest subunit of RNA polymerase II (Pol II) consists of 26 and 52 heptad-repeats in yeast and mammals, respectively. Studies in yeast showed that the strong periodicity of the YSPTSPS heptads is dispensable for cell growth and that di-heptads interspersed by spacers can act as minimal functional units (MFUs) to fulfil all essential CTD functions. Here, we show that the MFU of mammalian cells is significantly larger than in yeast and consists of penta-heptads. We further show that the distance between two MFUs is critical for the functions of mammalian CTD. Our study suggests that the general structure of the CTD remained largely unchanged in yeast and mammals; however, besides the number of heptad-repeats, also the length of the MFU significantly increased in mammals.


Introduction
The carboxy-terminal domain (CTD) of Pol II is a repetitive low-complexity domain that extends from the large subunit (Rpb1) in eukaryotes. The CTD comprises heptad-repeats with the consensus sequence of Tyr 1 -Ser 2 -Pro 3 -Thr 4 -Ser 5 -Pro 6 -Ser 7 . The general structure of the CTD remains largely unchanged in yeasts and metazoans; however, the number of repeats varies remarkably. There are 26 repeats in the CTD of Saccharomyces cerevisiae [1], 29 in Schizosaccharomyces pombe [2] and 52 in mammals [3]. The CTD serves as a binding platform for the recruitment of protein complexes and helps to coordinate the entire transcription cycle, RNA maturation and epigenetically modifying the chromatin [4][5][6][7]. Deletion of the entire CTD is lethal in yeast, Drosophila and mammalian cells [8 -11].
Genetic studies in yeast showed that the strong periodicity of heptadrepeats is not an absolute requirement for viability. Insertion of a single alanine residue between adjacent repeats is lethal, while insertion of alanine residues between pairs of heptads is well tolerated both in S. cerevisiae [12] and in S. pombe [13]. These analyses led to the concept of the minimal functional unit (MFU), in which essential CTD functions are accomplished through the interaction of protein factors with motif(s) that lie within di-heptads. Further analysis in S. cerevisiae revealed that the substitution of 'SPS' at positions 5-7 of the distal repeat with alanine (YSPTSPS YSPTAAA) or the removal of these residues (YSPTSPS YSPS) had no discernible impact on cell growth. These mutants were termed 'AR' and '252', respectively [14]. Similar observations were made in S. pombe, where a mutant with an array of 'YSPTSPS YSPAAA', grew well at 18 -348C, but did not thrive at 378C [13]. Together, these experiments suggested that a decapeptide unit could serve as an MFU and arrays of this MFU could suffice for the CTD functions in yeast.
Here, we identified the MFU in the CTD of mammalian cells by taking advantage of an Rpb1 knockin-knockout system, using an Rpb1 mutant that confers resistance to aamanitin [10,11]. Our experiments demonstrate that (i) the MFU of mammalian CTD is larger than in yeast and consists of penta-heptads; and (ii) similar to yeast, the distance between two MFUs is critical for mammalian cell growth.

Material and methods
(a) Establishing stable cell lines CTD sequences of rWT and CTD mutants (mAR, m252, Con 48, 15_AP, 8_AP, 6_AP, 5_AP, 4_AP, 3_AP, 2_AP, 6_2AP and 5_2AP) were synthesized by GeneArt (Regensburg, Germany) using human codon optimization. These constructs were cloned into LS*mock vector [11] and sequenced before transfection. 20 Â 10 6 Raji cells were used to transfect full-length Rpb1 vector carrying the rWT or CTD constructs using electroporation (10 mg of plasmid, 960 mF and 250 V). Polyclonal cell lines were established by selection with G418 (1 mg ml 21 ) for two to three weeks. Once the cell viability reached approximately 95%, the expression of recombinant Pol II was induced by removing tetracycline by washing the cells three times with 50 ml PBS supplemented with 1% fetal calf serum (FCS, Gibco). Twentyfour hours after induction, 2 mg ml 21 of a-amanitin (Sigma) was added to inhibit endogenous Pol II and cell lines were maintained in RPMI 1640 complete medium (10% FCS, 100 U ml 21 penicillin, 100 mg ml 21 streptomycin) at 378C and 5% CO 2 .

(c) Western blot analysis
Cells were washed twice with PBS and lysed in 2Â Laemmli buffer. Proteins were separated by SDS-PAGE and transferred to a nitrocellulose membrane (GE Healthcare). The membranes were first incubated with 5% milk in TBS-T for 1 h to block unspecific binding of antibodies. Membranes were then incubated with the primary antibody in blocking solution at 48C overnight. Next day, the membranes were washed with 1Â TBS-T and incubated for 1 h at room temperature (RT) with horseradish peroxidase (HRP)-conjugated secondary antibodies against rat (Sigma) or mouse (Promega) to be detected by chemoluminescence or incubated for 90 min at RT with IRDye-labelled, secondary antibodies against rat (680 nm; Alexa, Invitrogen) and mouse (800 nm; Rockford, Biomol) and analysed using an Odyssey Imaging System (Li-Cor).
(d) Growth kinetics 20 Â 10 6 cells of each cell line were induced and the numbers of living cells (N L ) and dead cells (N D ) were counted every day using trypan blue staining. The percentage of cell viability (N V ) was calculated using the following formula N V ¼ (N L /N L þ N D ) Â 100. To calculate the cumulative living cell number, the total number of living cells (N L ) was multiplied by the factor by which the cell culture was split.

(e) Statistical analysis
Two clones from each mutant were transfected into Raji cells to generate two biological replicates. Following the induction, three individual measurements of living cells and dead cells from each replicate were taken every day using a Fuchs -Rosenthal chamber. The mean of these three measurements for each biological replicate was calculated. Following this, the s.d. of the mean was used to calculate s.e.m. and plotted in the graphs.

Results and discussion (a) The MFU of the CTD is different in yeast and mammalian cells
Previous studies in S. cerevisiae and S. pombe identified a decapeptide unit of 'YSPTSPSYSPS' as an irreducible MFU of the CTD. We tested if CTD mutants with arrays of this decapeptide unit can suffice for cell proliferation of mammalian cells. We designed two CTD constructs, mAR and m252 (figure 1a), similar to the CTD mutants AR and 252 described in S. cerevisiae [14]. Stable cell lines expressing the CTD mutant were generated using an Rpb1 knockin-knockout system. Western blot analysis showed that the recombinant wild-type Pol II CTD (rWT) and the two CTD mutants expressed the hypophosphorylated (IIA) and hyperphosphorylated (IIO) forms of Rpb1 (figure 1b). This suggests that the mutants are transcriptionally active. Next, we monitored the growth kinetics of mutants by calculating the numbers of living and dead cells for a period of 10 days. Both mutants showed a continuous decline in the percentage of cell viability (figure 1c) and cumulative living cell number (figure 1d). Both mutants had less than 10% of living cells after 10 days of a-amanitin selection and eventually displayed a lethal phenotype. Together, these experiments showed that arrays of MFUs that confer growth and viability in yeast could not confer viability in mammalian cells. Thus, we conclude that the MFU of the CTD in mammalian cells is different from yeast.

(b) MFU of mammalian CTD consists of penta-heptads
We next determined the MFU of mammalian CTD. Repeats 1-3 and repeat 52 are important for the stability of CTD in mammalian cells, as the deletion of these repeats leads to the cleavage and degradation of the CTD from Pol II [18,19] (figure 2a). The mutant (Con48) in which repeats 4-51 of wild-type CTD were replaced by consensus heptads showed wild-type-like cell growth and proliferation, demonstrating that the non-consensus repeats are dispensable for cell growth [19]. Taking this information, we designed a set of CTD mutants using Con48 as template mutant. We replaced a consensus heptad with an AP spacer (AAPAAPA) after CTD mutants with arrays of MFUs of five or more consensus heptads (15_AP to 5_AP) were also viable under a-amanitin selection. Mutants 6_AP and 5_AP had a slight reduction in the percentage of cell viability as compared with Con48 but had nearly 80% cell viability after 9 days of selection, suggesting a viable phenotype. Interestingly, CTD mutants with arrays of MFUs of four or less consensus heptads showed a continuous decline in the percentage of cell viability and cumulative living cell numbers. The cell viability reached less than 10% after 9 days of a-amanitin selection, indicating a lethal phenotype for these mutants without any indication for growth adaptation. Together, we conclude that the MFU of mammalian CTD is significantly larger than in yeast and lies within penta-heptads. Our results suggest that, similar to yeast, the strong periodicity of heptad-repeats is not an absolute requirement in mammalian cells. However, in mammalian cells, essential CTD functions are accomplished through a motif that lies within penta-heptads in contrast to di-heptads in yeast [12]. Does this mean that the 35 amino acids covered by pentaheptads are all required for the functionality of mammalian MFUs? Previous studies have designed CTD variants to map the phosphorylation sites along the entire CTD in yeast [21] and mammals [22]. In these studies, a number of mutations including lysine (K) or arginine (R) were introduced after every two or three heptads to make the entire CTD accessible to mass spectrometry. Interestingly, these CTD variants were viable and showed comparable cell proliferation to that of recombinant wild-type CTD. This demonstrates that even within an MFU, mutations at certain positions are tolerated, suggesting that not all residues in MFUs of penta-heptads may be critical for CTD functions in mammalian cells. This is consistent with yeast mutants showing that only the presence of three SP (serine-proline) motifs and two tyrosine residues spaced seven amino acids apart were an absolute requirement for an MFU [14]. This appears also likely for mammalian CTD. However, the question of critical amino acids within a mammalian MFU has not been addressed in this study.
Why is the MFU of mammalian CTD longer than in yeast? It could well be that during evolution the CTD became more versatile to recruit larger proteins or protein complexes to regulate transcription-coupled processes. Additionally, CTD is an intrinsically disordered low-complexity domain that can undergo liquid-liquid phase separation in vitro and could cluster Pol II in living cells [23]. CTD interacts with other lowcomplexity proteins, including, MED-1, a subunit of the Mediator complex [24,25] to regulate Pol II localization and gene expression. The head and the middle module of Mediator make multiple contacts with the CTD [26,27] in a tyrosine-1dependent manner [17]. X-ray crystallography revealed that the head module of Mediator binds a peptide with several heptads [27]. The number of heptad-repeats binding to mammalian Mediator has not been studied; however, unpublished data of our laboratory indicate that a CTD with MFUs containing fewer than five heptad-repeats does not bind Mediator. Finally, mammalian CTD can recruit proteins which are specific for metazoans and not present in yeast. For example, a 14-subunit protein complex called Integrator interacts with the Pol II CTD in a Ser2 and Ser7 phosphorylation-dependent manner [28][29][30]. Recruitment of Integrator to Pol II CTD is crucial to regulate the 3 0 -end processing of small nuclear RNA (snRNA) genes [31] as well as polymerase pause-release [32,33]. Thus, mammalian CTD requires a longer MFU motif, probably to accommodate more and/or larger proteins for transcription-coupled processes. royalsocietypublishing.org/journal/rsbl Biol. Lett. 15: 20190068 (c) Spacing between functional units The distance between MFUs is critical in S. cerevisiae, as the growth rate slows down with the increase in the length of the spacer [12]. To understand the significance of distance between MFUs in mammalian cells, we designed CTD mutants in which MFUs of hexa-or penta-heptads were separated by two AP spacers. These mutants were termed as 6_2AP and 5_2AP, respectively (electronic supplementary material, figure S1a). Both mutants expressed the IIA and IIO forms of polymerase and showed comparable levels of Ser5 and Ser2 phosphorylation (electronic supplementary material, figure S1b). However, both mutants displayed a strongly reduced cell viability of less than 20% after 12 days under a-amanitin selection (electronic supplementary material, figure S1c,d). Together, these experiments suggest that the distance between two MFUs is also critical in mammalian cells and that increasing the distance between two MFUs by more than one AP spacer has a severe impact on cell growth. Separating MFUs by two AP spacers may affect the recruitment of certain essential transcription factors or affect the CTD-driven aggregation of Pol II [34], altering gene expression.

Conclusion
The length of the CTD considerably increased during the evolution from yeast to mammals, but the general structure of the CTD remained largely unchanged. We showed that besides the CTD length, the demand of the MFU length also increased in mammalian cells. This increase in the MFU length probably provides more versatility to the mammalian CTD to coordinate the complex transcription-coupled processes.