Cytoplasmic mRNA recapping has limited impact on proteome complexity

The m7G cap marks the 5′ end of all eukaryotic mRNAs, but there are also capped ends that map downstream within spliced exons. A portion of the mRNA transcriptome undergoes a cyclical process of decapping and recapping, termed cap homeostasis, which impacts the translation and stability of these mRNAs. Blocking cytoplasmic capping results in the appearance of uncapped 5′ ends at native cap sites but also near downstream cap sites. If translation initiates at these sites the products would lack the expected N-terminal sequences, raising the possibility of a link between mRNA recapping and proteome complexity. We performed a shotgun proteomics analysis on cells carrying an inducible inhibitor of cytoplasmic capping. A total of 21 875 tryptic peptides corresponding to 3565 proteins were identified in induced and uninduced cells. Of these, only 29 proteins significantly increased, and 28 proteins significantly decreased, when cytoplasmic capping was inhibited, indicating mRNA recapping has little overall impact on protein expression. In addition, overall peptide coverage per protein did not change significantly when cytoplasmic capping was inhibited. Together with previous work, our findings indicate cap homeostasis functions primarily in gating mRNAs between translating and non-translating states, and not as a source of proteome complexity.


Introduction
The 5 0 cap structure is a defining feature of all eukaryotic mRNAs and, for the vast majority of transcripts, its binding by eIF4E is the first committed step in translation initiation [1]. The cap is added co-transcriptionally [2,3] by the coordinated action of the bifunctional nuclear capping enzyme (RNGTT, termed CE), and the heterodimer of cap methyltransferase (RNMT) with its activating subunit RAM (or RAMAC) [4]. Capped analysis of gene expression (CAGE) is based on the 5 0 cap as a functional marker for transcription start sites [5]; however, some of the earliest applications of CAGE identified capped ends downstream of transcription start sites and within spliced exons [6,7]. These findings coincided with the identification by our laboratory of a cytoplasmic population of CE that co-sedimented with a 5 0 -monophosphate kinase. The 5 0 -kinase converts RNA with a 5 0 monophosphate end to one with a 5 0 diphosphate, with subsequent GMP addition by CE resulting in a GpppX terminus [8] (reviewed in [9]). The 5 0 kinase and CE assemble in the cytoplasmic capping complex by their respective binding to the second and third SH3 domains of adapter protein NCK1 [10], and the metabolon capable of converting 5 0 -monophosphate end to Cap 0 is completed by binding of the RNMT:RAM heterodimer to the N-terminus of cytoplasmic CE [11]. A recent proteomics analysis showed the cytoplasmic CE interactome is unexpectedly complex. Whereas nuclear CE interacts with four proteins [12] cytoplasmic CE interacts with 66 proteins, 52 of which are RNA-binding proteins [13]. This suggested target specificity is determined by the binding of one or more of these proteins, which in turn nucleate assembly of the cytoplasmic capping complex.
Functional analysis of cytoplasmic capping required the development of tools to selectively disrupt this process without affecting nuclear 5 0 end processing. One approach targets cytoplasmic m7G cap methylation by overexpressing a truncated form of RNMT in the cytoplasm (ΔN-RNMT) with a mutation in the SAM binding site. ΔN-RNMT competes with cytoplasmic RNMT for binding to both CE and RAM, and its overexpression results in loss of mRNAs with improperly methylated caps [14]. Another approach targets GMP addition using an inducible form of CE with a mutation in the GMP binding site (K294). Like ΔN-RNMT this protein is restricted to the cytoplasm by loss of the nuclear localization domain and addition of the HIV Rev nuclear export signal. Both approaches provided support for recapping at native 5 0 ends, and results from targeting cap methylation showed recapping can occur downstream within at least the 5 0 -untranslated region. This cytoplasmic cycling of caps off and back on (cap homeostasis) has been proposed as a way of gating mRNAs between translating and non-translating states [15,16]. While there is evidence for recapping downstream within the coding region [15,17,18], it remains to be determined if cytoplasmic mRNA recapping has a measurable impact on the proteome.
The current study addressed this using an isogenic system consisting of tetracycline-inducible U2OS cells stably transfected with a transgene expressing the K294A cytoplasmic guanylylation inhibitor [8,10,15,17,19]. The goal was to obtain a representative sampling that was of sufficient depth to determine if cytoplasmic capping impacts proteome complexity by looking for changes in protein and peptide representation after inhibiting this process. In addition, because tetracyclines have been reported to affect the metabolism and function of a number of human cell lines [20][21][22] we took advantage of having the parental tetracycline-inducible cell line to determine if treating cells with a tetracycline antibiotic has any impact on the proteome. Our findings show doxycycline treatment alone has minimal impact on the parental cell proteome. More importantly, inhibiting cytoplasmic capping had limited impact on the representative shotgun proteomics profile measured at 70-80% confluence, thus indicating this process is not a major contributor to proteome complexity.

Doxycycline has minimal impact on the U2OS cell proteome
The parental cell line for much of our work consists of U2OS osteosarcoma cells that stably express the tetracycline repressor protein (U2OS-TR cells [8]). That makes these cells a good model for testing whether the proteome is affected by treating cultured mammalian cells with tetracyclines. A proteome screen was performed on triplicate cultures of U2OS-TR cells that were treated for 24 h without and with this antibiotic (electronic supplementary material, table S1). A comparison of treated versus untreated cells showed doxycycline has minimal impact on the proteome of these cells ( figure 1). There was a small increase in six proteins and similarly small decrease in seven proteins (electronic supplementary material, table S1), and these results are consistent with results from RNA-Seq of the same cells [14], which showed no impact of doxycycline on the transcriptome.

Cytoplasmic capping has limited overall impact on the proteome
As noted in the Introduction, one can inhibit cytoplasmic capping by interfering with GMP addition or with cap methylation. The current study employed the former approach, using a dominant-negative form of capping enzyme (termed K294A) to block the cytoplasmic guanylylation step. Twenty-four-hour induction of K294A causes prolonged inhibition of recapping that in turn results in the accumulation and/or loss of uncapped forms of cytoplasmic capping targets [15]. Total cellular protein was recovered from triplicate cultures of cells treated ± doxycycline, reductively alkylated, trypsinized and subjected to label-free shotgun proteomics as described in Material and methods. To increase the possibility of detecting quantitative changes in peptide representation, analysis was performed with a more sensitive mass spectrometer than that used to examine the effect of doxycycline on the parental cell line, coupled to improved front-end separation (UPLC and ion mobility). Differential peptide and protein expression were determined by comparing peptide spectral matches (PSMs) and filtering for false discovery rate (FDR) less than 0.01. This identified 21 875 peptides corresponding to 3565 proteins (electronic supplementary material, table S2); this represents about one-third of the total proteins that might be expressed by this cell line [23]. This number of proteins should be more than sufficient to draw conclusions on cap homeostasis. Note that we did not need or expect to measure the entire proteome of the cell line by the chosen shotgun proteomics approach. Moreover, since abundances vary over seven orders of magnitude total proteome measurements would require additional fractionation and analyses performed at multiple time points after doxycycline addition.
Twenty-nine proteins (0.8% of 3565 proteins, plus the inhibitory form of RNGTT) showed a significant increase when cytoplasmic capping was inhibited (figure 2 and table 1). This is lower than the number of mRNAs that increased in [14], and may reflect both the shotgun proteomics royalsocietypublishing.org/journal/rsob Open Biol. 10: 200313 approach, where the entire proteome is not captured, and more importantly, proteome buffering as described in [24]. The mRNAs that increased in [14] primarily encoded proteins involved in transcription and RNA processing, and we noted there, changes in these transcripts are likely a compensatory response to the decrease in a large number of other mRNAs as described in [25]. As one might expect from the experimental protocol, RNGTT (i.e. K294A) was the most highly induced protein.
There was little evidence for functional relatedness between the proteins that increased in K294A-expressing cells, but it is not reasonable to draw conclusions on relatedness when only 0.8% of the proteins were increased. By Gene Ontology analysis, the increased proteins mostly fall under the broad categories of catalytic, binding, and structural proteins (figure 3a). A similarly small number of proteins (28, or 0.8%) were significantly decreased when cytoplasmic capping was blocked (figure 2 and table 1). Again, there was little evidence for functional relatedness between these. Gene Ontology analysis yielded groupings similar to those proteins that increased (figure 3b), but again one cannot draw conclusions based on such a small number of differentially expressed proteins. None of these proteins correspond to transcripts in [15] whose cap status changed following K294A induction. We suspect this is due to differences in the relative amount of uncapped RNA for any given transcript.

Peptide representation is unchanged by inhibition of cytoplasmic capping
In [26] and [9], we put forward the idea that one function for cytoplasmic capping might be to expand the proteome.
This was based on several findings. Approximately 25% of capped ends are located downstream of transcription start sites [6,7], and many of these lie upstream of potential start codons. There is also evidence from positional proteomics for N-termini that map downstream of canonical initiation sites [27][28][29]. While some of these correspond to known alternative initiation sites, propeptides and signal peptide cleavage sites, for many the origin remains unknown. Lastly, a number of mRNA targets acquire uncapped 5 0 ends in the vicinity of known downstream cap sites when cytoplasmic capping is blocked [17,18]. Because there was good peptide coverage across most of the identified proteins, we reasoned that downstream initiation events would be evident by changes in peptide representation in control versus K294A expressing cells, with the most notable change being an increase in peptides nearest the native N-terminus. That turned out not to be the case. A comparison of peptide coverage between individual replicates in figure 4 showed no evidence for significant changes as a function of K294A expression. Thus, our results indicate that cytoplasmic capping has little overall impact on the proteome. In summary, the accumulated data do not support a role for cytoplasmic capping in proteome complexity, but instead support the model of cap homeostasis put forward in [15], where decapping and recapping serves as a gating mechanism controlling the translation of a portion of the transcriptome. This is consistent with the observed recapping of RPS3, RPS4X and RPL8 mRNAs on their native 5 0 ends [14], and with results of in vivo single-molecule translational dynamics that showed cycling of mRNAs between translating and non-translating states [16,30]. The only other known function of mRNA recapping also involves translation, as recapping within the 5 0 -UTR (e.g. EIF3D, EIF3 K) can change secondary structures and binding sites for regulatory proteins.

Cell culture and protein extraction
Tetracycline-inducible U2OS (U2OS-TR) cells and tetracyclineinducible U2OS cells stably transfected with pcDNA4/TO/ myc-K294DNLS + NES-Flag (U2OS-K294A) were described previously [8]. Cells were maintained in a humidified incubator at 37°C under 5% CO 2 and were discarded after no more than 10 passages. Cells were grown in McCoy's 5A medium (Thermo Fisher 116600) supplemented with 10% tetracyclinefree fetal bovine serum (FBS, Atlanta Biologicals S10350). Triplicate cultures of parental U2OS-TR or K294A-expressing cells at 70-80% confluence were switched to medium without or with 1 µg ml −1 of doxycycline for 24 h. Prior to harvest cultures were washed three times with phosphate-buffered saline (PBS) and lysed using ice-cold lysis buffer (0.1 M HEPES, pH 8.5, 6M guanidine hydrochloride supplemented with one tablet of protease inhibitor (cOmplete Mini EDTA-free cocktail, Roche Life Science) and one tablet of phosphatase inhibitor (PhosSTOP, Roche Life Science). The cell lysates were sonicated using Sonic Dismembrator Model 100 (Fisher Scientific) for three cycles of alternating 30 s bursts followed by 30 s rest followed by centrifugation at 16 000× g for 15 min at 4°C. The protein concentration of the collected supernatant was determined by a bicinchoninic acid (BCA) protein assay kit (ThermoFisher Scientific).

Sample preparation for shotgun proteomics analysis
Four hundred milligrams of lysate was reductively alkylated by first incubating for 1 h at 37°C with 10 mM dithiothreitol, followed by 30 min alkylation (in the dark) at 25°C with 55 mM iodoacetamide (Sigma Aldrich). Samples were diluted sixfold with 50 mM ammonium bicarbonate to reduce the concentration of guanidine hydrochloride to less than 1 M. Tryptic digestion was performed by adding 2 µl of 1 µg µl −1 trypsin (1 : 200 w/w) supplemented with 1 µl of 1% ProteaseMAX surfactant (Promega) and incubating at 37°C for 3 h. Trypsin was inactivated by addition of trifluoroacetic acid (TFA) to a final concentration of 0.5%. The digestion products were centrifuged 16 000g for 10 min, and the supernatants were collected and evaporated to dryness.

Impact of doxycycline: Orbitrap Elite
Analysis of the impact of doxycycline on the U2OS cell proteome (U2OS-TR, figure 1b)   achieved at 35°C where the % B1 was maintained at 2% for 5 min; 2-35% over 75 min; 35-45% over 10 min and 45-85% over 10 min at a flow rate of 0.3 µl min −1 . The column was held at 85% (v/v) B1 for 5 min before reaching initial conditions after 10 min. The heated capillary temperature and electrospray voltage on the Orbitrap Elite were 200°C and 1.5 kV, respectively, using top 15 data-dependent acquisition in positive ion mode. The MS scans were acquired at a resolution of 120 000 with a automatic gain control (AGC) target value of 1 × 10 6 for a scan range of 400-1600 m/z. Collision-induced dissociation (CID) spectra were obtained in the ion trap with AGC target of 1 × 10 4 , maximum ion injection time (IT) of 50 ms, 1 m/z isolation width, normalized collisional energy (NCE) of 35 and ion activation time of 10 ms. The transfer tube S-lens RF was 49% and dynamic exclusion was set at 15 s with a repeat count of 1 for an exclusion list size of 500.

Impact of cytoplasmic capping: timsTOF Pro
Samples from U2OS-K294A cells were analysed using a nanoElute coupled to a timsTOF Pro equipped with a CaptiveSpray source (Bruker, Germany). Peptides (0.2 µg) was separated on a 25 cm × 75 µm analytical column, packed with 1.6 µm C18 beads (IonOpticks, Australia). The column temperature was maintained at 50°C using an integrated column oven (Sonation GmbH, Germany). Separation was achieved using 0.1% formic acid (A1) and acetonitrile with 0.1% formic acid (B1) as mobile phases. The column was equilibrated with four column volumes of 100% solvent A1 before loading sample at a maintained pressure of 800 bar. Peptide separation was achieved at 0.4 ml min −1 using a linear gradient from 2% to 25% solvent B1 over 90 min, 25% to 37% over 10 min, 37% to 80% over 10 min and maintained for 10 min for total separation method time 120 min. Data acquisition on the timsTOF Pro used the parallel accumulation serial fragmentation (PASEF) acquisition mode. Instrument settings included default imeX mode, mass range 100 to 1700 m/z,

Statistical analysis
Label-free relative quantification was carried out using the Limma package in R [31] using peptide spectral matches (PSMs) or spectral counts. Differential expression of individual proteins was determined using proteins with at least two PSMs and two unique peptides. Data were filtered for PSMs observed in all replicates of at least one condition and normalized using quantile normalization. PSMs were log 2 transformed and differential enrichment analysis was carried out using linear models combined with empirical Bayes statistics function in Limma. False discovery correction was applied using the Benjamani-Hochberg method and data were visualized using volcano plots with significant p-values ≤0.05 and heatmaps.
Functional classification of significant proteins was searched against the PANTHER Classification System using their corresponding gene names with Homo sapiens as the select organism. Pie chart illustrations of gene ontology annotations for molecular function of proteins with significantly different expression are presented in figure 3.   , table S2 was compared between protein samples for each of the control cell populations and cells treated with doxycycline to induce the K294A cytoplasmic capping inhibitor. Heatmap was generated using the R package ComplexHeatmap to reveal potential patterns in protein sequence coverage at the peptide level.