Post-translational processing targets functionally diverse proteins in Mycoplasma hyopneumoniae

Mycoplasma hyopneumoniae is a genome-reduced, cell wall-less, bacterial pathogen with a predicted coding capacity of less than 700 proteins and is one of the smallest self-replicating pathogens. The cell surface of M. hyopneumoniae is extensively modified by processing events that target the P97 and P102 adhesin families. Here, we present analyses of the proteome of M. hyopneumoniae-type strain J using protein-centric approaches (one- and two-dimensional GeLC–MS/MS) that enabled us to focus on global processing events in this species. While these approaches only identified 52% of the predicted proteome (347 proteins), our analyses identified 35 surface-associated proteins with widely divergent functions that were targets of unusual endoproteolytic processing events, including cell adhesins, lipoproteins and proteins with canonical functions in the cytosol that moonlight on the cell surface. Affinity chromatography assays that separately used heparin, fibronectin, actin and host epithelial cell surface proteins as bait recovered cleavage products derived from these processed proteins, suggesting these fragments interact directly with the bait proteins and display previously unrecognized adhesive functions. We hypothesize that protein processing is underestimated as a post-translational modification in genome-reduced bacteria and prokaryotes more broadly, and represents an important mechanism for creating cell surface protein diversity.


Background
Mycoplasma spp. are bacteria that evolved by a process of degenerative evolution from the low G þ C Firmicutes. Mycoplasmas have lost genes for cell wall biosynthesis, and many anabolic processes (including a TCA cycle) are reliant on glycolysis for the production of cellular ATP [1,2]. Mycoplasmas typically have small genomes of less than 1000 kbp and are dependent on the host for the supply of cholesterol for membrane biosynthesis, amino acids, nucleotides and other macromolecular building blocks for cell growth [3]. As such, mycoplasmas are excellent model organisms to examine the complexity of post-translational modifications in prokaryotes.
Mycoplasma hyopneumoniae is an agriculturally significant swine respiratory pathogen that causes substantial economic losses, estimated in the billions of dollars per annum [4]. Complete genome sequences of four geographically distinct strains of M. hyopneumoniae are available [3,5,6], shedding light on the & 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution metabolic capacity, host specialization and evolutionary background of this minimal organism. Genomes range in size from 850 to 920 kb and encode approximately 700 open reading frames (ORFs). The M. hyopneumoniae strain 232 genome contains 691 known proteins and 728 annotated genes. A recent proteome analysis of strain 232 identified 8607 unique peptide sequences (false discovery rate of 0.53%) confirming the expression of 70% (483) of the 691 predicted ORFs during culture in Friis broth. This included 171 of the 328 predicted hypothetical proteins (52%), 80% of the lipoprotein genes, and all the P97/P102 adhesin gene families. In the same study, proteogenomic analysis of strain 232 uncovered previously unidentified genes and 5 0 extensions to several genes [7]. Transcriptome studies indicate that 92% of predicted ORFs are transcribed in M. hyopneumoniae strain 7448 [8]. Seventy-eight non-coding RNAs were also identified in the analysis. Genes with the highest expression levels primarily encoded proteins involved in basal metabolism, as well as chaperones, adhesins, surface proteins, transporters and RNase P. A number of uncharacterized proteins were also identified. The M. hyopneumoniae gene encoding the P216 adhesin protein was also presented with a significant number of transcripts (RPKM, reads per kilobase of transcript per million mapped reads: 10 796.4) [8]. While these approaches have shed light on the protein coding capacity of M. hyopneumoniae, they do little to understand the extent by which it modifies its proteome post-translationally.
During the early, critical stages of infection, M. hyopneumoniae adheres specifically along the entire length of cilia of ciliated epithelial cells that line the trachea, bronchi and bronchioles in the upper respiratory tract of pigs. This association causes ciliostasis, loss of cilia and eventual epithelial cell death, which effectively perturbs mucociliary function. The P97 and P102 adhesin families are central to mediating attachment of M. hyopneumoniae to epithelial cilia [9][10][11][12][13][14][15][16][17][18][19]. Notably, all members of the P97 and P102 adhesin families are processed post-translationally to the extent that it is difficult to find evidence of adhesin pre-proteins [9][10][11][12]15,17,18,[20][21][22][23]. Most members of the P97 and P102 families are processed via highly efficient cleavage events typically at S/T-X-F -X-D/E sites, but also within stretches of hydrophobic amino acids and by numerous, less efficient cleavage events often in a manner consistent with trypsin-like activity [20][21][22]24]. Consequently, the surface protein architecture of M. hyopneumoniae displays cleavage fragments derived via processing of the P97 and P102 adhesin families by several endopeptidases. What is unclear is how endoproteolysis alters the presentation of surface proteins not related to the P97 and P102 adhesin families, including members of the lipoprotein family.
The current trend in global proteomic analysis has been to use high-speed, ultra-sensitive mass spectrometers combined with orthogonal upfront chromatographic fractionation (i.e. two-dimensional LC-MS/MS) in a peptide-centric manner to characterize proteomes. These high-throughput protocols rely on all proteins in a sample being digested with an efficient protease (e.g. trypsin) into peptides for downstream analysis. Peptide-centric or 'bottom-up' approaches are used widely, because peptides are more readily solubilized for fractionation and are amenable to chromatographic separation, and mass spectrometry is more sensitive when analysing peptides, rather than intact proteins [25]. Conversely, protein-centric approaches aim to preserve intact proteins throughout fractionation steps, so that proteoform information may be retained [26], and then discrete proteins or fractions are digested to peptides and analysed individually by mass spectrometry. Protein-centric methods are thus not necessarily 'top-down' approaches that aim to analyse individual intact proteins by mass spectrometry [27]. Without selective enrichment, high-throughput peptide-centric approaches can fail to capture post-translational proteolytic modifications and can lead to an oversimplification of the complexity of the proteome. In this study, we applied protein-centric approaches that retain mass context with the aim of identifying proteins that are targets of processing events in M. hyopneumoniae-type strain J.

Experimental procedures 2.1. Preparation of Mycoplasma hyopneumoniae whole cell lysate
Mycoplasma hyopneumoniae (strain J) was grown in modified Friis broth [28] and harvested as described previously [29]. A 0.1 g pellet of M. hyopneumoniae cells was resuspended in 7 M urea, 2 M thiourea, 40 mM Tris -HCl pH 8.8, 1% w/v C7BzO and disrupted with four rounds of sonication at 50% power for 30 s bursts on ice. Proteins were reduced and alkylated with 5 mM tributylphosphine and 20 mM acrylamide monomers for 90 min. Insoluble material was pelleted by centrifugation at 16 000g for 10 min, and the remaining soluble protein was precipitated in five volumes of ice-cold acetone for 30 min and the pellet air-dried. For one-dimensional SDS-PAGE, the pellet was resuspended in SDS sample buffer (0.25 M Tris-HCl pH 6.8; 0.25% w/v SDS; 10% glycerol and 0.0025% w/v bromophenol blue). For two-dimensional-PAGE, protein pellets were resuspended in 7 M urea, 2 M thiourea, 1% w/v C7BzO. If solution conductivity was measured to be greater than 200 mS cm 21 , samples were desalted and buffer exchanged into 7 M urea, 2 M thiourea, 1% w/v C7BzO using a microBioSpin column (Bio-Rad) according to manufacturer's instructions.

Two-dimensional polyacrylamide gel electrophoresis
Two-dimensional gels were run using 250 mg of whole cell lysate with 0.2% pH 3-10 carrier ampholytes (Bio-Rad). Isoelectric focusing was performed using 11 cm pH 4-7 IPG strips (Bio-Rad) and 11 cm pH 6-11 immobiline drystrips (GE Healthcare). Focusing was carried out using a Protean IEF system (Bio-Rad) at a constant 208C and 50 mA current limit per strip with a three-step programme: slow ramp to 4000 V for 4 h, linear ramp to 10 000 V for 4 h, then 10 000 V until 120 kVh was reached. Following IEF, the strips were equilibrated with 5 ml equilibration solution (2% SDS, 6 M urea, 250 mM Tris -HCl pH 8.5, 0.0025% (w/v) bromophenol blue) for 20 min before the second-dimension SDS-PAGE. The second-dimension gels were run using precast Bio-Rad TGX midi gels with TGS running buffer (Bio-Rad). Reference gels were stained with Coomassie blue G250 overnight and destained with 1% acetic acid to remove background. All visible spots (180 from the pH 4-7 gel and 160 from the pH 6-11 gel) were manually excised from the gel and subjected to in-gel trypsin digestion, before analysis by LC-MS/MS.  figure 4a). One-dimensional GeLC-MS/MS was also performed on a TX-114 detergent fraction and on a high-load lane of whole cell extract (where mass context was not reliably retained owing to macromolecular crowding effects), and these were also analysed by Q-TOF MS. About 150 mg of protein from any preparation was separated by SDS -PAGE, and fixed and stained with Coomassie blue G-250. Additionally, a high-load lane was run using 500 mg protein from whole cell lysates. Entire gel lanes were cut into 16 equal slices for whole cell lysates, 30 for the high-load lane or 15 for the TX-114 fraction. Gel slices were further diced into approximately 1 mm 2 cubes, destained, washed and digested in-gel with trypsin for analysis. Identification of proteins was performed following clean-up of peptide fractions using OMIX C18 SPE pipette tips, using one of the LC-MS/MS methods described below.

Expression of recombinant proteins and creation of polyclonal antisera
Expression of recombinant P65 and creation of polyclonal antisera was carried out as described previously [9,14,31].

Blotting
Proteins separated on pH 6-11 two-dimensional gels were transferred to PVDF membranes as described previously [12]. Blots were blocked with 5% (w/v) skim milk powder in PBS with 0.1% Tween 20 (v/v) (PBS-T) at room temperature for 1 h. For detection of immunogenic proteins, membranes were probed with pooled convalescent sera collected from low-health-status M. hyopneumoniae-infected pigs described previously [9] diluted 1 : 100 in PBS-T for 1 h, followed by incubation with peroxidase-conjugated anti-pig antibodies diluted 1 : 3000 in PBS-T for 1 h. For detection of adhesin R1 cilium binding domains, membranes were probed with antisera raised against the F3 recombinant fragment that spans the R1 cilium binding domain of MHJ_0194 (F3 P97 ); described previously [14] diluted 1 : 100 in PBS-T for 1 h, then peroxidase-conjugated anti-rabbit antibodies diluted 1 : 1500 in PBS-T for 1 h. For detection of P65 fragments, membranes were probed with antisera raised against recombinant P65 diluted 1 : 200 in PBS-T for 1 h, then peroxidase-conjugated anti-rabbit antibodies diluted 1 : 2000 in PBS-T for 1 h. Membranes were washed in three changes of PBS-T between incubations and were developed with SIGMAFAST 3,3 0diaminobenzidine tablets (Sigma-Aldrich) as per manufacturer's instructions.

Affinity chromatography for identification of protein interactions
Heparin affinity chromatography and avidin purification of fibronectin-binding proteins and PK15 cell surface protein interactors were performed as described previously [20][21][22]. Avidin purification of actin-and plasminogen-binding proteins was carried out as follows. Actin from bovine muscle (Sigma-Aldrich) was solubilized in 8 M urea, 20 mM triethylammonium bicarbonate, pH 8.0. Cysteine residues were reduced and alkylated with 5 mM tributylphosphine and 20 mM acrylamide monomers for 90 min at room temperature. Actin monomers were labelled in 20-fold molar excess Sulfo-NHS-LC-Biotin for 3 h at room temperature. Plasminogen from human serum (Sigma-Aldrich) was labelled in 20-fold molar excess Sulfo-NHS-LC-Biotin for 3 h at room temperature. Excess biotin was removed by buffer exchange into PBS using a PD-10 Desalting Column (GE Healthcare, Life Sciences). Biotinylated actin and plasminogen were incubated with avidin agarose (Thermo Scientific) on a rotating wheel for 5 h. The separate slurries were packed into columns and the flow-through collected from each. Unbound ligand was thoroughly washed with PBS. M. hyopneumoniae cells were pelleted by centrifugation at 10 000g for 20 min, washed with PBS, and gently lysed in 0.5% Triton X-100/PBS. Insoluble material was removed by centrifugation at 16 000g for 10 min, and the cleared lysate was incubated with biotinylated ligand -avidin agarose mixtures overnight on a rotating wheel at 48C. The mixtures were packed into columns, and the unbound proteins were thoroughly washed and collected in PBS. Interacting proteins were eluted with 30% acetonitrile, 0.4% trifluoroacetic acid. The eluting proteins were concentrated using a 3000 Da cut-off filter and acetone precipitated before pelleting by centrifugation. Elutions were subsequently subjected to onedimensional SDS-PAGE for transfer and detection by blotting or GeLC-MS/MS for protein identification.
Surface proteins were identified by enzymatic cell surface shaving using trypsin for 5 min at 378C as previously described [12] and cell surface labelling using Sulfo-NHS-LC-Biotin for 30 s at 48C as previously described [10].
2.8. One-dimensional liquid chromatography tandem mass spectrometry using Q-TOF These methodologies were performed as described previously [21,22]. Briefly, samples were loaded using an Eksigent AS-1 autosampler connected to a Tempo nanoLC system (Eksigent, USA) at 20 ml min 21 onto a C8 trap column (Michrom, USA) before washing and elution at 300 nl min 21  spots, 50280% MS B over 5 min, 80% MS B for 2 min, 8025% for 3 min. An intelligent data acquisition experiment was performed, with a mass range of 35021500 Da scanned for peptides of charge state 2þ to 5þ with an intensity of more than 30 counts scan 21 . Selected peptides were fragmented, and the product ion fragment masses were measured over a mass range of 5021500 Da. The mass of the precursor peptide was then excluded for 120 s for gel slices or 15 s for gel spots.
2.9. One-dimensional liquid chromatography -mass spectrometry/mass spectrometry using ion trap Peptide samples were analysed by nanoflow LC-MS/MS (nanoLC-MS/MS) using a LTQ-XL linear ion trap mass spectrometer (Thermo, San Jose, CA), using a fused silica capillary with an integrated electrospray tip (75 mm ID Â 70 mm) packed with 100 Å , 5 mm Zorbax C18 resin (Agilent Technologies, CA, USA). An electrospray voltage of 1800 V was applied via a liquid junction upstream of the C18 column. Samples were injected onto the column using a Surveyor autosampler, which was followed by an initial wash step with buffer A (5% v/v acetonitrile, 0.1% v/v formic acid) for 10 min at 1 ml min 21 . Peptides were eluted from the column with 0-50% buffer B (95% v/v acetonitrile, 0.1% v/ v formic acid) for 58 min at 500 nl min 21 . The column eluate was directed into a nanospray ionization source of the mass spectrometer. Spectra were scanned over the range of 400 -1500 amu and, using XCALIBUR software (version 2.06, Thermo), automated peak recognition, dynamic exclusion and MS/MS of the top six most intense precursor ions at 35% normalization collision energy were performed. Peptide identifications were accepted if their calculated probability was greater than 95.0% with a false discovery rate of 1.27%, and protein identifications were accepted if their calculated probability using the Peptide Prophet algorithm was greater than 80.0% with a false discovery rate of 2.4%. Protein probabilities were assigned by the Protein Prophet algorithm. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.

Mass spectrometry/mass spectrometry data analysis
The use of multiple techniques improved confidence in 'one-hit wonders'; proteins identified by a single peptide in a single replicate. Adopting the approach of White et al. [32], if the same single peptide was identified in two or more replicates or experiments, the protein was considered to be present, rather than a 'one-hit wonder'. Similarly, if a single peptide identified a protein in one replicate and a different single peptide identified the same protein in a separate replicate, then the protein was considered to be expressed. Single peptide hits were only retained in the dataset if, after being subjected to manual validation, the MS/MS spectra had a considerable sequence of b-and y-ions that were the dominant ions in the spectra. Six proteins were identified to be true onehit wonders, with the identifying spectra and fragmentation data shown in electronic supplementary material, figure S4.

Protein-centric approaches to mapping the Mycoplasma hyopneumoniae proteome
We applied a series of fractionation technologies that retain mass context to lysates of M. hyopneumoniae-type strain J, to determine the diversity of proteins that are targets of endoproteolytic processing. Members of two adhesin families related to P97 and P102 respectively are known to be extensively processed on the cell surface of M. hyopneumoniae, but the extent to which proteins on the cell surface are targets of endoproteolytic processing has not been explored. Three hundred and forty-seven unique M. hyopneumoniae strain J proteins, representing approximately 52% of the predicted proteome, were identified from the combined experiments following analysis by SCAFFOLD (electronic supplementary material, table S1). Table 1 summarizes the identification of proteins expressed in M. hyopneumoniae as detected by each of these methods. Interestingly, two uncharacterized proteins were identified mapping only to strain 232: an 8.8 kDa protein, Q5ZZV3, identified by one peptide in two runs on both ion trap and Q-TOF; and an 11.3 kDa protein, Q5ZZV5, identified by two peptides in one run from ion trap data. A BLAST search of the UniProt database shows that these proteins are conserved among strains 232, 7448 and 168; however, they are not annotated to be present in strain J. Seventy-seven (22%) of the identified proteins are named in UniProt as 'uncharacterized protein', despite some sharing homology with proteins that are well characterized in the literature such as P97 and P102 paralogues, MHJ_0369 and MHJ_0368 (Q4A9W4 and Q4A9W5), homologues of Mhp385 and Mhp384 (Q600R9 and Q600S0) respectively, in M. hyopneumoniae strain 232 [12]. GeLC-MS/MS preserves the intact molecular weight of proteins and was a valuable strategy to identify cleavage events that affected the migration of members of the P97 and P102 adhesin families [10,20 -22]. Much finer resolution of cleavage fragments was achieved using two-dimensional PAGE. pH 4 -7 and 6-11 gels were run using whole cell extracts of M. hyopneumoniae (figure 1). Overall, 340 spots comprising 180 spots from a 4-7 isoelectric point gradient gel and 160 spots from a 6-11 isoelectric point gradient gel were resolved well enough to be excised and analysed by LC-MS/MS. Identifications were obtained for 302 spots (159 from pI 4-7 and 143 from pI 6-11; electronic supplementary material, figures S2 and S3). One hundred and thirty unique proteins were identified from these 302 spots, representing 19% of the predicted proteome (37% of the identifiable proteome).
Eighty-seven proteins were identified from multiple spots. Not all of these, however, could be attributed to processing events, with a significant number of proteins appearing as 'spot trains' at a specific molecular weight that track along the pI gradient. This is likely to be the result of other posttranslational modifications that affect pI, such as deamidation, or phosphorylation, which has been previously documented in M. hyopneumoniae [18]. Of particular interest was the presence of 'cloud regions' where numerous spots could be detected, but could not be individually resolved (figure 1, boxed). These cloud regions are significant, as similar patterns in the same region have been previously identified when M. hyopneumoniae proteins were separated over nonlinear pH 6-11 gels using a different gel system, carried out in a different laboratory and are thus unlikely to be an artefact of sample preparation or gel separation methods [23]. We postulated that these low-abundance cleavage fragments are generated by endoproteolysis of abundantly expressed members of the P97 and P102 adhesin families. A two-dimensional blot rsob.royalsocietypublishing.org Open Biol. 6: 150210 probed with rabbit anti-F3 P97 serum [14] showed that P97, P66 and a range of lower abundance fragments of MHJ_0194 are recognized (figure 1c). Identical blots probed with a pool of convalescent sera sourced from pigs testing positive for infection with M. hyopneumoniae showed a strong reaction to the low-abundance P97 and P102 adhesin cleavage fragments (figure 1d). These observations are consistent with the highly immunoreactive nature of proteins carrying proline-rich repeats [18] such as those recognized by anti-F3 P97 serum.
It is important to note that this list is not exhaustive, as many other proteins were not identified with sufficient sequence coverage to be confirmed as cleavage fragments. We selected lipoprotein P65, an uncharacterized protein of unknown function and the cytosolic protein lactate dehydrogenase, which we show are targets of endoproteolytic processing events, to provide an insight into the sequences that are targeted by the processing machinery and present some the putative functions of the cleavage fragments that are generated by these processing events.

Evidence that the P65 lipoprotein is processed on
the surface of Mycoplasma hyopneumoniae P65, MHJ_0656 (Q4A932), comprises 627 amino acids and encodes a 71 kDa lipolytic lipoprotein with preference for short-chain fatty acids [38]. The N-terminal 29 amino acids comprise the signal sequence and are expected to be removed followed by lipid modification of the cysteine residue at position 30, generating a mature lipoprotein with a mass of 68 kDa and a pI of 5.8. We identified P65 as a series of protein spots on a two-dimensional gel with a mass of approximately 68 kDa and a pI of 5.8 (figure 2, peptide coverage in black). This 68 kDa molecule was also identified in separate affinity-capture assays using heparin and biotinylated porcine epithelial-like surface proteins as bait (figure 2, peptide coverage in red and blue, respectively). P65 is predicted to display three regions of protein disorder from amino acids 189-228 (DR1), 340-418 (DR2) and 553 -627 (DR3) according to the PONDR VSL2 algorithm. One of these, DR1, also overlaps with a coiled coil region (100% probability using the COILS algorithm) between amino acids 214-245, suggesting that this region may not be disordered [39]. Efficient cleavage events are known to occur in S/T-X -F X-D/E and related motifs that reside within acidic, disordered regions in the P97 and P102 adhesin families in M. hyopneumoniae [9,10,12,17,[20][21][22]. We identified an S/T-X-F X -D/E motif in P65 with sequence 360 T-N-F D-D 364 that resides in DR2, and a cleavage site that cuts at phenylalanine with sequence 501 V-A-F F-A 505 that is not located within a region of disorder. Both motifs reside within acidic regions that display a pI of 5 or less (      rsob.royalsocietypublishing.org Open Biol. 6: 150210 did identify a single tryptic peptide in a gel slice spanning 30-35 kDa that contained M. hyopneumoniae proteins captured during affinity chromatography using fibronectin as bait (fragment 5 in figure 2). This fragment is consistent with cleavage at positions 106 and 362, generating a protein with a mass of 30.4 kDa with a pI of 6.97.

Processing events identified in atypical cell surface proteins of Mycoplasma hyopneumoniae
Metabolic proteins such as elongation factor Tu, pyruvate dehydrogenase complex components A, B and D, glyceraldehyde-3phosphate dehydrogenase and L-lactate dehydrogenase (LDH) showed evidence of post-translational processing and were also identified in cell surface analyses (table 2). Evidence that LDH is processed is presented in figure 3. LDH was identified at its predicted mass of 35 kDa and at multiple pI between 5.7-7.5 on pH 4-7 and 6-11 two-dimensional gels (figure 3, peptide matches in black). Peptides mapping to LDH were also identified from gel spots at apparent molecular mass of 19 kDa and pI 5.3-5.8 on pH 4-7 gels and at 13 kDa and pI 8.5 on pH 6-11 gels. The full-length LDH protein was identified in separate affinity-capture assays using heparin, biotinylated fibronectin, actin and porcine epithelial-like surface proteins as bait (figure 3, peptide coverage in red, orange, purple and blue, respectively). While further studies are needed to confirm biologically meaningful interactions between LDH and these host molecules, affinity-capture assays provide independent evidence that regions within LDH bind host molecules and enrich for cleavage fragments. A single cleavage event between amino acids 188-199 would result in a theoretical N-terminal fragment of approximately 21 kDa with a pI of 5.2 and C-terminal fragment of approximately 13 kDa and a pI of 9.2, which is similar to the fragments of LDH identified from two-dimensional gels. The shift in pI may be attributed to deamidation of asparagine residues at position 121 in the N-terminal fragment and positions 269 and 279 in the C-terminal fragment as detected in peptides identified by GeLC-MS/MS. Neither fragment of LDH was detected from actin or PK15 cell surface binding pulldowns; however, peptides mapping to LDH were identified at masses between 40 and 200 kDa in actin pulldowns, possibly indicating incomplete disassociation from multimeric complexes prior to SDS -PAGE. Peptides mapping to the N-terminal fragment were identified from heparin and fibronectin affinity GeLC-MS/ MS experiments in slices at masses 15 -20 and 15-23 kDa respectively, whereas peptides mapping to the C-terminal fragment were identified only from heparin affinity GeLC-MS/MS experiments in a slice encompassing masses less than 15 kDa. Although no heparin-binding motif was identified in the C-terminal fragment, the lysine-rich sequence 290 DKEKEKFAKS 300 could facilitate interaction with heparin.
Further work is needed to determine if 290 DKEKEKFAKS 300 binds heparin.   proteolytic cleavage. Predicting the true N-terminus of the C-terminal fragment at amino acid position 567 (M) would generate a protein with a mass of 12.5 kDa and pI of 5.47 as predicted by ProtParam. We identified the C-terminal fragment from two-dimensional gels at the same approximate molecular mass, with pI ranging from approximately 5.5 to 6.2 (figure 1). MHJ_0009 was also identified from GeLC-MS/MS of samples following heparin affinity chromatography from a slice at molecular mass of approximately 10-12 kDa, in elutions carrying proteins with low heparin binding affinity (elution in 150-600 mM NaCl). This is consistent with the presence of putative heparin binding motifs within the C-terminus. Eight putative heparin-binding motifs were identified within MHJ_0009 similar to those described previously [21,22] in both the N-and C-terminal fragments, as denoted by grey underlined regions in figure 4. The protein was identified by the same two C-terminal peptides identified from low molecular mass slices in GeLC-MS/MS (underlined in black in figure 4). The C-terminal fragment of MHJ_0009 contains a thioredoxin-like domain, and a BLAST search of this fragment gives approximately 60% identity to thioredoxin from other Mycoplasma species (M. bovoculi: E-value: 2 Â 10 243 , score: 375, identity: 62%). Further work is needed to confirm if the C-terminal cleavage fragment displays oxidoreductase activity. Only one of the cleaved proteins listed in table 2, the uncharacterized protein MHJ_0523, has not also been identified in surfaceome studies using enzymatic shaving and/or cell surface biotinylation [40]. MHJ_0523 encodes a 230 kDa putative lipoprotein and is predicted to possess a transmembrane domain at the N-terminus (TMPred score 1612) and three other putative transmembrane domains (figure 5), which would suggest that the protein is likely to traverse the cell membrane. Extraction of M. hyopneumoniae with TX-114 is likely to have concentrated MHJ_0523 into the detergent-soluble fraction, indicating that it may be surface-exposed but expressed at low levels, rendering it undetectable by our shaving/biotin labelling methods. Detection of MHJ_0523 in slice 1 indicates that the molecule is poorly soluble during SDS -PAGE or that it forms large mass multimeric structures. Fragments identified were from the C-terminus ranging from masses upwards of 75 kDa on the TX114 gel, with no coverage of the first 314 amino acids. Five putative S/T -X-F-X -D/E cleavage motifs were identified along the length of the ORF, but we were unable to confirm if processing does occur at these sites.

Proteases identified in Mycoplasma hyopneumoniae
Eighteen ORFs have been annotated in the UniProt database (GO annotation) to have putative protease activity, 11 of which have been identified in our study (table 4)  Analysis of the C-terminal cleavage fragment spanning amino acids 568-664 with ProtParam indicated that it was 12.5 kDa with a predicted pI of 5.47 (see also figure 1). MHJ_0009 was also identified by GeLC -MS/MS from slices at approximately 12 kDa from low-affinity heparin chromatography elutions. Putative heparin binding motifs are underlined in grey.
rsob.royalsocietypublishing.org Open Biol. 6: 150210 rise to adhesin fragments, as well as potentially processing other proteins. Lon proteases are bioinformatically predicted to cleave at hydrophobic residues, including phenylalanine (F), and so may play a role in processing at the dominant cleavage motif S/T-X-F -X -D/E [9,10]. Additionally, uncharacterized protein MHJ_0568 is predicted to possess a trypsin-like domain, which may be responsible for trypsin-like cleavage events at lysine (K) and arginine (R) residues. Efforts are currently under way to confirm these bioinformatically predicted results.    [32,[41][42][43][44], they enabled us to characterize endoproteolytic processing events in 35 functionally diverse, surface-associated proteins. The unidentified portion of the proteome consisted of 198 uncharacterized ORFs, which may be of low abundance or have a high rate of turnover, or may not be transcribed under the growth conditions used in our analyses. Additionally, some ORF sequences that remain unidentified contain too many (or rarely too few) lysine and/or arginine residues, making the tryptic peptides generated by digestion undetectable by the methods used. Instrument sensitivity only partially explains why we did not identify a greater proportion of the proteome, given that we were only able to identify 70% (483) of the 691 predicted ORFs in strain 232 during culture in Friis broth [7]. Our approach is consistent with our primary goal to preserve mass-context prior to mass spectrometry as a means to identify the gamut of proteins targeted by processing mechanisms. Two-dimensional PAGE was able to resolve individual proteins and isoforms, providing information about post-translational modifications, whereas one-dimensional GeLC-MS/MS methods are higher-throughput, making them better suited for global proteome identification. The protein-centric approaches used in our studies provided insights into the extent of protein processing in M. hyopneumoniae. Table 2  Our data suggest that protein processing is a post-translational modification that occurs with greater frequency than is currently recognized and occurs in a wide range of functionally diverse cell surface proteins. This is consistent with the processing machinery being associated with the cell surface or with the general secretory pathway. On a cautionary note, it remains to be determined what mechanisms are needed to export proteins with canonical functions in the cytosol onto the cell surface [45][46][47]. Nonetheless, we provide strong evidence that numerous proteins with functions in the cytosol are bound on the surface of M. hyopneumoniae where they are targets of endoproteolytic processing. We detected cleavage at S/T-X-F -X-D/E sites consistent with the hypothesis that the same enzyme that cleavages the P97 and P102 families is also targeting other surface accessible proteins. Enrichment procedures such as TX-114 fractionation and affinity-capture chromatography techniques were useful for enriching the low-abundance proteome, delineating regions of proteins that bind host molecules and enriching for cleavage fragments, all of which provided clues to protein function. TX-114 extraction enriches for hydrophobic membrane proteins, which partition to the detergent phase [48]. As M. hyopneumoniae lacks a cell wall, the cell membrane is the mediator of contact between the bacteria and extracellular environment; hence, membrane-bound proteins are potentially valuable targets for vaccine and therapeutic development. While the TX-114 GeLC-MS/MS protocol detected the fewest protein identifications, at 206, it contributed five unique proteins to the overall analysis, all of which were uncharacterized proteins described as lipoproteins and/or predicted to contain transmembrane domains using TMpred. Overall, 26 of 50 M. hyopneumoniae lipoproteins were identified by all methods, and LC-MS/MS analysis of TX-114 solubilized proteins identified 22 of the 26.
While the precise functions of bacterial lipoproteins remain poorly understood, there is mounting evidence to suggest they are pathogen-associated molecular pattern (PAMP) molecules on the surface of Gram-positive bacteria. PAMPs are recognized by Toll-like receptors that trigger innate immune responses [49 -52]. Most mycoplasma lipoproteins are surface-exposed with acyl groups anchoring these proteins in the cell membrane, where they are thought to function as cytadhesins, transport proteins or virulence factors with immunomodulatory capabilities [53]. P65 is an abundantly expressed, immunoreactive and lipolytic lipoprotein that selectively partitions to the detergent phase during extraction with TX-114 [38,54]. Schmidt et al. showed that anti-P65 antibodies inhibit the lipolytic activity of P65 and growth of M. hyopneumoniae, indicating that P65 performs a primary function on the external membrane surface by providing a source of essential lipids for growth [38]. It has also been suggested that P65 may alter surfactant properties in the lungs of pigs in vivo [38]. In our studies, P65 was recovered during affinity capture protocols using different host molecules as bait. Although these are preliminary data that require quantitative studies to confirm a direct role for P65 in these interactions, this suggests that P65 displays motifs that facilitate binding to a diverse range of host molecules. Consistent with these preliminary observations, we show here for the first time that P65 is a target of several processing events that generate cleavage fragments which are selectively retained during affinity chromatography using porcine epithelial cell surface proteins, fibronectin or porcine heparin as bait. The ability of the cleavage fragments of P65 to bind the same bait proteins as P65 lends weight to the hypothesis that the interactions with host molecules are direct and biologically relevant. Cleavage occurred at a number of sites in the P65 protein sequence, including at a phenylalanine residue within a S/T-X-F X -D/E motif; a known processing site in the P97 and P102 adhesin families [9][10][11][12]15,18,[20][21][22][23]]. An immunoblot of biotinylated cell surface M. hyopneumoniae strain J proteins fractionated using TX-114 that was probed with anti-P65 polyclonal antibodies identified a 65 kDa protein and numerous smaller mass fragments of P65 consistent with cleavage at several sites within the molecule. These data show that P65 and cleavage fragments of P65 reside on the surface of M. hyopneumoniae. Notably, there is clear evidence of a doublet at approximately 65 kDa (boxed in figure 2b) in the lane containing M. hyopneumoniae aqueous phase proteins. Previous studies have shown that P65 may undergo clipping at the N-terminus and be a target of further post-translational processing events [54]. Our data suggest that the doublet may represent forms of P65 that have lost the lipid anchor because they partitioned to the aqueous phase. If correct, these data suggest that a small lipopeptide similar to the macrophage-activating lipopeptide 2 (MALP-2) of Mycoplasma fermentans may be produced from P65.
Lipoproteins of mycoplasmal origin are known targets of post-translational processing events. The first 14 amino acids of MALP-404, a 41 kDa lipoprotein in M. fermentans, are rsob.royalsocietypublishing.org Open Biol. 6: 150210 removed by a post-translational cleavage event generating a 2 kDa MALP-2 lipopeptide. The C-terminal 39 kDa cleavage fragment (known as RF) that results from this cleavage event has been isolated from culture supernatants, but its function remains unknown [55]. Unlike RF, both MALP-2 and MALP-404 are lipid-modified and remain associated with the membrane of M. fermentans. MALP-2 is a potent immunomodulatory molecule that engages Toll-like receptor 2 [50]. Like the MALP-2 lipopeptide, the N-terminus of P65 may play a similar immunomodulatory role in M. hyopneumoniae; however, further studies are required to confirm this. Similarly, MGA0674 is an 82 kDa lipoprotein in Mycoplasma gallisepticum whose expression is elevated in virulent strain R low compared with the attenuated vaccine strain F, suggesting that it may play a role in pathogenesis. MGA0674 is a target of a processing event at position 225 that releases a C-terminal 57 kDa fragment from the anchored N-terminal 22 kDa lipoprotein [56]. There are other reports of processing events that target lipoproteins in Mycoplasma pneumoniae but their functions have remained poorly characterized [57].

The extent of proteolytic processing in Mycoplasma hyopneumoniae
A significant number of the 35 proteins identified to be targets of post-translational processing were glycolytic enzymes and other metabolic proteins. Glycolytic enzymes are increasingly being identified as multitasking or moonlighting proteins in a wide range of organisms, including parasites [58], yeasts and fungi [59], mammalian cells [60], plants [61] and bacteria [62], and this is reflected in the range of entries seen in Multi-taskProtDB [63]. In other members of the Mollicutes, proteins with canonical functions in the cytosol have also been found to be surface-exposed and interact with host components. For example, in Mycoplasma pneumoniae, elongation factor Tu (EfTu) and pyruvate dehydrogenase (PdhB) were identified as surface-exposed moonlighting proteins, through screening for fibronectin binding proteins by ligand blotting of whole cell lysates and fibronectin-coupled affinity chromatography, and their surface localization was confirmed by immunogold labelling and electron microscopy [64]. Further investigation of EfTu revealed specifically that the carboxyl-terminus was surface-exposed by immunogold labelling and responsible for fibronectin binding [65]. In Mycoplasma genitalium, glyceraldehyde-3-phosphate dehydrogenase was identified to be surface-exposed and bind mucin, probably functioning as an adhesin [66]. These proteins were identified here to be cleaved. Many processing events are likely to alter canonical (enzymatic) function and profoundly influence how cleavage fragments interact with the mycoplasma membrane and host molecules [67]. LDH is a highly immunogenic cytoplasmic protein involved in the glycolytic process of M. hyopneumoniae [68,69]. Here, we have identified LDH to be present at the cell surface both as a full-length molecule and as cleavage fragments. We identified a single putative cleavage site between amino acids 188-199 and the cleaved form of LDH is unlikely to carry out its primary function owing to significant structural alteration. In eukaryotic organisms, LDH has been recognized as a moonlighting protein, along with other glycolytic enzymes such as hexokinase, glyceraldehyde dehydrogenase and enolase, playing a role in transcriptional regulation [70]. LDH has also been identified as a single-stranded DNA-binding protein in eukaryotic cells [71,72]. In eukaryotic cells, this switch in function is likely to be due to translocation to the nucleus where these functions take place, possibly through post-translational modifications such as tyrosine phosphorylation [70,73,74]. It is possible that, as in eukaryotic cells, posttranslational modifications may also affect localization and function of LDH in M. hyopneumoniae, or direct a subset towards processing. Intact LDH was identified from spots on two-dimensional gels between pI of 5.7 and 7.5. With a theoretical pI of 7.63, this indicates an acidic shift is likely to be caused by a variable degree of post-translational modification such as deamidation affecting a proportion of LDH [75]. LDH has also been identified from extracellular supernatants of various Lactobacillus and Bifidobacterium species from the honeybee Apis mellifera [76]. These lactic acid bacteria belong to the Firmicutes, and are genetically similar to the low G þ C content mycoplasma species. It was hypothesized that LDH, once localized to the surface, could evolve alternative functions as a moonlighting protein, functioning as an auxiliary adhesin [76]. Indeed, we have also previously identified a glutamyl aminopeptidase from M. hyopneumoniae, MHJ_0125, which moonlights as a multifunctional adhesin at the cell surface [77], and a leucyl aminopeptidase, MHJ_0461, which functions as a multi-substrate peptidase and binds heparin, plasminogen and foreign DNA [47]. The cleavage fragments of LDH identified here bound to heparin, used as a structural mimic for glycosaminoglycans in the respiratory tract, and the N-terminal fragment also bound to fibronectin, an extracellular matrix component, indicating fragments may also have adhesin functions.

Conclusion
We identified 347 (52%) of the 672 putative ORFs predicted from the genome sequence of M. hyopneumoniae strain J. The proteome coverage from well-resolved twodimensional gels, while low, is unsurprising. The limitations of two-dimensional gels are well documented, particularly considering the nature of sample preparation required, which limits the ability to retain and resolve very basic, acidic, small, large or hydrophobic proteins [43]. However, protein-centric, gel-based separations provide a technique complimentary to high-throughput two-dimensional LC-MS/MS protocols by maintaining mass and pI context, allowing the identification of cleavage products and the extent of proteolytic processing. We show for the first time that proteins with canonical functions in the cytosol that moonlight on the cell surface are also targets of endoproteolytic events. This describes a new dimension to protein moonlighting and suggests that much more biological information is inherent in proteins. While we cannot yet determine the exact nature of cleavage events as they occur in vivo, the analysis presented here is an important first step in determining physiologically relevant cleavage events. Cleavage events will undoubtedly complicate efforts to correlate the transcriptome with the proteome in future studies [78,79], and the protein-centric approaches presented here will provide a solid foundation for further investigation of post-translational processing in proteins involved in pathogenesis of M. hyopneumoniae, and will assist with delineating functionally important binding motifs. rsob.royalsocietypublishing.org Open Biol. 6: 150210