Mechanistic insights into molecular evolution of species-specific differential glycosaminoglycan binding surfaces in growth-related oncogene chemokines

Chemokines are chemotactic cytokines involved in leucocyte trafficking to infected tissue. Growth-related oncogene (GRO) chemokines namely CXCL1, CXCL2 and CXCL3 are neutrophil activating chemokines sharing a conserved three-dimensional structure, but encompassing functional diversity due to gene duplication and evolutionary events. However, the evolutionary mechanisms including selection pressures involved in diversification of GRO genes have not yet been characterized. Here, we performed comprehensive evolutionary analysis of GRO genes among different mammalian species. Phylogenetic analysis illustrated a species-specific evolution pattern. Selection analysis evidenced that these genes have undergone concerted evolution. Seventeen positively selected sites were obtained, although the majority of the protein is under purifying selection. Interestingly, these positively selected sites are more concentrated on the C-terminal/glycosaminoglycan (GAG) binding and dimerization segment compared to receptor binding domain. Substitution rate analysis confirmed the C-terminal domain of GRO genes as the highest substituted segment. Further, structural analysis established that the nucleotide alterations in the GAG binding domain are the source of surface charge modulation, thus generating the differential GAG binding surfaces and multiple binding sites as per evolutionary pressure, although the helical surface is primordial for GAG binding. Indeed, such variable electrostatic surfaces are crucial to regulate chemokine gradient formation during a host's defence against pathogens and also explain the significance of chemokine promiscuity.

(e) Essential dimer interface contacts between C-terminal helices and with β-strands. (f ) Surface structure of HCXCL1 monomer marked with receptor binding residues (red) and GAG binding residues forming α-domain (pink) and β-domain (green).

Selection analysis
Multiple sequence alignment of GRO nucleotide sequences was screened for the gene recombination using genetic algorithm recombination detection (GARD) approach of DATAMONKEY server [32,33]. In addition, GENECONV method of the RDP4 program was used to identify the gene conversion events [34]. Codon-based maximum-likelihood method was employed to test positive selection in GRO genes and to infer amino acid sites under positive selection during evolution using DATAMONKEY server and codeml program in the PAML 4.9a [33,35]. DATAMONKEY programs include: single likelihood ancestor counting (SLAC), fixed-effect likelihood (FEL), internal fixed effects likelihood (iFEL), mixedeffect model evolution (MEME), branch-site random effect likelihood (REL), fast unbiased Bayesian approximation (FUBAR) [33,[36][37][38]. For SLAC, FEL, iFEL and MEME methods, a p-value of 0.1 was used. p-Value was set to 0.95 for FUBAR, and a Bayes factor of greater than 95 was used for REL. For codeml program, once the positive selection was confirmed, naive or Bayes empirical Bayes approach was used to calculate the posterior probability for all the sites evolved under positive selection [39]. As reported previously, the resulting positively selected amino acid sites that were obtained at least by two methods were considered [17,[40][41][42][43]. Further, to examine the nature of duplication (concerted/divergent) among GRO genes, we compared the pairwise values of dN and dS for all the species.

Conservation score and substitution rates
To determine the site-specific conservation score of GRO proteins, ConSurf server was used [44]. Conservation profile for each of the sites was generated using WebLogo [45]. Nucleotide substitution rates for GRO genes were determined using p-distance model in MEGA 6.0 [31].

Computation of pairwise omega values among different species
To assess the variations among the non-synonymous/synonymous substitution rate ratios among different GRO domains (figure 1a), pairwise omega values were computed for three GRO domain sequences independently using Nei-Gojobori method with JCODA [46,47].

Coevolution analysis
Coevolving residues in the GRO genes were predicted using MISTIC (mutual information server to infer coevolution) server [48]. Analysis was done using the available structures of HCXCL1, HCXCL2 and MCXCL2. Average mutual information (AMI) score was calculated for all the coevolving pairs of positively selected residues.

Sequence and structural analysis of growth-related oncogene chemokines
The nucleotide codon versus amino acid alterations of the CXCL1-3 sequences for all the species present in the phylogenetic tree were analysed independently by aligning amino acids with nucleotide sequences.
Structural models for human CXCL3, murine CXCL3 and horse CXCL1/CXCL2/CXCL3 were generated through homology modelling based on target template alignment using Promod II in Swiss model server by employing known GRO structures as templates (electronic supplementary material, tables S2 and S3) [49]. Electrostatic surface potential maps for human, horse and murine CXCL1-3 monomeric and dimeric chemokines were generated using vacuum electrostatics of PyMOL molecular graphic system [50].

Growth-related oncogene genes adapted species-specific evolutionary patterns
Phylogenetic analysis exploiting the sequence data enriches us with an accurate picture of how different genes evolve, and are related to each other with respect to time and speciation. It not only provides information about the gene evolution pattern at present but also predicts the patterns of evolution that can be followed in future. To rejuvenate the historical events that have occurred during the course of evolution of GRO genes, we constructed a phylogenetic tree using the amino acid sequences of GRO proteins (figure 2). Broadly, the tree topology displayed three major clades depending on the GRO genes belonging to different suites of species, namely primates, laurasiatheria and rodents. Distribution of these nodes in the tree revealed a distant relationship among GRO genes. Moving towards the inner branches, the tree has unlocked the trends of their inherent duplication and speciation events. Within primates, CXCL1, CXCL2 and CXCL3 are more closely related to their orthologues. Similar topology was observed in rodents, implying duplication as a pre-speciation event. A contrasting trend was marked in the case of laurasiatheria, in which paralogues of GRO chemokines are more closely related than to their orthologues.
Further, to establish the origin of such varied phylogenetic features of the GRO chemokines in mammalian species, we have traced their ancestral origin and classification (electronic supplementary material, figure S1). Boroeutheria with two superorder branches, euarchontoglires and laurasiatheria, were considered to analyse the distinguishable patterns of GRO chemokine evolution. The primates, lagomorpha and rodents belong to the superorder euarchontoglires, where they show close relationship to their paralogues implicating that duplication of GRO genes in euarchontoglires is a pre-speciation event and confirms its ancestral origin. In contrast to euarchontoglires, the superorder laurasiatheria comprising perissodactyla, cetoartiodactyla and carnivora species showed more similarity among the paralogues of GRO genes implying that the duplication of GRO genes in ferreungulata is probably a postspeciation event. Identification of the ancestral origin of laurasiatheria with current set of data of GRO chemokines is not feasible. Full genome annotation of more mammalian species under laurasiatheria is required to reconsider their evolutionary patterns and to consider magnoorder Boreoeutheria as the most probable ancestor for duplicated GRO gene family.
Moving towards the other panel of the tree, one alluring difference was marked in the case of rabbit GRO genes. Despite rabbit being a species of euarchontoglires, it forms a clade with laurasiatheria. Notably, it has been observed that within rabbit GRO genes, CXCL3 shows a different behaviour and is close to bovine CXCL2 and forms a single node. In contrast, the other two paralogues of rabbit (CXCL1 and CXCL2) form a separate node. Similar kind of tree topology for rabbit has also been reported earlier [16]. In summary, this tree outlines the fact that GRO genes have followed specific evolutionary patterns, which might be the outcome of selection pressures.

Structural basis for observed Darwinian selection criterion in growth-related oncogene chemokines
In general, after the gene duplication, newly formed genes evolve by undergoing mutations. Some of these mutational changes can be progressed/selected, whereas some can be lost. human CXCL1 b r o w n r a t C X C L 3 h o u s e m o u s e C X C L 3 C h i n e s e h a m s t e r C X C L 3 b r o w n r a t C X C L 2 h o u s e m o u s e C X C L 2 C h i n e s e h a m s t e r C X C L 2 C h in e s e h a m s te r C X C L 1 b ro w n ra t C X C L 1 hou se mo use CX CL 1 rabbit CXCL 2 h o r s e C X C L 2 h o r s e C X C L 1 g i b b o n C X C L 3 o ra n g u ta n C X C L 3 hu m an C X C L3 chimpan zee CXCL3 cra b-e atin g ma caq ue CX CL 3 c h im p a n z e e C X C L 2 h u m a n C X C L 2 o r a n g u t a n C X C L 2 g i b b o n C X C L 2 c r a b -e a t i n g m a c a q u e C X C L 2 c r a b -e a t i n g m a c a q u e C X C L 1 g i b b o n C X C L 1 o ra n g u ta n C X C L 1 ch im pa nz ee CX CL 1 Phylogenetic tree of GRO family chemokines among different mammalian species. Scale bar represents distance that corresponds to 5% amino acid differences between the sequences. [51][52][53]. The simplest way to find out the positive Darwinian selection in the protein coding region is to compare the non-synonymous (dN) and synonymous substitution rates (dS). The difference between the two rates is measured by the ratio omega (ω) equal to dN/dS which depicts the effect of amino acid selection on the encoded protein of a given gene [54]. If non-synonymous mutations are dysfunctional, purifying selection will diminish or prohibit their fixation and, therefore, dN/dS will be less than 1, but when the non-synonymous mutations are neutral, then they will be fixed at the rate similar to synonymous mutations and dN/dS will be equal to 1. Non-synonymous mutations are fixed at a rate higher than that of synonymous substitutions, with dN/dS > 1 only in the case of positive Darwinian selection, a stringent criterion for detecting positive selection [55,56].
Prior to selection analysis, GRO genes were analysed for the occurrence of any gene recombination events as these events can mislead the conclusions obtained from selection analysis. Previous work by Abrantes et al. reported the gene conversion events between CC chemokine receptors in various mammalian species, although they are absent in their CC chemokine ligands [57,58]. In order to analyse such gene conversion events in GRO chemokines, we performed GARD and GENECOV analysis. The results evidenced no statistically significant gene recombination and gene conversion events in GRO genes. These observations are in coherence with the recombination analysis for CC chemokine ligands of CCR5 as described above [57][58][59].
Selection analysis for GRO genes was performed using maximum-likelihood methods of codeml, SLAC, FEL, REL, iFEL, MEME and FUBAR. The selection analysis was individually performed for all three GRO chemokines (CXCL1, CXCL2 and CXCL3), and also for the complete set of GRO chemokines that comprises all the sets of these three duplicated genes. The main aim of performing the analysis for combined set of the GRO chemokines is to assess the correlative nature of duplication and positive selection (table 1). The datasets with individual chemokines returned 3, 5 and 3 sites for CXCL1, CXCL2  (c) Significant tertiary interactions observed among the residues forming coevolved pairs are shown on HCXCL1 structure. and CXCL3, respectively. Some of these sites such as A55, S72 have evolved positively in more than one GRO protein. A total of 17 positively selected sites were observed when the analysis was performed for the entire set of GRO chemokines. As expected, out of 17 sites, 9 of them are exactly the same as obtained from individual analysis. We presume that the other 8 are an outcome of the duplication phenomenon. All the selection analysis results for GRO genes are summarized in table 1 and figure 3a.
We then analysed the positioning and the importance of these residues with respect to the GRO chemokine structure and function. Among the positively selected residues, four of them (A4, T5, L15, P20) are in the N-terminal domain, four of them (V28, K29, S30, 50A) are in central-core domain and nine of them (A55, I58, K60, E64, S69, D70, K71, S72, N73) are in the C-terminal domain (figure 3a,b). It is evident from the data that most of the positively selected sites (10 out of 17 residues) are in the GAG binding surface (C-terminal helix + 3 10 helix). Interestingly, these residues are mostly spanning on either sides/close proximity of experimentally demonstrated conserved GAG residues (H19, K21 on 3 10 helix and K61, K65 on C-terminal helix) [28,29]. It has also been observed that the N-terminal region which is crucial for receptor binding and activation is under purifying selection, except the two positively selected residues (L15 and P20). Of these two residues, P20 also overlaps with GAG binding surface. These results indicate that receptor binding domain is mainly showing the signatures of purifying selection, and thus conserved among the GRO chemokines. Indeed such behaviour is very much anticipated for the receptor  binding domain considering functional constraints of GRO chemokine binding to its conservative CXCR2 receptor. In contrast, the GAG binding domains have evidenced for large number of positive selection sites indicating the evolution of differential GAG binding features/GAG surfaces among GRO chemokines in different species. These results suggest that the two variable and complementary functional segments (receptor binding domain and GAG binding domain) of the GRO proteins have a selection bias in order to balance their structure-function relationship. The other interesting segment of GRO chemokines that showed positive selection is the dimer interface (V28, K29, S30) of the central-core domain. Indeed, such a specific site/segmental selection of GRO chemokines at the GAG binding and their extensions to dimer interface regions can regulate the functional activity of these chemokines by providing conformational adjustments at dimer interface, and by altering the binding energetics through modulation of positive charge potential on the surface via amino acid substitutions/gene mutations (discussed in later sections). We further analysed the relationship among these positively selected residues in order to assess their role in coevolution. In coevolution/compensatory evolution, when one amino acid evolved at a particular position, the other amino acid evolved at different position simultaneously as a coevolving pair. We analysed the coevolving nature of all the positively selected residues (table 2, figure 3c). Our results demonstrated that several of these amino acids are coevolved and contribute to the structural stability through side chain networks via different tertiary interactions. For example, coevolving pair L15-A50 is involved in hydrophobic interaction. Similarly, L55-D70 and S69-S30 contribute to the inter-subunit interactions (figure 3c). It is worth noting that sometimes coevolved residues are a part of the larger interacting networks. In such cases, a direct interaction with a particular site (positively selected partner) is not evident from distance measurements as noticed for T5. Considering the conserved nature of the receptor domain, and interesting evolutionary patterns of GAG binding domain of GRO chemokines, the evolutionary analysis and discussion is more oriented towards understanding the species-specific differential GAG binding characteristics.

Duplication of growth-related oncogene genes followed concerted evolution
Different models have been proposed for the evolution of multigene families that includes divergent evolution, concerted evolution and birth and death evolution. In divergent evolution, genes obtained after being duplicated from their parental genes acquire new gene functions independently. Whereas in concerted evolution, all the members of a gene family evolve in a concerted/coincidental manner. That is, the mutation occurring in a repeat will spread into the entire members of the gene family by the repeated occurrence of unequal crossover or gene conversion, thus resulting in homogenization of DNA sequences of all the member genes [60,61]. As unequal crossing over or gene conversions are supposed to be driving forces for concerted evolution, dS should be similar to or slightly higher than dN, irrespective of purifying selection. Moreover, if birth-and-death evolution is acting on a gene, then the rate of silent mutation should be extensive, leading dS to be many folds higher than dN [62].   To identify the evolutionary method of GRO genes, we compared the pairwise dS and dN values of GRO genes for different species. It was found that in each species the dS values were similar to or slightly higher than dN values indicating the concerted mode of evolution (table 3). Indeed, concerted evolution has been followed by many genes including ribosomal RNA genes and histone genes that code for the large quantities of proteins with similar functions and are crucial for the endurance of an organism [63,64].

Substitution rate analysis of growth-related oncogene proteins
Accumulation of mutations in the coding sequences of the duplicated genes greatly influences their functions. They include differential expression patterns, and various interactions with biomolecular binding partners. Our selection analysis along with the calculated amino acid conservation scores suggested that all the positively selected sites exhibit a conservation score below 70% and several of the nucleotides/amino acids that correspond to the purifying/neutral selection do vary considerably (electronic supplementary material, figure S2). Therefore, in order to throw light on the rate and nature of nucleotide substitutions resulting after duplication events, we calculated the substitution rates ( figure 4). It was observed that rodent genes are more prone to substitution in comparison to other species, when substitution rates were calculated for the full-length sequence of all the species (figure 4a). Further, to determine the segments of the protein that have major contribution to the high substitution rates, the protein was divided into three domains: N-terminal domain, central domain and C-terminal domain (figure 1a). Substitution rates were calculated for each of the domains among all the species. It was observed that in primates and laurasiatherians, C-terminal domain alone showed high substitution rates, suggesting that the rest of the protein is under stringent purifying selection ( figure 4b,c). However, in contrast to the higher-order mammals, both N-terminal and C-terminal domains showed high substitution rates in the rodents. Moreover, the C-terminal domain has highest propensity of substitution changes compared with the N-terminal counterpart ( figure 4d). Further, the comparative analysis of the substitution rates of alone C-terminal domain of the rodents with that of the same

Sequence and surface properties of growth-related oncogene proteins
Selection pressure analysis at domain level of GRO proteins prompted us to further delineate the molecular level details and sources of differential ω value patterns across different species. In order to dig out the differential evolution characteristics of GRO proteins across the species, we simultaneously analysed the variations in their sequences at both nucleotide and amino acid level (figure 6). We observed very small number and specifically regulated nucleotide and amino acid changes across GRO chemokines in primates and laurasiatherians. Moreover, the changes in these species are mainly through facile single-nucleotide alterations, in order to guide the duplicated sibling for a functional advantage via minimal perturbation. In contrast to this, rodent genes exhibited large number of changes with multitude of nucleotide alterations accompanying amino acid makeovers. It is worth noting that in the case of primates, the amino acid alterations are on the helical surface in GRO genes, i.e. essentially confined within the C-terminal α-helix. Whereas in rodents, apart from the N-terminal, these changes have spanned across both 3 10 helix along with C-terminal α-helix that constitute the GAG binding surface of their positively charged residues. Moreover, the observed gene mutation(s) in the GAG binding surfaces specifically corresponds to the change(s) in amino acids that result in the charge reversal/charge neutralization of protein surface. These observations indicate that promiscuity of GRO genes under evolutionary pressure is focused towards modulating the GAG binding surfaces via minimal gene alterations through charge distribution.
To delineate the surface architectures and electrostatic potentials that can plausibly contribute to differential GAG binding, we modelled the dimeric structures of murine, horse and human GRO chemokines (electronic supplementary material, tables S2, S3 and figure S3), and analysed their  electrostatic potential surfaces ( figure 7). These structural models furnished a pictorial view of distinct GAG binding surfaces that are generated by virtue of amino acid changes. Analysis on human GRO proteins (as dimers) yielded complex electrostatic surfaces and evoked the feasibility of multiple modes of GAG binding (figure 7a, H). It is evident from the electrostatic potential pattern that the human GRO genes will have an alternate GAG binding surface comprising the positive residues (R8, K29, R48, K49; according to HCXCL1) (figure 1a). In line with our evolutionary theory, a recent study on HCXCL1 confirmed the presence of two non-overlapping GAG-binding domains in HCXCL1 [29]. Interestingly, due to surface charge alterations, HCXCL2 has evolved with a contrasting 90 o rotated positive surface on the β-sheet/dimer interface when compared with HCXCL1 and HCXCL3 ( figure 7a, H). Such a surface of HCXCL3 is due to the presence of K27 along with R/K29 at the dimer interface and lack of R48 at the start of β3-strand and is a resultant of single-nucleotide alteration at these two sites on the β-surface of human GRO proteins. Moreover, these changes resulted in specific positioning and shuttling of the residues within Arg and Lys (K29, R48 in hCXCL1 versus R29, K48 in HCXCL3) to fine tune their GAG binding interactions/affinities. Such a GAG binding interface at the β-domain surface has also been reported for homeostatic chemokine CXCL12 [65]. Similar analysis with horse proteins yielded surfaces that are different from the primates. In horse CXCL1 (SCXCL1), the positive surface is confined at the 3 10   In the case of rodents, the mechanism of differential GAG surface formation is different from those of the primates and laurasiatherians. In MCXCL1, 3 10  positive surface potentials constitute single GAG binding motif on the helical surface in murine GRO proteins, and indeed previous NMR experiments on MCXCL1-trapped dimer demonstrated that GAG binds orthogonal to the inter-helical axis and the residues H20, K22 on 3 10 helix and K60 and K64 on the C-terminal helix are crucial for GAG binding [28].
Considering all the above observed electrostatic surface variations on the mammalian GRO proteins (figures 6 and 7), we summarize that the β-sheet binding surface and its relative surface orientation(s) are confined to the primates and laurasiatherians and no such surface is evident in the rodents. Further, our analysis suggested that the electrostatic surfaces have evolved specifically among the GRO genes in a given organism, in order to serve the needs of particular species as per the evolutionary pressure, although the helical surface of GAG binding resembles the primordial mode of GAG binding in all the species. Indeed, our selection analysis supported this theme as the residues H19, K21 on 3 10 helix and K61 and K65 on the C-terminal helix that are crucial for GAG binding are on a tight purifying selection for all the species.

Mechanistic insights into evolutionary characteristics of growth-related oncogene chemokines
Immunity and defence-related genes evolve rapidly under positive selection pressures accumulating amino acid changes more rapidly than other genes. Chemokines listed as one of the 8 most rapidly changing proteins and domains [8,12]. The reason for continuous evolvability of inflammatory chemokines is intimately related to their functionality in host defence processes, as viruses copy the host endogenous chemokines, and target the host machinery. In order to fight against such evolving pathogens, the host genes need to undergo smart changes and also expand themselves to maintain their diversity. This piece of work is an effort to furnish mechanistic insights into the evolutionary characteristics of GRO chemokines. The phylogenetic analysis provided us an overview of the fact that GRO genes have evolved in species-specific patterns. The observed phylogenetic profile for GRO genes is very much consistent with the earlier literature available on CXC chemokines [7,11,16]. Indeed, such an interesting species-specific evolution of GRO genes is an outcome of the mechanism of duplication events that has been precisely reflected in the genomic arrangements of different species including primates, laurasiatherians and rodents [11]. In rodents, all GRO proteins are present in the same transcriptional orientation in contrast to primates where CXCL2 and CXCL3 are placed in an opposite direction to CXCL1, as primate GRO gene evolution is an outcome of inverse duplication when compared with other species [10]. Such a gene positioning via duplication and species-specific nucleotide alterations under evolutionary pressure confer multiple layers of regulation to chemokine functions in terms of differential transcriptional regulation, different cellular territories of distribution, homo/hetero dimerization and receptor/GAG recognition to form cell/chemokine-specific chemotactic gradients/cell signalling events. Indeed, experimental reports on human CXCL1, CXCL2 and CXCL3 suggested that the expression of these three highly similar genes is differentially regulated in a cell-and signal-specific manner and bind to CXC receptors with varied affinities [16,24,25].
Site-specific selection analysis of duplicated GRO chemokines confirmed that the genes underwent mainly purifying/negative selection with few positively selected sites, several of them coevolving and the majority of them lying in the GAG binding domain/dimerization region. In general, the majority of the genes evolve under negative selection. For example, fork head gene family has evolved as a result of gene duplication, rapid differentiation and subsequent fixation of amino acid changes through negative selection [66]. Similarly, the post duplication charge evolution of surface charge in phosphoglucose isomerase was not driven by strong selection on individual amino acid sites but by the weak selection on large number of amino acid sites and consequently by steady directional/purifying selection on overall structural properties of the protein, which are derived from many modifiable sites [67]. In the case of GRO genes also, the modification of the electrostatic surfaces is not in line with the principles of strong positive selection. They evolved mainly under purifying selection with an exception of C-terminal domain. Such domain-specific evolutionary patterns are seen in immune signalling protein TRAF3 interacting protein 2. In this protein, the N-terminal domain that is more disordered has been subjected to positive selection, and a purifying selection was observed for the C-terminal domain which is contributing to the core structure of the protein [68]. For GRO chemokines, the selection, substitution rates and nucleotide comparison studies established that significant degree of positive selection/alterations are concentrated in the C-terminal/GAG binding domain including residues involved in the dimerization contacts. However, no positive selection has been noticed in the structural part of the protein suggesting the conserved nature of the structural fold for its chemotactic functioning.
Our analysis also established that such evolutionary changes resulted in multiple species-specific pathways for differential GAG binding such as regulated electrostatic surface in the 3 10 helix in rodents and generation of novel β-sheet surface in primates and some of the GRO proteins in laurasiatherians, although the helical surface binding appears to be the primordial mode of GAG binding for GRO chemokines. Evolutionary analysis on formyl peptide receptors across mammals evidenced similar sort of differential electrostatic surfaces through amino acid alterations [69]. Essentially, chemokine immobilization through GAGs facilitates the formation of haptotactic gradient, and adds another layer of specificity and control to the cell migration beyond the receptor. These evolved species-specific and gene-specific GRO proteins can also have differential half-life and susceptibility to protease degradation upon GAG binding.
Further, these species-specific evolutionary events are also credited for gene inactivation and partial/complete deletions leading to pseudogenes (non-functional), like genes (novel functions) and extinct genes (erased) [9,12,13,18,70]. Like genes for CXCL1 and CXCL3 (CXCL1 L/CXCL3 L) were reported in Macaca fascicularis [18]. In both cases, although the contributing GAG residues are similar, the C-terminal region that is crucial for monomer-dimer equilibrium is truncated thus contributing to differential populations of GAG induced dimerization/oligomerization. On a similar note, a pseudogene (CXCL1p) found in human and chimpanzee in the inversely duplicated segment of chromosome 4, which is yet to be processed completely for complete functionality. Currently CXCL1p gene contains only two exons with one intron between, accompanied by downstream deletion resulting in a sequence that is devoid of C-terminal helix [18]. Our sequence analysis identified that in CXCL1p, residue R9 that is essential for GAG/receptor binding has been mutated to P9 thus switching off the functionality of this gene at this stage (data not shown).
Such gene alterations are not only confined to the GRO family chemokines; several other CXC chemokines also experienced these evolutionary pressures for functional advantages. For example, CXCL4L1, a homologue of platelet factor 4 (CXCL4) is a resultant of duplication of CXCL4 gene present in humans and chimpanzees, which differs in only three amino acids (P58 L, K66E and L67H) when compared with CXCL4 and exhibits potent anti-angiogenic and anti-tumourous properties. Recent structural studies have shown that C-terminal helix of CXCL4L1 adopts an open conformation due to L67H mutation. These amino acid changes coupled with the conformational transition in the C-terminal helix are responsible for the lowering of GAG binding affinities of CXCL4L1 when compared with CXCL4 [71].
Another essential feature is that, during the course of evolution, these genes have also acquired variations due to point mutations. Different types of polymorphisms like single-nucleotide polymorphisms, insertion and deletions of specific nucleotides, alternating splicing that may be associated with several diseases or sometimes may be beneficial [12]. CXCL2 which is an essential mediator in lung protection but polymorphism of a short tandem repeat (AC)n at -665 position in the promoter region altered the promoter's activity, consequently heightened the expression of CXCL2 thus contributing to severe sepsis [14]. In such pathological cases, redundancy/promiscuity of these duplicated inflammatory chemokines play a key role due to an overlap in major functions carried out by them; so that any defect in one chemokine can be easily resolved by an alternative family member to safeguard the cell from immune insults.

Concluding remarks
In summary, we reconstructed the evolutionary history of GRO genes across the diverse range of mammals. Our molecular evolutionary studies on GRO proteins threw light on the underlying principles responsible for their variable evolutionary patterns, positive selection of the protein segments, substitution rates and their lineage to GAG binding properties thus bridging our structural awareness with evolutionary programming and functional variance. Such knowledge is applicable to all the protein/chemokine families that demand a detailed lineage of structure-function relationships. Future comparative experimental studies involving protein-GAG interactions of GRO proteins from multiple species are imperative to decipher the regulatory role of multiple GAG binding surfaces/orientations during chemotactic gradients and hence neutrophil trafficking.