FIGURES FOR : Solution structure of CXCL 13 and heparan sulfate binding show that GAG binding site and cellular signalling rely on distinct domains

Chemokines promote directional cell migration through binding to G-protein-coupled receptors, and as such are involved in a large array of developmental, homeostatic and pathological processes. They also interact with heparan sulfate (HS), the functional consequences of which depend on the respective location of the receptor- and the HS-binding sites, a detail that remains elusive for most chemokines. Here, to set up a biochemical framework to investigate how HS can regulate CXCL13 activity, we solved the solution structure of CXCL13. We showed that it comprises an unusually long and disordered C-terminal domain, appended to a classical chemokine-like structure. Using three independent experimental approaches, we found that it displays a unique association mode to HS, involving two clusters located in the α-helix and the C-terminal domain. Computational approaches were used to analyse the HS sequences preferentially recognized by the protein and gain atomic-level understanding of the CXCL13 dimerization induced upon HS binding. Starting with four sets of 254 HS tetrasaccharides, we identified 25 sequences that bind to CXCL13 monomer, among which a single one bound to CXCL13 dimer with high consistency. Importantly, we found that CXCL13 can be functionally presented to its receptor in a HS-bound form, suggesting that it can promote adhesion-dependent cell migration. Consistently, we designed CXCL13 mutations that preclude interaction with HS without affecting CXCR5-dependent cell signalling, opening the possibility to unambiguously demonstrate the role of HS in the biological function of this chemokine.

CXCL13 refolding, the elution fractions were diluted seven-times in refolding buffer (50 mM Tris at pH 8) and kept at 4°C overnight under gentle rocking. Four times dilution in refolding buffer preceded the loading into FastFlow SP sepharose ion exchange column (GE healthcare) equilibrated with a 50 mM potassium phosphate buffer at pH 6. Elution was performed under NaCl gradient (from 0 to 1 M) for 30 min at 2 mL.min -1 . Elution fractions were concentrated up to reach 3 mL volume (MWCO 3000 Millipore Centricon) prior to loading into Hi-load superdex S75 column (GE healthcare) equilibrated with the running buffer containing 20 mM potassium phosphate and 100 mM NaCl at pH 6. Elution fractions were concentrated up to reach 1 mL volume and the final concentration was measured by absorbance at 280 nm based on a molar extinction coefficient of 12,740 M -1 .cm -1 (http://www.expasy.ch).. CXCL13 construct, including an enterokinase cleavage site, was process similarly. To eliminate the -1 Met residue, enterokinase digestion (2U Sigma) was performed in 500 µL of 50 mM TriB pH 8 for 3h at room temperature with, prior to loading into the Hi-load sephadex S75 column. The digestion was validated by N-terminal amino acids sequencing. All CXCL13 samples were then stored at -80°C.

NMR spectroscopy
NMR signal assignment of wt-and Met-CXCL13 was performed on 200 μM protein samples containing 20 mM phosphate buffer (pH 6) and 100 mM NaCl. Spectra were recorded at 298 K using Bruker AvanceIII 600, 700, 850 or 950 MHz NMR spectrometers all equipped with cryogenic probes. Data processing with NMRPipe [2] preceded analysis with NMRFAM-SPARKY 1.2 [3]. Backbone and aliphatic sidechain assignments were obtained from 2D 15  and 3D 13 C-NOESY spectra. Stereospecific assignments of Val and Leu methyl were obtained with the labeling method previously described [4] along with the recording of a high resolution 2D 13 C-HSQC.
The combined chemical shift perturbation ∆ of the i th residue upon dimer association or heparin binding was calculated using equation (1), as described [5].
The equilibrium thermodynamic constants of dimerization were calculated from NMR signal of proper residues (see result part) as described, using equation (2) for the monomer-dimer association.
(2): K D = The molecular weight has been estimated using NMR approach previously described [7]. Briefly, the tumbling correlation time (τc) is calculated based on the ratio T1/T2 according to the equation 8 of Kay et al. [8] and considering the slow molecular motion limit (τc >> 0.5 ns) and the use of high frequency spectrometer (> 500 MHz). The τc can then be defined for secondary structure-localized residues as with MW, the protein molecular weight; R, the gas constant; η, the dynamic viscosity of water; and T, the absolute temperature. This latter equation was simplified considering the partial specific protein volume, V � = 0.73 cm 3 .g -1 and α = 0.3 g of bound water per gram of protein; 1 cm 3 .g -1 and 0.9 cP for the partial specific volume ( V � 2 ) and the dynamic viscosity of water. According to the plot of rotational correlation time versus protein molecular weight for known monomeric NESG (North East Structural Genomics Consortium) targets of ranging size, τ c /MW was experimentally measured as 0.6x10 -12 mol.g -1 [7]. Using the experimentally determined τ c /MW ratio and the first equation, the molecular weights of both oligomeric states of CXCL13 were estimated.

Molecular Dynamics Studies
Wt-CXCL13 with dp4 bound at both α-helical region cluster 1 and C-terminal region cluster 5 from the CVLS method was used as the initial structure for unrestrained constant temperature and pressure MD simulation. Each co-complex (wt-CXCL13-dp4) involving different combination of dp4 sequences bound at the binding sites were systematically prepared using Leap module of AMBER14 as follows. Amber-ff12SB force field and GLYCAM_06j-1 force field parameters were used for protein and ligand preparation respectively [9]. The net charge of each co-complex was neutralized to zero by adding appropriate number of Na+/Cl-counter ions, then centred in a three-point water (TIP3P) molecule box with a minimum distance of 12 Å between the walls to any atom in the complex. The system was then relaxed to a minimum state energy in two steps: i) the solute atoms were restrained with a force constant of 100 kcal/(mol. Å2) and the solvent molecules relaxed using 500 cycles of steepest descent and 2000 cycles of conjugate gradient method; ii) the whole system was relaxed using conjugate gradient minimization of 2500 cycles without any restraints. Following this each co-complex was brought to desired temperature and pressure (NPT) and further equilibrated for 1 ns. The MD production run was performed in NPT ensemble with the integration time step of 1 fs. Bonds involving hydrogen atoms were constrained using SHAKE algorithm. Maxwell distribution was used to assign the initial velocities and the total trajectory of each co-complex was computed for 50 ns.
Equilibration and the simulation processes were validated using physical observables of the system including potential/kinetic energies as well as temperature and pressure.
The sugar puckering for the IdoAp heterogeneity was restrained at the desired puckering ( 2 SO/ 1 C4) during the entire simulation, while the sugar puckering of GlcNp heterogeneity was stable in 4 C1 Conformations throughout the simulation.
Binding free energy calculation of each co-complex was computed using post-

Preparation and characterization of fluorescently labeled CXCL13
S6-(GDSLSWLLRLLN) tagged wt-or mutant-CXCL13 were prepared by chemical synthesis (ALMAC, East Lothian, UK) and fluorescently labeled essentially as described [12].  Figure S6A). The labelled protein was lyophilized following purification and the labelling validated by ESI-TOF MS ( Figure 6B).

Preparation of heparin derived oligosaccharides and biotinilated heparan sulfate
Porcine mucosal heparin was digested with heparinase I (8 mU/ml) in 0.1 mg/ml BSA, 2 mM CaCl2, 50 mM NaCl and 5 mM Tris buffer pH 7.5 for 50 h at 25 °C. The enzymatic reaction was stopped by heating the digest at 100 °C for 5 min. The digestion mixture was resolved from di-(dp2) to octa-(dp18) decasaccharide using a Bio-Gel P-10 column, equilibrated with 0.25 M NaCl and run at 1 mL/min. To ensure homogeneity, only the top fractions of each peak were pooled, and each isolated fraction was re-chromatographed on a gel filtration column to further eliminate possible contamination. Samples were dialyzed against distilled water and quantified. Heparan sulfate (HS) at 1 mM in phosphate buffer saline (PBS) was biotinylated at its reducing end for 24 h at room temperature with 10 mM biotin-LC-hydrazide [13]. Extensive dialysis of the mixture against H2O was performed to remove unreacted biotin prior to freeze-drying.

Equilibrium dissociation constant determination
KD was determined by fitting the maximal responses for each CXCL13 concentration against the Hill equation � with RU , the signal when the steady state is reached for the concentration of ligand [ 13] ; RU , the maximal signal obtained when all binding site are occupied; and n, the Hill coefficient, using the least squares method (Levenberg-Marquardt algorithm provided by Igor Pro version 6.03) with the three variables n, KD, RU as free fit parameters.

CXCL13 cell surface binding assays
Cell surface binding assays were conducted using fluorochrome-coupled chemokine.    Considering that, according to our data and in agreement with the crystal structure of human CXCL13, two hydrogen bonds also participate in the N-terminal stabilization, the relative contribution of the initiating methionine might be lower. However, the absence of initiating methionine is enough to completely disrupt N-terminal interaction with the core protein.
Presumably, increase in CXCL13 concentration induces dimerization, resulting in a sample that displays a larger HS-binding surface than their corresponding monomer counterpart. These kinetic values should thus be considered as estimates only.

Figure S8
Molecular model of CXCL13 homodimer. A, B: CXCL13 homodimer was calculated by protein-protein docking using HADDOCK [14], in the manner of earlier modelling of dimers [15,16]. This led to 11 representative models, of which two were identified as the best in initial evaluation possessing similar HADDOCK and z-scores, and overall backbone RMSD of 7 Å. Both monomeric chains are coloured differently. C: