Long non-coding RNA-polycomb intimate rendezvous

The interaction between polycomb-repressive complexes 1/2 (PRC1/2) and long non-coding RNA (lncRNA), such as the X inactive specific transcript Xist and the HOX transcript antisense RNA (HOTAIR), has been the subject of intense debate. While cross-linking, immuno-precipitation and super-resolution microscopy argue against direct interaction of Polycomb with some lncRNAs, there is increasing evidence supporting the ability of both PRC1 and PRC2 to functionally associate with RNA. Recent data indicate that these interactions are in most cases spurious, but nonetheless crucial for a number of cellular activities. In this review, we suggest that while PRC1/2 recruitment by HOTAIR might be direct, in the case of Xist, it might occur indirectly and, at least in part, through the process of liquid–liquid phase separation. We present recent models of lncRNA-mediated PRC1/2 recruitment to their targets and describe potential RNA-mediated roles in the three-dimensional organization of the nucleus.

genome in three-dimensional) [21], the catalytic activity of these complexes is critical for polycomb-mediated silencing [22][23][24]. As the role of these marks has been discussed elsewhere, we refer the reader to other excellent reviews [25,26]. In our review, we focus on the role of RNA and in particular of long-coding RNAs, in the recruitment of these complexes to the chromatin, using the two most studies lncRNAs, Xist and HOTAIR, as models.

Direct versus indirect binding of
polycomb-repressive complexes 1/2 components to Xist and HOTAIR Long non-coding RNAs (lncRNAs) are RNA molecules longer than 200 bases that lack protein-coding potential [27,28]. They represent a significant portion of the cell transcriptome [29] and work as activators or repressors of gene transcription acting on different regulatory mechanisms [30][31][32]. lncRNAs can act as scaffolds for protein recruitment [33][34][35][36][37][38][39][40] and behave as guides and/or sponges for titrating RNAs and proteins, influencing transcription at regulatory regions or triggering transcriptional interference [41][42][43]. In the large spectrum of activities, the RNA structure plays a central role and dictates precise functionalities by creating spatial patterns and alternative conformations and binding sites for proteins [44,45]. In this review, we will focus on the two best-studied lncRNAs, Xist and HOTAIR, to critically discuss what we know about the interaction of PRC1/2 complexes with RNA.
Xist is a long non-coding RNA and the master-regulator of X chromosome inactivation (XCI) [46][47][48][49]. Xist works as a scaffold for the recruitment of repressive complexes on the inactive X chromosome (Xi) [46,50]. As for its structure, six conserved repetitive regions (Rep), named A to F, have been reported to be essential for its function [30,44]. The interaction between Xist and PRC1/2 has been studied in detail. In particular, PRC1 has been reported to interact with Xist B-repeats and PRC2 with Xist A-repeats (see below) (figure 1a). In the case of PRC1-Xist B repeats, a study from the Heard laboratory showed that a region encompassing the Xist B/C-repeat is necessary for PRC1 recruitment [52]. The Brockdorff laboratory mapped this interaction to the B repeat mostly, and proved that HNRNPK, which physically interacts with PRC1, is directly involved in RNA binding (figure 1a) [54]. For the PRC2-Xist interaction with the A-repeats, there is not agreement in literature. A seminal study from the Lee laboratory has shown that Xist A-repeats directly recruits EZH2 via direct interaction with its stem and loops [51]. However, different lines of evidence stemming from developmental studies suggest that Xist expression and PRC2 recruitment can be decoupled. In particular, in developing female embryos, Xist RNA clouds seems to precede H3K37me3 domains, making a direct interaction unlikely [55,56]. In agreement with these observations, super-resolution microscopy [57] and genetics analysis [58] point towards a non-direct interaction. In particular, Almeida et al. suggest that Xist attracts PRC2 to the chromatin via the recognition of the chromatin mark placed by PRC1 (i.e. H2AK119ub), in agreement with other models of PRC1/2 recruitment [59,60] (discussed in more details below).
HOTAIR [61] is another well-known lncRNA regulating the expression of the HOX genes during development [61]. HOTAIR works as a scaffold for the recruitment of the PRC2 members EZH2, SUZ12, and it is also able to act in trans to allow the establishment of a repressed chromatin state at the HOX clusters [62,63]. How HOTAIR interacts with PRC2 in vivo is still debated, an in vitro study indicates a direct interaction between HOTAIR and EZH2 at its 5 0 [63,64]. In particular, HOTAIR interaction with PRC2, mapped at the HOTAIR repeat D1 helix 7 (H7) [53], appears to be direct (in the range of 200 nM) [63,64]. HOTAIR-PRC2 interactions might be very different from those of Xist-PRC1/2 (figure 1b). The interaction between HOTAIR and PRC2 is likely sustained by the repetitive Guanine stretches (G-tracts) found in the D1 helix [64]. This interpretation is in line with data from Somarowth et al. showing equal affinity of the PRC2 complex to natively purified or refolded HOTAIR 5 0 /3 0 using in vitro assays [53]. Noticeably, the putative Xist-PRC2 interaction region (A-repeats) is missing the key RNA recognition sequences needed for specific interactions (discussed below) [65].

Xist and HOTAIR show different modes of interactions with polycomb-repressive complexes 1/2 components
We analysed our previously published data on Xist and HOTAIR [35,66,67] binding abilities to PRC1/2 components. In our studies, we employed the catRAPID [35,68] method to estimates the binding potential of proteins to RNA molecules through van der Waals, hydrogen bonding and secondary structure propensities of both protein and RNA sequences. This allows the identification of binding partners with high confidence [69]. In agreement with experimental evidence [54], catRAPID identified a direct interaction between Xist 5 0end and HNRNPK [35] (Global Score = 0.99 on a scale ranging royalsocietypublishing.org/journal/rsob Open Biol. 10: 200126 from 0 to 1, where 0 indicates no RNA-binding ability and 1 strong affinity; figure 2a; by contrast, the negative control Dyskerin Pseudouridine Synthase 1 DKC1 has a score of 0.01). To identify interactions of long non-coding RNAs such as Xist, catRAPID exploits a special pipeline that is based on the division of the transcript into fragments and calculation of their individual binding propensities (Z-normalized to 0 mean and standard deviation of 1), which is useful to spot the binding sites (figure 2a) [35].  [70]. In brief, using randomized Xist Arepeats as a control, Ezh2 has been predicted to bind Xist with low affinity (EZH2-A-repeats interaction propensity is approximately 1, using a scale where positive interactions have scores greater than 10). These findings are in good agreement with three-dimensional-SIM data (figure 2b), showing the poor overlap between Xist and PRC2 [57], suggesting that this interaction might be sustained by intermediary proteins or via an indirect cascade (i.e. through PRC1-mediated H2A119 ubiquitination, see below). On the other hand, catRAPID predictions indicate that HOTAIR and EZH2 might directly interact (Global Score = 0.99; figure 2a; by contrast, the negative control, the keratin-associated protein KRTAP21 has a score of 0.01, which is in agreement with previous biochemical evidence [63,64]). In both Xist and HOTAIR analyses, protein interactions strictly occur in highly structured regions of the transcripts (figure 2c) that contain royalsocietypublishing.org/journal/rsob Open Biol. 10: 200126 G-rich stretches. These findings are in line with recent studies revealing that double-stranded regions in RNA molecules provide the scaffold for protein complexes [71,72]. Indeed, since RNA transcripts are highly flexible, an increase in secondary structure makes the protein partners bind tightly [72], favouring their accumulation on the scaffold, which can induce the formation of phase-separated assemblies (discussed below) [71]. In regards to the RNA structure, the CROSS (Computational Recognition of Secondary Structure) algorithm predicts the propensity of a nucleotide to be double-stranded given the neighbour nucleotides and the crowded cellular environment [73]. CROSS has been previously employed to compute the structural properties of Xist and HOTAIR [73,74]. In accordance with dimethyl sulfate (DMS)-sensitivity experiments [75], CROSS [73] analysis predicts that, Xist B and C Repeats (nucleotides approximately 2000-5500) as well as Xist A repeats (nucleotides approximately 1-400) and E (nucleotides approximately 10 000-12 000) of Xist are highly structured. Among Xist-interacting proteins binding to RepE, there are the splicing regulators polypyrimidine Tract Binding Protein 1 (PTBP1), MATRIN-3 (MATR3), CUG-Binding Protein 1 (CELF1) and TAR-DNA Binding Protein (TDP-43) [35][36][37]39].

Polycomb-repressive complexes 1/ 2-long non-coding RNA interactions and phase separation
Phase separation is defined as the process by which a homogeneous solution divides in two or more separated phases. Paraspeckles are a classic example of phase-separated cellular entities, nucleoli and stress granules [19,[44][45][46][47][48][49], which are membrane-less assemblies composed of RNA and proteins. Formation of cytoplasmic stress granules is an evolutionary conserved mechanism. For example, stress granules are formed in response to environmental changes (i.e. heat shock) and favour the confinement of enzymes and nucleic acids in discrete regions of the nucleus or cytoplasm [77]. Structurally disordered and nucleic acid binding domains promote protein-protein and protein-RNA interactions in large 'higherorder' assemblies [78,79]. Intrinsically disordered proteins, which are enriched in polar and non-polar amino acids such as arginine and phenylalanine, have been shown to promote phase transitions in the cell [45].
In a recent publication [67], we reasoned that Xist exerts its functions-at least in part-through the formation of silencing granules by phase separation, in which PRC1 and PRC2 are also recruited. More precisely, we suggested that non-canonical recruitment of repressive PRC1 complexes is promoted or reinforced by the formation of higher order assemblies. In this scenario, the primary de novo recruitment of PRC1/2 would happen through the Xist B repeats [54] direct interaction and involve proteins with a strong propensity to phase separate. As predicted by the catGRANULE algorithm [45] that estimates the ability of proteins to form liquid-like assemblies containing protein and RNA molecules [67], both EZH2 and HNRNPK are prone to phase-separate (figure 3a and table 1). Yet, HNRNPK shows a much higher granulation score than EZH2 (1.60 versus 0.71; note that the score is z-normalized and 0 correspond to the average protein propensity), which suggests enhanced ability to form large ribonucleoprotein complexes. In agreement with this observation, experimental [57,81] and computational studies [67] have indicated that Xist could phase separate with its associated proteins, but no evidence has been proposed so far on HOTAIR ability to form such assemblies. This finding is in line with the fact that PRC2 components might directly binding to HOTAIR, while most of Xist-Polycomb associations [51,82] are largely indirect [54] (figures 1 and 2). Indeed, analysing the whole protein interactomes of both Xist [36,39] and HOTAIR [83], we found that Xist binding partners are highly prone to phase separation, while HOTAIR interactions show lower propensity to phase separate, which is in accordance with the observation that indirect protein-protein interactions may mediate associations through structurally disordered domains (figure 3b) [67]. We note that HOTAIR binding partners have a non-negligible propensity to phase separate with respect to a similar length negative control (antisense of 3 0 UTR of Alpha Synuclein; around 2500 nucleotides; figure 3b) [80], which suggests that HOTAIR might form medium-size assemblies [84].
In the Xist case, PRC1 positive feedback recruitment may be reinforced by liquid-like interactions in which specific elements such as CBX2 [85] (liquid-liquid phase separation propensities of 1.17 [45]) as well as SAM-domain multimerization [86] or intrinsically disordered domains could be involved. Based on their phase separation scores, we speculate that other proteins such as HNRNPU ( phase separation propensity of 2.5) and MATR3 (liquid-liquid phase separation propensity of 1.5) might contribute towards the recruitment of polycomb proteins to the Xist body (table 1). These interactions might also be mediated by intrinsically disordered proteins yet to be discovered binding the Xist A-, D-3 0 end repeats. This protein multimerization driven by phase separation and the RNAprotein interactions might be playing a critical role in this process [67] and, in turn, trigger RNA Polymerase II (Pol-II) and basic transcription factors eviction, inducing gene silencing and heterochromatinization (figure 3c).

Non-catalytic functions of polycomb-
repressive complex in shaping the threedimensional genome might be mediated by RNA interactions Work from different laboratories has shown that PRC1/2 complexes are essential regulators of cellular three-dimensional structure (recently reviewed by Illingworth RS [85] and Cheutin and Cavalli [87]). Very recent work from the Cavalli lab has elegantly shown how PRC1 can exert different and apparently opposing functions such as gene repression, three-dimensional organization of the genome and gene royalsocietypublishing.org/journal/rsob Open Biol.  , table 1). Comparison with control RNA (antisense of the 3 0 UTR of Alpha Synuclein) [80] indicates that HOTAIR has non-negligible propensity to associate with phase-separating proteins (***p-value < 0.001; Kolmogorov-Smirnov test). (c) The most-likely Xist-mediated PRC2 recruitment pathway involves PRC1 recruitment via repeat B interaction through HNRNPK direct interaction (light green). H2A ubiquitination by PRC1 may induce PRC2 recruitment on the Xi as previously shown (see main text). We suggest that Xist might also recruit PRC1/2 complexes by phase separation through mediation of structurally disordered proteins the Xist binding repeat E. Phase-separated PRC1/2 recruitment could occur through a direct interaction with repeat Xist E. We suggest that the PRC1/2 oligomerization can further recruit repressive proteins and/or disordered proteins, contributing to the eviction of Pol II and basic transcription factors, recruiting more structurally disordered proteins and in turn, inducing further granule formation, heterochromatinization and gene repression. Xist repeats are shown; A repeat ( pink), B repeat (orange); E repeat (blue). Proteins are shown by name. Waved grey profiles on proteins, indicate intrinsically disordered regions; Xist RNA (black line).
royalsocietypublishing.org/journal/rsob Open Biol. 10: 200126 activation [88]. In brief, Loubiere and colleagues showed, using PRC1 mutants at the duchsund locus in Drosophila, that genes are positively and negatively regulated by PRC1. In particular, they suggest that while in the absence of activating transcription factors (TFs), PRC1 is mostly involved in gene silencing, in the presence of TF, PRC1 might be able to regulate gene expression by making PRC1-dependent promoter enhancer contacts [88]. As PRC1 has also been shown to have a role in regulating occupancy, elongation and phosphorylation of RNA polymerase II (Pol-II) [89,90], it is tempting to speculate that these functions of PRC1 might be, in part, mediated by its ability to bind to RNA via RING1A/B or CBX7 [91] proteins (figure 4a). In support of this idea/interpretation, a paper from the Moazed laboratory [92] has shown that the Rixosome, a conserved RNA degradation machinery, interacts with PRC1/2, and it is recruited at Polycomb sites for efficient gene silencing. Similarly, Garland et al. [93] showed a link between the RNA degradation pathways and Polycomb silencing. In particular, they showed that KO of Zcfh31, a component of the poly(A) RNA exosome targeting (PAXT) complex, increases the cellular level of poly-adenylated RNA, triggering the destabilization of the PRC2 complex, impaired chromatin binding and reduction of gene silencing [93]. Furthermore, work from several laboratories has shown that Polycomb can interact with RNAs [94,95], nascent transcripts [96] or with R-loops at Polycomb-repressed targets [94,97]. These lines of evidence support the idea that the interaction of Polycomb proteins with RNA might be spurious, yet it is critical for numerous cellular functions, from nuclear threedimensional organization [85,87,[98][99][100], repression of target genes [94,101,102], spreading on PRC1/2 [103], cellular differentiation and lineage commitment.

Conclusion
Elegant biochemistry work from several laboratories showed that PRC1 [19] and specific PRC2 subcomplexes [20,104] (i.e. PRC2.1, PRC2.2 depending on the accessory subunits present in the complex, reviewed in Van Mierlo and colleagues [9]) bind to RNA with different affinities and specificities. Recent work suggests that the interaction of PRC1/2 components to RNA is promiscuous [18,105], and in part mediated by protein-protein interactions [65]. It has also been shown that EZH2-RNA interactions can catalytically inactivate or expel EZH2 [101,104,[106][107][108], suggesting that RNA binding is essential for the modulation of polycomb catalytic activities [50] (figure 4b). However, allosteric RNA inhibition can be relieved both by H3K27me3 and methylated JARID protein interactions (the latter also in agreement with Cifuentes-Rojas and colleagues [106,109]). These lines of evidence suggest a new model of PRC2 recruitment that can explain both de novo polycomb recruitment (RNA binding) and spreading (using established polycomb domains) [20]. Taking into account previous experimental and computational work, we suggest that the 'canonical', direct lncRNAmediated PRC2 recruitment has to be revisited [105]. As for Xist, the de novo recruitment of PRC1 and PRC2 is highly unlikely to occur through a mechanism of recruitment to the chromatin associated with catalytically inactivated complexes (i.e. allosteric inhibition). Although the recruitment of Xist to pre-existing CpG islands might partially alleviate its catalytic inhibition (104). Alternatively, these interactions occur indirectly (no complex inhibition), through intermediate proteins or by means of liquid-liquid phase separation (figure 3a-c). For example, Xist A-repeats, the putative Xist-PRC2 interaction region, are missing the key RNA recognition sequences needed for specific interactions [65], which suggests that these interactions, although critical, might also be spurious [18,65,105] (binding many RNAs with low affinity) or indirect. As for the HOTAIRmediated de novo Polycomb recruitment (possibly mediated by direct interactions), it is possible that residual H3K27me3 at the HOX locus might alleviate allosteric inhibition [110]. For PRC1/2 recruitment on the inactive X chromosome (Xi) at the onset of XCI, it is likely that de novo accumulation largely depends on PRC1-mediated mark on the chromatin, such as H2A-119ub (figure 1a) [23,54,58,104]. In this regard, work from the Pasini and Klose laboratory elegantly proved that H2A119 ubiquitination is essential for PRC1/2 silencing and PRC2 de novo recruitment [22,23,59]. We believe that more work has to be done in order to have a final model of lncRNA and Polycomb recruitment, capable of reconciling all this evidence.

RNA-protein interaction predictions and granule propensity
To compute protein-RNA interactions, we used the catRAPID approach that evaluates the interaction propensities of polypeptide and nucleotide chains based on their physicochemical properties predicted from primary structure [35,66]. Structural disorder, nucleic acid-binding propensity and amino acid patterns such as arginine-glycine and phenylalanine-glycine are key features of proteins coalescing in granules [45]. These features were combined in a computational approach, catGRANULE, that we employed to identify RBPs assembling into granules (scores >0 indicate granule propensity). We predicted the secondary structure of transcripts using CROSS [73,74]. The algorithm predicts the structural profile (singleand double-stranded state) at single-nucleotide resolution using sequence information only and without sequence length restrictions (scores > 0 indicate double stranded regions). HOTAIR repeats annotation: D1 (nucleotides 1-530) consists of 12 helices, 8 terminal loops and 4 junctions (three 3-way junctions and one 4-way junction). D2 (nucleotides 531-1040) consists of 15 helices, 11 terminal loops and 4 junctions (three 5-way junctions and one 3-way junction). D3 (nucleotides 1041-1513) is the smallest of all the four domains and consists of 9 helices, 6 terminal loops and 3 junctions (two 4-way junctions and one 3-way junction). Finally, D4 (nucleotides 1514-2148) is the largest among the four domains and consists of 20 helices, 13 terminal loops and 7 junctions (one 6-way, two 4-way and four 3-way junctions).

Funding. A.C. had been funded by a Rett Syndrome Research Trust
(RSRT) and a BARTS Charity grants and by QMUL intramural support. The research leading to these results has been supported by European Research Council (RIBOMYLOME 309545 and ASTRA 855923), the Spanish Ministry of Economy and Competitiveness (BFU2017-86970-P) and H2020 projects (IASIS 727658 and INFORE 825080).  Figure 4. RNA sustains Polycomb complexes functions. RNA can facilitate PRC1/2 complex and sustain three-dimensional contacts and loops (also mediated by the cohesin complex; red/blue ring) to coordinate gene expression by brining co-regulated genes together (gene A, green; Gene B, purple; green/blue ribbons represent nascent RNA from gene A/B). Rixosome could also be participating to these interactions. (B) RNA inhibits PRC2 catalytic activity. RNA (green) can inhibit PRC2 catalytic activity. Its activity can be relieved by H3K27me3 tails (red lollipop) or methylated Jarid2 proteins.