Open Biology
Open AccessReview articles

Mapping molecular complexes with super-resolution microscopy and single-particle analysis

Afonso Mendes

Afonso Mendes

Instituto Gulbenkian de Ciência, Oeiras, Portugal

Contribution: Conceptualization, Visualization, Writing – original draft

Google Scholar

Find this author on PubMed

,
Hannah S. Heil

Hannah S. Heil

Instituto Gulbenkian de Ciência, Oeiras, Portugal

Contribution: Conceptualization, Visualization, Writing – original draft

Google Scholar

Find this author on PubMed

,
Simao Coelho

Simao Coelho

Instituto Gulbenkian de Ciência, Oeiras, Portugal

Contribution: Writing – review & editing

Google Scholar

Find this author on PubMed

,
Christophe Leterrier

Christophe Leterrier

Aix Marseille Université, CNRS, INP UMR7051, NeuroCyto, Marseille, France

Contribution: Writing – review & editing

Google Scholar

Find this author on PubMed

and
Ricardo Henriques

Ricardo Henriques

Instituto Gulbenkian de Ciência, Oeiras, Portugal

MRC Laboratory for Molecular Cell Biology, University College London, London, UK

[email protected]

Contribution: Conceptualization, Supervision, Writing – review & editing

Google Scholar

Find this author on PubMed

    Abstract

    Understanding the structure of supramolecular complexes provides insight into their functional capabilities and how they can be modulated in the context of disease. Super-resolution microscopy (SRM) excels in performing this task by resolving ultrastructural details at the nanoscale with molecular specificity. However, technical limitations, such as underlabelling, preclude its ability to provide complete structures. Single-particle analysis (SPA) overcomes this limitation by combining information from multiple images of identical structures and producing an averaged model, effectively enhancing the resolution and coverage of image reconstructions. This review highlights important studies using SRM–SPA, demonstrating how it broadens our knowledge by elucidating features of key biological structures with unprecedented detail.

    1. Introduction

    Organized assemblies containing several copies of one or more molecular entities are a hallmark of the living world, and their existence underlies every biological process. Studying their structure and assembly dynamics is crucial to understanding their formation, function and resulting activity. Typically, these structural assemblies are characterized by specific molecular interactions, recruitment, conformational changes and catalysis of subsequent reactions. These complex supramolecular constructs range from cellular components to viruses and display structural redundancy across different levels of complexity (figure 1a). For example, the human immunodeficiency virus type 1 (HIV-1) capsid, which houses and directs the intracellular trafficking of the viral genome towards the nucleus during infection [5], is an approximately 40 MDa supramolecular complex containing at least 1500 copies of the HIV-1 capsid protein (CA) [1]. Notably, its formation involves assembling the CA monomers into approximately 12 pentameric and 250 hexameric complexes and their subsequent congregation to form a final and more complex structure [2,3] (figure 1b).

    Figure 1.

    Figure 1. Assembly of supramolecular complexes. (a) Supramolecular complexes such as clathrin-coated pits, the NPC, the HSV-1, the centriole and the VACV comprise highly ordered structures across a size range of several hundred nanometers. (b) Structural redundancy occurs at different scales within the same biological assembly. For example, the HIV-1 capsid is composed of at least 1500 copies of the HIV-1 capsid protein (CA) [1], which assemble into approximately 250 hexameric CA complexes before congregating to form the fully assembled capsid [2,3]. (c) Supramolecular complexes can be mapped by combining SRM with SPA. For this, several image sections containing super-resolved views of single particles are chosen. A filtering step removes non-matching objects based on predefined criteria. The remaining image sections are aligned, and an averaged view of the object's architecture is generated, which is then used to infer a model of the supramolecular structure. (NPC and HIV-1 CA structures were created using Mol* [4]; PDB IDs: 7N9F (NPC), 2M8N (CA monomer), 6OBH (CA hexamer) and 3J3Y (HIV-1 capsid)).

    Among the experimental tools to study biological structures, microscopy stands out by allowing their direct observation. In particular, fluorescence microscopy allows observations with molecular specificity. The spatial resolution achievable with conventional light-based microscopy is limited to approximately half of the wavelength (approx. 200 nm) [6]. Electron microscopy (EM) is currently the microscopy modality that achieves the greatest resolutions, being the only suitable option to resolve structures below a certain size. However, it provides limited information on the identity of molecular components and the precise location of a substantial population of molecules within very small complexes. The advent of super-resolution microscopy (SRM) has enabled nano-sized structures to be resolved while retaining the molecular contrast provided by molecule-specific fluorescent labelling.

    Despite the potential of SRM to resolve small molecular structures, factors such as low labelling densities and localization uncertainty hinder its ability to map supramolecular complexes with high precision and generate complete models. Single-particle analysis (SPA) is an analytical method that combines information from several views of a structure of interest, termed particle, and produces an averaged model (figure 1c). It was first developed in the field of cryo-EM [710] but has recently been applied successfully in SRM. Despite fundamental differences between EM and SRM, SPA can overcome important SRM limitations, such as under-labelling due to stochastic fluorophore excitation. SPA is primarily used to map biological structures by generating image reconstructions with a higher signal-to-noise ratio (SNR) than the original images used.

    Here, we provide an overview of SPA applications in SRM, discuss current limitations and consider future developments.

    2. Single-particle analysis overcomes critical single-molecule localization microscopy limitations

    Current SRM technologies can be classified into two broad categories based on the principle underlying their ability to achieve sub-diffraction resolutions. One type includes methods that modulate the light's path by exploring different illumination schemes, such as structured illumination microscopy (SIM) [11] and stimulated emission depletion (STED) microscopy [12]. The second category includes modalities that exploit the intrinsic properties of fluorophores, namely the ability to control their stochastic ‘on/off’ switching (i.e. ‘blinking’) and their excitation/emission spectra (i.e. photoconversion). This last category contains methods such as photoactivated localization microscopy (PALM) [13], direct stochastic optical reconstruction microscopy (dSTORM) [14,15] and point accumulation for imaging in nanoscale topology (PAINT) [16,17], which enable the reconstruction imaged structures from the localization coordinates of single and isolated fluorophores. These are known as single-molecule localization microscopy (SMLM) modalities.

    At the interface of these two categories lies minimal photon fluxes (MINFLUX), which uses a doughnut-shaped excitation light beam to detect the presence of a fluorophore and then triangulates its exact localization by refining the beam's position until the doughnut's zero centre matches the fluorophore's position, thus, minimizing the flux of photons [18]. A more detailed explanation of the principles, advantages and disadvantages of each SRM modality is available in [19].

    Despite its ability to reveal the localization of individual molecules within supramolecular complexes, SMLM has notable limitations such as labelling artefacts, target-to-label distances (i.e. linking error) and lack of temporal information. For example, single proteins are typically imaged using antibody-based labelling (i.e. immunolabelling) or fusions between fluorescent tags and the proteins of interest. In immunolabelling, the antibody's epitope is separated from the fluorophore by a short linker, resulting in a slight displacement of the fluorescent signal relative to the actual binding site in the target molecule. In turn, fusions with fluorescent tags minimize this linking error. Still, they can disturb the target proteins' function, an effect often related to steric hindrance or alterations in their spatial distribution and binding affinities (e.g. [20]). Another caveat of SMLM is related to the density and coverage of the labelling. SMLM stochastically samples the distribution of labels in a specimen [15]. Thus, there is no guarantee that all the fluorophores in a sample are acquired or excited proportionally. Furthermore, in immunolabelling, different regions within a supramolecular complex might not be equally accessible to the labelling antibodies, resulting in heterogeneous or incomplete coverage. In addition, immunolabelling is performed using chemically fixed structures. Although this allows for a snapshot of critical cellular activities and subsequent nanometer characterization, the spatial-temporal progression of the cellular activities is confined to the time of fixation and may contain fixation-induced artefacts.

    SPA is a computational approach that significantly enhances structural studies using microscopy. In SPA, particles are compared, and goodness-of-fit metrics such as cross-correlation [21,22] evaluate the particle before averaging. The typical procedure of SPA comprises (i) image acquisition, (ii) particle detection and segmentation, (iii) selection, (iv) alignment, (v) averaging and (vi) model generation. An example pipeline showcasing a dSTORM dataset containing the nuclear envelope of a Xenopus oocyte labelled for the nuclear pore complex (NPC) protein gp210 [23] (data available at https://doi.org/10.5281/zenodo.5068525) is shown in figure 2.

    Figure 2.

    Figure 2. Example SPA framework using dSTORM data of the NPC protein gp210. (a–c) A super-resolved image is acquired before the analysis. (d) Particles corresponding to the structure of interest are manually or automatically detected from a super-resolved image, generating an image library containing several segmented particles. (e) The particle population is filtered to remove unwanted objects according to specific criteria. (f) Each particle is aligned and used in an iterative averaging process [24], resulting in a final model (g) with enhanced structural accuracy. Scale bars represent 2 µm (a–d) and 100 nm (e–g).

    3. Mapping supramolecular structures at the nanoscale with super-resolution microscopy–single-particle analysis

    The studies referenced in the following sections highlight the potential and limitations of SRM and SPA in mapping the architecture of biological molecular assemblies.

    3.1. The nuclear pore complex

    The NPC mediates and regulates the translocation of molecules across the nuclear envelope. It is one of the largest supramolecular complexes in the eukaryotic cell, with a diameter of approximately 125 nm and a height of approximately 70 nm. Its cylindrical structure with eightfold symmetry is composed of multiple copies of at least 30 different proteins called nucleoporins (Nups) that form a cytoplasmic ring, a nuclear ring and a central channel with a diameter of 35–50 nm [2527] (figure 1a). Molecules crossing via the NPC need to be at least slightly smaller than its central channel. Thus, knowing the exact diameter of the NPC is crucial to understand which molecules or molecular complexes can travel across it. This becomes particularly relevant in the context of infection with pathogens trying to reach the cell's nucleus, such as HIV-1 [28]. In addition, owing to key structural features such as large size, radial symmetry and low variability between individual entities, NPCs became a preferential biological structure to evaluate SRM–SPA methods [29]. Furthermore, while averaged reconstructions of NPC densities were made using EM–SPA [30], the organization of its components within the complex remained unknown.

    Schermelleh et al. [31] used three-dimensional SIM to resolve single NPCs with molecular specificity, but the spatial resolution was insufficient to reveal fine structural details. Löschberger et al. [32] tackled this problem using dSTORM to image NPCs in nuclear envelopes isolated from Xenopus laevis (frog) oocytes. They focused on gp210, a protein that forms homodimers and then multi-mers that surround and anchor the NPC on the luminal side of the nuclear pore membrane [33]. Previous experiments using biochemical tools suggested a model where at least eight gp210 dimers were present in each NPC [34]. Löschberger et al. [32] resolved an eightfold symmetrical arrangement of gp210 and calculated an average diameter for the complex of 161 ± 17 nm. They generated images from 426 individual rings (approx. 160 000 localizations) that were automatically detected and aligned to confirm the statistically observed eightfold symmetry. The final averaged image displayed the eightfold symmetry of the gp210 ring previously observed and a diameter of 164 ± 7 nm. Attempts to distinguish gp210 monomers and dimers failed because the linkage error (approx. 18 nm) was too high to achieve the required spatial resolution. Finally, the authors exploited the high binding affinity of fluorescently labelled wheat germ agglutinin for N-acetylglucosamine-modified Nups to resolve the inner lining of the central channel of the NPC. The same analytical approach was performed by combining 621 rings (approx. 40 000 localizations) and calculating a 41 ± 7 nm pore diameter.

    Szymborska et al. [35] extended this type of approach to other components of the NPC using a distinct SPA routine. They labelled Nup133 in.whole cells and mapped the position of the fluorophores relative to the centre of the nuclear pore with a precision of 0.1 nm and an accuracy of 0.3 nm. The authors conducted a thorough characterization of the NPC's structure by labelling several members of the Nup107-160 complex, a primary component of the NPC. In doing so, they addressed the existence of three contradictory models for the orientation of the subcomplex and the radial position of its components on the nanometer scale [3638]. The same SPA framework was used to map the relative positions of the proteins in the complex. Nup-GFP fusion proteins and dye-coupled anti-GFP nanobodies were used to minimize the linking error and overcome the inability to obtain antibodies for several Nup107-160 members working in SRM. Using this methodology, they calculated similar ring diameters, and the differences between the two approaches corresponded to the difference in the linker's size between the two approaches. These measurements concluded that a head-to-tail arrangement of the Nup107-160 complexes along the nuclear pore's circumference probably explained the data. Importantly, they showed that the developed SPA implementation could be applied to asymmetric structures without losing precision if a molecular reference in a second colour channel was used.

    Many studies on the nanoarchitecture of NPCs followed, and this subject remains an intense area of research today. Some of these studies will be discussed further in light of the SPA implementations used (e.g. LocMoFit [39]; figure 3a).

    Figure 3.

    Figure 3. Overview of biological structures analysed with SRM–SPA. (a) NPC. Single NPC particles (raw data) were assumed to represent samples of the same underlying distribution. A particle was found (in the example the particle (iv)) so that it best describes all other sites based on the rank on sum LL of the all-to-all matrix, where the 50 subset sites were fitted to each other. The initial template is built based on sequential registration in the order of the sum LL rank. The final fused particle is used to register all sites in the 318-site dataset. This procedure yields an updated fused particle, which is used to register the dataset again. This process is iterated until convergence. The final average (model) was calculated from 318 particles without any assumption on the underlying geometry or symmetry in a tilted view (mode, top left). For comparison, the EM density of the NPC with C-termini of Nup96 is indicated in red (model, top right). Top and side views (model, bottom), where the nucleoplasmic and cytoplasmic rings are shown together (left panel), or separately (middle, right panels). The two proteins per ring per symmetric unit give rise to tilted elongated average protein distributions in the averages (arrows in model, bottom). Scale bars represent 50 nm. Adapted from [39]. (b) VACV. Segmented and aligned particles are used to generate multi-component models of single virions. For this, more than one seed selection criterion is applied to a single reference channel (VACV A4 frontal and sagittal pictured). Additional virion components (VACV F17 pictured) are aligned to the reference for generating virion orientation-based (frontal and sagittal) models of various viral components. A composite image of all sagital models is shown (A17(cyan)/CM(yellow)/A4(red)/DNA(blue)/L4(green)/F17(magenta)). Scale bar represents 100 nm. Adapted from [40]. (c) HSV-1. Virus images obtained from aligned particles (right, larger insets) and individual representative particles (left, smaller insets). Scale bar represents 100 nm. Adapted from [41]. (d) Clathrin-CME. (i) Dual-colour side-view super-resolution images of Las17-SNAP and Abp1-mMaple at individual sites. Images were rotated so endocytosis occurs upward and sorted by the distance of Abp1 centroid to Las17 at the base. (ii) Averages of Las17 and Abp1 at endocytic sites. For comparison, average outer boundaries of the actin network (dotted lines) and average plasma membrane profiles (solid line) obtained by correlative light-EM [42] are overlaid for each time point, as inferred from the images. Scale bar represents 100 nm. Adapted from [43]. (e) Cilia. (i) Two-dimensional STORM images of the ciliary DAs of mTEC cells with CEP164 labelled. The STORM localizations of an individual structure are fitted to an ellipse, which is then deformed to a circle. The circularized structure is normalized to a ring with a fixed diameter calculated by averaging the diameter of 31 original structures. Image (ii) shows the resulting averaged structure after alignment. Scale bar represents 100 nm. Adapted from [24]. (f) Centriole. RPE-1 C1-GFP cells were immunolabelled for the indicated proteins and imaged first in a wide-field mode, followed by three-dimensional STORM imaging, and correlative EM analysis. Averaged STORM signals were pseudocoloured, rotated and then superimposed to generate a horizontal distributional map of the DAPs. Scale bar represents 200 nm. Adapted from [44].

    3.2. Viruses

    Viruses are small multi-protein entities that can replicate their genome within a host cell. Due to their small size, viral structures have historically been described using EM [45]. However, this approach is limited in mapping specific molecules' position and molecular identity within viral supramolecular complexes because they either use labelling methods that probe only a small subset of protein copies (e.g. gold nanoparticle immunolabelling) or take advantage of light microscopy methods that fall outside the SMLM domain. SRM–SPA overcomes this limitation by using data from thousands of labelled viral particles to provide averaged three-dimensional models with molecular specificity.

    Moreover, Gray et al. [40] developed VirusMapper. This SPA framework automatically performs detection, segmentation, alignment, classification and averaging thousands of individual viral particles to generate high-resolution reconstructions of viral particles with substructural detail. Importantly, VirusMapper is a user-friendly and open-source ImageJ/Fiji implementation. The authors evaluated the analytic capabilities of the framework using multi-colour SIM and STED images of Vaccinia virus (VACV), the smallpox vaccine virus [46]. VirusMapper enabled mapping the relative localizations of several viral proteins within the fully assembled virion. Importantly, it also allowed the detection of changes in the arrangement of a particular viral protein (L4) in the viral complex induced by the virion's fusion with the cell membrane, a feature that even EM had not been able to demonstrate robustly. In a follow-up study [47], Gray et al. used VirusMapper and an extensive collection of virus mutants to investigate the nanoscale organization of the VACV entry complex, a supramolecular complex located on the viral membrane that is crucial for binding and fusion with the membrane of the host cell. They determined that fusion activity and success correlated with a specific orientation of the virion and the spatial distribution of viral binding and fusion proteins (figure 3b).

    The Herpes simplex virus type 1 (HSV-1) is a common cause of cold sores but can also cause blindness and encephalitis [48,49]. It is highly contagious and establishes a remarkably persistent and latent infection in sensory ganglia with periodic reactivation leading to symptomatic or asymptomatic virus shedding [50]. The structure of the virus, an approximately 200 nm diameter sphere with a 125 nm diameter icosahedral capsid housing the viral DNA genome, was first determined by EM/ET [5153]. The most complex layer of the virus is called tegument and contains over 20 proteins [54]. Laine et al. [41] used two-colour dSTORM to map the relative localization of several HSV-1 proteins in the viral envelope and tegument. Using viral proteins fused with fluorescent proteins, they resolved individual capsids inside infected cells. They could discriminate between enveloped and non-enveloped particles and reveal the subcellular localization of viral components and their relative amounts in the cytoplasm and the nucleus. With this information, the authors classified the resolved structures based on the presence of a capsid, tegument or envelope, effectively achieving nanoarchitecture-based characterization of single virions. A SPA framework was developed and used to map more than 50 viral particles. The algorithm generates averaged models for individual viral proteins based on their localization and radii distribution. Also, the thickness of each protein layer is estimated from the population's diameter variability and deviation from spherical symmetry (figure 3c).

    HIV-1 is the causative agent of acquired immunodeficiency syndrome (AIDS). The HIV-1 genome encodes at least 12 functional proteins [55], but several cellular factors are required for its successful replication, including members of the endosomal sorting complexes required for transport (ESCRT) [5658]. Understanding the nanoscale organization and function of ESCRT subcomplexes at viral assembly sites is crucial to eluicidate the mechanisms underlying HIV propagation. Van Engelenburg et al. [59] used PALM to image ESCRT subcomplexes that interact directly with HIV-1 Gag (the main component of the HIV-1 capsid) and play a role in HIV-1 membrane abscission. They expressed ESCRT proteins fused with fluorescent proteins in the COS7 cell line and resolved structural details of viral assembly sites at the plasma membrane. Furthermore, using a Gag-mEOS2 protein fusion as a reference, they detected ESCRT subcomplexes forming clusters at the plasma membrane that were discernible from their corresponding cytosolic pools. Importantly, a two-colour SPA approach allowed the authors to quantitatively determine that large fractions of some ESCRT proteins become trapped inside the HIV-1 Gag lattice of viral-like particles during viral assembly. Furthermore, the deletion of a specific amino acid motif in Gag precluded this phenomenon.

    3.3. Clathrin-mediated endocytosis

    Clathrin-mediated endocytosis (CME) is critical for many biological processes, such as signalling, nutrient uptake and pathogen entry. CME involves the assembly of supramolecular complexes containing clathrin and many other proteins at the plasma membrane, leading to the inward budding of an intracellular vesicle. The vesicle is then pinched off the plasma membrane and rapidly uncoats to allow its fusion with endosomes. Because of the high conservation of CMEs between yeast and humans, yeast models are often used to study their structures and dynamics. The endocytic machinery comprises sub-diffraction limit structures extensively characterized using microscopy and SPA.

    Berro & Pollard [60] developed a SPA framework that tracks protein patches located at the endocytic sites and then performs particle alignment and averaging. This approach captures in super-resolution the endocytic pathway as a function of the conformational state. This is done by classifying different conformational states within the membrane, thus inferring sequential progress. Among other discoveries, the authors mapped the average positions of endocytic proteins along membrane invagination using confocal fluorescence microscopy. Furthermore, Picco et al. [61] developed a similar SPA approach to characterize protein dynamics during vesicle budding. They combined live-cell imaging data with previous EM structural data and characterized the endocytic process thoroughly. Finally, Mund et al. [43] developed a SPA framework that uses SRM images to characterize the dynamics of vesicle budding during CME. They fluorescently tagged 23 different endocytic proteins and imaged thousands of fixed yeast cells, collecting data from more than 100 000 endocytic sites. Using live-cell imaging data as a temporal reference, the average positions of labelled proteins at endocytic sites were mapped with unprecedented accuracy, and three-dimensional models of their distribution at different stages of endocytosis were generated (figure 3d).

    3.4. Cilia

    Cilia are small organelles that take the shape of a protrusion projecting from the cell body. Their type functions as cellular antennas to integrate environmental signals or create extracellular flows by continuous beating. They are generated from the basal body, a cylindrical structure composed of triplet microtubules arranged with ninefold symmetry [6265]. The skeleton of cilia, called the axoneme [66], comprises microtubule doublets arranged with ninefold symmetry around a central microtubule doublet and is contained by a membrane contiguous to the plasma membrane. In humans, the importance of cilia is highlighted by several diseases, called ciliopathies, which arise from defective ciliary function typically caused by mutations. These include polycystic kidney disease, polydactyly and Joubert syndrome (JBTS).

    Similar to NPCs, cilia are the subject of several SRM studies. In particular, the transition zone separates the ciliary and plasma membranes and is proposed to function as a gate to the cilium [67]. EM studies on the transition zone's structure revealed ‘Y’-shaped densities called Y-links, with the stem anchored at the microtubule doublets and the two arms attached to the ciliary membrane [68]. However, the Y-links' components and the arrangement of proteins in the transition zone were unknown. Shi et al. [69] imaged the transition zone using a two-colour three-dimensional STORM. They observed several proteins forming rings with different diameters localized between the axoneme and the ciliary membrane. Based on the radial, angular and axial distribution of the ciliary components, they constructed a three-dimensional map of the transition zone with a resolution of 15–30 nm, demonstrating that the protein rings observed were consistent with being Y-links. A protein called Smoothened (SMO) formed a ring of discrete clusters specifically in the transition zone. The authors revealed that certain JBTS-associated mutations reduced SMO localization in the transition zone and the ciliary membrane.

    This study also showed that the diameter of the rings composing the ciliary transition zone, often with an elliptical shape, varied from 369 to 494 nm. This reveals an important structural characteristic of many large cellular organelles called semi-flexibility. Briefly, semi-flexible structures can display slightly different sizes and shapes due to elastic deformation or molecular composition variations while maintaining symmetry and angular arrangement. This heterogeneity poses a problem for particle alignment and averaging, reducing the accuracy of the final reconstructed images. To address this problem, Shi et al. [24] developed SRM–SPA algorithms that take structural flexibility as a degree of freedom for image registration. They work by deforming heterogeneous elliptical structures into more uniform ones and then aligning and averaging the deformed structures. The algorithms were evaluated using simulated and experimental SMLM data of ciliary appendages from mouse tracheal epithelial cells (figure 3e). They resolved the ninefold ciliary symmetry and generated multi-colour three-dimensional models with a higher resolution than state-of-the-art algorithms based on rigid structure registration [7072]. Importantly, particle deformations that can be explained with the structural semi-flexibility model can instead be a consequence of perspective, i.e. the observation of particles randomly oriented in the sample. The method developed in this study can correct for this aspect.

    Robichaux et al. [73] combined cryo-ET and STORM imaging to map repeating structures in a specialized structure of the rod sensory cilium called ‘connecting cilium’. Subtomogram averaging provided a structural framework of the connecting cilium, used as a reference to map the distribution of specific molecular components resolved with STORM. This method allowed mapping subcompartments of the connecting cilium with high precision and revealed structural differences in knockouts for basal body proteins.

    3.5. Centriole

    The basal ciliary body originates from the centriole, a cellular structure [74] sharing a similar organization, including microtubule triplets arranged radially with ninefold symmetry [75]. They display proximal-distal polarity. The proximal end shows a supramolecular matrix called the pericentriolar material [7678], which is important for microtubule nucleation and centriole duplication, the latter resulting in a mother and a daughter centriole. The distal end of the mother centriole contains projections called distal appendages (DAs), which are required for ciliogenesis [7985], microtubule anchoring and centriole positioning [8688]. Several proteins are distributed around the centriole's centre, forming 100–200 nm diameter rings. Due to centrioles' small size and high molecular complexity, mapping the exact distribution of several centriole proteins remains a challenge.

    Gartenmann et al. [89] devised a SPA method to resolve the diameter of centriole rings with unprecedented precision in Drosophila larval wing discs. Their framework involved an initial imaging step with a three-dimensional SIM to resolve the ring of the well-established centriole protein Asl. Then, several GFP fusions to centriole proteins were imaged with SMLM and using the centre of the Asl rings as initial estimates for the centriole's centroid, localizations within 100 nm were averaged to calculate a new centre. The process was repeated until the centroid positions converged. Finally, the average radius of the rings was calculated, and the distribution of GFP fusion proteins in the centriole was reconstructed. By taking advantage of the high labelling density of three-dimensional SIM, the high precision of SMLM and the resolution enhancements and statistical robustness of SPA, this framework allowed calculating radii with an accuracy of ±4–5 nm.

    Bowler et al. [44] used a combination of cryo-ET, STORM imaging and SPA to map the precise location of DA proteins in three-dimensional (figure 3f). They revealed the dynamic nature of DAs by demonstrating with ultrastructural detail the reorganizations that occur before and during mitosis.

    The centriole is also a central component of the centrosome, a microtubule-organizing structure comprising a pair of centrioles responsible for the mitotic spindle formation during cell division. Sieben et al. [72] developed SPARTAN, a graphical user interface employing a novel SRM–SPA framework that generates three-dimensional structure reconstructions from two-channel two-dimensional SMLM data. In this study, centrosomes from a human cell line were purified and concentrated in coverslips by centrifugation before imaging with STORM. A library of densely labelled particles (10%−20% of the entire population) was created, and the particles were aligned and classified according to their orientation. An averaged model for one labelled protein was generated, on which models for other proteins could be mapped. Using this tool, they confirmed the ninefold symmetry of Cep164, a crucial centriolar component. They also revealed unknown features of the human centriole's architecture, such as closer proximity of the Cep164 N-terminus to the centriolar wall than previously reported [78].

    4. Quantitative structural descriptions outside the scope of single-particle analysis

    SPA refers to frameworks that register and average particles to produce a final model with improved SNR. Despite comparable analytical methods (e.g. pair-correlation) sharing key statistical approaches, such as numerical descriptions of ensemble parameters (e.g. particle diameter), these are not well suited to produce improved image reconstructions—unlike SPA.

    For example, spatial descriptive statistics such as pair-correlation [90], Ripley's K-function [91] and density-based spatial clustering of applications with noise [92] have been employed to determine clustering states and average cluster sizes of particle populations. Gunzenhäuser et al. [93] developed a quantitative method to reveal the functional and morphological aspects of the HIV-1 assembly based on SMLM data. They analysed hundreds of HIV-1 Gag clusters and determined the number of Gag molecules per cluster, as well as the frequency distribution of the clusters with respect to their radius and aspect ratio. Malkusch et al. [94] evaluated the performance of different cluster analysis methods in the same context. They were able to classify Gag clusters corresponding to three different stages of viral assembly. Finally, Floderer et al. [95] expressed assembly defective Gag mutants in live cells to follow the trajectories of individual molecules. They calculated the duration and the energy proportions involved in each assembly stage.

    A type of analytical tool that has recently been explored in microscopy is deep-learning (DL). Examples of its applications in bioimage analysis are image processing (e.g. denoising) and particle tracking and segmentation, with several user-friendly tools available to researchers as open-source (e.g. [96]). The use of DL in SPA is showcased in [97]. Here, six DL models were trained and benchmarked on their ability to recognize the septin ring, a structure arising in cells treated with the actin polymerization inhibitor cytochalasin D. Models were able to identify hundreds of septin rings with high accuracy. The segmented rings were then used for particle averaging, producing models with sensitivity for slight structural deviations induced by a septin ring-related gene knockout.

    Finally, supramolecular complexes can be mapped using different fluorescent labels in the same protein. For example, Mennella et al. [77] resolved kendrin/pericentrin, a component of the pericentriolar material in human centrioles, with antibodies against the protein's N-terminus and a GFP tag in the C-terminus using SIM. The distance from the different fluorescent signal to the centre of the structure allowed to determine the protein's orientation in the supramolecular complex, with the C-terminus positioned closer to the centre and the N-terminus extending outwards with radial symmetry. Another example is Leterrier et al. [98], where the nanoscale organization of the neuronal protein ankG in a region of the axon called the axon initial segment was determined using antibodies targeting different domains of the protein. A known spectrin-binding domain of ankG, which is closer to the protein's N-terminus, was colocalized with spectrin bands just under the plasma membrane and displayed a marked periodicism. By contrast, downstream domains closer to the C-terminus progressively lost this periodicity and were found to be located deeper in the cytoplasm than the N-terminus by three-dimensional STORM.

    5. Current challenges in single-particle analysis

    The studies mentioned in the previous sections showcase the potential of SRM–SPA to map the nanoarchitecture details of supramolecular complexes. However, they also highlight current limitations. In general, the increased computation resources and times required to perform the analyses in novel SPA frameworks are minimized by exploiting different mathematical approaches and hardware specifications, such as fast Fourier transforms [99] and graphical processing units (e.g. [100]). Additionally, structures occluded by noise in SMLM data can be revealed by introducing spatial descriptive statistics in the SPA pipelines [101]. However, more fundamental challenges remain. In particular, crucial steps in SPA methodologies involve using an initial template. Particles detected in a dataset can be included or excluded from the analysis with respect to how well they match the chosen template. Additionally, particle averaging involves iterative refinement of the shape of an initial template. The template chosen is typically arbitrary, often consisting of a simplified shape recapitulating a priori knowledge on the structures of interest. For example, a simple spherical shape can be used as a template if the structure of interest is a mature HIV-1 or HSV-1 particle. Another option is to select a representative particle from the dataset, which can be a single particle, a subpopulation or even a class average (e.g. VirusMapper [40]). The problem shared among these approaches is that the final reconstructed structures are biased towards the template used.

    Another important limitation of SPA arises from the assumption that all particles represent the same underlying structure. However, this is not necessarily true, as most particle populations are expected to display some degree of heterogeneity. For example, some biological structures sporadically or transiently manifest as variations of their canonical structure, such as the NPC variant with ninefold symmetry [35], semi-flexible ciliary structures [24] and multi-core HIV-1 particles [102]. These particular structures are under-represented in the final reconstructions, decreasing their accuracy and remaining undetected.

    5.1. Novel particle registration approaches minimize template bias

    Several studies focused on developing so-called ‘template-free’ SPA methodologies, which use non-arbitrary or data-driven templates to overcome template bias. For example, Fortun et al. [103] developed a SPA framework that reconstructs three-dimensional supramolecular assemblies from two-dimensional SRM images of particles in different orientations. While this framework uses a small number of hand-picked templates to detect particles, it estimates their orientations and performs volume reconstructions without resorting to a priori knowledge, creating an initial model that is iteratively refined by averaging. This framework allowed mapping the distribution of Cep63 around the central centriole barrel from STED images with unprecedented accuracy.

    Salas et al. [70] used multi-variate statistical analysis to classify particles according to their orientation. Representative examples of each class were then selected based on the visual match between the class average and the individual particles and used to perform multi-reference alignments. The algorithm successfully produced high-resolution three-dimensional reconstructions of DNA origami [104] structures and T4 bacteriophages from SMLM and simulated data.

    Furthermore, Heydarian et al. [100] developed an all-to-all’ registration approach. Each particle is registered pairwise against every other particle, producing similarity metrics and estimates of each pair's relative orientation and position. All the possible absolute positions and orientations of each particle are then calculated using a technique from the field of computer vision called ‘structure from motion’ [105]. A quality control step is performed to remove outliers and registration errors by inferring the combinations of relative parameters from the absolute parameters obtained previously. These retrodicted parameters are then compared to those found in the all-to-all registration, and pairs deviating from their registered counterparts above a specified threshold are discarded from the next steps. These procedures generate a data-driven template that is further refined by particle averaging. The algorithm's performance was first evaluated using a dataset containing two-dimensional DNA origami particles imaged with DNA-PAINT and generated image reconstructions with approximately 3 nm resolution. In addition, the algorithm was able to perform using particles with labelling densities as low as 30%. The algorithm was also tested using the NPC dataset in [32] and successfully retrieved the NPC's eightfold symmetry without requiring prior knowledge. More recently, this framework was extended to enable three-dimensional SPA [106].

    Another example is Blundell et al. [107], where DL was used to predict each particle's pose. Here, a neural network fits a three-dimensional model against a library of particles viewed in two-dimensional. The framework was able to discern the centriole's expected toroidal structure from the STORM dataset acquired in [72].

    It is important to note that, although ‘template bias’ exists when using arbitrary models as an initial reference for SPA, completely discarding prior knowledge on the structures of interest might not be the most sensible action. Indeed, the ‘template-free’ algorithms showcased so far produce reconstructions that can be further improved by taking prior knowledge into account. On this note, Wu et al. [39] developed LocMoFit. This framework extracts particle geometry from SMLM data and fits the particles to geometric models based on maximum-likelihood estimation (MLE). When the geometry of the structure of interest is known, the user can decide on the class of geometric models to be used for the fitting. Importantly, when an underlying geometry cannot be recovered, the algorithm can perform particle averaging to generate a data-driven template by combining the all-to-all registration methodology developed by Heydarian et al. [100] with MLE. Mund et al. [108] used LocMoFit to extend their work on CME developed in [43]. Here, they densely labelled clathrin in the human melanoma cell line SK-MEL-2 and used three-dimensional SMLM to resolve the endocytic sites. Then, they performed extensive quantitative descriptions of the structures analysed and were able to infer a spatial-temporal model of the endocytic process.

    5.2. Improving sensitivity to detect structure heterogeneity

    Image classification can be used to enhance SPA's lack of sensitivity to particle heterogeneity. This usually involves calculating similarity metrics (e.g. cross-correlation at different relative rotations [21,22] for particle pairs and then sorting the particles into a suitable number of classes). Several SPA frameworks mentioned so far use classification to group particles according to their viewing orientation (e.g. [40,72]). The particles' relative orientations can be calculated to produce a three-dimensional model from two-dimensional projections with this information.

    A more underlooked challenge of SPA that benefits from particle classification is the ability to detect and classify particles representing different underlying structures. This becomes particularly difficult when the various structures are not known a priori and reasonable templates cannot be provided. Huijben et al. [109] extended the pipeline developed in [100] to achieve this goal. In this implementation, the similarity metric obtained from the all-to-all registration strategy is converted into a dissimilarity metric, and multi-dimensional scaling [110] is used to translate the values of each registration pair into spatial coordinates projected in a multi-dimensional space. The multi-dimensional particles are then classified using k-means clustering [111] and each class is averaged to generate the final reconstructions. In particular, the authors provide a relatively simple strategy to determine an optimal number of classes without a priori knowledge. The framework was evaluated using a two-dimensional DNA origami test dataset imaged with DNA-PAINT [16,112], containing particles belonging to several different classes. Under these conditions, the algorithm correctly classified greater than 95% of the analysed particles. The averaged reconstructions achieved resolutions between 3.7 and 5.7 nm. Most importantly, the algorithm correctly classified an NPC with ninefold symmetry in a simulated dataset where this class was highly under-represented (2% of the total number of particles). Furthermore, the algorithm detected classes corresponding to elliptical NPCs in the Xenopus oocyte NPC dataset mentioned previously [32]. Although the presence of these elliptical structures in the data was deemed by the authors to be an artefact generated during sample preparation without any biological significance, their successful detection showcases once again the high sensitivity of the classification framework and its ability to detect unknown structures. In this context, Sabinina et al. [113] used SPA to perform quantitative descriptions and obtain an average model of the NPC. They observed that the nuclear basket protein TPR was distributed over a larger volume than expected, suggesting averaging over different NPC conformations. Accordingly, they found significant variation in descriptive parameters, such as circularity and diameter, which exceeded the expected registration error. Thus, they developed a classification method that does not depend on a priori information about the structures of interest and used it to confirm the presence of different NPC conformations.

    6. Conclusion and outlook

    Determining the fine structure of supramolecular complexes gives crucial insight into their assembly dynamics and functional capabilities. SRM excels at imaging small biological structures containing multiple components with different molecular identities at high resolutions. It successfully tackles the challenge of surpassing the diffraction limit of light by bringing together cutting-edge microscopy technology, modulation of the physical properties of fluorescent labels and advanced statistical analysis. Furthermore, the implementation of SPA in SRM substantially improves the accuracy of image reconstructions overcoming crucial caveats of SRM such as under-labelling and labelling heterogeneity. It combines information from thousands of imaged structures and produces reconstructions that effectively contain more localizations than their underlabelled counterparts. This enables the mapping of supramolecular complexes in three-dimensional and retrieving architectural features with single-digit nanometer precision. Additionally, the analysis can be performed in multiple colour channels representing different labelled molecules present in the same molecular assembly, which allows using a reference molecule to register the remaining (e.g. [40]).

    Despite the great success of SRM–SPA in mapping supramolecular complexes, its capabilities have not yet been fully explored. So far, SPA studies have focused almost exclusively on generating reconstructions of a unique and fully assembled structure. However, these same structures undergo a process of assembly comprising multiple metastable structures that share crucial molecular players. Resolving the fine structural details of intermediate structures is crucial to describing the assembly dynamics of the corresponding supramolecular complexes. There are several obstacles to mapping metastable structures. For example, the degree to which each structure is represented in the data is a function of its relative stability or frequency, resulting in a heterogeneous representation. Combined with the already existing caveats of SRM (e.g. under-labelling) increases the chances that less stable structures are under-represented, potentially to a point where it becomes hard or even impossible to acquire a sample with enough particles to accurately represent the structure. In addition, structural plasticity as observed in Shi et al. [24] and Sabinina et al. [113] also contributes to under-representation of a particular structure.

    Furthermore, the lack of temporal information due to fixed-sample imaging precludes the mapping of the sequential assembly of supramolecular complexes. This problem was addressed in Berro & Pollard [60], where a ‘brute force’ approach was taken. Briefly, CME was oversampled to ensure that a suitable number of particles representing each metastable structure were detected. Then, the assembly process's temporal hierarchy was estimated from the similarity between metastable structures. A potential solution to this problem might reside in correlating the fine structural details resolved with SRM–SPA with structural and temporal information extracted from live-cell SRM imaging. While still being limited by the constraints of live-cell SRM today, remarkable progress and future development of both fields, SPA and live-cell SRM, will be the key to elucidating the dynamics of the structural rearrangement of molecular complexes within living cells.

    Data accessibility

    This article has no additional data.

    Authors' contributions

    A.M.: conceptualization, visualization and writing—original draft; H.S.H.: conceptualization, visualization and writing—original draft; S.C.: writing—review and editing; C.L.: writing—review and editing; R.H.: conceptualization, supervision and writing—review and editing.

    All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

    Conflict of interest declaration

    The authors declare no competing interests.

    Funding

    This work was supported by the Gulbenkian Foundation (A.M., H.S.H., S.C. and R.H.) and received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement no. 101001332 to R.H.), the European Molecular Biology Organization (EMBO-2020-IG-4734 to R.H. and ALTF 499-2021 to H.S.H), the Wellcome Trust (203276/Z/16/Z to R.H.), and NHMRC (APP1183588 to S.C.). A.M. would like to acknowledge support from the Integrative Biology and Biomedicine PhD programme from Instituto Gulbenkian de Ciência.

    Acknowledgements

    We thank Marie-Christine Dabauvalle and Georg Krohne for the nuclear envelope preparation and Xiaoyu Shi for sharing the Deformed Alignment code.

    Footnotes

    The authors contributed equally.

    Special Feature: Advances in Quantitative Bioimaging. Guest edited by Ricardo Henriques, Ilaria Testa, Christophe Leterrier and Aubrey Weigel.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.