Signature of transition to supershear rupture speed in coseismic off-fault damage zone

Most earthquake ruptures propagate at speeds below the shear wave velocity within the crust, but in some rare cases, ruptures reach supershear speeds. The physics underlying the transition of natural subshear earthquakes to supershear ones is currently not fully understood. Most observational studies of supershear earthquakes have focused on determining which fault segments sustain fully-grown supershear ruptures. Experimentally cross-validated numerical models have identified some of the key ingredients required to trigger a transition to supershear speed. However, the conditions for such a transition in nature are still unclear, including the precise location of this transition. In this work, we provide theoretical and numerical insights to identify the precise location of such a transition in nature. We use fracture mechanics arguments with multiple numerical models to identify the signature of supershear transition in coseismic off-fault damage. We then cross-validate this signature with high-resolution observations of fault zone width and early aftershock distributions. We confirm that the location of the transition from subshear to supershear speed is characterized by a decrease in the width of the coseismic off-fault damage zone. We thus help refine the precise location of such a transition for natural supershear earthquakes.


Introduction
While most earthquake ruptures propagate at speeds below the shear wave speed (sub-Rayleigh regime), some earthquakes can occasionally accelerate above the shear wave speed. Such events are known as supershear earthquakes. The earthquake rupture speed, and specifically its abrupt changes, control the high-frequency radiation. 1,2 As an earthquake goes to supershear speeds, it manifests Mach fronts that produce unusually large ground motion at distances far from the fault. [3][4][5] Rupture speed also governs the spatial extent of the off-fault coseismic damage zone, i.e., the volume within the crust directly surrounding the rupture that experienced mechanical damage. [6][7][8][9][10][11] Coseismic off-fault damage refers to fractures created or reactivated in the off-fault volume due to the dynamic rupture. As variations in rupture speed affect the seismic radiation and the off-fault coseismic damage, understanding the conditions for a rupture to transition from the sub-Rayleigh to the supershear regime would greatly help constrain fault properties (geometry, friction, lithology, etc.) and traction conditions that promote supershear ruptures. In addition, knowing how, why, and where earthquakes attain supershear speeds would help in the reliable estimation of earthquake hazard assessment. 1,12 Whether supershear ruptures occur in nature was a matter of debate for a long time. While theoretical and numerical models demonstrated in the early 1970s that earthquakes could propagate at supershear speeds, [13][14][15][16][17][18][19] the absence of field observation and their extreme rarity in laboratory experiments 20 first suggested that supershear earthquakes could not exist in nature. It was not until the M w 6.5 Imperial Valley earthquake (California, 1979) that a supershear rupture was inferred for the first time. [21][22][23] Pioneering laboratory experiments [24][25][26] together with observations from the 1999 M w 7.4 Izmit and the 1999 M w 7.2 Düzce earthquakes in Turkey, 27,28 then conclusively confirmed that supershear ruptures are more common than previously expected. Supershear ruptures have now been inferred for several, albeit rare, events: the 2001 M w 7.8 Kunlun (China) earthquake, [29][30][31] the 2002 M w 7.8 Denali (Alaska) earthquake, 32,33 the 2010 M w 6.9 Qinghai (China) earthquake, 34 the 2012 M w 8.6 off-Sumatra (Indonesia) earthquake, 35 the 2013 M w 7.5 Craig (Alaska) earthquake, 36 the 2013 M w 6.7 Okhotsk (Kamtchatka) earthquake, 37,38 and most recently the 2018 M w 7.5 Palu (Indonesia) earthquake. 39,40 Several efforts have been made to identify and characterize these supershear earthquakes using various methods. 28,31,39,41,42 The methods are, in most cases, designed to reveal along which segment the rupture propagated at supershear velocities. The location of the transition from sub to supershear speeds is usually inferred as follows. The fault is subdivided into segments for which an average rupture speed is determined, using kinematic inversion methods. If the average rupture speed of a segment is deduced to be supershear, supershear transition is presumed to have occurred in between subshear and supershear segments. However, due to the lack of dense near-fault records, rupture speed estimates from seismological inversions and back-projection techniques rely on teleseismic records, leading to rough estimates of the transition's location (i.e., tens of kilometers precision). Even in a situation where near-field data exists, such as during the 2002 Denali earthquake, 32,33 records are still too sparse to precisely locate the transition.
Numerous theoretical and numerical models have been developed to characterize the mechanics of supershear transition. On planar faults with homogeneous stress and strength, the occurrence of supershear ruptures depends on the S ratio: S = (τ p − τ 0 )/(τ 0 − τ r ), where τ 0 is the background shear stress and τ p and τ r are the peak and residual frictional strengths. [15][16][17] Supershear ruptures occur if S < 1.77 in 2D, 15,17 or S < 1.19 in 3D, 43 and the fault length is long enough for transition to occur. 15 The time and location of the transition are determined by how far S is from the Burridge-Andrews threshold of 1.77.
However, natural faults do not have homogeneous stress and strength conditions before an earthquake.
Numerical studies of supershear transition have also focused on spatial heterogeneity in stress and/or strength distribution, on a planar fault [43][44][45] and on a non-planar fault, [46][47][48] while Kaneko & Lapusta 49 and Hu et al. 50 looked at the effect of the free surface on supershear transition.
Despite these efforts, the conditions for supershear transition in nature are still poorly understood because the imprecise location of the transition does not allow us to properly characterize the conditions on, or off, the fault that led to it. In addition, it also appears difficult to precisely capture this transition in both laboratory earthquakes (e.g., Mello 51 ) and numerical models. This problem is mainly because the transition happens over very small length scales (of millimeters in the lab and over a few elements in numerical methods). A new approach that focuses specifically on the characterization of the region of transition is thus required to better understand why earthquakes accelerate to supershear speeds.
Although the modeling efforts helped better understand the physics of supershear ruptures, it is not straightforward to apply that knowledge in the field, especially since we do not directly measure the stress and strength along the fault. Therefore, we propose a new approach based on the physical characterization of parameters that may promote supershear ruptures. This approach focuses on the detailed analysis of coseismic off-fault damage as a proxy to characterize the transition to supershear speeds.
In this work, we cross-validate this idea by writing a theoretical analysis and comparing it with several numerical approaches. Then, we look at the observations available for natural supershear ruptures to corroborate this approach.

Coseismic Off-Fault Damage Around The Rupture Tip While Approaching Supershear Transition: Theoretical Analysis
Coseismic off-fault damage depends on the evolution of stress during the propagation of a rupture in the crust. We use Linear Elastic Fracture Mechanics (LEFM) to provide closed-form solutions to quantify the state of stress around the rupture tip.
Consider a semi-infinite plain-strain crack that moves at speed v ≤ c R , where c R is the Rayleigh wave speed. Let the origin of the polar coordinate system (r, θ) coincide with the crack tip. The near-tip stress, which also depends on the rupture speed v, is given as, 52 where σ 0 αβ is the initial stress state and K dyn II is the dynamic stress intensity factor which can be approximated as where c p , the P-wave speed, is the limiting speed for a mode II crack andL = t 0 v(t)dt is the current crack length and c R is the Rayleigh wave speed. 52 Here K sta II (L) = ∆τ √ πL is the static stress intensity factor of the crack and ∆τ is the stress drop defined as σ 0 yx − τ r . Here σ 0 yx is the initial shear stress on the fault and τ r is the residual frictional strength. This solution by Freund 53 allows us to transform a static solution into a dynamic one. The rupture velocity dependence of the stress field makes it undergo a Lorentz-like contraction affecting both the stress field and its angular distribution as the rupture speed approaches the Rayleigh wave speed, c R . 52 This contraction has already been observed and verified experimentally for ruptures approaching the Rayleigh wave speed. 54 For a crack-like rupture, there is a competing effect of the increase in stress intensity factor with increasing crack length. If it is a stable slip pulse of fixed length, then the effect of the crack length is invariant with rupture speed.
When an earthquake rupture transitions to the supershear regime, the rupture first accelerates to the Rayleigh wave speed. Within the framework of LEFM, as the rupture approaches the Rayleigh wave speed, K dyn II monotonically decreases to zero, strongly reducing the stress concentration at the rupture tip. Thus the off-fault domain affected by this stress concentration also shrinks. We illustrate this effect by calculating the extent of the region where the stress state exceeds the Drucker-Prager failure criterion, i. e. F DP ≥ 0, where Here p ≡ I 1 /3 = σ kk /3 is the hydrostatic stress derived from the first invariant of the stress tensor, I 1 , J 2 = s ij s ji /2 corresponds to the second invariant of the deviatoric stress tensor (s ij = σ ij − pδ ij ) and tan φ = f is the static coefficient of friction. This yield criterion, effectively a smooth approximation to the Mohr-Coulomb law using invariants of the stress tensor, 55 takes into account all possible planes of slip, as opposed to Mohr-Coulomb where the potential slip planes need to be defined a priori or are optimally oriented.
Using the LEFM solutions for a mode II crack, we can compute F DP . The closed-form expressions are quite long, but the key variables that affect the solution can be written as Thus, given a rupture velocity history, v(t), one can then calculate the region around the fault where the Drucker-Prager yield criterion is violated.
We illustrate the solution in Figure 1. Consider a fault in an elastic medium subject to an initial stress state (see Table 1 for all the parameters used). Assume that the rupture is accelerating to the Rayleigh wave speed (Figure 1a). The instantaneous static stress intensity factor increases with the increasing length of the crack (Figure 1b). However, Freund's universal function, F II (v), monotonically decreases to 0 ( Figure 1c). Consequently, the dynamic stress intensity factor first increases with increasing crack length but soon starts decreasing as it is dominated by F II (v) (Figure 1d). Thus, we should expect to see an initial increase in the spatial extent of the damage followed by a reduction as the rupture approaches the limiting speed. We illustrate this effect at key moments in the rupture's history (marked by points numbered 1 through 9 in Figure 1d). First, we compute, in Figure 1e, the 'instantaneous' damage, i. e., not accounting for the damage accumulated from the past history of the rupture around the rupture tip.
We observe that the spatial extent of the domain where Drucker-Prager failure criteria is violated first increases (time points 1 through 3) and then decreases (time points 4 through 9) as the rupture accelerates.
We can now superimpose all the snapshots of instantaneous damage to visualize the 'cumulative' damage accumulated by the rupture throughout its history ( Figure 1f). Note that this last exercise is purely for illustrative purposes as the LEFM solution does not really account for the history of the rupture. These limitations will be overcome in the following sections. Nevertheless, to the first order, this approach illustrates that, as the rupture approaches the Rayleigh wave speed, the width of the coseismic off-fault damage zone should decrease significantly.
It is also worth noting that in the calculations above, the off-fault medium remains elastic. Thus, the rupture is insensitive to changes in the constitutive response of the medium due to coseismic offfault damage. To resolve these issues, we need to take advantage of numerical simulations that allow for proper constitutive descriptions of the off-fault medium, including feedback from changes in elastic properties on the rupture behavior and that also allow for transition from subshear to supershear speeds.

.1 Modeling strategies
The two numerical models presented in this study account for fracture damage-dominated brittle rheology. Earthquake ruptures induce large dynamic strain rates ( ≥ 1s −1 ) around the rupture tip that cannot be fully accounted for by classical Mohr-Coulomb or Drucker-Prager plasticity, typically used low strain rates ( ≤ 10 −5 s −1 ) and the two models presented below allow to account for such large dynamic strain.
Okubo et al. 10,11 explicitly model the sub-kilometric off-fault fractures using a continuum-discontinuum numerical framework. The micromechanical model 8,9 accounts for coseismically activated "microfractures" (∼10's of meters) by computing the dynamic changes in the constitutive response due to these fractures. The difference between the two models is essentially in the modeled off-fault fracture networks scale, leading to two distinct modeling strategies. In essence, the two brittle failure models are complementary, and they should, ideally, be combined. Nevertheless, they are good proxies for modeling off-fault damage at various scales, and we further show that they produce similar results.
To reproduce coseismic damage at relatively large scales (i.e., 50m to several km), Okubo et al. 10,11 use the software suite HOSSedu, developed by the Los Alamos National Laboratory (LANL). 57,58 The numerical algorithms behind this tool are based on the combined Finite-Discrete Element Method (FDEM) proposed by Munjiza,59 to produce dynamically activated off-fault fracture networks. The key feature here is that FDEM allows each interface between the finite elements, describing the off-fault medium, to have its own tensile and shear failure criterion. Thus, these interfaces can rupture under appropriate traction conditions. When the earthquake rupture propagates, the dynamic stress field around its tip will increase to violate the tensile or shear failure criteria leading to off-fault fracture damage. The lower limit of fracture resolution is around 50m, which is defined by the minimum size of the discretized mesh. For a further detailed description of the method, see Okubo et al. 10,11 The second modeling strategy relies on laboratory experiments 60-62 and field observations 63, 64 that show significant changes in elastic properties related to fracturing. Observations in the field show up to 40% coseismic reduction in P-and S-wave velocities on spatial scales of hundreds of meters normal to the fault and few kilometers in depth. Following the damage constitutive laws proposed by Ashby & Sammis, 65 Deshpande and Evans, 66 Bhat et al., 67 Thomas et al. 8 implemented an energy-based micromechanical brittle rheology, as developed by Rice, 68 to account for such dynamic change of bulk rheological properties during earthquakes. In short, at each equilibrium state, the Gibbs free energy density Ψ of the damaged solid is defined as the sum of (1) the free energy Ψ e of a solid, without flaws, deforming purely elastically and (2) the free energy Ψ i corresponding to the contribution of the current set of microcracks. Using thermodynamic arguments, Ψ = Ψ e + Ψ i is used to derive the new stressstrain constitutive law and the changes of elastic properties in the medium at the equilibrium stage. In the model, following laboratory experiments, the evolution of Ψ i is determined by taking into account the effect of loading rate and crack-tip velocities on crack growth (see Zhang & Zhao 69 for an overview).
A complete description of the model can be found in Bhat et al. 67 and Thomas et al. 8

.2 Model set up
For comparison purposes, we set up the two brittle rheology models to be as close to each other as possible. In both cases, we consider a 2-D plane strain medium, with a 1-D right lateral fault embedded in it and loaded by uniform background stresses. The maximum compressive stress σ 1 , and the minimum compressive stress σ 3 are in the x − y plane, whereas the intermediate principal stress σ 2 coincides with σ zz . The fault plane makes an angle of 60 o with σ 1 with normal uniform traction (σ 0 yy ) and shear traction (σ 0 xy ) everywhere except in the nucleation zone of the micromechanical model where the shear traction is slightly above the nominal static strength. In the FDEM model, the dynamic rupture is initiated by locally decreasing the static friction, in the nucleation zone, instead of the local change of σ 0 xy . These different nucleation strategies do not affect the results since this study is focused on damage occurring when the rupture is dynamic (sub and supershear). For each model, the initial shear stress and the S-ratio were chosen so that the rupture transitions to supershear speed reasonably early on.
Rupture propagation along the main fault plane is governed by a slip-weakening friction law. 70 Static friction is set at 0.6, which corresponds to a value measured in laboratory experiments for a large range of rocks, 71 and dynamic friction at 0.1 as observed in high slip-rate experiments. 72 The fault length is 64km for the micromechanical model and 115km for the FDEM model. The domain width is determined based on the numerical method and on the scale of the fracture networks accounted for in the calculations. For the micromechanical model, the width around the fault is 6km with absorbing boundary conditions to avoid reflections interfering with the propagating dynamic rupture. For the FDEM model, we simply set the domain large enough (86km) so that the reflections do not arrive on the fault over the computation duration. We non-dimensionalize all length scales by the static size of the process zone (R 0 ) so that the models could be compared to natural or laboratory earthquakes. R 0 is defined as Here ν is Poisson's ratio, µ is shear modulus, G is the fracture energy associated with the slip-weakening law, τ p and τ r are the static and dynamic frictional strengths, respectively.
Reference values for the common parameters between the two models are summarized in Table 1, whereas parameters specific to different modeling strategies are provided in the Supplementary Information (Tables S1 & S2). Schematics and parameters for the simulations can also be found in Figure   S1.

.3 Influence of rupture dynamics on off-fault damage
To understand the complex interaction between the main fault and the surrounding medium undergoing coseismic damage, it is necessary to simultaneously look at the rupture dynamics, the associated stress field around the fault, and the triggered damage in bulk. Here we use the micromechanical model to underline the key features that arise from numerical studies. We note that the same conclusions can be drawn from the FDEM model.
In the micromechanical model, under a compressive regime, damage occurs by the growth of preexisting "flaws". They represent the faults, joints, cracks, mineral twins, defects in the crystal structure, grain boundaries, etc., we observe in nature. Frictional sliding occurs on these pre-existing fractures when the shear stress overcomes the frictional resistance acting on the fracture interface. As the faces slide in opposing directions, it creates a tensile wedging force that opens wing cracks at the tips of the shear fracture. The wing cracks grow when their stress intensity factors overcome the fracture toughness.
In Figure 2 we display the Drucker-Prager criterion to emphasize the regions where shear sliding is likely to occur, i.e., F DP > 0 (equation 4), and therefore where we expect wing cracks to grow. Fault slip rate (white curves) and cumulative damage (greyscale) are superimposed on each snapshot of F DP . New coseismic damage induced by the displayed stress field is highlighted in red. As the rupture propagates below the Rayleigh wave speed (Figure 2a), the damage is essentially generated behind the rupture front, in the tensional quadrant. This feature has been seen by Okubo et al. 10 as well, using the FDEM model. Damage also occurs ahead of the rupture front, where the S-wave field concentrates, although the resulting damage density is much smaller than behind the rupture front. The location of newly formed cracks corresponds to the region where F DP is positive. The transition to supershear velocity ( Figure   2b) directly impacts the generation of coseismic damage. Around t = 6.5s a new pulse is generated ahead of the rupture (Figure 3b). The transition to supershear velocity coincides with a decrease in the width of the damage zone (Figures 2b and 3b-d). As we argued in section 2. , this is likely related to the decrease in the stress intensity factor as the rupture speed increases. When approaching the Rayleigh wave speed, K dyn II becomes small, even if the total length of the fault that has ruptured (L in equation 3), has increased significantly. Moreover, a snapshot of F DP at t = 7.2s (Figure 2b) shows that the first pulse is not strong enough to change the state of stress optimally, thus reducing even further K dyn II . These two factors likely explain why damage mostly occurs behind the front of the pulse propagating at sub-Rayleigh rupture speed. After t = 7.2s, the second pulse ahead of the rupture front becomes stronger and the first pulse weaker (Figure 3d-e). Consequently, damage now essentially occurs behind the rupture front propagating at supershear speed and in the area behind the Mach cone ( Figure 2c). Therefore, even if the sub-Rayleigh pulse propagates now inside the "transition zone", it induces little to no further damage. Therefore, this local reduction in the damage zone width is expected to remain unchanged for the earthquake duration and could, in principle, be observed in the field.
Finally, it is worth noticing that while the surrounding medium records the transition to supershear rupture (via the damage zone width), the final displacement along the fault, that geodesy could provide, carries no such information. As one can see in Figure 3a, the slip continues to accumulate in the "transition zone" after the passage of the rupture, while the damage zone width remains unchanged ( Figure   2c).

.4 Comparison between the different modeling strategies
We now perform a similar exercise using the FDEM model and choose parameters such that the micromechanical and FDEM models are as close to each other as possible. We adopt a slightly different yield criterion to reflect on the way damage is accounted for in this numerical method (see section 3.
where c is the cohesion and R M C is given by: where where J 3 is the third invariant of the deviatoric stress tensor.   . They conducted a 2D in-plane dynamic rupture modeling with Drucker-Prager elasto-plastic constitutive law. In their model, the degree of damage is inferred from the accumulation of plastic strain. Although plasticity, as a proxy for damage, essentially holds only at low-strain rates, 69, 74 they nevertheless pointed out a "remarkable contraction" in the damage zone width associated with supershear transition.
We can thus conclude, quite confidently, that this reduction in damage zone width (a gap in off-fault fractures) is a universal characteristic of supershear transition and is insensitive to the constitutive law used to model damage.

Natural observations of coseismic off-fault damage
Theoretical and numerical models, with three different rheological descriptions of off-fault damage, 7,8,10 all suggest that the region affected by the stress field around the rupture tip shrinks during the supershear transition. This shrinkage of the stress field results in a narrow off-fault damage zone. We now look for a direct (or indirect) signature of this reduced damage zone in the case of natural supershear earthquakes, verifying their location with corresponding kinematic models that invert for rupture speeds.
Vallage et al. 75 and Klinger et al. 76 showed that there is a one-to-one relationship between the features of the displacement field, around the fault, of an earthquake and off-fault fracture damage.
Using the FDEM method described earlier, Klinger et al. 76 showed that the observed, spatially diffuse, displacement field around a fault is due to displacement accommodated by off-fault fractures (see Figure   3 in 76 ). This feature is in stark contrast with the sharp displacement field obtained, around a fault, when displacement is only accommodated by the main fault. Thus deviations from this sharp displacement field would allow us to characterize the width of the damage zone around a fault.
We can also make a reasonable assumption that the newly created/reactivated off-fault fractures are likely to host early off-fault aftershocks (based on the observed stress field). We should then expect a reduction in the spatial extent of the early off-fault aftershocks at the location of the supershear transition.
A more thorough analysis would involve continuing the simulations described earlier for up to a week, or so, after the earthquake. However, computational limitations make this beyond the scope of the present study.

.1 Optical image correlation to observe off-fault coseismic damage
Recent developments in satellite optical image analysis and sub-pixel correlation methods allow for detecting displacement variations due to an earthquake down to sub-metric resolutions (e.g., Vallage et al., 75 Barnhart et al., 77 Delorme et al. 78 and Ajorlou et al. 79 ). These methods enable characterizing the surface rupture geometry, the amount of surface displacement, and the width of the zone affected by this displacement after an earthquake, also referred to as the "fault zone width". 80 The fault zone width reflects the lateral extent of rock damage, and secondary deformation features around the fault. 81,82 In the field, the fault zone width corresponds to the last deformation features observed when moving away from the fault core. 81 The same definition is employed in geodetic studies providing a more detailed characterization of this region due to the density and resolution of the observations. 76 The 2001 M w 7.8 Kunlun (China) earthquake is a strike-slip event with ∼ 400km long surface rupture and a mean rupture speed between 3.3-3.9 km/s, which is larger than the shear wave speed and hence reported as a supershear rupture. 29,30 This earthquake was well recorded by satellite images, allowing us to investigate the spatial distribution of coseismic off-fault damage by focusing on the fault zone width. We use SPOT-1 to SPOT-4 images, covering the 2001 Kunlun fault area from 1988 to 2004 with a ground resolution of 10m, allowing for change detection of less than 1m. 75,83 We focus on the central part of the rupture dominated by fault parallel motion, 84,85 obtaining through optical image correlation the horizontal displacement field for the earthquake east-west and north-south components ( Figure S2).
The maximum strike-slip displacement for this area of the rupture is around 8 m. 84,85 Prior to correlation, SPOT images are corrected from viewing angle and topography using a unique  Figure S2 to see the data employed in this study). The MicMac method enables preserving the resolution of the input images, that is 10m, since it measures the displacement at every pixel location in the pre-earthquake image. The size of the pixel pattern considered in the pre-earthquake images, and that is searched for in the post-earthquake ones, is 5 pixels, so 50m.  Figure S3 for a profile example). This procedure allows us to evaluate the fault zone width in the strike-slip direction, over 396 profiles (Fig. 5b). Field measurements of fault zone widths are at least a few tens to a few hundred meters in the case of this earthquake. 84,85 Thus, we infer that the fault zone width can be estimated using displacement maps, employing satellite image correlation.  [29][30][31] The onset of slip-partitioning cannot cause a reduction in the damage zone width because the normal faulting motion was quite likely triggered by the dynamic stress field of the main supershear segment. 6 This means that the overall dynamics in that region was still dominated by the rupture on the strike-slip segment. With these observations, and considering the theoretical analysis and numerical modeling conducted in the previous sections, we interpret this localized fault zone width reduction as a shrinkage of the off-fault damage zone associated with the rupture transitioning to supershear speeds (e.g., Figure 2). Hence, this along-strike feature seen by the optical correlation analysis represents a natural observation of the supershear transition zone.

.2 Distribution of early aftershocks as a proxy for off-fault coseismic damage
Assuming that the nucleation of early aftershocks is mainly governed by the stress state left in the wake of the earthquake, we test whether the extent of coseismic off-fault damage in nature manifests in the spatial distribution of early off-fault aftershocks. The logic is that considering the state of stress in the wake of the earthquake, weakened regions will preferentially host early aftershocks, and thus we should observe a region with less events where the rupture transitioned to supershear. When re-analyzing the regions where established supershear ruptures transitioned from sub-Rayleigh to supershear speeds, we might find local minima in the spatial extent of early aftershocks. If this region coincides with kinematic observations of supershear transition, we would have an independent and more precise location of supershear transition. velocities of about 5.5km/s were reported, 32 and 1-year long catalog of aftershocks is available 88 (Figure   6b for 1-month aftershock locations). The 2013 M w 7.5 Craig earthquake (Alaska) was also inferred as a supershear event, with rupture velocities between 5.5 to 6.0km/s 36 and a 5-month aftershock catalog is available 89 (Figure 6c for 1-month aftershock locations).
We observe a paucity in aftershocks where the ruptures have transitioned to supershear, as highlighted by the pink boxes in Figures 7b, d, and f. We quantify such lack of aftershocks by computing the cumulative moment density released by aftershocks, considering those located within l a = 5km of the fault trace (see the Supplementary Information for the same analysis considering l a = 2.5km, Figure   S6). We assume that each aftershock, having a seismic moment M 0 , is a circular crack and compute the slip distribution δ(r) using: 90 where r is the crack radius, µ = 30GPa is the shear modulus, and ∆τ is the stress drop, assumed to be 3MPa. We bin the aftershock crack radius along-strike, based on the minimum aftershock magnitude for each case (Izmit M w =2.4, Denali M w =1.5, and Craig M w =1.7). Thus the binning sizes employed to discretize the main fault are 90m for Izmit, 32m for Denali, and 40m for Craig earthquake. With the slip distribution, the cumulative seismic moment density of the i th bin containing N aftershocks is given by Note that δ (i) j is part of the slip distribution of the j th aftershock projected on the i th fault segment. We also note that we do not compute the full moment density tensor as we do not have focal mechanisms for all the aftershocks. As we are interested in density relative spatial variation (along the strike of the main fault) of this quantity, the above approximation is adequate. We compute the seismic moment density over two periods following the mainshock: 1 and 3 weeks (Figures 7a, c, and e). Focusing on the region where the rupture is expected to transition to the supershear regime (based on kinematic models), we systematically observe a small area characterized by a reduced seismic moment density and lack of aftershocks (Figures 7b, d, and f, pink boxes). The extent of these regions is different for each earthquake c. f., 3.6km for Izmit, 6km for Denali, and 3.2km for Craig. The difference in the smoothness of the cumulative moment curves for different earthquakes is due to the number of aftershocks available for the calculation. In the Craig earthquake case, we have 68 (83)   In order to evaluate the robustness of our findings, we perform an uncertainty analysis of the seismic moment density estimation by accounting for errors in the location of aftershocks in each catalog. We employ a multivariate normal distribution {x} = N 2 (µ a , σ a ), assuming that the location uncertainty of aftershocks is normally distributed. Here, µ a corresponds to the aftershock epicenter, while σ a is the covariance matrix of the location. This procedure allows us to generate, randomly, 10000 synthetic catalogs and repeat the along-strike moment density evaluation with time for each of the 10000 catalogs, deriving the mean and the standard deviation (1-σ and 10-σ) of the spatio-temporal evolution of seismic moment density (red areas in Figure 8 for 1-week, denoting the mean of the seismic moment density on the main fault ± the standard deviation). Such analysis suggests that despite potential changes in the spatio-temporal distribution of seismic moment density, our conclusions are robust (see the Supplementary Information for a comparison between 1 and 3-week using 1-σ and 10-σ, Figures S7, S8, and S9).
We consider 3-week as the maximum reasonable period to investigate the evolution of the seismic moment for these earthquakes to avoid potential effects of postseismic deformation. Afterslip is proposed as a mechanism driving the aftershock triggering (e.g., Hsu et al. 91 and Ross et al. 92 ). For each studied event, the inferred transitional region co-locates with areas where some authors report afterslip occurrence. [93][94][95] Therefore, the observed gap in the early aftershocks productivity (less than 3-week after the mainshock) is mainly related to the mainshock rupture. It is worth noting that results for the M w 7.8 Kunlun earthquake reported by Robinson et al., 30 employing Harvard CMT solutions, also allude to the same conclusion. However, due to the lack of high spatio-temporal density of aftershocks in their catalog, we are more confident in our results using optical correlation techniques to characterize the transition zone.
We note that in the distribution of moment plotted for the entire fault profile (Figure 7), local gaps in aftershock density exist and do not necessarily indicate a supershear transition. We hypothesize that such gaps could also be related to abrupt local changes in rupture speeds, off-fault medium strength and fault geometrical complexities which warrants further investigation.
To conclude, we show that the reduction in the spatial extent of early aftershocks (≤ 3-week after the mainshock), and the cumulative seismic moment density around the fault (≤ 5km on either side of the fault) is associated with the transition to supershear speed. This type of analysis also helps us identify, more precisely than kinematic models, the region where supershear transition occurs.

Conclusion and Discussion
Using theoretical arguments and numerical models, that account for coseismic off-fault damage, we have shown that supershear transition is characterized by a significant reduction in the width of the off-fault damage zone. This feature is due to a Lorentz-like contraction of the spatial extent of the stress field around a rupture tip. We then cross-validated this phenomenon with natural observations of coseismic off-fault damage zone width using image correlation and analysis of aftershock catalogs. We confirm that supershear transition is indeed characterized by a significant reduction in the width of the off-fault Our results are in general agreement with the published kinematic models for the Izmit, 28,96 Denali, 32,33 Craig, 36 and Kunlun 29-31 earthquakes, relative to the location where the rupture accelerates to supershear speeds. Our inferred location of the transition zone can sometimes be different from those found in kinematic inversions but only by a few kilometers.
While the theoretical and numerical models are still idealized (modeled on a 1D planar fault with uniform traction and frictional strength), the reduction in the width of the coseismic off-fault damage zone, related to supershear transition, is observed for natural earthquakes (Figures 5 and 7). This approach could thus be used to refine and precisely define the zone of supershear transition inferred from coarser kinematic models. Moreover, as the zone of supershear transition is now narrowed down to a few kilometers, it could also be used to guide future fieldwork to study the segment of the fault where such a transition occurred. A better understanding of the physical conditions for supershear transition might help us foresee the location of future supershear transition, although, as of now, this remains a difficult task.
The lack of aftershocks on the segment experiencing supershear rupture has been debated in the past.
Bouchon & Karabulut, 41 for instance, already concluded that the entire supershear segment collocated with a region of aftershock quiescence. To explain this, they claimed that the stress and strength conditions were more homogeneous on the supershear segments, 41 impeding large ground accelerations near the fault. 28,32 Our work, in contrast, focuses on characterizing the region where the rupture transitions to supershear speed, before the said supershear segment.
The results of this study are valid for a well-developed sub-Rayleigh rupture that transitions to supershear speeds. However, the 2018 M w 7.5 Palu (Indonesia) earthquake might have either nucleated directly at a supershear speed, or transitioned very early. 39,40,97,98 The model we develop here might provide insights to understand whether an earthquake can nucleate and propagate directly at supershear speeds, and if so, why that would be the case.

Data Access
All the catalogs and numerical simulation results employed in this work have been obtained from pub-         Figure S4).    but considering a 10-σ standard deviation on the calculation.