The scaling structure of the global road network

Because of increasing global urbanization and its immediate consequences, including changes in patterns of food demand, circulation and land use, the next century will witness a major increase in the extent of paved roads built worldwide. To model the effects of this increase, it is crucial to understand whether possible self-organized patterns are inherent in the global road network structure. Here, we use the largest updated database comprising all major roads on the Earth, together with global urban and cropland inventories, to suggest that road length distributions within croplands are indistinguishable from urban ones, once rescaled to account for the difference in mean road length. Such similarity extends to road length distributions within urban or agricultural domains of a given area. We find two distinct regimes for the scaling of the mean road length with the associated area, holding in general at small and at large values of the latter. In suitably large urban and cropland domains, we find that mean and total road lengths increase linearly with their domain area, differently from earlier suggestions. Scaling regimes suggest that simple and universal mechanisms regulate urban and cropland road expansion at the global scale. As such, our findings bear implications for global road infrastructure growth based on land-use change and for planning policies sustaining urban expansions.


I. INTRODUCTION
Modern civilizations developed along with road networks, simple and efficient systems designed to colonize free land, improve human mobility and move goods among locations.Today, the road system grooves and fragments the 19 million hectare surface of the Earth with more than 14 million km of paved surface.At the backbone of human colonization, road expansion embodies a complex blend of economic growth and often unsustainable development [1,2].From an economic perspective, road expansion has typically been associated with economic growth, poverty reduction [3] and urbanization processes [4].However, roads cutting through ecosystems may cause severe environmental degradation, like habitat fragmentation and biodiversity loss, facilitating urban sprawl and efficient deforestation [5,6].Pressure from global population growth with the resulting increase in food demand [7,8] and from the ongoing global urban transition [9,10] will foster a massive road expansion in the upcoming decades.It has been estimated that the total paved length will increase by an additional 25 million km by 2050 [5].Controlling such an expansion will be of crucial importance for global environmental conservation strategies and sustainable agricultural development.Yet despite the fundamental role of road expansion for global human-environment relations and some attempts made to reconcile their double-edged consequences [5], a quantitative and exhaustive description of the structure of the global road network necessary sufficient to model this expansion is currently missing.
Statistical laws governing road networks have been extensively explored [11,12].Such studies investigated issues such as scaling [13][14][15][16], road centrality [17], evolution [4,[18][19][20] and urban sprawl [21], within the general domain of complex networks analysis [22].However, such efforts have focused on urban road networks, neglecting connections among nodes serving areas dedicated to non-urban land uses.Here, by coupling a detailed data set on the Global Road Network (GRN) with global land-use inventories, we provide an analysis of the structure of the network of major roads as of year 2015 and examine its dependence on land use.Such analysis is carried out by studying the distributions of road lengths and their scaling relationships.

RESULTS
The Gobal Road Network (GRN) has been extracted from commercial vectorial maps of major roads on Earth that are used mostly for navigation and cartography [see Materials and Methods (MM)].The GRN contains the major roads network for the year 2015 organized in four hierarchies: primary roads with limited access (H1), primary roads with non-limited access (H2), secondary roads (H3) and local roads (H4).The GRN does not contain local urban roads but only the major ones; therefore, the GRN can be used to analyze the major road infrastructure but not the urban morphology related to urban block size and form.The total road length deduced from the GRN database (14,522,470 km in 2015) is much larger than that estimated from a data set recently used for global road environmental impact estimation (gROADS) [5,23], which accounts for a total road length of 7,644,410 km.The gROADS data set comprises only hierarchies H1, H2 and H3 for the year 2009.Comparing gROADS with the correspondent subset of GRN, one can estimate an annual growth of 5.7 % between 2009 and 2015.However, by considering all GRN road hierarchies (H1 ∪ H2 ∪ H3 ∪ H4), up to 30% of the total road length at the global scale was not represented in previous analyses.
Starting from the original database, we produced the GRN as a primal road network [24], in which nodes are the road junctions and links are road segments and each link carries a weight indicating its length (l).In order to extract the final GRN network, we removed street junctions connecting only two roads and split each link, regardless of hierarchy, at any intersection with three or more roads.Defining three layers of major land uses-urban, cropland, and seminatural-we labeled each road with the land use it belongs to.These three land-use classes, corresponding to the three main land uses on Earth, are extensively used in the global land-use literature [25].We extracted the global urban footprint for 2013 from night-time lights (NTL) [26] and cropland from a recent global cropland inventory for 2005 [27].We then assigned each road to three mutually exclusive land-use classes as follows: roads have been labeled as urban (U ) if they totally belong to urban areas, cropland (C) if they totally or partially overlap a cropland area, and seminatural (Sn) if they are free of any agricultural or urban land use (that is, they are not dominated by direct human presence, e.g.roads crossing remote areas or natural parks).Figure 1A shows a global overview of the GRN and the three classes.It is important to note that the adopted labeling methodology allows us to avoid multiple land-use associations and treat croplands differently, as a single road segment can connect more than one cropland unit.A detailed description of the data set along with an atlas of detailed visualizations and spatial analysis methodologies are provided in MM and Supplementary Materials (SM).Key to our spatial inventory at the global scale [8,25], we show that the threshold used to discriminate land-use classes, such as illumination threshold, does not affect our main results (see SM). Moreover considering that cropland had very little growth in the last ten years [8], data from different years cannot effect the main results.
We here focused on a seemingly simple yet fundamental road network feature: the lengths (l) of road segments [28,29].The GRN in the year 2015 spanned a total length of 14,522,470 km divided into more than 3 million road segments, with mean road (segment) length independent of land use l = 4.8 km [with standard deviation (SD) 8.9 km].
The U , C and Sn classes cover respectively 12%, 37% and 51% of the GRN by road length as shown in Fig. 1B.These percentages depend mildly on the threshold levels used to discriminate between classes (see MM and SM).The mean road lengths in the different land-use classes are l U = 1.2 km (SD 1.3 km), l C = 7.4 km (SD 9.0 km) and l Sn = 7.0 km (SD 12.0 km).The large standard deviation values highlight an important feature of road length distributions: these distributions are heavy-tailed and potentially reminiscent of power-laws [30], and thus are not grouped tightly around a typical value.The mean road length in different land-use classes highlights a second feature of these distributions: C and Sn roads are generally longer than U roads.Indeed, the distribution of U road lengths is very different from that of C and Sn segment lengths.Expectedly, cities encompass shorter streets than agricultural or seminatural areas.But apart from these different mean lengths, how fundamentally different are these distributions?Are they possibly rescaled versions of the same universal distribution, obtained by varying only the mean road length?To test this hypothesis, we examined whether road length distributions can be described by a finite-size scaling form [31][32][33]: where l is the mean road length in the land-use class of interest and the function F (•) is identical across classes.Normalization of the distribution requires the exponents γ and α must satisfy α = 1/(2 − γ) [33].Furthermore, F (x) must satisfy appropriate limiting behaviors as x → 0 and x → ∞ (see [33] for details).We verified our hypothesis (Eq. 1) on the functional form of p(l) by data collapse [34], i.e. by plotting l γ p(l) versus l/ l α for each land-use class separately and varying α until the curves describing each class collapse onto the same curve, thus providing a plot of the scaling function F .Fig. 1C shows the plot of p(l) for each land use and Fig. 1D shows the resulting plot with α = 1 and γ = 1 (see MM for details on probability distribution collapse).We find that road length distributions associated with urban and cropland classes collapse onto nearly the same curve, whereas the road length distribution associated with seminatural areas fails to do so.Different definitions of urban boundaries and different methods for the classification of roads crossing land-use boundaries led to indistinguishable results (see SM).Therefore, road length distributions in urban and cropland areas share the same fundamental structure at the global scale, despite large differences in mean road length.Eq. 1 describes the ensemble distributions of road lengths obtained by grouping roads belonging to urban and cropland patches of different extents.Following previous approaches [35,36], we fragmented urban land use into separate components by means of spatial contiguity of urbanized and cropland cells, extracting all urban and cropland patches on Earth (see MM).As in our global analysis, we labeled each road with the land use, as well as the area of the patch it belongs too.We then study the road length distribution as a function of patch area for urban and cropland patches (Fig. 2A,D) finding different scaling behaviors in urban patches compared to cropland ones as explained below.

Cropland
We found that the mean road length l|A U within an urban patch of area A increases sub-linearly with A, l|A U ∝ A δ with δ = 0.41 ± 0.02 (mean ± standard error estimated via least-squares fit of log-transformed data) for areas below A U T 4 • 10 7 m 2 , above which it remains relatively constant (Fig. 2B, inset).The distribution p U (l|A) of urban road lengths conditional on the urban patch area A (Fig. 2B) appears to be well described by the scaling form: where G U is a scaling function with suitable properties [33], as verified by data collapse (Fig. 2A, B).For each patch, we computed the total road length L and found that mean total road length in urban patches increases sub-linearly with A, L U ∝ A β with β = 0.62 ± 0.01 (again, via least-squares fit of log-transformed data) below A U T , above which it becomes effectively linear (β = 1.06 ± 0.02) (Fig. 2C).Since l|A U is relatively constant above A U T , the linear scaling of L U implies that the mean total number of roads increases linearly with A above A U T .Road length statistics in cropland patches also display multiple scaling regimes, although with different behavior compared to urban areas.Specifically, the mean road length l|A C associated with a cropland patch of area A displays a triphasic behavior (Fig. 2E, inset).Initially, l|A C increases until A 10 5 m 2 , although very few crop patches are found below this scale and we will thus neglect these data in our discussion.Then, l|A C decreases until A C T 10 9 m 2 and remains relatively constant above A C T .The distribution p C (l|A) of cropland road lengths conditional on the cropland patch area A (Fig. 2D) appears to be well described by the scaling form: where G C is a scaling function with suitable properties [33], as demonstrated by data collapse in Fig. 2D, E. Analogous to the distribution of urban road lengths, the distribution p C (l|A) is invariant for all cropland areas larger than A C T .Conversely, the mean total road length in cropland patches (Fig. 2F) is approximately constant below A C T and increases linearly above this threshold (exponent β = 0.95 ± 0.05, estimated via least-squares fit of log-transformed data), implying that the mean total number of cropland roads also increases linearly with A above A C T .We thus found that the average road length versus cropland patch area is well described by < l|A > C with A C T = (7.4± 0.5) • 10 7 m 2 .The above results lead to a natural question: are the scaling functions G C and G U related? Superimposing the two scaling functions (Fig. 2B and Fig. 2E, as shown in Fig. 3A) suggests that G C G U , such that Eqs. 2 and 3 approximately coincide, although with different dependencies of l|A U and l|A C on A.
Scaling structure of road network can be also investigated by observing the recursive fragmentation dependent on the hierarchy of roads.The GRN subdivides the surface of the Earth into non-overlapping regions called faces (roads networks are planar graphs consisting of a series of land cells suitably surrounded by road segments [4]).We defined ensembles of faces for each road hierarchy (H1 − H4) and investigated the probabilistic relationship between the size of any face and the length of the roads within it at each scale of observation.For example, by considering only the coarse-grained view provided by highway faces alone (H1), it is possible to extract a series of large faces and study how they are fragmented at the smaller scale by major roads (H2).Then, similarly, we can study how the faces of major roads (H2) are fragmented by secondary roads (H3) and so on.Fig. ?? gives a global overview of the hierarchical organization of GRN and Fig. ?? illustrates the procedure of nested fragmentation.Formally, once all road segments are labeled by their hierarchical class, E H(i) , a protocol has been set up to assign each of them to a face of area A H(i−1) , on which we speculate the length distribution of such road segments is conditional.This is equivalent to a coarse-graining procedure.We binned the areas of all faces (i.e., at each scale of observation) into ten log-binned intervals (A (k) , k = 1, . . ., 10) to account for variable numerosity of the samples [30].We collected the road lengths {l} A (k) (k = 1, . . ., 10) belonging to the faces whose areas are included in the k th face area bin (irrespective of the hierarchical class of these roads) and computed the relative proportion p(l|A) = p(l|A (k) ) that measures the probability to find a road of a given length in the area bin A (k) .We then tested whether the curves l γ p(l|A) plotted against l/ l|A α (where l|A is the average road length in the set {l} k ) for each of the area bins collapse onto the same curve, i.e., whether p(l|A) displays finite-size scaling (Fig. 4D,E).By using the full data set, a satisfactory collapse has been found for α = 1.1 and γ = 1.1 as shown in Fig. 4 D,E.Such collapse indicates a universal scaling curve that regulates the fragmentation of the road network at all scales of observation.and 2E, demonstrating that the scaling functions GU and GC coincide, further confirming the universality of road length distributions in different land-use classes.(B) According to our approximation (Materials and Methods), the ensemble distribution of urban road lengths (dashed red curve here and in Fig. 1C) coincides with the distribution of urban road lengths belonging to urban patches larger than A > 10 8 m 2 (red solid curve).The same approximation also holds for the distribution of cropland road lengths, but the tail of the ensemble distribution of cropland road lengths (green dashed line) is 'fatter' than the distribution of cropland road lengths associated with cropland areas larger than A > 10 9 m 2 (solid green line), leading to a slight deviation in the collapse of the tails of the ensemble distributions visible in Fig. 1D.

DISCUSSION
Our study provides evidence for general statistical laws describing global road lengths conditional on land use.At the global scale, urban and cropland roads share a universal distribution after rescaling that satisfies (Eq. 1 with α = γ = 1): where F (•) is independent of the land-use class, with land use instead impacting the mean road length l .At finer scales, we investigated the dependence of road length distributions on the area of the homogeneous patch under study, separately for each land-use class.We found that road length statistics conditional on patch area closely match a universal scaling law of the form: where G(•) is independent of the land-use class (i.e.G = G U = G C ).That is, land use affects p(l|A) solely through the scaling of the mean road length l|A with patch area A.
Eq. 5 has interesting implications.Because l|A U and l|A C are approximately constant above the respective thresholds, A U T and A C T , the distributions p U (l|A) and p C (l|A) are invariant for all cities and cropland patches larger than such a threshold.In particular, because A U T 4 • 10 7 m 2 is the area of a relatively small town (∼ 3.6 km radius), our results suggest that the distribution of urban road lengths is virtually independent of the urban patch size for nearly all cities.Moreover, the finite-size scaling functions F and G are quite similar in shape and nearly coincide if one neglects small urban and cropland patches, i.e. below the critical sizes A U T and A C T (Materials and Methods).The observation of nearly universal probability distributions of urban and cropland road lengths suggests that the processes that govern road expansion are to a large extent inevitable, regardless of climate, topography and social and economic constraints, echoing general results on self-organized patterns [37,38].Properties arising from physical constraints imposed by the planarity of the road network may also be at play [19].
These results suggest that local conditions, such as the socio-economic development or the demography of a specific region, may simply accelerate or delay the development of road infrastructure, affecting the characteristic length scale of the road length distribution but not its scaling form.Significantly, we find that road length distributions belonging to seminatural areas do not collapse onto the same distribution, even after rescaling.We argue that such diversity stems from the diverse purpose of roads in natural areas which are not built specifically for direct access to the land and are therefore regulated by diverse, site-specific processes and possibly far from optimized.The linkage of optimality and self-organization is known to induce a variety of scaling phenomena, as shown by studies on network structures derived from selection principles [18,[39][40][41][42], including small-world constructs [43], fluvial trees [44][45][46] and the topology of the fittest networks [47].At this stage, however, attributing the departure of the scaling features of roads serving seminatural areas to transient or non-optimized processes would be premature.
Urban scaling approaches have suggested that larger cities require less infrastructure than bigger ones.Indeed, if the total road length (L) scales sub-linearly with the total population of a city (P ) within their boundary, i.e.L ∝ P η with 0.7 < η < 0.9 [48,49], such scaling would be directly related to the management of mega-cities [50] which would potentially be more efficient than small towns.Our results puts such scaling results in a different perspective.According to the linear scaling shown in Fig. 2C, an urban connected component of 10 9 m 2 and ten components of 10 8 m 2 together require approximately the same total length of major roads.This holds for all urban patches bigger than A U T 4 • 10 7 m 2 , thus including all major cities.This result is consistent with other recent studies performed for a set of cities in Europe and US [51] and for the entire UK [52], showing that the actual sizes of urban patches scale linearly with the total lengths of major roads within them.This then implies that the proposed sub-linear scaling between length and population is due to a super-linear scaling between population density and city size, and that road network lengths are not a good approximation of urban population density.However, more analysis is necessary to prove such speculation, because the suggestion of sub-linearity in the scaling relationship between L and population may have been an artifact of census-based data, which are built regardless of land use within the political boundary of a given region.Therefore, it seems plausible that roads from different land-use classes, with correspondingly different characteristic lengths, may have been mixed.Moreover, census areas are defined differently among various countries, thus precluding cross-country comparative analyses.Indeed, it has been shown that different methods to define city boundaries, based on tuning urban population density, could drastically affect the scaling of urban measurements [52][53][54].We used here a definition of urban patches only based of contiguity of urban land, therefore the proposed results must be interpreted on the basis of used data sets and adopted methodology.
Although the above implications are in need of deeper scrutiny, which should include scaling analyses of different urban measurables (e.g., energy and material flows sensu [50] or other infrastructures), our findings can be directly used as an urban planning tool aimed to estimate road infrastructure requirements.For example, the analysis of global urban evolution, carried out by using 16 night-light layers (see SM), reveals that the total number of urban patches is proportional to A Utot and thus also to L Utot .Therefore, our ansatz suggests, for example, that a 10% increase of global urban surface would imply a parallel increase in total urban road length of 10% (see SM for details).
As it has been estimated that by 2030 the total urban area, A Utot , could potentially grow by 200% [9,10], such urban expansion would entail a total length L tot of 3, 485, 400 km of new paved urban roads.
Our approach also suggests that cropland patches smaller than A C T = 7 • 10 7 m 2 require a road investment per unit area larger than cropland patches above such threshold.That said, of course many other issues must be considered before suggesting a development policy for agricultural land based on these observations.Most importantly, agricultural development policies must account for the preservation of ecological corridors and minimize the fragmentation of wildlands.Relatedly, because the total road length in cropland patches scales linearly with the area, large cropland patches (say, of area A) require the same investment in road paving as n smaller cropland patches of area A/n, provided that A/n > A C T .From this perspective, larger cropland patches are predicted to be as efficient in term of road infrastructure requirements as smaller ones (within the linear regime highlighted in Fig. 2D).In contrast, cropland areas smaller than A C T require a relatively greater length of roads (Fig. 2E and 2F).Such cropland patches are typically composed of several small and scattered cropland units located in wild or mountainous areas.A typical example is given by scattered agricultural plots in forest areas (see SM Fig 3A,B); in such configurations, long roads serve tiny cropland units indicating potentially higher environmental impacts of sparse agricultural expansions.Meanwhile, cropland patches around urban areas are generally more compact and more fragmented, in the linear regime.Because cropland roads can connect more than one cropland patch, however, the proportion N C ∝ A Ctot ∝ L Ctot would overestimate L Ctot .Moreover, by considering the competition among cropland patches and seminatural areas and the scarcity of residual free land to colonize [8], future cropland roads might merely partially substitute for the existing seminatural roads, thus partially conserving the existing extent of road length, L tot .Such a hypothesis is corroborated by the similarity of the distributions p(l) C and p(l) Sn shown in Fig. 1C, which implies a similar level of fragmentation of seminatural areas.An important distinction arises therefore between urban and cropland areas in that the site-specific degree of re-use of existing infrastructure in cropland areas prevents the same kind of strict predictability allowed for urban areas.

DATA PREPARATION AND DATA FUSION
The general idea here is to transfer the land-use information, represented by continuous values, onto the road network, which is represented by lines vectorial geometry.In order to do so, a series of spatial analysis operations have been performed, mainly in ArchMap, Qgis and Python environments.An illustration of the data preparation schema is presented in SM.Data preparation includes the extraction of the urban footprints, a preparation of the road network and a join phase between land-uses and the road network.Below we report details of each phase, in SM we report detailed visualization of the final road network.

Urban footprints extraction
Urban footprint data have been extracted from inter-calibrated night-time light (DMSP-OLS) remote imagery using original images and data processing by NOAA's National Geophysical Data Center and DMSP data collected by the US Air Force Weather Agency (https://www.ngdc.noaa.gov).The NTL data contains continuous values from D = 1 to D = 63representing the intensity of stable light in a grid of 30 arc seconds (approximately 1 km).DMSP-OLS have been widely used to characterize urban footprint and urban evolution at the global scale [23,26,55,56].After removal of gas flaring, the Jenks cluster algorithm has been applied to extract an urban settlement mask [57].The Jenks algorithm is an unsupervised classification method which imposes the number of clusters and is widely used in geographical analysis to cluster continuous surface data sets in separated areas.Starting from a set of randomly selected values, the Jenks algorithm works to minimize the variance inside classes while maximizing variance between classes.We thus classified the entire NTL into three classes and took the most illuminated class to be the urban footprint, considering a pixel to be classified as urban if D > 28.We tested other classification methods as well as different numbers of classes; but given the sharp separation between highly and poorly illuminated places, different values of D did not affect the main results of the scaling analysis (see SM).Using the same procedure, we classified urban footprints from 1997 to 2012.

Cropland
The 1 km global IIASA-IFPRI [27] cropland likelihood map has been developed by integrating a number of individual cropland maps at global to regional to national scales.The individual map products include existing global land cover maps such as GlobCover 2005 and MODIS v.5, regional maps such as AFRICOVER, and national maps from mapping agencies and other organizations.IIASA-IFPRI is a public data set that can be downloaded from the Geowiky platform (http://cropland.geo-wiki.org/downloads/).Detailed visulizations of the cropland mask are provided in SM.

Global road network preparation and land-use labeling method
Geo-located vectorial road data are usually produced for GPS navigation and cartography.Our data set was produced by and acquired from DeLorme (GARMIN, Yarmouth, ME, USA).This data represents a complete and updated data set of primary and secondary roads at the global scale.This data has commercial restrictions; additional information and request for data samples can be addressed to the contact details at https://developer.garmin.com/datasets/digital-atlas/ .The topology of the road network has been corrected using Archmap software and ad hoc Python scripts for the purpose of joining connected roads at junctions with only two roads and to remove small road links representing highway ramps and cross roads intersections which are not representative of any fragmentation process and are potential noise for the statistical analysis.Roads shorter than 100 m cover only ≈ 0.03% of the total road length; these roads, as confirmed by an extensive and scrupulous inspection of the GLR dataset, appear to be highway ramps or road segments for large road junctions and were therefore excluded from our analyses.Coupling the GNR with two global land-use inventories, road have been classified into three categories: urban roads, i.e. roads that entirely belong to urban areas, crop-land roads: i.e. road that intersect or belong to crop-land areas, and semi-natural roads, i.e. road that are completely free of direct urban or cropland use (see SM for detailed visualization of the final classified road network).Data of road network are released under license agreement with DeLorme (GARMIN, Yarmouth, ME, USA).

Probability distribution collapse
Normalization of the distribution p(l) = l −γ F (l/ l α ), requires that the exponents γ and α satisfy α = 1/(2 − γ) [33].Furthermore, F (x) must satisfy appropriate limiting behaviors for x → 0 and x → ∞ (see [33] for details).We verified our hypothesis (Eq. 1) on the functional form of p(l) by data collapse [34], i.e. by plotting l γ p(l) versus l/ l α for each land-use class separately and varying α until the curves describing each class collapse onto the same curve, thus providing a plot of the scaling function F (•).We similarly verified our scaling ansätze (Eqs. 1, 2 and 3) on the scaling form of the road length distributions via data collapse, as routinely used in statistical physics and beyond [33,34].We used an objective method [34] to determine the scaling exponent that gives the best data collapse in terms of minimizing the area between the rescaled distributions (Fig. 1D, inset).

Relationship between scaling functions
The relationship between the scaling functions F and G is mediated by the distribution of patch areas p(A), via p(l) = p(A)p(l|A) dA.In general, the integral cannot be computed analytically; however, focusing for this calculation on urban roads and neglecting the sub-linear regime A < A U T , we find p(l) = p(A)p(l|A) dA = p(A) where we have used the independence of l|A on A that holds approximately for A > A U T .Eq. 6 coincides with Eq. 5 for A > A U T .Within this approximation, therefore, the scaling functions F and G coincide.The same result can be derived for croplands, neglecting patches with area A < A C T .Deviations between F and G are thus ascribable to urban and cropland patches in the sub-linear scaling regimes.The quality of the approximation can be tested by contrasting the ensemble distribution of urban road lengths (red dashed curves in Figs.1C and 3B) with the distribution of urban road lengths conditional on the urban area being A > 10 8 m 2 (solid red curve in Fig. 3B, corresponding to the curves from yellow to red in Fig. 2 panels B).Fig. 3B shows that the two distributions overlap almost completely, implying that Eq. 6 is an excellent approximation for urban patches.The analogous approximation for croplands is slightly less satisfactory, caused by the fact that the longest roads are found in croplands of smaller area (see the inset of Fig. 2F).This unintuitive result appears to be due to long roads fragmenting wild areas for agricultural purposes.Therefore, the tail of the ensemble distribution of cropland road lengths (green solid line in Fig. 1C and green dashed line in Fig. 3B) is "fatter" than the tail of the cropland road length distribution conditional on cropland area A > 10 9 m 2 (i.e., above the threshold area A C T , green solid line in Fig. 3B).

1 FIG. 1 .
FIG. 1. (A) Visualization of the Global Road Network (GRN), with different colors representing classification into different land uses: urban (red), cropland (green) and seminatural (turquoise).(B) Fraction of total road length composed by roads with length less than l for each land-use class.(C) Probability distributions p(l) of road length for different land uses.(D)Collapse of distributions obtained by plotting p(l)l γ vs l/ l α (Materials and Methods), where l is computed separately for each road class, for α = 1 and γ = 1.The inset shows an objective function which reaches its minimum at the exponent α that gives the best collapse[34] (see Materials and Methods).A slight deviation between the urban and cropland distribution tails is visible in panel D, caused by very long cropland roads that are associated with small cropland patches (see Materials and Methods).Roads associated with seminatural areas deviate visibly from the collapse of the curves.

1 FIG. 2 .
FIG. 2. Length distributions for urban (A-C) and cropland (D-F) roads conditional on patch areas.Panels A and D plot road length distributions conditional on various values of urban (A) and cropland (D) patch area A, divided into logarithmic bins (color-coded as indicated in the insets of panels B and E).Panels B and E show road length distributions rescaled according to Eqs. 2 and 3, respectively.The insets show the mean road lengths l|A U (C) and l|A C (F) as functions of A. Panels C and F represent distributions on double-logarithmic scales of total road length L in urban (C) and cropland (F) patches of different areas, considering all urban and cropland patches on Earth.Red lines and dots indicate the mean total road length as a function of patch areas.Colormaps display logarithmic counts of pathes in base 10.

1 FIG. 3 .
FIG. 3. (A)Superimposed rescaled urban (red scale) and cropland (green scale) data from Figs.2B and 2E, demonstrating that the scaling functions GU and GC coincide, further confirming the universality of road length distributions in different land-use classes.(B) According to our approximation (Materials and Methods), the ensemble distribution of urban road lengths (dashed red curve here and in Fig.1C) coincides with the distribution of urban road lengths belonging to urban patches larger than A > 10 8 m 2 (red solid curve).The same approximation also holds for the distribution of cropland road lengths, but the tail of the ensemble distribution of cropland road lengths (green dashed line) is 'fatter' than the distribution of cropland road lengths associated with cropland areas larger than A > 10 9 m 2 (solid green line), leading to a slight deviation in the collapse of the tails of the ensemble distributions visible in Fig.1D.

1 FIG. 4 .
FIG. 4. (A)An overview of the hierarchical organization of the GRN.(B) A detailed view of the Indian road network, where each color, from red to green, represents the proper hierarchy from H1 to H4. (C) A sketch illustrating the process of hierarchical fragmentation by which, starting from H1, each face is fragmented by the link of the lower hierarchy.(D) Probability distributions p(l) of the link belonging to a face in the range k. (E) Collapse of road length distributions, with the best collapse found for p(l)l 1.1 vs l/ l 1.1 .