Taylor’s power law and its decomposition in urban facilities

As one of the few generalities in ecology, Taylor’s power law admits a power function relationship V = aMb between the variance V and mean number M of organisms in a quadrat. We examine the spatial distribution data of seven urban service facilities in 37 major cities in China, and find that Taylor’s Law is validated among all types of facilities. Moreover, Taylor’s Law is robust if we shift the observation window or vary the size of the quadrats. The exponent b increases linearly with the logarithm of the quadrat size, i.e. b(s) = b0 + A log (s). Furthermore, the ANOVA test indicates that b takes distinct values for different facilities in different cities. We decompose b into two different factors, a city-specific factor and a facility-specific factor (FSF). Variations in b can be explained to a large extent by the differences between cities and types of facilities. Facilities are more evenly distributed in larger and more developed cities. Competitive interchangeable facilities (e.g. pharmacy), with larger FSFs and smaller bs, are less aggregated than complementary services (e.g. restaurants).


Introduction
Bettencourt & West [1] state: ' . . . cities are remarkably robust: success, once achieved, is sustained for several decades or longer, thereby setting a city on a long run of creativity and prosperity' (p. 913). The urban facilities such as convenience stores, restaurants, schools etc., play critical roles to set a city on a long run of creativity and prosperity by providing services to meet citizens' basic needs. The spatial distribution of facilities demonstrates rather different patterns on the map, e.g. restaurants aggregate into clusters, while pharmacies are dispersed across the whole city. The location of facilities can be easily affected by city geographical landscapes, business nature, local consumer characteristics, etc. It is a complex task to understand the commercial logics of the locations of the urban facilities. Some statistical regularities can be found if we take spatial statistical approaches on the location data by ignoring other details. Pablo [2] uses a network approach to reveal many important facts about the commercial organization of retail trade based on location data alone.
Urban facilities resemble organisms of a species in an area, in the sense that the same type of facilities in a city may help each other survive, while at the same time, they compete for various resources. As a fundamental law in ecology, Taylor's Law characterizes the fluctuation of the organisms' spatial distribution. Based on the spatial data of urban facilities through the API of Baidu 1 digital map, we are able to test whether Taylor's Law can be applicable to urban facilities, and then test if the exponents of Taylor's Law can be used to quantify the characteristics of the spatial distribution of different facilities by removing the city factor from the total exponent embedded with both city and facility factors.
Taylor [3] describes a power function relationship V ¼ aM b between the between-sample variance in density V and the overall mean density M of a sample of organisms in an area. The exponent b describes the heterogeneity of the spatial or temporal distribution. For example, b ¼ 1 corresponds to a Poisson distribution, b . 1 indicates the clumping of organisms.
Besides the distribution of organisms, Taylor's power law has found application to seemingly unrelated phenomena like human sexual pairing [4], human haematogenous cancer metastases, the clustering of childhood leukaemia [5], measles epidemics [6] and gene structures [7]. Taylor's Law has recently been increasingly applied in social systems modelling, e.g. in [8] Taylor's Law is used for the analysis of human spatial behaviour. Given such broad applicability of Taylor's Law in many seemly mysterious and complicated natural processes, one might ask whether there is any general principle at the basis of all these processes. 2 A large body of literature has been devoted to this question, and many theoretical models have been introduced to explain Taylor's Law. For instance, Andersen et al. [9] propose a Markovian population model and justify this model through simulations, which show that with the increase in the average population density, the variance to mean relationship would approach a power function with a maximum exponent value 2. Fronczak & Fronczak [10] suggest that Taylor's Law is a result from the second law of thermodynamics and the behaviour of the density of states. Kendal & Jørgensen [11] further notice that the cumulant generating function derived by Fronczak & Fronczak [10] is known as the Tweedie distribution and therefore suggests that Taylor's power law results from the central-limit-theorem-like convergence.
Despite those theoretical attempts to explain the origin of the power law, however, it has not come to agreement on the meaning of the exponent b. Empirical evidence is clearly needed if we can observe how b varies among different species at the same location, and similarly how b varies for the same species across different locations. In this aspect, our study might shed light on the meaning of b by exploring the potential mean -variance relationship of Taylor's Law in the spatial distribution of urban facilities. We will measure b for various urban facilities in many cities. It is understood that the measurement of scaling laws is subject to inaccurate estimations [12,13]. The advantage of our approach is that we can measure b for many cities and various urban facilities. One measurement of b for one urban facility in a city may be uncertain. The combined statistical test, which uses many bs for different combinations of facilities and cities, can yield reasonably robust results.
Our statistical test shows the values of b are different from one city to the other city; they are also different from one facility to the other facility. Both a city-specific factor (CSF) and a facility-specific factor (FSF) are embedded in b. In order to study these two key factors contributing to the difference among bs and explore the mechanism underlying the distribution of urban facilities, we decompose the inverse of exponent b by examining their contribution to the numbers of facilities located in a study region. The CSF plus the FSF accounts for a remarkably high level of over 92% of 1/b, even though there are inevitably estimation errors in b. The meaning of CSF and FSF are also discussed in the paper. The CSF reflects the overall density of all the facilities in a city, as well as their aggregation level of all types of facilities put together. FSF indicates the overall aggregation level of each type of facility in all the cities. For example, the first tier four cities rank in the top four in the CSF values due to the high overall facility density in these cities, while restaurant tops the overall aggregation level due to their complementary nature. These findings are consistent with our intuitive understandings of these cities and urban facilities. 1 Baidu is the largest search engine and digital map service provider like Google in China. 2 For more detailed and in-depth review of relevant literature about Taylor's Law and the explanation offered, refer to [7].
royalsocietypublishing.org/journal/rsos R. Soc. open sci. 6: 180770 As suggested by Leitão et al. [14], city-specific observations scale nonlinearly with population, we also check the CSF for cities of different population size, and find that larger and more developed cities tend to have smaller bs. It means that in larger and more developed cities, we do not only have more service facilities available to the citizen, but also the facilities are more evenly distributed to provide more convenient services to citizens.
Serving as areas for the concentration of human activities, cities are considered to be the principal engines of innovation and economic growth [15,16]. Today, more than half of the world population live in cities. The developed world is now about 80% urbanization and the entire planet will follow this pattern by around 2050, with some two billion people moving to cities, especially in China, India, Africa and Southeast Asia. 3 Countries around the world are experiencing a rapid urbanization process, which presents an urgent challenge for developing predictive, subtle and quantitative theories and methods, providing necessary technical support for urban organization and sustainable development.
As consumers of energy and resources and producers of artefacts, information and waste, cities have often been compared with biological entities and ecosystems [17][18][19]. Bettencourt et al. [17] show that there are very general and non-trivial quantitative regularities of social activities common to all cities across urban systems, and many diverse properties of cities, such as patent production, personal income and crime, are shown to be power law functions of population size. Besides, the size distribution of cities fits a power function (known as Zipf's Law): the number of cities with populations greater than S is proportional to 1/S [20]. Geometrically, the complex spatial structure of cities have apparent fractal nature associated with individual cities and entire urban systems [21]. Through exploring possible consequences of the scaling relations by deriving growth equations, Bettencourt & West [1] quantify the dramatic difference between growth fuelled by innovation versus that driven by economies of scale, suggesting that as population grows, major innovation cycles must be generated at a continually accelerating rate, so as to sustain growth and avoid stagnation or collapse. Bettencourt & West [1] state that cities should be treated as a complex dynamic system, which is capable of aggregating and manifesting human cognitive ability, leading to open socioeconomic development.
The contribution of this paper mainly lies in the following aspects. This is the first paper revealing Taylor's Law in the spatial distribution of urban facilities. Furthermore, we discuss the size effect of quadrats and discover that the exponent b increases linearly with the logarithm of the size of the quadrat, i.e. b(s) ¼ b 0 þ A log (s), which has not been documented in the previous literature. Moreover, we decompose the inverse of exponent b to examine two different factors contributing to the numbers of facilities in a study region (we call it a quadrat) within a city, and find that both the CSF and the FSF have their own concrete and specific implications. This paper proceeds as follows. Section 2 introduces the source of the data and data processing method. Sections 3 and 4 present the empirical results of Taylor's Law and size effects of quadrats. Section 5 discusses the size effect of quadrats. Section 6 decomposes the factors affecting the number of a facility in the quadrats within a city. Section 7 discusses the results and concludes.

Data preparation
Baidu provides a programmable interface to use the digital map. We collect the spatial coordinates data by calling the interface for seven types of service facilities, in the city area and adjacent counties of 37 major cities in China. The seven types of facilities are: beauty salons, banks, stadiums, schools, pharmacies, convenience stores and restaurants. 4 The exact meaning of these facilities is described in table 1. The 37 major cities consist of four direct-controlled municipalities (Beijing, Shanghai, Chongqing and Tianjin), 30 provincial capitals and sub-provincial cities and three other large cities with a high ranking in GDP output. These cities are the largest Chinese cities in both population and aggregate economical output. Population size and number of facilities which are studied in this paper are listed in table 7.
The raw data include the latitude and longitude coordinates of each facility. For illustrative purposes, we randomly choose 300 samples for each of the seven types of facilities in Beijing and mark these samples in figure 1. We then convert the raw data of the latitude and longitude coordinates into planar coordinates in the unit of metre, so as to facilitate the calculation of distance and area selection. It is hard to define the exact boundary of a city. We select a central point (lng 0 , lat 0 ) by examining the satellite map of a city, and use the central point as our origin of the planar coordinate system. 5 We then convert the spherical coordinate of each facility to the corresponding planar coordinate. For example, given the latitude and longitude coordinate (lng i , lat i ) of a facility, its planar coordinates (x i , y i ) are calculated by where Sign is a sign function, with Sign( The Distance function is defined as the length of the largest arc connecting two points on the spherical surface of the earth. The radius of the earth is used as R ¼ 6 371 004 m. It is hard to draw a boundary to separate a city from its suburbs since the city -suburbs lines are rather blurring in most cities. In this study, our strategy is to combine an initial choice and a later examination of the data. The initial choice is to set a large enough starting area to cover most populated city zones and to prevent the boundary from being drawn into adjacent towns. Then in the starting area, we draw a grid of non-overlapping sub-areas. A later examination is to detect if a subarea should be discarded if the number of facilities in that area is less than a threshold. The size of the starting area is set as 40 000 Â 40 000 m, 6 as a stating point centred at the city central point. 7 This area is then divided into sixteen 10 000 Â 10 000 m non-overlapping sub-areas. 8 However, because of the irregularity of the city area, a few of the sub-areas could be corresponding with the depopulated zones. We mark a sub-area as not valid if the total number of a given facility is less than 20. 9 This method naturally decides where we should draw a line for the city boundary according to the concentration of facilities. The choice of 40 000 Â 40 000 m for the starting area is because it is large enough to cover most populated area of a city. The choice of this number should not be an issue for most cities. If we make it bigger, newly added area will most likely be identified as non-city zone. For some cities, it is possible that we can get a bigger exponent b if we enlarge the area since we might include less populated adjacent towns. 7 The central point is marked by hand for each city according to the satellite image. 8 The choice of 16 is to balance the need to get more data points and less estimation error for the mean -variance pair. If the number is too big, say, 25, we may end up with small number of facilities in each sub-area, which leads to large estimation error of the mean and variance. If the number is too small, say, 9, for some irregular cities, we may not have enough data points to draw a line for the mean and variance. 9 The choice of the threshold number 20 to decide whether a sub-area is valid does not affect the validity of Taylor's Law. However, if the number is set too small, then in some sub-areas, the mean and variance are inaccurate since there are not many facilities in each quadrat. On the other hand, if the number is set too big, we may not have enough cities which have at least five valid sub-areas so that we can estimate the values for the exponent b for all considered seven types of facilities. In our case, the threshold number is set as 20 to balance the above two issues based on the data we have, we end up having 23 cities with all seven bs estimated.  16) are computed using values of N i,j , j ¼ 1, 2, . . . for each sub-area i. In order to get a good estimation of mean and variance in each sub-area i, N i,j should be bigger than 0 for a sufficient number of quadrats. In some cities, the numbers of certain facilities are not large enough, which may greatly increase the size of error in the estimation of the means and variances. To avoid this problem, we decide that if the total number of valid sub-areas is small, e.g. less than five (i.e. we have less than five pairs of means and variances), we cannot get a reasonably good estimation of the mean -variance relationship. We then have to give up our estimation for this given combination of facility and city.

Taylor's power law
It should be noted that in our study, every pair of mean and variance is estimated in an area using the same quadrat size. Hence, the potential relationship between the means and variances only reflects the fluctuation of the number of events in a quadrat of the given size, not a scaling rule where the number of events is measured in a series of expanding quadrats discussed by Wu et al. [23]. Through examining whether or not there is linear correlation between the natural logarithm of the means and variances, i.e. log V ¼ log a þ b log M, we can judge whether or not Taylor's power law is applicable to the urban facilities.
Taking Beijing as an example, we draw the scatter plots of the means and variances for beauty salons, stadiums, schools and banks in figure 2a for J ¼ 5 and figure 2b for J ¼ 8. Remember that a sub-area is divided into J Â J quadrats. J ¼ 5 means that each quadrat in which the number of facilities N i,j is urban facilities, as well as social-economical environment of a city. We will decompose the exponent in the later part of the paper, which can explain up to 92% of the fluctuations. The rest may be attributed to the estimation error. We organize the cities along the horizontal axis by population size from the highest to the lowest. One should understand that the population data are not accurate due to the rapid urbanization process in China. The population data reported in this study only include permanent residents, it does not include the migrating population who do not have the city Hukou. There is a slight tendency of increase in exponent b from the left to the right. It seems that bigger and more developed cities have smaller values of b. In the next section, after we decompose the exponent b, we will plot the CSF against the population size.
As a robust check, we will test if Taylor's Law still holds and how the exponent b varies if we shift the locations of the observation windows of the quadrats.
The whole study area of a city is covered by non-overlapping quadrats. For different choice of J (we choose J ¼ 4, . . ., 8), the size of the quadrat is L ¼ 10 000/J. We notice that for different choice of J, Taylor's Law still holds by observing that the variance is a power function of the mean. Figure 2 gives an illustration of the power law relationship for J ¼ 5 and J ¼ 8. The results are similar for other choices of J. S h a n g h a i B e i j i n g T i a n j i n C h e n g d u G u a n g z h o u S h e n z h e n W u h a n H a e r b i n Z h e n g z h o u Q i n g d a o H a n g z h o u X i a n C h o n g q i n g S h e n y a n g N a n j i n g C h a n g s h a J i n a n X i a m e n D a l i a n K u n m i n g W u x i N a n c h a n g T a i y u a n beauty salons banks stadiums schools pharmacies convenience stores restaurants  Then we shift the location of the observation window to the maximum by moving the window up by L/2 then to the right by L/2. By using the seven types of facilities in the 23 cities, we report the mean and standard deviation of b before we move the window and after we move the window in table 2. We then take T-test to test the null hypothesis that b does not change if we shift the observation window. The p-values are larger than 0.05 for all Js, indicating that there is no significant difference in b. We will discuss how b varies when we use different sizes of the quadrats in the next subsection.

How does b vary with size of the quadrat
Note that in this paper, the mean -variance relationship is obtained when the density of the events fluctuates across different sub-areas of a city, while the size of the quadrat is kept fixed. For different choices of J (hence different sizes of the quadrats), we have demonstrated in the previous section that Taylor's Law still holds. We have the following proposition on how the exponent b varies with the size of the quadrat. It is a result that can be derived from the combination of two scaling laws regarding mean -variance relationship, namely, Taylor's Law and size-scaling law as studied in [23].
is the exponent measured in quadrats of unit size, and A is a constant to be decided by experiment.
It should be noted that s is dimensionless, which does not have a unit. In other words, s is relative, measuring the ratio of the size of a quadrat relative to the base quadrat for which we let s ¼ 1. For instance, at two different scales, we pick one and let the size be s ¼ 1, the size of the other one relative to the first one is s. The formula tells how the exponents at the two different scales are related.
To prove the proposition, we need to cite the other scaling invariant mean-variance relationship as studied in [23], if we vary the size of the quadrats while the density (i.e. number of events per unit area) is fixed. If we scale the size of the quadrats by s, the mean in the new quadrats is scaled by M(s) / s 2 in a two-dimensional space. Owing to the power law of the covariance density, i.e. the covariance density g(r) is a power function of the physical distance r between two locations, g(r) / r 2c . It is shown in [23] that the variance is scaled by V / s 42c . Thus, the mean-variance relationship is given by a power law V / M b 0 with b 0 ¼ 2 2 c/2. To differentiate the two power laws, we refer to the latter one as size-scaling law, and Taylor's Law as the density-scaling law. We use b 0 and b to refers to the exponent of the size-scaling law and the density-scaling law, respectively.
Assume that there are two sub-areas i and j with different densities. The density (i.e. the number of events in a unit area) in the two sub-areas is M i and M j , respectively. First we use quadrats of unit size to measure the mean and variance. According to Taylor's Law, we have (4:1) Then, we increase the quadrat size by s. In the new quadrats of sub-area i and j, the mean is M i (s) ¼ s 2 M i and M j (s) ¼ s 2 M j , respectively. According to the size-scaling law as shown in [23], we have the following relationship, respectively, for sub-area i and j, where V i (s) and V j (s) are the variance using the new quadrats (with size s) in sub-area i and j, respectively. Note that V i and V j are measured in the base quadrats when s ¼ 1. b 0 refers to the exponent of the size-scaling law. It may take different values in different sub-areas with different densities, so we use subscript i and j to refer to the difference. Combining the above three equations, we have Since Taylor's Law still holds when the quadrats are increased in size by s, we have the following two possible cases.
The size-scaling law of mean-variance does not depend on the density of the events. Then log  figure 2, the exponents are not equal for J ¼ 5 and J ¼ 8. Case 1 is not true.
We can test if the above argument is true by using five different values of J, i.e. J ¼ 4, 5, 6, 7, 8. Since the size of the sub-area is fixed as 10 000 Â 10 000 m, the quadrat for a given J is of size 10 000/J. In figure 4, the average value of the exponents of 23 cities (in which we can estimate bs for all facilities) are plotted against log (1/J ). The five points, corresponding to the five different choices of J, are on a straight line with an upward slope. So, our conjecture is verified, i.e. b(s) ¼ b 0 þ A log (s) with A . 0 for urban facilities.

Decomposition of b
As we have mentioned in §3, the differences in socioeconomic conditions of the cities and the distinct features of various facilities may result in the fluctuation of the exponent b. In order to explain the fluctuation in the values of b for various facilities and in different cities, we can decompose the contributor affecting the number of a facility in the quadrats of a city into two major contributors: a city-specific contributor and a facility-specific contributor. Through the ordinary least-square regression, this decomposition may remove a proportion of estimation error due to insufficient sample quantity.
Assume: (a) the number of a facility in a quadrat of a city is jointly determined by the cityspecific contributor and facility-specific contributor, which are assumed to be independent from each other; (b) these two contributors satisfy the following equation for decomposition: X ij ¼ Y i Z j , where: (a) X ij stands for the quantity of the facility j ( j ¼ 1, 2, . . ., 7), in a quadrat of city i (i ¼ 1, 2, . . ., 23); (b) Y i represents the contribution from city-specific contributor and (c) Z j represents the contribution of facility-specific contributor. Here we do not specify any measurable variable or quantity for city-specific contributor or facility-specific contributor. Underlying Y i or Z j could be a complex function of many variables. We are only interested in the decomposition of the exponents b instead of the specific function form of the factors. Note that the independence and multiplicative royalsocietypublishing.org/journal/rsos R. Soc. open sci. 6: 180770 assumption is supported by the results in table 7 that large cities tend to have more facilities of all types than small cities. 11 From the independent assumption, it is clear that the mean value of X ij could be expressed as follows: , which represents the average contribution of the city-specific contributor and that of the facility-specific contributor respectively. We further assume that there is a power function relationship with exponent c i between the variance V and the average contribution of the city-specific contributor, while the power function for facility-specific contributor is with exponent f j . This assumption can be represented by the following two equations: and Based on Taylor's power function V ¼ aM b , we can derive Hence, based on equations (5.1) -(5.4), we can infer for the variance part where 1 ij represents the estimation error in b ij . We assume that 1 ij is normally distributed and has the same variance, i.e. 1 ij N (0, s 2 ). The inverse of b ij is decomposed into two components, namely, c i and f j in equation (5.6). We call c i the CSF and f j the FSF. It should be noted that we can only solve c i and f j up to a constant since they always appear in a summation pair. For example, c i 2 c 0 and f j þ c 0 are also the solutions. It suffices our purpose to examine whether CSFs (or FSFs) take distinct values for different cities (or facilities). To solve equation (5.6), we can define the objective function to minimize: where n i and n j represent the number of cities and that of a facility, respectively. Through minimizing the above objective function, we can derive the value of c i and that of f j . In appendix B, we give the details on how to solve the problem. In tables 3 and 4, we list the values of the two factors for all the seven types of facilities in 23 cities under the constraint that P n i i¼1 c i ¼ P n j j¼1 f j from the Moore-Penrose pseudo inverse. Here, J ¼ 5 is used. We notice that E[j1=b ij À (c i þ f j )j=(1=b ij )] ¼ 7:5%, which means that c i and f j jointly account for 92:5% of 1/b ij .
As we can see from table 3, the values of the CSF vary significantly for different cities. The meaning of CSF can be understood in two ways. First, from equation (5.2), these values directly determine the mean value for all facilities within a city, hence they can be seen as indicators of the overall density of all facilities in a given city. Secondly, CSF is a component of 1/b. Larger CSF means smaller b. Smaller b means that the facilities are more evenly distributed. The CSF for Shenzhen is 0.26, which has the highest density of facilities among all cities. It is followed by Beijing (0.24), Shanghai (0.23) and Guangzhou (0.23). These four are the largest cities ranking in the first tier in the Chinese cities. Our results suggest that in larger and more developed cities, we do not only have more facilities available to the citizen, but also the facilities are more evenly distributed so that citizens can better use them. 11 There are some violations in the data, our assumption is an approximation which simplifies our analysis.
royalsocietypublishing.org/journal/rsos R. Soc. open sci. 6: 180770 For the values of the FSF, we need to understand from another perspective. Because 1/b ij ¼ c i þ f j , given a value of c i in a city, the smaller the value of f j , the larger the values of b ij and therefore larger V ij for a given mean M ij in the corresponding city. Larger variances imply greater differences among the numbers of facility j in different quadrats. At some place, the number is small, but at another place, the number can be very large which means that the facility tends to aggregate in space. On the contrary, when f j becomes larger, given the same value of mean, the variance falls, thus the distribution tends to behave more like the Poisson distribution, which implies a weaker aggregation. In our decomposition, the value of the FSF for restaurants is 0.45, which is the smallest, and for pharmacies it is 0.60, which is the largest. It shows that restaurants have the highest degree of aggregation; while pharmacies have the lowest degree of aggregation. Restaurants with different styles can coexist at one place; however, due to their interchangeability, pharmacies tend to avoid staying close to each other.  In figure 5, we present the scatter plot of the CSF versus the logarithm of population size of cities. Overall, the CSF increases with the population size. Table 5 gives the statistical test. The null hypothesis is that CSF has nothing to do with the population, which is rejected at 1% confidence level. Positive correlation coefficients indicate that CSF is positively correlated with the logarithm of population. Since CSF is a component of the inverse of b, the positive correlation between CSF and population indicates that b decreases with population. It is worth mentioning that the conclusion is applicable to all type of facilities, since CSF is a component of b for all type of facilities. It should be noted that China is experiencing a rapid urbanization. The population data, reflecting the official record of permanent residents, does not include the migrating population. The latter becomes more and more important as a lot of Chinese people are migrating from rural areas to cities in recent decades. With larger CSFs and smaller bs, larger and more developed cities tend to have more facilities (more service) than small cities.
6. Statistic test 6.1. ANOVA test As indicated in tables 3 and 4, cities differ in the CSF and facilities differ in the FSF, thus leading to the fluctuation of exponent b for various facilities in different cities.
To further investigate whether these differences have statistical significance, we use the two-way ANOVA test. The two-way ANOVA is an appropriate analysis method for a study with a quantitative outcome and two categorical explanatory variables.
The two-way ANOVA test has the following assumptions: the sample in each cell (i.e. for each combination of levels of the two factors) is independent of the samples in the other cells, the sample in each cell comes from an (approximately) normal distribution and the populations corresponding to each cell have the same variance (from the homogeneous variance assumption). Our structural model for the two-way ANOVA without interaction is the no-interaction (additive) model. The additive model assumes that the effects on the outcome of a particular level change for one explanatory variable does not depend on the level of the other explanatory variable.
The purpose of the ANOVA test is to investigate the dependence of b on two explanatory variables (city and facility). Similar to the decomposition, the statistical test on b for various facilities in different cities is carried out on the inverse of b, i.e. x i,j ¼ 1/b i,j , similar to the factor decomposition. The test can be called a 23 Â 7 ANOVA because those are the levels of the two categorical explanatory variables. The ANOVA test has the two null hypotheses as follows:

Discussion and conclusion
Based on the dataset of spatial coordinates of the seven types of facilities in 37 major cities in China, we explore the micro-structure of these cities and study the characteristics of the distribution of urban facilities. We find that there is a power law function relationship V ¼ aM b between the variance V and mean M of number of facilities in a quadrat. The distribution of urban facilities complies with Taylor's Law. The same facilities in a city may help each other survive, while at the same time, they compete for various resources, which resembles the relationship between the organisms of a species in an area. Furthermore, in order to study the key factors contributing to the difference between the values of exponent b and explore the mechanism underlying the distribution of urban facilities, we decompose the inverse of exponent b into two different factors contributing to the numbers of facilities in a city, respectively: the CSF and the FSF. We find that the values of the CSF vary significantly between different cities, and different facilities have different degree of agglomeration. It is interesting to note that Beijing, Shanghai, Guangzhou and Shenzhen, the largest and most developed four cities in mainland China, have the largest b. Our results suggest that in larger and more developed cities, we do not only have more service facilities available to the citizen, but also the facilities are more evenly distributed so that citizens can better use them. Moreover, restaurants have the highest degree of agglomeration; while pharmacies have the lowest degree of agglomeration. These findings are consistent with our intuitive understandings of these cities and urban facilities.
Economic activities are often geographically concentrated in particular cities or metropolitan areas, and there are many theories explaining why the concentration may occur [24][25][26]. Ellison & Glaeser [26] assess the importance of natural advantage to geographical concentration, and find that onequarter of industrial concentration can be explained by observable sources of natural advantage. Audretsch [27] states that 'knowledge is generated and transmitted more efficiently via local proximity, economic activity based on new knowledge has a high propensity to cluster within a geographic region' ( p. 18).
The analytical results in this paper are in line with the findings in the literature mentioned in the above paragraph. Beijing, Shanghai, Shenzhen and Guangzhou are generally acknowledged as the four most urbanized cities in China, and our analytical results show that these cities have the highest level of concentration of urban facilities. Glaeser & Kohlhase [22] show evidence that services tend to be located in dense areas because they are more dependent on proximity to costumers than manufacturing industries. Moreover, there is a strong tendency of service industries to locate near their suppliers and customers, because the costs of delivering services are much higher than the costs of delivering goods. City streets enable service providers to readily link with large numbers of their diverse customers, hence they are a good setting for services. Waldfogel [28] reveals that there is a strong pattern of retail establishment sectors, such as restaurants and media, to locate near demographic groups that regularly buy from that sector.
As we have shown in the above analyses, the distribution of urban facilities resembles that of the organisms in ecosystems. Organisms feed on various resources, while facilities 'feed on' consumer demands. Organisms are prone to form groups, but the size of group varies between different species. For example, zebras and wildebeest form large herds, while the lions usually live in small groups. Urban facilities tend to agglomerate in an area, while as we can see from table 3, the degree of agglomeration varies between different facilities. For instance, the value of the FSF for restaurants is 0.45, which is the smallest, and that for pharmacies is 0.60, which is the largest. This shows restaurants have the highest degree of agglomeration; while pharmacies have the lowest degree of agglomeration (or highest degree of dispersion).
It is important for us to carry out further studies on the distribution of urban facilities, and the potential directions could lie in the following three aspects. Firstly, through combining spatial statistics, economic theories and other relevant fields, we could further explore the rationals and mechanisms underlying the distribution of urban facilities, and examine its impact on socioeconomic development in a city and the adjacent regions. Secondly, when we have sufficient panel data, we royalsocietypublishing.org/journal/rsos R. Soc. open sci. 6: 180770 could examine the evolution of the distribution of urban facilities over both the time and space, and explore the relationship between the evolution process and the changes in socioeconomic development indicators, such as income per capita, population density and health indicator, etc. Lastly, we could explore relevant theoretical frameworks that could help improve the distribution of urban facilities, thus facilitating sustainable development of cities.
Ethics. No special permit was required. The data were collected using an API from Baidu map service. Data accessibility. The data and Matlab source code have been uploaded to Dryad Digital Repository: https://doi.org/10. 5061/dryad.g9b301f [29]. Table 7. Population and number of facilities in the study region (40 Â 40 km around the centre) for each of the major 37 cities in China. solution, so are c i 2 c 0 and f j þ c 0 . We need one more condition to determine the exact value of X. In this paper, we report the solution of X, which has the smallest L2 norm among all solutions X ¼ pinv(M)B, ( B 6 ) where pinv is the Moore-Penrose pseudo inverse of a matrix. The Moore-Penrose pseudo inverse is solved under the constraint to minimize L2 norm of X, which is to minimize P n i i¼1 c 2 i þ P n j j¼1 f 2 j by choosing c 0 . In this case, it can be shown that the L2 norm constraint is equivalent to P n i i¼1 c i ¼ P n j j¼1 f j .