Static landscape features predict uplift locations for soaring birds across Europe

Soaring flight is a remarkable adaptation to reduce movement costs by taking advantage of atmospheric uplifts. The movement pattern of soaring birds is shaped by the spatial and temporal availability and intensity of uplifts, which result from an interaction of local weather conditions with the underlying landscape structure. We used soaring flight locations and vertical speeds of an obligate soaring species, the white stork (Ciconia ciconia), as proxies for uplift availability and intensity. We then tested if static landscape features such as topography and land cover, instead of the commonly used weather information, could predict and map the occurrence and intensity of uplifts across Europe. We found that storks encountering fewer uplifts along their routes, as determined by static landscape features, suffered higher energy expenditures, approximated by their overall body dynamic acceleration. This result validates the use of static features as uplift predictors and suggests the existence of a direct link between energy expenditure and static landscape structure, thus far largely unquantified for flying animals. Our uplift availability map represents a computationally efficient proxy of the distribution of movement costs for soaring birds across the world's landscapes. It thus provides a base to explore the effects of changes in the landscape structure on the energy expenditure of soaring birds, identify low-cost movement corridors and ultimately inform the planning of anthropogenic developments.


Static environmental variables
For descriptive purposes we group the environmental layers in two categories -surface features and land cover. All environmental layers are publicly available (Tab. S2.1).

Surface features.
In order to characterize the surface features we used the publicly available elevation map EU-DEM (based on SRTM and ASTER Global Digital Elevation Model) [1] and we computed slope, aspect and roughness (topographic heterogeneity) using the R package raster [2]. The native spatial granularity of the elevation map is 1 arcsec (about 25 m near the Equator) but we aggregated the raster cells to match the 100 m resolution of the land cover map. Slope and aspect were computed according to Horn [3]. The roughness was calculated as the difference between the maximum and the minimum value of a cell and its 8 surrounding cells. Unevenness in the aspect, the slope and the elevation (this last one also called Topographic Position Index) were computed as the difference between the value of a cell and the mean value of its 8 surrounding cells. Highly correlated layers were excluded from the model to avoid multicollinearity (this was the case of the slope because of the high correlation with the roughness).
Land cover. We characterized the land cover using the Normalized Difference Vegetation Index (NDVI), CORINE Land Cover categories (CLC) and the Global Urban Footprint (GUF). The NDVI product is available as Spectral Indices product of Landsat 7 [4] with a spatial granularity of 30 m; the raster cells were resized to 100 m. We computed a summer (from June 1 st to September 30 th ) NDVI composite for 2014 to match the temporal resolution of our tracking data. For the composite we extracted the higher monthly NDVI value of each cell and we averaged the resulting maximum monthly values. Extracting the maximum monthly value instead of the average value allowed us to avoid low values of NDVI that could be associated with errors in the cloud cover mask of the Landsat NDVI product. The CORINE Land Cover 2012 is available from the European Environmental Agency [5] with a native spatial resolution of 100 m. We used the level 3 categories with few modifcations (Table S2.2). The Global Urban Footprint is a binary thematic map with values of 1 for built-up areas (man-made building structures) and 0 for non-built-up areas. The dataset is available with a native spatial resolution of 0.4 arc seconds (about 12 m near the Equator) [6].
We resized the raster cells to 100 m computing the mean for each 9 by 9 cells (proportion of built-up areas/100 m cell). Global urban footprint (GUF) Produced by Deutschen Zentrums für Luft und Raumfahrt, 2011 [6].

Model tuning and evaluation
Different parameter values can be modifed by the user in order to tune the random forest algorithm and improve its performances (such as number of variables chosen at each split, forest size and tree depth). In addition, in the feld of species distribution modelling, the proportion of data belonging to different classes in the dataset used to train the model has an important effect on different accuracy measures on the model output of different machine learning algorithms [10,11]. For these reasons, we tuned the static and the dynamic landscape models, choosing the optimal values of mtry (number of variables chosen at each split) with respect to Out-of-Bag error estimate, using the package's inbuilt function tuneRF. Additionally, we tried different values of ratio soaring to flapping locations (named hereafter prevalence, for similarity with the defnition used in species distribution modelling).
The role of prevalence on the model performances is controversial. Different studies suggested that certain accuracy measures (such as kappa) are sensitive to prevalence and that prevalence should be taken into account when evaluating the model [12,13], but few unclear suggestions have been made about whether the number of presences and absences should be manipulated in order to maximize the model performances [10,11,13]. This lack of clarity is probably due to multiple reasons, such as the algorithm used, the difference between using real absences or pseudo-absences, the difficulty to differentiate between biases in the measures used to evaluate the performance and biases in the performances itself, and fnally to the uncertain effect of data manipulation on the ecological interpretation of the result (for instance when dealing with rare and specialist species versus abundant and generalist ones, and the need to compare the predicted prevalence with the observed prevalence The results of the tuning procedure showed an increase in the model accuracy (AUC and TSS) with increasing prevalence. Sensitivity and specifcity showed variable values at different values of prevalence, but they both showed a slight positive trend and the difference between the two slightly decreased, with increasing prevalence (Fig. S3.1).
In contrast, the threshold that maximized the TSS increased, being around 0.5 with a prevalence equal to 2, and around 0.9 with a prevalence of 12, which is in agreement with the fact that the best performances in our model were associated with high thresholds. This didn't affect our model accuracy, but its robustness. In fact, all models            Figure S4.1 -Non linear relationship between some of the environmental predictors included as smooth terms in the static uplift intensity model (generalized additive model) and the response variable (vertical speed). In (A) the effect of NDVI and aspect, whereas in (B) the effect of DEM and latitude on the vertical speed. In lighter colours, combination of variables' values that predict a higher vertical speed.