Exploring the high-resolution mapping of gender-disaggregated development indicators

Improved understanding of geographical variation and inequity in health status, wealth and access to resources within countries is increasingly being recognized as central to meeting development goals. Development and health indicators assessed at national or subnational scale can often conceal important inequities, with the rural poor often least well represented. The ability to target limited resources is fundamental, especially in an international context where funding for health and development comes under pressure. This has recently prompted the exploration of the potential of spatial interpolation methods based on geolocated clusters from national household survey data for the high-resolution mapping of features such as population age structures, vaccination coverage and access to sanitation. It remains unclear, however, how predictable these different factors are across different settings, variables and between demographic groups. Here we test the accuracy of spatial interpolation methods in producing gender-disaggregated high-resolution maps of the rates of literacy, stunting and the use of modern contraceptive methods from a combination of geolocated demographic and health surveys cluster data and geospatial covariates. Bayesian geostatistical and machine learning modelling methods were tested across four low-income countries and varying gridded environmental and socio-economic covariate datasets to build 1×1 km spatial resolution maps with uncertainty estimates. Results show the potential of the approach in producing high-resolution maps of key gender-disaggregated socio-economic indicators, with explained variance through cross-validation being as high as 74–75% for female literacy in Nigeria and Kenya, and in the 50–70% range for many other variables. However, substantial variations by both country and variable were seen, with many variables showing poor mapping accuracies in the range of 2–30% explained variance using both geostatistical and machine learning approaches. The analyses offer a robust basis for the construction of timely maps with levels of detail that support geographically stratified decision-making and the monitoring of progress towards development goals. However, the great variability in results between countries and variables highlights the challenges in applying these interpolation methods universally across multiple countries, and the importance of validation and quantifying uncertainty if this is undertaken.


Datasets
The Demographic and Health Surveys (DHS) is a program of national household surveys implemented across a large number of LMICs. The DHS Program collects and analyses data on population demographic and health characteristics through more than 300 surveys in over 90 countries. The gender-disaggregated data we investigated in this report come from DHS datasets.

Georeferenced DHS indicators
In most DHS household surveys, a two (or three) stage cluster sampling procedure is adopted. Sampling clusters are usually the primary sampling units (PSU), which are pre-existing geographic areas known as census enumeration areas (EAs). A stratified sample of EAs is usually selected with probability proportional to size (PPS) (ICF International, 2012). The boundaries of the EAs are defined by the country's census bureau, as is the urban and rural status of each cluster. An EA can be a city block or apartment building in urban areas, while in rural areas it is typically a village or group of villages. At a second stage, within each identified EA, a fixed (or variable) number of households is selected and all eligible women and men in the selected households are interviewed (ICF International, 2012). The population and size of sampled clusters vary between and within countries. Typically, clusters contain 100-300 households, of which 20-30 households are randomly selected for survey participation (ICF International, 2012). The estimated center of each cluster is recorded as a latitude/longitude coordinate, obtained from a GPS receiver or derived from public online maps or gazetteers. The actual physical size or boundaries of the survey cluster are not made available. In recent years it has become more common for countries to use census EA boundary files to calculate the center of EAs (ICF International, 2012).
The georeferenced datasets can be linked to individual and household records in DHS household surveys through unique cluster identifiers. A displacement of up to 5 km in rural areas and up to 2 km in urban areas is introduced at the processing stage to anonymize cluster locations. Furthermore, around 1% of the rural clusters are displaced up to 10 km (Macro International Inc., 1996;Burgert et al. 2013;ICF International, 2012). The displacement affects the geolocation of the data, so this had to be taken into account when building the modelling dataset of indicators derived from the DHS survey. For linear modelling approaches, we used the mean value of each variable in a buffer of two kilometres from the cluster location for urban areas and five kilometres for rural areas, following published recommendations (Perez-Heydrich, 2013). For the non-linear modelling architectures (e.g. Artificial Neural Networks (ANN), the values came from a Monte Carlo analysis using the same buffers.
The publicly available DHS survey data and related cluster GPS coordinates for Kenya   (KNBS, 2010), Tanzania (2010) (NBS, 2011), Nigeria (2013 (NPC, 2014), Bangladesh (2011) (NIPOR, 2013) and Haiti (2012) (Cayemittes et al., 2013) were obtained. The sections below describe the cluster level variables we mapped. We produced gender-disaggregated maps of nutrition, literacy and family planning indicators. These indicators are clearly defined by the DHS program and were constructed following the instructions contained in individual country final reports, and information from Rutstein and Rojas (2003).

Geospatial covariate layers
To predict each indicator at locations where survey data were not available, we exploited the relationship with covariate layers taking into account spatial autocorrelation (e.g. Alegana et al., 2015;Sedda et al., 2015). An important aspect of geostatistical modelling is the exploitation of geospatial covariates that are correlated with the outcome of interest, and can partially explain variation in that response, allowing for more accurate predictions across the map. A suite of covariates was chosen from existing publicly available libraries, based on factors that have previously been shown to correlate with development indicators in different settings (Alegana et al 2015, Gething et al 2015. A set of several physical (topography, climate, land cover, etc.) and some social (population, ethnicity) covariate grids, completely covering the selected countries, were selected and tested as possible explanatory covariates across the analysed DHS indicators. With this being a short pilot project, because of the limited time and the long time necessary for assembling a proper set of covariates, differences among countries exists in the number and types of covariate grids in each of the datasets. These differences are mainly related to data availability and to the time necessary to build specific covariates when not already available. Due to the different spatial resolution, projection system, format and extent of the datasets, algorithms were developed and applied for converting all the layers in a common 1x1 km gridded datasets suitable to be used in map production. Information on features, sources, and naming conventions for each of the covariates are available in the following tables (1-4) and subsections.   In order to test the value of inclusion of these spatial datasets as covariates in the statistical models, the following steps were taken: 1) The study area was rasterized at a resolution of 30 arc-seconds (i.e., 0.00833333 degree corresponding to ~ 1km at the equator) 2) Vector datasets were rasterized at a resolution of 30 arc-seconds 3) If needed, raster datasets were resampled to 30 arc-seconds using the most appropriate interpolation technique, depending on the category, type, and original spatial resolution of each given dataset 4) Finally, all datasets were co-registered (i.e., spatially aligned so that all pixels representing the same area and location in different datasets are exactly coincidental) and matched to the rasterized study area (by clipping them to the study area and/or filling their "NoData" pixels located within the study area)

Accessibility
The EU Joint Research Center maintains a gridded surface that estimates accessibility, measured in likely travel times (via all transport methods), to cities with more than 50,000 inhabitants. In practice, this provides a useful composite measure of the extent to which regions are rural or urban, as well as the degree of their connectedness to the national system of transportation. Areas near major roads, for example, would be relatively well connected, even if they were some distance from major cities. More details of this geospatial layer can be found at http://forobs.jrc.ec.europa.eu/products/gam.

Nightlights
Visible Infrared Imaging Radiometer Suite (VIIRS) composite nightlight imagery for 2012 and 2014 were obtained (http://ngdc.noaa.gov/eog/). These surfaces allow differentiation of regions based on the density of population and the degree of electrification of dwellings, commercial and industrial premises, and infrastructure.

Population density
As part of the WorldPop project (www.worldpop.org.uk), gridded data on population density were created from settlement, land use and other geospatial data derived from satellite imagery, and used to disaggregated areal census population counts to 100x100m grid squares.

MODIS Enhanced Vegetation Index (EVI)
EVI composites are produced globally at 500-metre resolution on 16-day intervals from data collected by the MODIS sensor on the Terra satellite. MODIS EVI products are computed from atmospherically corrected bi-directional surface reflectances that have been masked for water, clouds, heavy aerosols, and cloud shadows. These data are used for global monitoring of vegetation condition (Thome et al). For this study, 16-day composites were downloaded for the most recent 5 years (2010 -2014) and averaged at the pixel level to create a longterm average of vegetation condition.

MODIS Mid-infrared (MIR) reflectance
MIR reflectance is the surface reflectance in the middle-infrared part of the electromagnetic spectrum. MIR reflectance (2105 -2155 nm) is collected globally at 500-metre resolution every 16 days by the MODIS sensor on the Terra satellite. These data are used for characterising biophysical properties of the land surface. Vegetation spectral signatures are characterised by low reflectance in the visible and MIR (controlled by pigment and water absorption features). For this study, 16-day composites were downloaded for the years 2001 -2005 and averaged at the pixel level to create a long-term average of MIR reflectance.

Livestock densities
Data from Gridded Livestock of the World v2.0 were obtained via the Livestock Geo-Wiki (http://livestock.geo-wiki.org) (Robinson, 2014). Distribution maps of livestock densities at 1 km resolution were downloaded for cattle, pigs, ducks, goats, sheep, and chickens.

Aridity and Potential Evapotranspiration (PET)
The CGIAR Consortium maintains high-resolution global raster climate data related to evapotranspiration processes and rainfall deficit for potential vegetative growth. These are based on data from the WorldClim project and ultimately from weather station data that has been interpolated by using covariates such as altitude. These grids, extracted for four of the countries, allowed for differentiation of areas with adequate rainfall and moisture regimes to sustain agriculture as opposed to areas where drier and more arid conditions prevail (http://csi.cgiar.org/Aridity/).

Urban/Rural and distance to settlements
For this study the urban extents grid for the year 2000, produced by Columbia University Centre for International Earth Science Information Network (CIESIN) was obtained. The grid distinguishes urban and rural areas based on combining population counts, settlement points, and nighttime lights (CIESIN, 2011). We also exploited a settlement location vector dataset produced within the framework of the WorldPop Project (refer to Tatem et al., 2004; for a description of the methodology used to produce the dataset) that was used to calculate some of the "Distance to settlements" variables. In Tanzania and Kenya we used the MODIS 500m Global Urban Extent to calculate the distance to human settlement. It is based on Moderate Resolution Imaging Spectroradiometer (MODIS) 500-m satellite data and is freely available through https://nelson.wisc.edu/sage/data-and-models/schneider.php.
In Haiti we used the Global Human Settlement Layer (GHSL) with a resolution of 38m, to calculate the distance per pixel to the boundary of the nearest human settlement, for the year 2015. For pixels located within urban areas the 'distance to' value is presented as a negative value, i.e. the distance to the outer urban boundary, thereby providing additional variability. GHSL data is produced within the framework of the European Commission JRC Global Human Settlement Analysis for Disaster Risk Reduction project, implemented by the EC JRC Global Security and Crisis Management Unit. These data are provided to WorldPOP as pre-public-release for scientific research purposes only.

Protected areas
Protected areas were obtained from the World Database on Protected Areas (WDPA), available through (IUCN & UNEP-WCMC, 2012).

Births
Birth data estimating the number of live births per 100-metres were downloaded from the WorldPop Project (http://www.worldpop.org.uk/). The estimates are adjusted to match UN national estimates on numbers of live births (Tatem et al., 2014). For this study we aggregated the 2012 estimates of number of live births per pixel to 1-kilometre.

Pregnancies
Pregnancy data estimating the number of pregnancies per 100-metres were downloaded from the WorldPop Project (http://www.worldpop.org.uk/). The estimates were adjusted to match national estimates on numbers of pregnancies made by the Guttmacher Institute 6 . For this study we aggregated the 2012 estimates of number of pregnancies per pixel to 1kilometre.

Ethnicity
Geocoded ethnic groups were obtained via the GeoEPR 2014 dataset available through ETH Zurich. These data identify politically relevant ethnic groups in every country from 1946 to 2013, including annual data on more than 800 groups (Vogt et al., 2015).

Bioclimatic variables
We used global climate layers produced by WorldClim for annual trends from 1950 -2000 for temperature and precipitation. BIO1 is the annual mean temperature, calculated by the mean of all the weekly mean temperatures. Each weekly mean temperature is the mean of that week's maximum and minimum temperature. BIO12 is the annual precipitation, calculated as the sum of all the monthly precipitation estimates (Hijmans et al., 2005).

Gross cell production
Gross cell production was derived from a geophysical-based data set on economic activity (representing the regional equivalent of gross domestic product) measured at a 1-degree longitude by 1-degree latitude resolution at a global scale (http://gecon.yale.edu/data-anddocumentation-g-econ-project). Each pixel (cell) contains a gross product value calculated by accounting for economic data merged with other important demographic and geophysical data including climate, physical attributes, location indicators, population, and luminosity (refer to Nordhaus et al. 2006 for a detailed description of derivation of G-Econ data).

Rainfed crop suitability
Rainfed crop suitability was derived from a global raster dataset with a resolution of 5 arcminutes (http://www.fao.org:80/geonetwork/srv/en/resources.get?id=14172&fname=Map6_61.zip &access=private). Each pixel contains the suitability index for rainfed crops, calculated using maximising crop and technology mix (2005 version), while pixels representing urban areas, closed forests, and irrigated areas contain negative values.

Land cover
The IGBP MODIS Land Cover Type product (MCD12Q1), with a native resolution of about 500m, is derived from input of Terra-and Aqua-MODIS data and includes 11 natural vegetation classes, 3 developed and mosaicked land classes, and three non-vegetated land classes (refer to Friedl et al., 2010 for a detailed description of how the MCD12Q1 dataset was produced).

Distance to roads
For all countries, the "Distance to roads" covariate was calculated using the most recent Open Street Map road network dataset (http://extract.bbbike.org/) available at the time of analysis.

Distance to waterways
The "Distance to waterways" covariate was calculate either using the most recent Open Street Map river network dataset (http://extract.bbbike.org/) available at the time of analysis or the corresponding VMAP0 dataset (http://geoengine.nga.mil/geospatial/SW_TOOLS/NIMAMUSE/webinter/rast_roam.html) depending on which one was more detailed for each country (refer to tables 1-4 above).

Distance to schools
The "Distance to schools" covariate was calculated using the most recent Open Street Map point location dataset (http://extract.bbbike.org/) available at the time of analysis.

Distance to health facility
The "Distance to health facility" covariate was derived from Open Street Map point location data. In this instance "Health Facility" locations relate to Cholera Treatment Centres and Cholera Treatment Units recorded up until Dec 2011, extracted from http://haitidata.org/.

Bayesian model specification
The Gaussian Function (GF) in INLA is represented as a Gaussian Markov Random Function (GMRF). Computations in INLA are carried out using the GMRF by approximating a set of spatial-temporal random function with weighted sum of basis functions. The advantage of computation using the GMRF as approximations to GF with Matérn covariance is due to the Markovian property of the former resulting in sparse matrices that are computationally efficient. The explicit link between the continuous domain GF is a solution to the stochastic partial differential equation (SPDE) expressed as: where C, G1 and G2 are sparse matrices while t Q precision was based on a temporal AR(1) process similar to Cameletti et al. (2012).
A linear model was implemented using gaussian, zero-inflated binomial or beta likelihood for the proportion of literacy, stunting and use of modern contraception methods.
are realizations of the process linked to a structured predictor in an additive way, ) (s x denotes set of covariates with  coefficients and ) (s  is the measurement error while ) (s  represent first order autoregressive dynamics with spatially correlated innovations.
Gaussian likelihood were adopted in the majority of the case; beta likelihood were applied for modelling male literacy in Kenya and the use of modern contraception methods in Tanzania and zero-inflated binomial for modelling the use of modern contraception methods in Nigeria.

Artificial Neural Networks specification
An artificial neuron is a computational model inspired by natural neurons. Natural neurons receive signals through synapses located on the dendrites or membrane of the neuron. When the signals received are strong enough (surpass a certain threshold), the neuron is activated and emits a signal though the axon. This signal might be sent to another synapse, and might activate other neurons.
The complexity of real neurons is highly abstracted when modelling artificial neurons. These basically consist of inputs (like synapses), which are multiplied by weights (strength of the respective signals), and then computed by a mathematical function which determines the activation of the neuron. Another function computes the output of the artificial neuron . An ANN is implemented by a system of interconnected nodes. Information propagates through nodes transforming the inputs in intermediate derived signals up to generate the final outputs. The internal nodes are called neurons and define the ANN hidden layers. Each node is a processing element propagating weighted inputs received from other nodes (Pradhan, 2009) (Fig. 5.4).
The higher the weight of an artificial neuron, the stronger the input which is multiplied by it will be. Weights can also be negative, thus we can say that the signal is inhibited by the negative weight. Depending on the weights, the computation of the neuron will be different. By adjusting the weights of an artificial neuron we can obtain the output we want for specific inputs.
Depending on the specific ANN architecture, the inputs of a given node may include or exclusively be constituted by intermediate derived signals. The learning process comes from adjusting the weights between neurons analysing the error between the predicted and target output. The output of a neural network, after the training, is a model that is capable of predicting a target value from an input dataset (Lee, 2007).
A useful class of ANN applications (Bosco et al., 2013;de Rigo et al., 2001;Secomandi, 2000) is based on the multilayer feedforward networks (multilayer perceptrons) due to their universal approximation properties (Kreinovich, 1991;Hornik et al., 1989). A feedforward neural network is an ANN having connections between the different units not forming a cycle or loop. In this architecture information moves in only one direction, from the input nodes, through the hidden layers (if any) to the output nodes. The main reason for the use of this type of ANN is the simplicity of its theory, ease of programming and good results (Fig. 5.5 shows the scheme of this kind of ANN). Fig. 5.4 -Artificial neuron model. Inputs to the network are represented with the symbol x n , each of these inputs are multiplied by a connection weight w n , summed and fed through the transfer function f() to generate a result and the output. This configuration is actually called a Perceptron. The perceptron (an invention of Rosenblatt (1962)), was one of the earliest neural network models. Each of the processing neurons calculates the weighted sum of all interconnected signals from the previous layer plus a bias term and then produce an output through the activation function. The effective incoming signal to node j is: The activation function associating individual nodes have typically a sigmoid shape (Fig. 5.6). The sigmoid function most often used for ANNs is the logistic function: in which can vary in the range ±∞ but y is bounded between 0 and 1. Other transfer functions can also be applied. The adjustment of the ANN function to experimental data (training of the network) is based on a non-linear regression procedure (Fraser, 2000). Random weights are assigned to each neuron, the output of the network is evaluated and the error between the output of the network and the training dataset is calculated. If the error is large, the weights are adjusted and the process goes back to evaluate the network's output. This cycle is repeated until the error is small or a stop criterion is satisfied. During the training of a neural network, the prediction error is evaluated at each iteration. The use of an ANN with too many neurons allows an excess of degrees of freedom. This can cause overfitting of the data. A test dataset can be separated and used to check how good the prediction capacity of the ANN is, on the basis of the sum of squared prediction errors. For obtaining the optimal degree of training it is possible to explore the ANN performance in order to minimize the sum of the training plus validation (or cross-validation) errors ( Figure 5.7), whilst ensuring that the training process is not stopped at the first point of minimum. Training should be allowed to proceed further to check whether or not it is a point of local minimum, since a local minimum can be found. Within the present study we explored the use of feed-forward neural networks through the package AMORE (A MORE flexible neural network) (Castejón Limas et al., 2014) in GNU R and through the package Neural Network (Schmid, 2009) available in GNU Octave (Eaton, 2008). The functions in these packages allow to develop the most common type of neural network model (the feed-forward multi-layer perceptron). The functions have enough flexibility to allow the user to develop the best or most optimal models by varying parameters during the training process.
In the package AMORE, the user can select the number of layers and the number of neurons in each layer, while controlling several parameters. These include the learning rate at which every neuron is trained, the momentum for every neuron, the error criterion (least mean squares, least mean logarithm squares or TAO), the activation function of the hidden and the output layer (Purelin, Tansig, Sigmoid, or Hardlim), and the training method (Adaptive gradient descent or BATCH gradient descent, with or without momentum). The same principle applies for the package Neural Network where the user can select the number of layers and neurons in each layer and the activation function of the hidden and the output layers (Purelin, Tansig, logsig) to apply with the training algorithm (Levenberg-Marquardt). With these parameters selected, the algorithm trains the network with the manually or randomly selected samples before testing it with the rest of the samples.
Feed-forward neural networks provide a flexible way also to generalize linear regression functions. They are non-linear regression models but with many parameters, such that they are extremely flexible-flexible enough to approximate any smooth function. (Venables and Ripley, 2002).

Selection of geospatial covariate layers
For obtaining a more appropriate combination of covariates to produce high-resolution prediction maps for each of the modelled indicators, a sensitivity analysis using a jackknife approach was carried out (Fig.1). The jackknife analysis consists of dropping one observation at a time from one set of data, and calculating the estimate each time. It was developed by Maurice Quenouille, (1949, 1956 and John Tukey (1958) expanded on the technique and proposed the name "jackknife". Fig. 1 -Results of the jackknife analysis implemented in MATLAB language in GNU Octave for selecting the final subset of covariates of an artificial neural network. The x-axis shows the subset of covariates where the i th covariate was removed. The y-axis is the score obtained by comparing the MSE of all the subset of covariates in each of the 400 runs of the jackknife loop. The two bars represent the results with one and two neurons, respectively. In this example, the 12 th covariate is a good candidate to be removed both in the ANN with one and two neurons.
Within the modelling architectures, categorical covariates with more than two levels were recoded into a number of separate dichotomous variables in order for the results to be interpretable. All covariates were also normalized to make all variables have a mean of zero and unit variance.

Semantic Array programming
Managing heterogeneous arrays of data and data transformation models in a systematic and structured way is a challenging task. The multiplicity of model families, covariates and modelled quantities in this work required the support of a common, flexible and scalable modelling architecture. The applied modelling architecture is based on the Semantic Array Programming (SemAP) paradigm (de Rigo, 2012;2015). Array programming (AP) emerged as a way to reduce the gap between mathematical notation and algorithm implementations by promoting arrays (vectors, matrices, tensors) as atomic quantities with compact manipulating operators (Iverson, 1980). Atomicity here implies that even a large array of data is managed as a single logical piece of information. For example, a regional-scale gridded layer may be managed by AP languages as if it were a single variable instead of a large matrix of elements.A disciplined use of AP (Iverson, 1980) may allow nontrivial data-processing to be expressed with very concise expressions (Taylor, 2003) and a potentially simpler control flow. However, this capability for abstraction and simplification of AP may be limited by the very same generality of AP data structures-multi-dimensional arrays where the value of some elements may be infinite or not-a-number (IEEE 754 standard) or even complex-valued (de Rigo, 2015). The Semantic Array Programming paradigm has been introduced for supporting a disciplined semantics-aware implementation of AP concepts and methods, with additional systematic semantic checks for the semantic correctness of the chain of modelling blocks (de Rigo, 2012). This is why our computational modelling methodology follow the SemAP paradigm by combining concise implementation of the model with its conceptual subdivision in semantically enhanced abstract modules.
Two additional design concepts (de RIgo, 2012) define the SemAP paradigm as supported by the Mastrave library: 1. The modularisation of sub-models (and more generally of any autonomous task which displays a nontrivial extent of complexity) with a systematic effort toward their most concise generalization and reusability in other contexts. 2. The use of semantic constraints (based on the mathematics of arrays and defined following the physical/conceptual meaning of the modelled quantities) with array oriented -thus concise-invariants applied to the information entered in and returned by each module. Even this design pattern is applied systematically within each SemAP module instead of relying on external assumptions (which might prove more fragile in case of, for example, subsequent refactoring of a certain chain of data transformations where key external checks are inadvertently removed).
Modularization also expects consistent code self-documentation and uniform predictable conventions for module interfaces (without directly interfering with the preferred module's implementation). Semantic constraints contribute to enforce, within each module, autonomous distributed consistency checks instead of assuming top-down correctness of input information (e.g. instead of relying on object-oriented "monolithically-designed-to-besafe" data).

Results
This section presents the results for the gender-disaggregated indicator mapping addressed in this project. We organize the presentation of results by indicator, at gender disaggregated level, in the following order: literacy, stunting in children, use of modern contraception methods. For each indicator, the results of a first exploratory analysis are presented with gender disaggregated histograms showing the basic statistical distribution of the indicator at cluster level and a scatter plot of the predicted versus observed data both in training and validation. We then present the results of the covariate selection exercise, detailing which covariates were selected as the optimum performing set for the given indicator for each country at gender disaggregated level and, for each indicator having an associated modelling explained variance higher than 0.3, we show maps of the survey clusters and the indicator value at each cluster, maps of the predicted proportion of modelled indicators and the level of uncertainty associated with these maps in each pixel, and finally the quantile-quantile (Q-Q) plot in training and validation. The maps reported in the following paragraphs are: male and female literacy rate in Nigeria and Kenya, female literacy rate in Tanzania, male and female stunting in Nigeria and the proportion of women using modern contraception methods in Nigeria and Tanzania.   showing the associated interdecile range. Figure 13. Gender disaggregated histograms (top row) related to the distribution of a 70% subset of DHS data (training) and scatter plot of the predicted (y-axis) and observed (x-axis) proportion of stunting for children under the age of 5, in the training (middle row) and validation (bottom row).   . Q-Q plot of the predicted (y-axis) and observed (x-axis) proportion of women using modern contraception methods in the training (left) and validation (right) dataset. Figure 17. Map of the cluster-level survey data (top row), for the indicator on use of modern contraception methods in female age 15-49. Maps of the mean predicted proportion of women using modern contraception methods at 1 km 2 resolution (middle row) and related uncertainty maps (bottom row) showing the associated standard deviation. . Q-Q plot of the predicted (y-axis) and observed (x-axis) proportion of women using modern contraception methods in the training (left) and validation (right) dataset. Figure 19. Map of the cluster-level survey data (top row), for the indicator on use of modern contraception methods in females age 15-49. Maps of the mean predicted proportion of women using modern contraception methods at 1 km 2 resolution (middle row) and related uncertainty maps (bottom row) showing the associated standard deviation.