Geospatial Data & Tools for Analysis
More location-specific data is available today than ever before, but organizations often don't have the means to use it. AidData's GEO program breaks down technological barriers and empowers a broad range of data users to generate analysis and insights with next-generation geospatial data, methods and tools.
Our spatial data integration and extraction infrastructure is available for free public use through GeoQuery, a powerful platform that enables users without significant computing power or technical expertise to easily find, access and merge together spatial data from a wide variety of sources.
The GEO program also creates innovative measures of development outcomes that can be observed at high levels of frequency and spatial precision; provides the largest and most up-to-date open-source repository of geographic boundaries; and innovates new geostatistical analysis and machine learning-based methods that address the triple challenges of spatial measurement imprecision, spatial spillovers and spatially heterogeneous impacts.
GeoQuery enables individuals and organizations without significant computing power or data science expertise to freely find and aggregate satellite, economic, health, conflict, and other spatial data from anywhere in the world into a single, simple-to-use file.
GeoBoundaries provides accurate data on the geographic boundaries of administrative areas around the globe. Unlike other boundary datasets, GeoBoundaries is an open product: all boundaries are free and redistributable, and are released with extensive metadata and license information to inform users.
GeoSIMEX is a geospatially-adapted SIMEX (simulation and extrapolation) model available as an R package. It helps research models establish a relationship between measurement error and covariate bias introduced by geospatial uncertainty to estimate the impact an intervention had.
GeoMatch is an R package that helps researchers overcome a major barrier to causal inference: treatment spillover to (nearby) control units. An example of this challenge is when a clinic may not only improve health outcomes in the neighborhood where it is located, but also in nearby areas. Failing to adjust for these types of treatment spillovers can result in erroneous estimates of causal impact.
Recent work has demonstrated that machine learning methods—which identify features in satellite imagery that correlate with “ground truth” data—can be used to fill in major gaps in household surveys. AidData uses these methods to generate high-precision, high-frequency measures of development outcomes, and apply them in real world applications (e.g. monitoring and evaluation of aid programs).
A revolutionary new tool to spur open access to usable geospatial data
Find quality assured datasets curated by experts
Filter and join datasets without using code
Data is exported to a clean CSV with predictable naming conventions
Supporting documentation includes metadata
Access a permanent link of data extraction requests
Watch a short demonstration or go right to GeoQuery.
How GeoQuery works
GeoQuery performs advanced spatial statistics to extract data from open-source datasets on topics on such as:
- International Aid
- Population and the Environment
- Conflict and Health
- Economic Development
- Access to Infrastructure
Every data request you make returns an email with a single spreadsheet file (CSV), where each row is a geographic boundary and each column is a requested dataset. This file can be read by nearly all software packages, and we also include a full PDF of metadata. All requests are made accessible at a unique, permanent URL to promote data sharing. See our Quick Start Guide.
Visit our list of GeoQuery-related research publications to explore the technical details. Additional options for advanced users:
- Boundary Data: Get the geographic and administrative boundary data (GeoBoundaries) used in GeoQuery. Fully open and redistributable, with all meta and license data available for every country in the world.
- Measurement Datasets: Access the raw raster and vector data used to generate data in GeoQuery, including development project data, open-source satellite data, household survey data, and other information.
Contact firstname.lastname@example.org for more information on GeoBoundaries and raw raster and vector data.
A convolutional neural network approach to predict non‐permissive environments from moderate‐resolution imagery
Seth Goodman, Ariel BenYishay, Daniel Runfola
Impacts of a large-scale titling initiative on deforestation in the Brazilian Amazon
Benedict Probst, Ariel BenYishay, Andreas Kontoleon, Tiago N. P. dos Reis
Exploring the Socioeconomic Co-benefits of Global Environment Facility Projects in Uganda Using a Quasi-Experimental Geospatial Interpolation (QGI) Approach
Daniel Runfola, Geeta Batra, Anupam Anand, Audrey Way, Seth Goodman
GeoQuery: Integrating HPC systems and public web-based geospatial data tools
Seth Goodman, Ariel BenYishay, Zhonghui Lv, Daniel Runfola
Featured Blog Posts
The largest update yet to GeoQuery, AidData’s free spatial data platform
New data on Chinese development finance, and data at monthly intervals for frequently-used datasets.
Using Facebook’s Social Connectedness Index to study social networks’ impact on the diffusion of agricultural technology in sub-Saharan Africa
AidData found that the use of improved seeds was higher in regions more socially connected to other regions that had previously adopted improved seeds.