New insights into geospatial data at NYC Open Data Week 2021

How can innovative sources and applications of open geospatial data be used to empower insights at a global scale?

March 24, 2021
Soren Patterson

Demand for geospatial (or location-based) data and analytics is exploding: estimates vary, but the market for it is estimated to reach anywhere from $125 to $300 billion in size by 2025.

Geospatial data is driving a diversity of human activities and industries, from retail and shipping to international development and local government. But with terabytes upon terabytes of data being unlocked from a dizzying array of sources, how can individuals and organizations seeking to employ geospatial data and tools possibly drink from this firehose of information?

Earlier this month, we provided some more manageable sips from that firehose. On March 9th, AidData hosted an event as part of NYC Open Data Week 2021. The workshop, Empowering Insights with Open Geospatial Data, provided attendees with an opportunity to learn how they could use innovative sources and applications of open geospatial data to empower insights at a global scale. More than 75 individuals attended from a variety of government, non-profit and commercial organizations. The event recording and the presentation slides have now been made available online. 

Members of AidData and the geoLab, an affiliated research lab at William & Mary, including myself, Katherine Walsh, and Sydney Fuhrig, discussed three sources of data aimed at empowering users to better leverage geospatial information: (1) GeoQuery, AidData’s free platform for creating customized geospatial datasets, (2) AidData’s ongoing work tracking the hundred-billion-dollar scope of China’s global development footprint, and (3) the geoLab’s geoBoundaries project, the world's largest open, free and research-ready database of political administrative boundaries. Seth Goodman, a data engineer at AidData and developer of GeoQuery, moderated the event.

We first shared about GeoQuery, an initiative of AidData to make geospatial data more accessible and useful for those who may have no experience with it. GeoQuery.org is a completely free online platform that anyone can use to create customized datasets for the geographic areas of the world and the topics in which they are interested. 

GeoQuery saves time and lowers the barrier to working with geospatial data by handling processing of large scale geospatial datasets, providing them in a simple and easy-to-use format. Behind the scenes, it uses a high-performance computing cluster—essentially, a supercomputer—to process terabytes of data, allowing anyone to find and aggregate dozens of datasets into a single spreadsheet. Users can select from over 70 satellite, economic, health, conflict, and other datasets that span decades and are available for over 195 countries and territories. 

I provided an introduction into how GeoQuery works and the datasets available for the entire world, as well as a demo of how GeoQuery can be used to visualize data and produce meaningful insights into geospatial trends. The platform is and always will be free as a public good, thanks to the generous support of our partner, the Cloudera Foundation, as well as foundational investments from William & Mary, USAID and the William & Flora Hewlett Foundation

We’ve also recently produced a short video series that explores GeoQuery and its features for new users. We hope to reach a wider range of users, from large international organizations to small grassroots nonprofits, who are curious about the potential for geospatial data, but may not be familiar with how to use it or how it can help their work.

One of the most popular datasets in GeoQuery is AidData’s geocoded global Chinese official finance dataset. China does not itself publish a country-by-country breakdown of its international official finance activities, nor does it make public project-level data. Katherine Walsh, program manager of AidData’s Transparent Development Footprints team, shared how AidData has worked to assemble the most comprehensive and detailed source yet of project-level information on China’s global development footprint, covering 4,300 projects worth $350 billion between 2000-2014. She also previewed some changes in the types of information that will be available, when AidData releases its next version of the dataset later this year that extends coverage to 2017.

Sydney Fuhrig, managing director of geoBoundaries, shared how her team currently tracks approximately 300,000 boundaries across 199 countries and territories to produce the world’s largest online, open-source dataset of different levels of administrative and political boundaries (i.e., state, county, district). The geoBoundaries are fully available in AidData’s GeoQuery tool, so users can extract the data they are interested in to any boundaries.

All boundaries in the geoBoundaries Global Database of Political Administrative Boundaries Database are also available to directly view or download in common file formats, including shapefiles (a file format that can be used in GIS software). Attendees had the opportunity to learn how data for geoBoundaries is collected and made freely available, as well as how geoBoundaries is being used in numerous applications.

If you weren’t able to make it to the event, check out the recording and presentation slides. And if you would be interested in a follow-up webinar, please fill out this form. Finally, reach out with questions about GeoQuery and AidData’s research on China to info@aiddata.org and with questions about geoBoundaries to team@geoboundaries.org.

Soren Patterson is AidData's Communications Specialist.