FAQs about our Global Chinese Official Finance Dataset, Version 1.0

1. What types of Chinese overseas investments do you track in your dataset?

The Global Chinese Official Finance Dataset, Version 1.0 covers the known universe of projects officially financed by China in five major regions of the world from 2000-2014. The dataset includes both Chinese official development assistance (ODA) and other official flows (OOF) from the Chinese government to other countries with developmental, commercial, or representational intent.

Chinese ODA represents “Chinese aid” in the strictest sense of the term, but Chinese official finance (ODA + other official flows) is sometimes used as a broader definition of aid. AidData’s dataset allows users to disaggregate Chinese official finance into its constituent parts to examine Chinese assistance using either a narrow or broad definition of aid.

Export credit projects are coded as other official financing (OOF) flows that have commercial intent. China Export-Import Bank and China Development Bank are the two major providers of export credits for buyers and suppliers.

In addition to ODA and OOF, the dataset also includes one additional category, "Vague Official Finance," which we assign to any flow that represents official financing, but for which we have insufficient information. For example, vague official finance would include projects for which we do not know the level of concessionality to determine whether the project should be classified as ODA-like or OOF-like. This category has been created by AidData to be fully transparent about the uncertainty and imprecision in our data collection efforts.

For more information on how AidData and the OECD define these terms, see the glossary and how to use our data.

We do not systematically capture "unofficial" financing such as Joint Ventures, Foreign Direct Investment, military assistance, or corporate aid.

2. Which geographic areas and time periods does your China dataset cover?

AidData’s Global Chinese Official Finance Dataset, Version 1.0 tracks official financing to 140 countries and territories between 2000-2014. The dataset includes official finance investments in five major regions of the world: Africa, the Middle East, Asia and the Pacific, Latin America and the Caribbean, and Central and Eastern Europe.

The global dataset builds upon AidData’s previous Chinese Official Finance to Africa datasets (versions 1.0, 1.1, 1.1.1, and 1.2), which covered 2000-2013, collectively. The global dataset expands upon this initial effort by capturing Chinese investments in regions beyond Africa and adding activity from 2014.

The data collection process covered all low- and middle-income countries and territories—125 of which yielded at least one project during the time period. The dataset also includes some flows to 15 high-income countries in the specified regions. In the final version of the dataset, 140 countries and territories were found to have received at least some funding from China.

While we included North Korea in our data collection efforts, we caution users that this data is likely to be under-reporting actual official financing activity, given the secrecy of these financial flows. Users are advised to exclude North Korea from any statistical analysis or the generation of estimates of total Chinese official development assistance (ODA) or other official flows (OOF) to specific countries. More information on how to use our data is available here.

Significant time and effort is required to standardize and synthesize large volumes of structured and unstructured open-source information. Therefore, our dataset is currently reported with a time lag of 2-3 years. Depending on the availability of resources, we would like to expand this time series to 2017.

3. What information do you collect on Chinese official finance?

Our unit of analysis is generically referred to as a "project." Broadly defined, a "project" is a discrete transfer of goods, services or cash. Apart from discrete projects, the dataset also captures: individual activities that are subsets of larger projects; cash payments; economic and technical agreements; and MOUs for technical and economic cooperation.

We have differentiated records that refer to "mega deals" or conglomerates of projects/flows by marking them as "umbrella" projects. This is to reduce the chance of double counting across separate records; we generally recommend that users exclude umbrella projects from any estimates that they generate of aggregate amounts of Chinese official financing.

We believe that if you want to know what providers of official finance are really doing, you have to "follow the money"—that is to say, you must follow projects through their entire life cycle. By tracing the progress of projects over time and triangulating a wide variety of sources (all of which are posted on the individual project pages on China.AidData.org), we categorize each record in our database as either a pledge, an official commitment, a project in implementation, a completed project, or a suspended/cancelled project.

A pledge is a "verbal, informal agreement" between the development partner and partner country. We do not include pledges in our reporting of aggregate financial amounts of Chinese official financing, because there is no concrete evidence that these pledges have progressed. We advise all users to do the same in excluding pledges for their analysis.

An official commitment is a firm obligation, expressed in writing and backed by the necessary funds, undertaken by an official donor to provide specified assistance to a recipient country or a multilateral organization.

We also include project cancellation and suspension data. A central question for many aid analysts is the conditions under which either donors or recipients choose not to follow through on their commitments. In order to identify these conditions, one would ideally have data on the cases where the donor "changed its mind" or chose to suspend/cancel a project. Additionally, suspended or canceled projects can be used as evidence to understand changing relations between the Chinese government and the recipient.

In total, we track 63 variables for each project record in the dataset. Complete definitions for each dataset field are included in the readme file that accompanies the Version 1download, and more detailed information is available in the methodology document.The most important variables include:

  • project_title - title of the project, as created by our researchers.
  • year - year in which the project was agreed upon between the donor and recipient (i.e. year committed). In the case of pledges, this is the year the project was pledged (e.g. announced).
  • recipients_all - the partner country(ies) receiving of the financing or technical assistance.
  • status - the current state of the project. Options include: pipeline: pledge; pipeline: commitment; implementation; completion; suspended; and cancelled.
  • intent - the donor’s intent for the project. Options cover the following categories:  Development, Commercial, Representational, and Mixed (some development, no development, uncertain).
  • flow_type - the modality by which the project is conducted. Options include: grants; technical assistance; interest-free loans; in-kind contributions of goods and services; debt forgiveness, debt rescheduling; export credits; loan guarantees; scholarships, and strategic/supplier credits.
  • flow_class - a categorization of the flow according to the level of concessionality, aligned with definitions as established by the OECD-DAC. Options include: ODA-like, OOF-like, and Vague (OF).
  • umbrella - identifies mega-projects that likely comprise smaller projects captured elsewhere in the dataset.

We do not track disbursements, because this would require information on yearly aid disbursements that is not usually provided in open source documents. However, by using the "status" variable, users may classify projects that have been implemented or completed as disbursements of Chinese official financing; during the quality assurance phase, our team strives to update the status variable for each project record to accurately reflect the most up-to-date status of the project.

Please see our methodology document for more information.

4. How do you collect your official finance data? How do you ensure the data is reliable?

AidData uses an innovative, open source data collection methodology called Tracking Underreported Financial Flows (TUFF) to capture the known universe of Chinese official financing flows at the project level from 2000-2014.

The TUFF methodology is a transparent, systematic, and replicable set of procedures for standardizing and synthesizing information from four types of sources:     

  1. English and Chinese language news reports                   
  2. Chinese ministries, embassies, and economic and commercial counselor offices
  3. Aid and debt information management systems of finance and planning ministries in counterpart countries
  4. Case study and field research undertaken by scholars and NGOs

When conflicting information is reported by different sources, we prioritize the official sources or the information that most sources report. For the purposes of financial reporting, when all all of our sources are media reports and a majority of the available sources do not agree upon a consistent number, we default to the lowest financial estimate to err on the side of caution.

The TUFF methodology has been stress-tested, refined, codified, and subjected to scientific peer-review, resulting in dozens of working papers and journal publications. The use of TUFF-derived data on Chinese development finance has also resulted in more than 90 stories in elite and mass media outlets, including articles in The Guardian, The Economist, and the Financial Times.

A recent publication in the Journal of Development Studies by Muchapondwa et al. (2016) also found that field-based data collection methods and TUFF-based produce generally very similar data. However, field-based data collection is prohibitively costly and complex if one is trying to achieve comprehensive, global coverage of China’s official financing activities. Nor is it sustainable over the long-run.

We have made several improvements to our TUFF methodology with this iteration:

  • Reduced reliance on media sources: The latest version of the dataset relies on media reports for only 56% of all sources -- this is down from 89% in the original Chinese Office Finance to Africa dataset.
  • Increased use of official and academic sources: Official government data and documentation from China, counterpart countries, and international organizations now constitute 27.6% of all sources (up from 21% in the 1.0 version of the dataset). Peer-reviewed journal articles and other academic publications represent 6.8% of all sources (up from 1% in the 1.0 version of the dataset).
  • Expanded number of sources for each project: There has also been an increase in the average number of sources that underpin each project record -- from 2.13 sources (in the 1.0 dataset version) to 3.6 sources (in the latest version of the dataset)
  • More complete information records: The average project record completeness score has increased from 6.09 (in the 1.0 dataset version) to the 6.37 (in the current version of the dataset). This means that an increasing number of the core fields (e.g. transaction amounts, flow types, and commitment years) for each project record are populated.

To assemble AidData’s Global Chinese Official Finance Dataset, Version 1.0, we have collected project-level information from over 15,000 distinct information sources. On average, each project entry is informed by 3.6 sources. Although 24% of project records are based on information from a single resource, they represent only 6% China’s total financial commitments globally. For each project, we include a "Health of Record" score that rates its completeness and verifiability.

Please see our complete methodology document for more information.

5. Is the data available on the China.AidData.org the same as what is included in the static data?

Currently, only the geocoded data from our Chinese Official Finance to Africa (version 1.1.1) and Chinese Official Finance in Three Ecologically Sensitive Areas (version 1.0.1) datasets are available in the China.AidData.org online database with subnational location information.

All location information was collected from the same source documentation used to create project records. Media reports, government document, and academic articles can provide very detailed information on the sub-national location of projects. AidData’s geocoding methodology, which has also been adopted as a global reporting standard by the International Aid Transparency Initiative, has been applied to more than 1,900 Chinese development projects in our China.AidData.org database.

6. How should I cite your dataset? Are there any licensing restrictions?

AidData makes our Global Chinese Official Finance Dataset, Version 1.0 available as a public good, so that it can be used and re-used widely. We only ask that you appropriately attribute the use of our data using the citations below.

We offer two citation formats for users of our Global Chinese Official Finance Dataset, Version 1.0, depending upon how the data is being used (see below).

For academic purposes, please cite the dataset as:

Dreher, Axel, Andreas Fuchs, Bradley Parks, Austin M. Strange, and Michael J. Tierney. 2017. Aid, China, and Growth: Evidence from a New Global Development Finance Dataset. AidData Working Paper #46. Williamsburg, VA: AidData.

For non-academic purposes, please cite the dataset as:

AidData. 2017. Global Chinese Official Finance Dataset, Version 1.0. Retrieved from http://aiddata.org/data/chinese-global-official-finance-dataset

To cite AidData's other data releases on China, please view the citation information separately available along with each of the datasets:

Chinese Official Finance in Africa, Versions, and 1.2

Chinese Official Finance in Ecologically Sensitive Areas, Version 1.0.