Exploring research questions on Gapminder

With the purpose of improving my skills in data science, I registered in the course on Data Management and Visualization offered by the Wesleyan University and taught by Lisa Dierker, Professor of Psychology through Coursera. Throughout this blog posts series I relate my experience and provide updates on my progress in this certification program.

In this first post, I am going to develop a research question to be address in the incoming course sections. As a public health scientist, I am interested in factors that influence global health, welfare, climate, development and climate. For this reason, I chose to work on the GapMinder dataset provided by the instructor. GapMinder is an organization dedicated to collecting data from a handful of sources, including the Institute for Health Metrics and Evaluation, US Census Bureau’s International Database, United Nations Statistics Division, and the World Bank.

The Gapminder data file provided by the course instructor is a compilation of several datasets downloaded from the organization website and includes one year of numerous country-level indicators of health, wealth and development. Word countries are set as unique identifiers.

Figure 1: Compiled GapMinder dataset provided for the Data Management and Visualization course on Coursera.

After taking a look at the codebook, I decided to study factors associated with HIV, breast cancers and suicide. For now, I am including in my codebook, all variables pertaining to wealth, development, policy and personal behavior: income per person, alcohol consumption, CO2 emissions, female employment rate, Internet use rate, life expectancy, oil consumption per capita, polity score, residential electricity consumption per person, and urban population.

I found that several variables in the code book are related to development issues, upon a literature review with the search terms development+health+issues, economic + development + health issues, economic development factors, economic development determinants, human development index, on Google and Google Scholar.

Figure 2: Diagram of Human Development Dimensions. Credit: The United Nations Development Programme.

The United Nations Development Programme (2015)’s perspective on human development embraces several factors such as life expectancy, education, standard of living, political environment, environmental sustainability, human rights and security, and gender equality.

Empirical studies on economic growth across countries highlighted the correlation between growth and a variety of variables such as natural resources, capital, labor, power, transport, communication, and human capital (Robert J. Barro & Robert J. Barro, 1997; Malecki, 1997). Thus, most of variables in our code book – such as income per person, CO2 emissions, female employment rate, Internet use rate, life expectancy, oil consumption per capita, polity score, residential electricity consumption per person, and urban population, are indicators of human development. Deaton (2003) and López i Casasnovas, Rivera, & Currais, (2005) demonstrated several interactions between health, economic development, life expectancy, employment, and income. A study conducted by Kposowa (2001) revealed that unemployment is strongly related to suicide, but this relationship is more enduring and stronger among women.

Based on these findings from the literature, I hypotheses that human development indicators (income per person, CO2 emissions, female employment rate, Internet use rate, life expectancy, oil consumption per capita, polity score, residential electricity consumption per person, and urban population) may be associated with public health indicators which are HIV rate, breast cancers rate, and suicide rate in the code book.

In the progress of this course, I plan to address the following question: are development indicators associated with HIV, breast cancers and suicide rates?

In the next post, I will explore my data and write my first Python program, and rewrite the code in SAS which I am already familiar with for having used it to analyze the Burkina Faso Demographic and Health Survey 2010 data for my Master of Public Health’s thesis at the Georgia State University School of Public Health (Yehadji, 2015).


Deaton, A. (2003). Health, Inequality, and Economic Development. Journal of Economic Literature, 41(1), 113–158. http://doi.org/10.1257/002205103321544710

Kposowa, A. J. (2001). Unemployment and suicide: a cohort analysis of social factors predicting suicide in the US National Longitudinal Mortality Study. Psychological Medicine, 31(01), 127–138. http://doi.org/null

López i Casasnovas, G., Rivera, B., & Currais, L. (Eds.). (2005). Health and economic growth: findings and policy implications. Cambridge, Mass: MIT Press.

Malecki, E. J. (1997). Technology and Economic Development: The Dynamics of Local, Regional, and National Change (SSRN Scholarly Paper No. ID 1496226). Rochester, NY: Social Science Research Network. Retrieved from http://papers.ssrn.com/abstract=1496226

Robert J. Barro, & Robert J. Barro. (1997). Determinants of Economic Growth: A Cross-Country Empirical Study. The MIT Press. Retrieved from https://mitpress.mit.edu/books/determinants-economic-growth

United Nations Development Programme. (2015). About Human Development. Retrieved November 25, 2015, from http://hdr.undp.org/en/humandev

Yehadji, D. (2015, April 20). Urban-rural disparities in HIV related knowledge, behavior and attitude in Burkina Faso: Evidence from Burkina Faso Demographic and Health Survey 2010. Retrieved from http://scholarworks.gsu.edu/iph_theses/390

2 thoughts on “Exploring research questions on Gapminder

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s