A look at GapMinder data

Data source:

GapMinder data are comprised of global development indicators curated by the Gapminder Foundation. The foundation is a non-profit venture registered in Stockholm, Sweden, aiming at promoting sustainable global development and achievement of the United Nations Millennium Development Goals by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels.

Data collection:

Since its conception in 2005, Gapminder has grown to include over 200 indicators, including gross domestic product, total employment rate, and estimated HIV prevalence. Gapminder contains data for all 192 UN members, aggregating data for Serbia and Montenegro. Additionally, it includes data for 24 other areas, generating a total of 215 areas. GapMinder collects data from multiples sources, including the Institute for Health Metrics and Evaulation, US Census Bureau’s International Database, United Nations Statistics Division, and the World Bank.

Measures :

The indicators I am interested are: HIV rate, Income per person, Alcohol consumption, Breast cancer per 100,000 female, CO2 emissions, Female employment rate, Employment rate, Internet use rate, Life expectancy, Oil per person, Polity score, Residential electricity per person, Suicide per 100,000 people, and Urban rate.

HIV rate: 2009 estimated HIV Prevalence % – (Ages 15-49) Estimated number of people living with HIV per 100 population of age group 15-49.

Income per person: 2010 Gross Domestic Product per capita in constant 2000 US$. The inflation but not the differences in the cost of living between countries has been taken into account.

Alcohol consumption: 2008 alcohol consumption per adult (age 15+), litres. Recorded and estimated average alcohol consumption, adult (15+) per capita consumption in litres pure alcohol.

Breast cancer per 100,000 female: 2002 breast cancer new cases per 100,000 female. Number of new cases of breast cancer in 100,000 female residents during the certain year.

CO2 emissions: 2006 cumulative CO2 emission (metric tons), Total amount of CO2 emission in metric tons since 1751.

Female employment rate: 2007 female employees age 15+ (% of population). Percentage of female population, age above 15, that has been employed during the given year.

Employment rate: 2007 total employees age 15+ (% of population). Percentage of total population, age above 15, that has been employed during the given year.

Internet use rate: 2010 Internet users (per 100 people). Internet users are people with access to the worldwide network.

Life expectancy: 2011 life expectancy at birth (years). The average number of years a newborn child would live if current mortality patterns were to stay the same.

Oil per person: 2010 oil Consumption per capita (tonnes per year and person).

Polity score: 2009 Democracy score (Polity). Overall polity score from the Polity IV dataset, calculated by subtracting an autocracy score from a democracy score. The summary measure of a country’s democratic and free nature. -10 is the lowest value, 10 the highest.

Residential electricity per person: 2008 residential electricity consumption, per person (kWh). The amount of residential electricity consumption per person during the given year, counted in kilowatt-hours (kWh).

Suicide per 100,000 people: 2005 Suicide, age adjusted, per 100 000. Mortality due to self-inflicted injury, per 100 000 standard population, age adjusted.

Urban rate: 2008 urban population (% of total). Urban population refers to people living in urban areas as defined by national statistical offices (calculated using World Bank population estimates and urban ratios from the United Nations World Urbanization Prospects).

The outcomes of interest (response variables) are HIV rate, Breast cancers rate, Suicide rate, and CO2 emissions. The explanatory variables are development or economic indicators such as: income, urban rate, residential electricity per person, etc. All these variables are quantitative continuous, so there will be no need to manage or recode them for regression analysis.

One thought on “A look at GapMinder data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s