Simple linear regression with Python

The Coursera course I am taking this week is dedicated to the Regression Modeling in Practice, Week2 -Basics of Linear Regression. I decided to use The GapMinder dataset and run linear regression models to assess the association between urbanicity and breast cancers rate. Urbanicity is 2008 urban population (% of total). Urban population refers to people living in urban areas as defined by national statistical offices (calculated using World Bank population estimates and urban ratios from the United Nations World Urbanization Prospects). Breast cancers rate is the 2002 breast cancer new cases per 100,000 female.

The code I wrote is accessible here

Output 1: Mean urban rate and Centered mean urban rate

Summary statistics.PNG

Figure 1: Scatterplot for the Association Between (non centered) Urbanicity and Breast Cancers Rate

qt_img129123896786948.png

Figure 2: Scatterplot for the Association Between Centered mean urbanicity and Breast Cancers Rate

qt_img129858336194564.png

Output 2: Regression model for the Association Between (non-centered) Urbanicity and Breast Cancers Rate

Urbanicity.PNG

Output 3: Regression model for the Association Between Centered mean urbanicity and Breast Cancers Rate

Centered urban.PNG

Results: Urbanicity (Beta = 0.5616, p < 0.001) and breast cancers rate are significantly and positively associated.

 

Advertisements

One thought on “Simple linear regression with Python

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s