Predicting Housing Prices with Linear Regression using Python, pandas, and statsmodels

This post was originally published here In this post, we’ll walk through building linear regression models to predict housing prices resulting from economic activity. Topics covered will include: What is Regression Variable Selection Reading in the Data with pandas Ordinary Least Squares (OLS) Assumptions Simple Linear Regression Regression Plots Multiple Linear Regression Another Look at […]

Read More

Data Science Things Roundup #6

Time again for the weekly data science things roundup.  If you haven’t seen this before, check out some of the previous ones to get a feel for it.  Each Tuesday I run through 3 things I’ve found interesting and bookmarked recently, generally related to python and data science (with some admitted diversions). This week is […]

Read More

Data Science Things Roundup #5

Time again for the 5th edition of the data science things roundup, named suspiciously similarly to the much more established Data Science Roundup by RJ Metrics (but we won’t worry about that this week).  In previous weeks we’ve seen some pretty cool ML and Data Science libraries, mostly in python, this week we branch out […]

Read More

Data Science Things Roundup #4

Time for another edition of the data science things roundup, where I round up some data science things for ya’ll.  Todays collections are uncharacteristically R heavy.  It’s usually pretty python and machine learning heavy, so if you find something you like here, be sure to check out previous editions as well.  Without further adieu: Scikit-Learn […]

Read More

Data Science Things Roundup #3

Time again for the 3rd edition of the data science things roundup, where I share a few data science things I’ve come across recently.  Check out previous editions here and here. Self Organizing Maps with TensorFlow Google’s open sourcing of TensorFlow late last year caused a pretty big splash in the machine learning and data […]

Read More

Data Science Things Roundup #2

This is the second edition of the now-regular series of posts: Data Science Things Roundup, where I round up data science things (as you’d probably guessed).  Last week we had a scikit-learn extension, a GUI framework for python CLIs and some writing about how kaggle winners won their competitions.  This week is a bit more […]

Read More

Data Science Things Roundup #1

This is the first in a new series of posts, tailored more towards the newsletter subscribers (join it here).  There are a few of these around the internet that I like, notably: ds_ldn’s Data Machina RJMetrics’ Data Science Roundup Jeremy Singer-Vine’s Data is Plural Mine will probably be way less consistent, so if you like […]

Read More

Github.com cumulative blame in 5 lines of python

Git-pandas has gotten to be pretty capable.  Currently in the master branch and soon to be in the v1.0.0 release, we’ve included a github.com interface to git-pandas via the GitHubProfile class.  With this, in just a few lines of code, you can see how your profile has grown over time: from gitpandas.utilities.plotting import plot_cumulative_blame from […]

Read More