Diagnosing and Fixing Memory Leaks in Python

Fugue uses Python extensively throughout the Conductor and in our support tools, due to its ease-of-use, extensive package library, and powerful language tools. One thing we’ve learned from building complex software for the cloud is that a language is only as good as its debugging and profiling tools. Logic errors, CPU spikes, and memory leaks […]

Read More

Predicting Housing Prices with Linear Regression using Python, pandas, and statsmodels

This post was originally published here In this post, we’ll walk through building linear regression models to predict housing prices resulting from economic activity. Topics covered will include: What is Regression Variable Selection Reading in the Data with pandas Ordinary Least Squares (OLS) Assumptions Simple Linear Regression Regression Plots Multiple Linear Regression Another Look at […]

Read More

Becoming a Data Scientist

This blogpost is an excerpt of Springboard’s free guide to data science jobs and originally appeared on the Springboard blog. Data Science Skills Most data scientists use a combination of skills every day, some of which they have taught themselves on the job or otherwise. They also come from various backgrounds. There isn’t any one […]

Read More

Applied Data Science

How can data science improve products? What are predictive models? How do you go from insight to prototype to production application? This is an excerpt from “Applied Data Science,” A Yhat whitepaper about data science teams and how companies apply their insights to the real world. You’ll learn how successful data science teams are composed […]

Read More

NYC Subway Math

About Erik: Dad and CTO (Chief Troll Officer) at a fintech startup in NYC. Ex-Spotify, co-organizing NYC ML meetup, open source sometimes (Luigi, Annoy), blogs random stuff NYC Subway math Apparently MTA (the company running the NYC subway) has a real-time API. My fascination for the subway takes autistic proportions and so obviously I had […]

Read More