SQL Fundamentals

The pandas workflow is a common favorite among data analysts and data scientists. The workflow looks something like this: The pandas workflow works well when: the data fits in memory (a few gigabytes but not terabytes) the data is relatively static (doesn’t need to be loaded into memory every minute because the data has changed) […]

Read More

Revisiting Unit Testing and Mocking in Python

My previous blog post, Python Mocking 101: Fake It Before You Make It, discussed the basic mechanics of mocking and unit testing in Python. This post covers some higher-level software engineering principles demonstrated in my experience with Python testing over the past year and half. In particular, I want to revisit the idea of patching […]

Read More

Setting up a Python development environment

Setting up Python is usually simple, but there are some places where newcomers (and experienced users) need to be careful. What versions are there? What’s the difference between Python, CPython, Anaconda, PyPy? Those and many other questions may stump new developers, or people wanting to use Python. Note: this guide is opinionated. Contents Glossary and […]

Read More

Becoming a Data Scientist

This blogpost is an excerpt of Springboard’s free guide to data science jobs and originally appeared on the Springboard blog. Data Science Skills Most data scientists use a combination of skills every day, some of which they have taught themselves on the job or otherwise. They also come from various backgrounds. There isn’t any one […]

Read More