by Karlijn Willems | January 12, 2017 This post originally appeared on the DataCamp blog. Big thanks to Karlijn and all the fine folks at DataCamp for letting us share with the Yhat audience! Scikit-Learn library Most of you who are learning data science with Python will have definitely heard already about scikit-learn, the open […]
Category: Data Visualization
Jupyter with Vagrant
I’ve written about using vagrant for 99.9% of my python work on here before (see here and here for examples). In addition to vagrant, I use jupyter notebooks on 99.9% of the work that I do, so I figured I’d spend a little time describing how I use jupyter with vagrant. First off, you’ll […]
Top 10 Python libraries of 2016
Last year, we did a recap with what we thought were the best Python libraries of 2015, which was widely shared within the Python community (see post in r/Python). A year has gone by, and again it is time to give due credit for the awesome work that has been done by the open source […]
How to get a data science job
You’ve done it. You just spent months learning how to analyze data and make predictions. You’re now able to go from raw data to well structured insights in a matter of hours. After all that effort, you feel like it’s time to take the next step, and get your first data science job. Unfortunately for […]
Pandas Cheat Sheet for Data Science in Python
by Karlijn Willems | November 30, 2016 This post originally appeared on the DataCamp blog. Big thanks to Karlijn and all the fine folks at DataCamp for letting us share with the Yhat audience! Pandas library The Pandas library is one of the most preferred tools for data scientists to do data manipulation and analysis, […]
Git-pandas v1.0.0, or how to check for a stable release
In the process of making the v1.0.0 release of git-pandas, I had one primary goal: to simplify and solidify the interface to git-pandas objects (the ProjectDirectory and the Repository). At the end of the day, the usefulness of a project like git-pandas versus one off analysis or rolling your own interface is consistent and predictable […]
Github.com cumulative blame in 5 lines of python
Git-pandas has gotten to be pretty capable. Currently in the master branch and soon to be in the v1.0.0 release, we’ve included a github.com interface to git-pandas via the GitHubProfile class. With this, in just a few lines of code, you can see how your profile has grown over time: from gitpandas.utilities.plotting import plot_cumulative_blame from […]