Analyzing the conditions for studying stars
Author: yhat
Introduction to Split Testing Models
Introduction to Split Testing Models
RPy2: Combining the Power of R + Python for Data Science
About Matthew: Matthew is a Data Scientist at C2FO in Kansas City. He previously studied Physics for his BS at the University of Notre Dame followed by the University of Kansas for his MS. When he is not programming, Matthew enjoys playing board games, especially Race for the Galaxy. Intro During my time as a […]
Building a (semi) Autonomous Drone with Python
They might not be delivering our mail (or our burritos) yet, but drones are now simple, small, and affordable enough that they can be considered a toy. You can even customize and program some of them via handy dandy Application Programming Interfaces (APIs)! The Parrot AR Drone has an API that lets you control not […]
Data Normalization in Python
Opening Day Well it’s that time of the year again in the United States. The 162 game marathon MLB season is officially underway. In honor of the opening of another season of America’s Pasttime I was working on a post that uses data from the MLB. What I realized was that as I was writing […]
ROC Curves in Python and R
ROC Curves in Python and R
Customer Segmentation in Python
Customer Segmentation in Python
Interview with a Data Scientist Tool Developer
About Peadar: Peadar Coyle is a data scientist, author and math geek who specializes in applying robust statistical or machine learning models to data to extract business value. His academic interests range from quantum computing to time series forecasting. Peadar has worked or consulted for Amazon, Vodafone, Import.io and JobTODAY, to name a few. He […]
Electron Release Manager
Recently we released a new version of our Rodeo, our data science IDE. In the past this meant our users would have to go to our homepage, click on the Rodeo page, download Rodeo again, and then reinstall it. But luckily this is no longer the case! As of the v1.1 release, we’re officially supporting […]
Summarizing Data in SQL
About Matt: Matt DeLand is Co-Founder and Data Scientist at Wagon. His team is building a collaborative SQL editor for analysts and engineers. He studied algebraic geometry at Columbia University, taught at the University of Michigan, and now enjoys applied machine learning— his mom is very proud! Introduction How quickly can you understand data from […]
What is Model-Based Machine Learning?
About Tom: Tom Diethe is a research fellow on the SPHERE project at the University of Bristol. His research interests include probabilistic machine learning, computational statistics, learning theory, and data fusion. He has a PhD in machine learning applied to multivariate signal processing from University College London. Contact him at tom.diethe@bristol.ac.uk. Introduction If you haven’t […]
Exploring U.S. Traffic Fatality Data
About Ben: Ben is a Data Analyst at DataScience. When not digging into a dataset, you’ll find him on a bicycle searching for funk records and the best tacos in LA. Introduction and Inspiration At a ChiPy event, Nick Bennett gave an excellent talk on traffic fatalities and how he attempts to visualize the publicly […]
How we built Rodeo with Electron
Last week we announced the release of Rodeo v1.0. The big deal was that we’d taken Rodeo from a command line, python app built using Flask, to a more legitimate looking desktop app. There were comments on reddit and twitter mentioning that it seemed like Rodeo was running it’s own browser behind the scenes–and these […]
Rodeo 1.0: a Python IDE on your Desktop
Rodeo 1.0 Release When we released our in-browser IDE for Python earlier this year, we couldn’t believe the response. Thousands of our readers all over the world saddled up and told their friends and colleagues to do the same (no more puns, we promise). That reaction, as well as the endless search for hacks to […]
ScienceCluster Meets Spark
Getting Started When did all the ‘big data’ hoopla start? By the very first definition, in a 1997 paper by scientists at NASA, a data set that is too big to fit on a local disk has officially graduated to big-data-dom. Whether you’re working with large excel files or processing the “10 terabytes generated by […]