When I launched Dataquest a little under two years ago, one of the first things I did was write a blog post about why. At the time, if you wanted to become a data scientist, you were confronted with dozens of courses on sites like edX or Coursera with no easy path to getting a […]
Category: Statistics
The Six Elements of the Perfect Data Science Learning Tool
When I launched Dataquest a little under two years ago, one of the first things I did was write a blog post about why. At the time, if you wanted to become a data scientist, you were confronted with dozens of courses on sites like edX or Coursera with no easy path to getting a […]
Yhat Whitepaper: Data Science in Practice
This blogpost is an excerpt of our most popular whitepaper about how data science gets applied in the real world. You can also download the full whitepaper PDF if you’d like. What it’s About In this whitepaper we introduce five common applications of data science that build upon that definition and goal. We debunk the […]
Stockstats – Python module for various stock market indicators
I’m always working with stock market data and stock market indicators. During this work, there’s times that I need to calculate things like Relative Strength Index (RSI), Average True Range (ATR), Commodity Channel Index (CCI) and other various indicators and stats. My go-to for this type of work is TA-Lib and the python wrapper for […]
Machine Learning Walkthrough Part One: Preparing the Data
Cleaning and preparing data is a critical first step in any machine learning project. In this blog post, Dataquest student Daniel Osei’s takes us through examining a dataset, selecting columns for features, exploring the data visually and then encoding the features for machine learning. This post is based on a Dataquest ‘Monthly Challenge’, where our […]
Simulating the Monty Hall Problem
I’ve been hearing about the Monty Hall problem for years and its never quite made sense to me, so I decided to program up a quick simulation. In the Monty Hall problem, there is a car behind one of three doors. There are goats behind the other two doors. The contestant picks one of the […]
How to get a data science job
You’ve done it. You just spent months learning how to analyze data and make predictions. You’re now able to go from raw data to well structured insights in a matter of hours. After all that effort, you feel like it’s time to take the next step, and get your first data science job. Unfortunately for […]
“How I Interview Data Scientists” with Matt Fornito
by Matt Fornito | December 13, 2016 This interview is featured in Springboard’s guide to data science interviews. Matt is also a mentor for Springboard’s Data Science Career Track, the first data science bootcamp to guarantee a job after graduation. About Matt: Matt is the President & Founder of Summit Analytics. He has over ten […]
A Rebuttal For Python 3
Zed Shaw, of Learn Python the Hard Way fame, has now written The Case Against Python 3. I’m not involved with core Python development. The only skin I have in this game is that I like Python 3. It’s a good language. And one of the big factors I’ve seen slowing its adoption is that […]
Automate Your Browser: A Guided Selenium Adventure
Prerequisites: Have Python/Anaconda and Selenium installed. See the previous intro to Selenium if you’re not familiar with it. The full code for this post is included at the end. You might find it fun to first run the entire script and watch how it works before jumping in and following along with the post. And […]
Four ways to conduct one-way ANOVA with Python
The current post will focus on how to carry out between-subjects ANOVA using Python. As mentioned in an earlier post (Repeated measures ANOVA with Python) ANOVAs are commonly used in Psychology. We start with some brief introduction on theory of ANOVA. If you are more interested in the four methods to carry out one-way […]
Debugging With Wireshark: TLS
Sometimes in my darker moments I forget that not all programmers get to work with computer networks every day, like I do. This means that many of you don’t have a chance to experience some of the tools and debugging experiences that I do on a nearly daily basis. This is a real shame, because […]
Integrating Python and R into a Data Analysis Pipeline – Part 1
By Chris Musselle and Kate Ross-Smith For a conference in the R language, EARL London 2015 saw a surprising number of discussions about Python. I like to think that at least some of this was to do with the fact that the day before the conference, we ran a 3-hour workshop outlining various strategies […]
Is Python going to be better than R for Big Data Analytics and Data Science? #rstats #python
Uptil now the R ecosystem of package developers has mostly shrugged away the Big Data question. In a fascinating insight Hadley Wickham said this in a recent interview- shockingly it mimicks the FUD you know who has been accused of ( source https://peadarcoyle.wordpress.com/2015/08/02/interview-with-a-data-scientist-hadley-wickham/ 5. How do you respond when you hear the phrase ‘big […]
Install Package in Python from Github
You can use pip install git+git://github.com/yhat/ggplot.git or pip install –upgrade https://github.com/yhat/ggplot/tarball/master Filed under: Analytics Tagged: GitHub, python