Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages, and makes importing and analyzing data much easier. Pandas builds on packages like NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis […]
Category: Libraries
Articles about Python libraries
Working with SQLite Databases using Python and Pandas
SQLite is a database engine that makes it simple to store and work with relational data. Much like the csv format, SQLite stores data in a single file that can be easily shared with others. Most programming languages and environments have good support for working with SQLite databases. Python is no exception, and a library […]
Python & JSON: Working with large datasets using Pandas
Working with large JSON datasets can be a pain, particularly when they are too large to fit into memory. In cases like this, a combination of command line tools and Python can make for an efficient way to explore and analyze the data. In this post, we’ll look at how to leverage tools like Pandas […]
Four ways to conduct one-way ANOVA with Python
The current post will focus on how to carry out between-subjects ANOVA using Python. As mentioned in an earlier post (Repeated measures ANOVA with Python) ANOVAs are commonly used in Psychology. We start with some brief introduction on theory of ANOVA. If you are more interested in the four methods to carry out one-way […]
The Case for a Data Science Lab
By Mark Sellors, Technical Architect – Mango Solutions As more and more Data Science moves from individuals working alone, with small data sets on their laptops, to more productionised, or analytically mature settings, an increasing number of restrictions are being placed on Data Scientists in the workplace. Perhaps, your organisation has standardised on a particular […]
Matplotlib Plotting Cookbook Review
Here is my review I was given a copy of Matplotlib Plotting Cookbook by Alexandre Devert and asked to review it. Thanks PACKT! Here is my review. Preface But first, I’ll mention I’ve worked on two projects recently that involved rendering matplotlib graphs directly to the browser i.e. via content-type: image/png. This is fun! It’s […]
Pillow 2-3-0 is Out
Pillow is the friendly PIL fork by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors Since Pillow 2.0, the Pillow Team has adopted a quarterly release cycle; as such, Pillow 2.3.0 has just been released. Here’s what’s new in this release: 2.3.0 (2014-01-01) Stop leaking filename parameter passed […]
Pillow 2.2.1 Released
Pillow is the “friendly” PIL fork. PIL is the Python Imaging Library. Note An earlier version of this entry was published yesterday with the wrong date. Apologies for any annoyance or confusion. The Pillow 2.2.1 source distribution is now available on PyPI, featuring over 30 documented bug fixes and enhancements since 2.1.0 was released 3 […]
Requests 2.0
Every now and then the Requests project gets bored of fixing bugs and decides to break a whole ton of your code. But it doesn’t look good when we put it like that, so instead we call it a ‘major release’ and sell it as being full of shiny new features. Unfortunately it turns out […]
Requests: The Difference Between Params and Data
This question pops up a lot on Stack Overflow, on GitHub, and in the IRC channel, so I thought I’d write a short post to address it. The question is, broadly, this: How do I send data on a POST? I tried params, but that didn’t work! The answer is that Requests has two different […]
Python Requests And Proxies
One of Requests’ most popular features is its simple proxying support. HTTP as a protocol has very well-defined semantics for dealing with proxies, and this has lead to widespread deployment of HTTP proxies. The vast majority of these proxies are ‘transparent’: that is, they sit on the message path and quietly capture HTTP messages before […]
Caching In Python Requests
I think I’ve made it clear in the past that I think Requests is awesome. At this stage it’s become a mature, feature-filled library that is more than capable of replacing urllib2 and friends in almost every situation you might be interested in. There are very few things that urllib2 can do that Requests can’t […]
Requests’ Two APIs
Kenneth Reitz’s excellent Requests library has been praised, rightfully, for its excellent API. In fact, its API is so good that it’s been praised in a literary context, as well as by almost every programmer who has come across it. There is no question that this API is one of the best you can find […]
Requests and the HTTP 302 Status Code
I wanted briefly to touch on the behaviour of the Python Requests library when it receives an HTTP 302 message. This has come up a couple of times on GitHub, and has usually been considered a bug, so it’s worth briefly stepping in and explaining what Requests does and why it does it. First, HTTP […]
Choosing The SSL Version In Python Requests
Over the last few months (and probably for quite a while before then too), a few issues have been raised on the Requests GitHub page asking how to select the version of SSL used by Requests. This is actually simple once you know how, so I thought I’d write a short post to show you […]