R from Python – an rpy2 tutorial

rpy2 tutorial

Recently I found the Python module rpy2. This module offers a Python interface to R. Obviously; rpy2 requires that you have both R (version 3.2+) and Python (versions 2.7 and 3.3) installed.  There are pre-compiled binaries available for Linux and Windows (unsupported and unofficial, however). In this short tutorial, I will show you how to do carry out a repeated measures ANOVA (rmANOVA) using the r-packages ‘afex‘ and ‘lsmeans‘, Python, and rpy2.


First, we need to install rpy2. I installed rpy2 on Ubuntu 14.04 using pip:


sudo pip install rpy2

When we have a working installation, we start to import the methods that we need to use. In this example we are going to use ‘afex’ to do the rmANOVA and ‘lsmeans’ to do the follow-up analysis.


import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector

Now we check if the packages we want to use are installed. Note that the following script will install the r-package if needed. However, it might be good to know how to carry out a rmANOVA using the function ez_aov.


packageNames = ('afex', 'lsmeans')

if all(rpackages.isinstalled(x) for x in packageNames):

    have_packages = True

else:

   have_packages = False

if not have_packages:

    utils = rpackages.importr('utils')
    utils.chooseCRANmirror(ind=1)

    packnames_to_install = [x for x in packageNames if not rpackages.isinstalled(x)]

    if len(packnames_to_install) > 0:

        utils.install_packages(StrVector(packnames_to_install))

We borrow a data set from the package Psych. In this case, we use the r-function read.table to get the data.


data = robjects.r('read.table(file = "http://personality-project.org/r/datasets/R.appendix3.data", header = T)')

 

repeated measures ANOVA

Before conducting our rmANOVA, we need to import the r-package (i.e., afex). After importing the r-package, we will use the function aov_ez to conduct the analysis.


afex = importr('afex') 
model = afex.aov_ez('Subject', 'Recall', data, within='Valence')
print model

The last line above prints the results. A main effect of Valence was found.


   Effect         df  MSE          F ges p.value
1 Valence 1.15, 4.60 9.34 189.11 *** .93  < .0001

Follow-up analysis

If we are interested in following up the main effect we can do that using the package ‘lsmeans’. First we need to import the package and then we do a pairwise contrast and adjust for familywise error using Holm-Bonferroni correction.


lsm  = importr('lsmeans')
pairwise = lsm.lsmeans(model, "Valence", contr="pairwise", adjust="holm")

That was easy, right. Particularly, if you are used to doing analysis in R. Although, rpy2 is relatively easy to use I don’t think it will replace learning R. That is, you will have to know some R to make use of it. However, if you are a Python programmer and want to use available R-scripts, it might be useful. Noteworthy, I am not aware of any Python implementations of rmANOVA (except for the linear-mixed effects approach maybe). In fact, that is why I learned how to use rpy2 in the first place; to use Python, and R, to conduct the analysis.

Update: In this rpy2 tutorial you learned how to do a repeated measures ANOVA with Python and R. I have now found a Python package that allows Python ANOVA for within-subjects design (i.e., Python native); see my tutorial Repeated Measures ANOVA using Python.

 

 

The post R from Python – an rpy2 tutorial appeared first on Erik Marsja.