Two-way ANOVA for repeated measures using Python

Previously I have shown how to analyze data collected using within-subjects designs using rpy2 (i.e., R from within Python) and Pyvttbl. In this post I will extend it into a factorial ANOVA using Python (i.e., Pyvttbl). In fact, we are going to carry out a Two-way ANOVA but the same method will enable you to analyze any factorial design. I start with importing the Python libraries that  are going to be use.

import numpy as np
import pyvttbl as pt
from collections import namedtuple

Numpy is be used in simulating the data. I create a data set in which we have one factor of two levels (P) and a second factor of 3 levels (Q). As in many of my examples the dependent variable is going to be response time (rt) and we create a list of lists for the different population means we are going to assume (i.e., the variable ‘values’). I was a bit lazy when coming up with the data so I named the independent variables ‘iv1’ and ‘iv2’. However, you could think of iv1 as two different memory tasks; verbal and spatial memory. Iv2 could be different levels of distractions (no distraction, synthetic sounds, and speech, for instance).

Simulate data

N = 20
P = [1,2]
Q = [1,2,3]

values = [[998,511], [1119,620], [1300,790]]

sub_id = [i+1 for i in xrange(N)]*(len(P)*len(Q))
mus = np.concatenate([np.repeat(value, N) for value in values]).tolist()
rt = np.random.normal(mus, scale=112.0, size=N*len(P)*len(Q)).tolist()
iv1 = np.concatenate([np.array([p]*N) for p in P]*len(Q)).tolist()
iv2 = np.concatenate([np.array([q]*(N*len(P))) for q in Q]).tolist()


Sub = namedtuple('Sub', ['Sub_id', 'rt','iv1', 'iv2'])               
df = pt.DataFrame()

for idx in xrange(len(sub_id)):
    df.insert(Sub(sub_id[idx],rt[idx], iv1[idx],iv2[idx])._asdict())    

I start with a boxplot using the method boxplot from Pyvttbl. As far as I can see there is not much room for changing the plot around. We get this plot and it is really not that beautiful.

df.box_plot('rt', factors=['iv1', 'iv2'])
A Boxplot before we do our Python two-way ANOVA
Boxplot Pyvttbl

Two-way ANOVA for within-subjects design in Python

To run the Two-Way ANOVA is simple; the first argument is the dependent variable, the second the subject identifier, and than the within-subject factors. In two previous posts I showed how to carry out one-way and two-way ANOVA for independent measures. One could, of course combine these techniques, to do a split-plot/mixed ANOVA by adding an argument ‘bfactors’ for the between-subject factor(s).

aov = df.anova('rt', sub='Sub_id', wfactors=['iv1', 'iv2'])
print(aov)

The output one get from this is an ANOVA table. In this table all metrics needed plus some more can be found; F-statistic, p-value, mean square errors, confidence intervals, effect size (i.e., eta-squared) for all factors and the interaction. Also, some corrected degree of freedom and mean square error can be found (e.g., Grenhouse-Geisser corrected). The output is in the end of the post. It is a bit hard to read.  If you know any other way to do a repeated measures ANOVA using Python please let me know. Also, if you happen to know that you can create nicer plots with Pyvttbl I would also like to know how! Please leave a comment.

Output ANOVA table

rt ~ iv1 * iv2

TESTS OF WITHIN SUBJECTS EFFECTS

Measure: rt
  Source                            Type III      eps      df         MS           F        Sig.      et2_G   Obs.     SE     95% CI    lambda    Obs.  
                                       SS                                                                                                         Power 
=======================================================================================================================================================
iv1           Sphericity Assumed   4419957.211       -        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Greenhouse-Geisser   4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Huynh-Feldt          4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Box                  4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1)    Sphericity Assumed    258996.722       -       19     13631.406                                                                           
              Greenhouse-Geisser    258996.722       1       19     13631.406                                                                           
              Huynh-Feldt           258996.722       1       19     13631.406                                                                           
              Box                   258996.722       1       19     13631.406                                                                           
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv2           Sphericity Assumed   5257766.564       -        2   2628883.282   206.008   4.023e-21   3.920     40   18.448   36.158    433.701       1 
              Greenhouse-Geisser   5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1 
              Huynh-Feldt          5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1 
              Box                  5257766.564   0.500        1   5257766.564   206.008   1.192e-11   3.920     40   18.448   36.158    433.701       1 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv2)    Sphericity Assumed    484921.251       -       38     12761.086                                                                           
              Greenhouse-Geisser    484921.251   0.550   20.911     23189.668                                                                           
              Huynh-Feldt           484921.251   0.550   20.911     23189.668                                                                           
              Box                   484921.251   0.500       19     25522.171                                                                           
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv1 *         Sphericity Assumed   1622027.598       -        2    811013.799    83.220   1.304e-14   1.209     20   22.799   44.687     87.600   1.000 
iv2           Greenhouse-Geisser   1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000 
              Huynh-Feldt          1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000 
              Box                  1622027.598   0.500        1   1622027.598    83.220   2.262e-08   1.209     20   22.799   44.687     87.600   1.000 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1 *   Sphericity Assumed    370327.311       -       38      9745.456                                                                           
iv2)          Greenhouse-Geisser    370327.311   0.545   20.728     17866.175                                                                           
              Huynh-Feldt           370327.311   0.545   20.728     17866.175                                                                           
              Box                   370327.311   0.500       19     19490.911                                                                           

TABLES OF ESTIMATED MARGINAL MEANS

Estimated Marginal Means for iv1
iv1    Mean     Std. Error   95% Lower Bound   95% Upper Bound 
==============================================================
1     983.755       43.162           899.157          1068.354 
2     599.917       21.432           557.909           641.925 

Estimated Marginal Means for iv2
iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound 
===============================================================
1      525.025       19.324           487.150           562.899 
2      814.197       49.416           717.342           911.053 
3     1036.286       43.789           950.459          1122.114 

Estimated Marginal Means for iv1 * iv2
iv1   iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound 
=====================================================================
1     1      553.522       24.212           506.066           600.978 
1     2     1103.488       28.411          1047.804          1159.173 
1     3     1294.256       19.773          1255.501          1333.011 
2     1      496.528       29.346           439.009           554.047 
2     2      524.906       20.207           485.301           564.512 
2     3      778.317       21.815           735.560           821.073 

The post Two-way ANOVA for repeated measures using Python appeared first on Erik Marsja.