Tuesday, 31 July 2012 at 2:05 pm

Performing principal component analysis with matplotlib is extremely easy.

The input is a 2d numpy array where columns are the dimensions you want reduced and rows are samples. For example, a set of transcriptome data:

        Condition1   Condition2   Condition3   Condition 4   Condition5   
geneA 231 321 4 221 2312
geneB 211 4 53 34 53
geneC 4 343 .. .. ..
geneD  43 .. .. .. ..
.. .. .. .. .. .. 

Here is all you need to do a PCA (assuming you have already gotten your data into a native python 2d array):

from matplotlib.mlab import PCA
#construct your numpy array of data
myData = numpy.array(data) 
results = PCA(myData) 

#this will return an array of variance percentages for each component

#this will return a 2d array of the data projected into PCA space

Here is a great how-to on how to plot a 3d graph using matplotlib and Tk. Here is the code from the how-to plotting the PCA results from above:

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

x = []
y = []
z = []
for item in result.Y:

plt.close('all') # close all latent plotting windows fig1 = plt.figure() # Make a plotting figure
ax = Axes3D(fig1) # use the plotting figure to create a Axis3D object.
pltData = [x,y,z]
ax.scatter(pltData[0], pltData[1], pltData[2], 'bo') # make a scatter plot of blue dots from the data

# make simple, bare axis lines through space:
xAxisLine = ((min(pltData[0]), max(pltData[0])), (0, 0), (0,0)) # 2 points make the x-axis line at the data extrema along x-axis
ax.plot(xAxisLine[0], xAxisLine[1], xAxisLine[2], 'r') # make a red line for the x-axis.
yAxisLine = ((0, 0), (min(pltData[1]), max(pltData[1])), (0,0)) # 2 points make the y-axis line at the data extrema along y-axis
ax.plot(yAxisLine[0], yAxisLine[1], yAxisLine[2], 'r') # make a red line for the y-axis.
zAxisLine = ((0, 0), (0,0), (min(pltData[2]), max(pltData[2]))) # 2 points make the z-axis line at the data extrema along z-axis
ax.plot(zAxisLine[0], zAxisLine[1], zAxisLine[2], 'r') # make a red line for the z-axis.

# label the axes
ax.set_xlabel("x-axis label")
ax.set_ylabel("y-axis label")
ax.set_zlabel("y-axis label")
ax.set_title("The title of the plot")
plt.show() # show the plot