Thursday, April 9, 2015

Stacked area plots with matplotlib

In a stacked area plot, the values on the y axis are accumulated at each x position and the area between the resulting values is then filled. These plots are helpful when it comes to compare quantities through time. For example, considering the the monthly totals of the number of new cases of measles, mumps, and chicken pox for New York City during the years 1931-1971 (data that we already considered here). We can compare the number of cases of each disease month by month. First, we need to load and organize the data properly:
from scipy.io import loadmat
NYCdiseases = loadmat('NYCDiseases.mat') # loading a matlab file
chickenpox = np.sum(NYCdiseases['chickenPox'],axis=0)
mumps = np.sum(NYCdiseases['mumps'],axis=0)
measles = np.sum(NYCdiseases['measles'],axis=0)
In the snippet above we read the data from a Matlab file and summed the number of cases for each month. We are now ready to visualize our values using the function stackplot:
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt

plt.stackplot(arange(12)+1,
          [chickenpox, mumps, measles], 
          colors=['#377EB8','#55BA87','#7E1137'])
plt.xlim(1,12)

# creating the legend manually
plt.legend([mpatches.Patch(color='#377EB8'),  
            mpatches.Patch(color='#55BA87'), 
            mpatches.Patch(color='#7E1137')], 
           ['chickenpox','mumps','measles'])
plt.show()
The result is as follows:


We note that the highest number of cases happens between January and Jul, also we see that measles cases are more common than mumps and chicken pox cases.