Some Random Matplotlib Hacking

This isn't much of an example yet. The data is on a subset of US census bureau data extracted from one of the sample data sets in Google Big Query -- I should show how that part was done. Here's the dataset.

What I do is grab that portion of it for the three zip codes I'm interested in, and graph the distribution of population by minimum age in each zip code.

I'm still at the phase in Matplotlab where I'm feeling my way along, but this came out rather nicely in a short time.

In [56]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('../data/census1/zip_subset.csv', dtype={'zipcode': np.str})
df = df.dropna()
df.sort_values(['minimum_age', 'zipcode'])

#print(df.head())
zips = ['95662', '28262', '02895']
fig, ax = plt.subplots()
for zip in zips:
    df_new = df[df.zipcode == zip]
    
    # We don't care about the Male / Female difference, so aggregate
    # The minimum age    
    result = df_new.groupby('minimum_age').agg({'population': 'sum'})
    
    ax.plot(result, label=zip)
    
    plt.ylabel('Population')
    plt.xlabel('Age')
    # Now add the legend with some customizations.
    legend = ax.legend(loc='upper center', shadow=True)
plt.show()