Skip to content Skip to sidebar Skip to footer

Plotting In Python With Congruent X-values

Goal: Get two different names on the same graph. Make sure that the years line up. Note, not the file has some years twice (when a name has been given to both girl & boy), in t

Solution 1:

I'm guessing you're using the Census baby names dataset? The one used in Wes McKinney's book? In the future it's a good idea to include a sample from your dataset so that others can reproduce your work.

I've just got 2006 - 2010 read into a DataFrame, like this.

In [75]: df.head()
Out[75]: 
       name sex    num  year
0     Emily   F2136520061      Emma   F1909220062   Madison   F1859920063  Isabella   F1820020064       Ava   F169252006

Added in prop as defined above:

In [26]: df['prop'] = df.groupby('year')['num'].transform(lambda x: x / x.sum())


In [26]: df
Out[26]: 
         name sex    num  year      prop
0       Emily   F  2136520060.0054131        Emma   F  1909220060.0048372     Madison   F  1859920060.0047133    Isabella   F  1820020060.0046114         Ava   F  1692520060.0042885     Abigail   F  1561520060.003956

I'd suggest a different approach to get the counts by name and year. I think it will make plotting easier. Instead of making two dataframes, one for each name, do it at the same time.

In [48]: df.query('name in ["Joeseph", "Nancy"]')
Out[48]: 
           name sex   num  year      prop
323       Nancy   F101420060.00025723206   Joeseph   M    3420060.00000934401     Nancy   F89620070.00022557551   Joeseph   M    3920070.00001069300     Nancy   F85320080.00021892066   Joeseph   M    4520080.000011104394    Nancy   F66320090.000174127335  Joeseph   M    3420090.000009139050    Nancy   F56520100.000154161863  Joeseph   M    2920100.000008[10 rows x 5 columns]

Prior to pandas .13 you can use df[df.name.isin(['Joeseph', 'Nancy'])]

Since you already have prop calculated, we don't need any further groupbys (this is a bit simpler than what I had before):

In [42]: s = df.query('name in ["Joeseph", "Nancy"]').set_index(['year', 'name'])['prop']

In [46]: ax = s.unstack().plot()

enter image description here

With this method you shouldn't have to worry about aligning the x-values. It's already done for you.

Post a Comment for "Plotting In Python With Congruent X-values"