data/most-common-name at master · mhinne/data

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
adjusted-name-combinations-list.csv		adjusted-name-combinations-list.csv
adjusted-name-combinations-matrix.csv		adjusted-name-combinations-matrix.csv
adjustments.csv		adjustments.csv
aging-curve.csv		aging-curve.csv
independent-name-combinations-by-pop.csv		independent-name-combinations-by-pop.csv
most-common-name.R		most-common-name.R
new-top-firstNames.csv		new-top-firstNames.csv
new-top-surnames.csv		new-top-surnames.csv
state-pop.csv		state-pop.csv
surnames.csv		surnames.csv

README.md

This directory contains the code and data behind the story:

The main script file is most-common-name.R

There are four input files:

state-pop.csv - Total population and Hispanic population by state.
surnames.csv - Data on surnames from the U.S. Census Bureau, including a breakdown by race/ethnicity.
aging-curve.csv - Data from the Social Security Administration on the chances that someone born in the decade shown was still alive in 2013: http://www.ssa.gov/oact/NOTES/as120/LifeTables_Tbl_7.html
adjustments.csv - Taken directly from Lee Hartman's article: http://mypage.siu.edu/lhartman/johnsmith.html.

And five output files:

adjusted-name-combinations-list.csv - Adjusted estimates for the most common full names.
adjusted-name-combinations-matrix.csv - The same data from the file adjusted-name-combinations-list.csv but in matrix form. These are the estimates presented in the second (and final) table of the article.
independent-name-combinations-by-pop.csv - Matrix of estimates for the top 100 most common first names by top 100 most common surnames. These were calculated using independent odds, and displayed in the first table presented in the article.
new-top-firstNames.csv - Final estimated ranking of top first names.
new-top-surnames.csv - Final estimated ranking of top surnames.