How to replicate the data in our paper "Social Integration via Online Dating"

Theory and Simulations of Networks. You can download our Matlab script here.

State Variables. You can download them from this file. I explain below how to construct each variable. All the data from IPUMS is using the predetermined samples from 1980, 1990, and 2000-2016.

Broadband. You can download all the original FCC reports from this folder or manually on the FCC website https://www.fcc.gov/general/reports-high-speed-services-internet-access. We use the December data (the FCC sometimes also releases the July data). The observations for Wyoming 2000 and Hawaii 2000 - 2005 are missing, so we set Wyoming 2000= Wyoming 2001 and discard the Hawaii observations.

Telephones. We obtained from the 1956 US county and city book, available here.

MDUs. Multiple dwelling units. We obtained them from IPUMS (https://usa.ipums.org), using the default samples for 1980, 1990, 2000 and 2016. The key variable is unitsstr; MDU are units in a structure with three or more units.

Unemployment. We obtain the year average unemployment rates from the Bureau of Labor Statistics, click here.

Median Household Income. We obtain it from the U.S. Census Bureau, Current Population Survey, Annual Social and Economic Supplements. Households as of March of the following year. Income in 2017 adjusted dollars. 1980's data estimated using data from 2000 to 1984. Download originals here.

High-Tech Industries. We obtain this data from the County Business Patterns here. The classification of high tech industry is defined by Hecker (1999). The data extends from 1986 to 2016, we estimate the data for 1980. You will need this Stata code on the original data. The sic codes are converted to naics codes using this website https://www.naics.com/sic-naics-crosswalk-search-results

gen six=floor(real(sic)*.1)
gen hightech=1 if (six==281 | six==286 | six==283 | six==357 | six==366 | six==367 | six==372 | six==376 | six==381 | six==382 | six==737 | six==873 | six==282 | six==284 | six==285 | six==287 | six==289 | six==291 | six==348 | six==351 | six==353 | six==355 | six==356 | six==361 | six==362 | six==365 | six==371 | six==384 | six==386 | six==871 | six==874)
replace hightech=0 if hightech!=1
tab  fipstate hightech

or with the nic codes

gen nai=real( naics )
drop if lfo!="-" %this line only for data after 2010
gen hightech=1 if inlist(nai, 325180, 325120, 325130, 211130, 325130, 325180, 325998, 331313, 325194, 325110, 325130, 325194, 325110, 325120, 325180, 325193, 325194, 325199, 325998, 325411, 325412, 325412, 325413, 325414, 334111, 334112, 334118, 333316, 334118, 334418, 334613, 333316, 333318, 333318, 334118, 333318, 334519, 339940, 334210, 334418, 334220, 334515, 334290, 334419, 334412, 334413, 334416, 334416, 334416, 334417, 334220, 334310, 334418, 334419, 334515, 336411, 541713, 336412, 541713, 332912, 336411, 336413, 541713, 336414, 541713, 336415, 541713, 336419, 541713, 334511, 333249, 333415, 333994, 333997, 333999, 337127, 339113, 334512, 334513, 334514, 334514, 334515, 334516, 333314, 334514, 334519, 334519, 339112, 334513, 541511, 334614, 511210, 541512, 518210, 517311, 517919, 541513, 532420, 443142, 811212, 518210, 541512, 541519, 541713, 541714, 541715, 541720, 541910, 541713, 541714, 541715, 541720, 541380, 541940, 325211, 325212, 325220, 325220, 325611, 325612, 325613, 325611, 325620, 325510, 325311, 325312, 325314, 325320, 325520, 325920, 325910, 325180, 311942, 325199, 325510, 325998, 324110, 332992, 332993, 332994, 332994, 333611, 333618, 336390, 333120, 333923, 336510, 333131, 333132, 333921, 333922, 333923, 332439, 332999, 333924, 333249, 333243, 333243, 333244, 333241, 332410, 333111, 333242, 333249, 333249, 333318, 333914, 332991, 333912, 333413, 333413, 333993, 333612, 333994, 333613, 314999, 333414, 333999, 335311, 335313, 335312, 335991, 335314, 335999, 334310, 334614, 512250, 336111, 336112, 336120, 336211, 336992, 336211, 336211, 336310, 336320, 336330, 336340, 336350, 336390, 336212, 336213, 332994, 333249, 333415, 333994, 333997, 333999, 337127, 339112, 339113, 322291, 334510, 339113, 339999, 339114, 334517, 334510, 334517, 325992, 333316, 541330, 541310, 541360, 541370, 236115, 236116, 236118, 236210, 236220, 237110, 237120, 237130, 237310, 237990, 561110, 541611, 541612, 541613, 541614, 561312, 541820, 561210, 541320, 541330, 541618, 541690, 611710)
replace hightech=0 if hightech!=1
tab  fipstate hightech

Ratio of Non-White People and Ratio of Men/Women. Dowload the data form IPUMS (https://usa.ipums.org).

GDP. We obtain the GDP data in chained 2012 dollars from the Bureau of Economic Analysis https://www.bea.gov/data/gdp/gdp-state

Population Density. Obtained as the number of state population divided by its land size in square miles. Original data can be downloaded here.

Individual Variables.

Interracial Marriage. We download our data from https://usa.ipums.org. To get the interracial marriage rates, we use the following Stata code.

clear all
 use "C:\Users\JOR\Desktop\IndividualData.dta"
 drop if sploc==0

replace race=5 if race==6
 replace race=5 if race==4
 replace hispan=1 if hispan~=0
 replace race=7 if race==8
 replace race=7 if race==9
 replace race=4 if hispan==1
 label define racelabel 1 white 2 black 3 native 4 latino 5 asian 7 other
 label values race racelabel
 
generate couple_id = cond(pernum < sploc, string(pernum) + " " + string(sploc), string(sploc) + " " + string(pernum))

bysort state year serial couple_id : generate mixed = race[1] != race[2] if _N == 2
 bysort state year serial couple_id : gen hetero=0 if sex[1]==sex[2]
 drop if hetero==0
 
 by stateicp: tab  year mixed [fweight= hhwt ], row nofreq

Figures. To produce Figure 2 with your own Facebook friend graph, you can use https://lostcircles.com/ .

To produce Figure 3, download the data from the PEW Research center here.

To produce Figures 8,9 and 10, use this file and then type the following Stata command (or a minor modification of)

drop if State=="Total"  
drop if State=="Hawaii"
gen bbchange=BB2017 - BB2000 
gen imchange=IM2017 - IM2000  
scatter bbchange imchange, mlabel(B) || lfit bbchange imchange