}}
==New Work= ==Files==
New files are in E:\projects\agglomeration
Finally in this part, build '''hcllevels''' and '''hcllayerwzero'''. For hcllevels we are going to compute mean distances between clusters. It is computational infeasible to do this for all layers. And then for all layers (inc zero) we are going to run our selection regression.
For the next steps on the data see [[Jeemin Sim (Work Log)]]. This includes details of how to load the TIF data. ===First Estimation(s)=== At this stage we have MasterLevels.txt and MasterLayers.txt as datafiles. MasterLevels.txt contains only layers corresponding to levels 0 through 12 and also has noothergeoms and avgdisthm as variables. The questions we need to answer are:1) Is there an agglomeration effect? 2) Which level or layer best describes a city (perhaps for a year, or perhaps over its life)? We can just pick a level (say 25 hectares) and run a within-city regression: . xtreg growthinv17l_f growthinv17l nosingletonl totmultitoncountl totpaircountl tothullcountl avgpairlengthl avghulldensi > tyl avgdisthml i.year if level==6, fe cluster(placelevelid) Fixed-effects (within) regression Number of obs = 5,027 Group variable: placelevelid Number of groups = 198 R-sq: Obs per group: within = 0.4097 min = 3 between = 0.8310 avg = 25.4 overall = 0.5974 max = 37 F(44,197) = 78.20 corr(u_i, Xb) = 0.4087 Prob > F = 0.0000 (Std. Err. adjusted for 198 clusters in placelevelid) ----------------------------------------------------------------------------------- | Robust growthinv17l_f | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------+---------------------------------------------------------------- growthinv17l | .1388644 .0176074 7.89 0.000 .1041412 .1735877 nosingletonl | .1447935 .0402488 3.60 0.000 .0654197 .2241673 totmultitoncountl | .0909545 .0481349 1.89 0.060 -.0039714 .1858803 totpaircountl | .1724367 .0383185 4.50 0.000 .0968695 .2480039 tothullcountl | .7120504 .0467915 15.22 0.000 .6197739 .8043269 avgpairlengthl | -.0219417 .023633 -0.93 0.354 -.0685478 .0246645 avghulldensityl | .049566 .0202756 2.44 0.015 .0095808 .0895511 avgdisthml | .0933327 .076309 1.22 0.223 -.0571546 .2438201 Or: . xtreg growthinv17l_f growthinv17l numstartups numstartupssq nosinglemulti nosinglemultisq nohull nohullsq nopair nopairs > q i.year if level==6, fe cluster(placelevelid) Fixed-effects (within) regression Number of obs = 5,773 Group variable: placelevelid Number of groups = 200 R-sq: Obs per group: within = 0.4017 min = 4 between = 0.8425 avg = 28.9 overall = 0.5708 max = 37 F(45,199) = 72.39 corr(u_i, Xb) = 0.4222 Prob > F = 0.0000 (Std. Err. adjusted for 200 clusters in placelevelid) --------------------------------------------------------------------------------- | Robust growthinv17l_f | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------+---------------------------------------------------------------- growthinv17l | .220333 .018699 11.78 0.000 .1834595 .2572066 numstartups | .0062875 .0022944 2.74 0.007 .001763 .0108119 numstartupssq | -8.04e-07 1.14e-06 -0.70 0.483 -3.06e-06 1.45e-06 nosinglemulti | .0648575 .0168134 3.86 0.000 .0317023 .0980127 nosinglemultisq | -.0021614 .0006336 -3.41 0.001 -.0034108 -.000912 nohull | .1747691 .0255105 6.85 0.000 .1244636 .2250747 nohullsq | -.0057148 .0012164 -4.70 0.000 -.0081136 -.003316 nopair | .0896908 .0248207 3.61 0.000 .0407455 .1386361 nopairsq | -.0097196 .0024153 -4.02 0.000 -.0144825 -.0049567 Note that the following don't work, either alone or with other variables (including numstartups and numstartupsq), probably because they are third-order effects: . xtreg growthinv17l_f growthinv17l avghulldensity avghulldensitysq avgpairlength avgpairlengthsq avgdisthm avgdisthmsq i. > year if level==6, fe cluster(placelevelid) Fixed-effects (within) regression Number of obs = 5,027 Group variable: placelevelid Number of groups = 198 R-sq: Obs per group: within = 0.3579 min = 3 between = 0.5926 avg = 25.4 overall = 0.3753 max = 37 F(43,197) = 2152.49 corr(u_i, Xb) = 0.2529 Prob > F = 0.0000 (Std. Err. adjusted for 198 clusters in placelevelid) ---------------------------------------------------------------------------------- | Robust growthinv17l_f | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------+---------------------------------------------------------------- growthinv17l | .2668427 .0208574 12.79 0.000 .2257101 .3079752 avghulldensity | .0008076 .0003875 2.08 0.038 .0000433 .0015718 avghulldensitysq | -1.14e-07 8.80e-08 -1.29 0.197 -2.87e-07 5.97e-08 avgpairlength | -.0018724 .0036128 -0.52 0.605 -.0089972 .0052524 avgpairlengthsq | -.0000296 .0000596 -0.50 0.620 -.0001471 .0000879 avgdisthm | .001429 .0035371 0.40 0.687 -.0055465 .0084045 avgdisthmsq | -.000012 .0000157 -0.76 0.447 -.0000429 .000019 We can also do it with fractions and their squares (omit fracsinglemulti). However at level 6 (25 hectare), pairs seems more important than hulls: . xtreg growthinv17l_f growthinv17l numstartups numstartupssq fracpair fracpairsq frachull frachullsq i.year if level==6, > fe cluster(placelevelid) Fixed-effects (within) regression Number of obs = 5,773 Group variable: placelevelid Number of groups = 200 R-sq: Obs per group: within = 0.3919 min = 4 between = 0.8456 avg = 28.9 overall = 0.5274 max = 37 F(43,199) = 62.34 corr(u_i, Xb) = 0.4268 Prob > F = 0.0000 (Std. Err. adjusted for 200 clusters in placelevelid) ------------------------------------------------------------------------------- | Robust growthinv17~f | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- growthinv17l | .2481436 .0181447 13.68 0.000 .2123631 .283924 numstartups | .0100673 .0019792 5.09 0.000 .0061644 .0139702 numstartupssq | -6.92e-06 2.03e-06 -3.41 0.001 -.0000109 -2.92e-06 fracpair | .7540177 .3709212 2.03 0.043 .0225772 1.485458 fracpairsq | -1.936 .7030942 -2.75 0.006 -3.322472 -.5495289 frachull | .1969853 .562807 0.35 0.727 -.9128457 1.306816 frachullsq | -.1491389 .3878513 -0.38 0.701 -.9139649 .615687 Whereas across all levels: . xtreg growthinv17l_f growthinv17l numstartups numstartupssq fracpair fracpairsq frachull frachullsq i.year, fe cluster(p > lacelevelid) Fixed-effects (within) regression Number of obs = 76,623 Group variable: placelevelid Number of groups = 2,600 R-sq: Obs per group: within = 0.3956 min = 4 between = 0.8330 avg = 29.5 overall = 0.5279 max = 37 F(43,2599) = 827.33 corr(u_i, Xb) = 0.4143 Prob > F = 0.0000 (Std. Err. adjusted for 2,600 clusters in placelevelid) ------------------------------------------------------------------------------- | Robust growthinv17~f | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- growthinv17l | .2522524 .0049725 50.73 0.000 .2425019 .2620028 numstartups | .0100677 .0005323 18.91 0.000 .0090239 .0111114 numstartupssq | -6.95e-06 5.51e-07 -12.62 0.000 -8.03e-06 -5.87e-06 fracpair | .4152028 .0956859 4.34 0.000 .2275745 .6028311 fracpairsq | -.9753654 .1271631 -7.67 0.000 -1.224717 -.7260141 frachull | -.8606939 .1231519 -6.99 0.000 -1.10218 -.6192081 frachullsq | .495785 .0976557 5.08 0.000 .3042942 .6872758 This is probably because of the variation in hulls vs pairs at level 6, which has lots of cities with nothing in pairs and everything in hulls. We might want to 'control' for cityarea by restricting our within city analysis to large enough cities. A 25 hectare target area might be too encapsulating -- more than 10% of observations are 100% in hulls: . su frachull if level==6, det frachull ------------------------------------------------------------- Percentiles Smallest 1% .2162162 .1153846 5% .3333333 .1153846 10% .4285714 .1428571 Obs 6,032 25% .6 .1428571 Sum of Wgt. 6,032 50% .8 Mean .7539126 Largest Std. Dev. .2209328 75% .9666666 1 90% 1 1 Variance .0488113 95% 1 1 Skewness -.6390206 99% 1 1 Kurtosis 2.400947
===Other===
*[[TIF Project]]
=Old Work Using Circles= ==Very Old Summary==
Agglomeration is generally thought to be one of the most important determinants of growth for urban entrepreneurship ecosystems. However, there is essentially no empirical evidence to support this. This paper takes advantage of geocoding and introduces a novel measure of agglomeration. This measure is the smallest circle area that covers all startup offices, subject to having at least N startups in each circle. Using GIS data on cities, this paper controls for the density and socio-demographics of an area to identify the effect of just agglomeration.