Difference between revisions of "Urban Start-up Agglomeration and Venture Capital Investment"
(→Data) |
(→Data) |
||
Line 36: | Line 36: | ||
In order to see if there are outliers, I get the average coordinates for all cities and find the differences of the firm's coordinates from the city coordinate. | In order to see if there are outliers, I get the average coordinates for all cities and find the differences of the firm's coordinates from the city coordinate. | ||
− | The script for the average city coordinates is in Z:\Hubs\2017\sql scripts and the file name is '''newcolevel.sql'''. | + | The script for the average city coordinates is in |
− | The differences are taken in excel. The file containing the differences is in Z:\Hubs\2017 and the file name is '''new_colevel.txt'''. | + | Z:\Hubs\2017\sql scripts and the file name is '''newcolevel.sql'''. |
+ | |||
+ | The differences are taken in excel. The file containing the differences is in | ||
+ | Z:\Hubs\2017 and the file name is '''new_colevel.txt'''. | ||
Revision as of 16:00, 4 August 2017
Academic Paper | |
---|---|
Title | Urban Start-up Agglomeration |
Author | Ed Egan |
RAs | Peter Jalbert, Jake Silberman, Christy Warden |
Status | In development |
© edegan.com, 2016 |
Summary
Agglomeration is generally thought to be one of the most important determinants of growth for urban entrepreneurship ecosystems. However, there is essentially no empirical evidence to support this. This paper takes advantage of geocoding and introduces a novel measure of agglomeration. This measure is the smallest circle area that covers all startup offices, subject to having at least N startups in each circle. Using GIS data on cities, this paper controls for the density and socio-demographics of an area to identify the effect of just agglomeration.
Description
Clusters of economic activity plays a significant role in the firms performance and growth. An important driver of growth is the knowledge spillover between firms. This includes among others the facilitation of information flow and ideas between firms which could be a milestone especially in the growth of startup firms or small businesses. This project focuses on the effects of agglomeration on the performance and growth of startup firms. It introduces a novel measure of agglomeration which can be used to empirically test the effects of clustering. This measure the is smallest total circle area that covers all of the startups in the sample such that there are at least n firms in each circle. The projects is based on the creation of an algorithm which gives an unbiased measure to be used in the empirical analysis. The regression we are interested in takes the following form:
The dependent variable is a measure of growth of the firms. This measure could be investment forwarded one period or growth in investment. The control variables include the number of the startups firms, m, the agglomeration measure, A and a vector of other control variables affecting the growth of firms at time t. Because of the endogeneity in the circle area or the measure of agglomeration, A, there is a need for an instrumental variable to get consistent estimates of the effects we are interested in. The proposed instrument is the presence of a river, or road in between the points representing geographical locations of the venture capital backed up firms. The instrument affects agglomeration without having a direct impact on the growth. This makes it good candidate for a valid instrument. The next tasks are determining the additional control variables to include in the regression, years to include in the analysis and methods of finding an unbiased measure of agglomeration.
Data
- SDC VentureXpert
- GIS City Data
Data on NSF, NIH, population, income, clinical trials, employment, schooling, R&D expenditures and revenue of firms can be found in Hubs. Data on the number of new vc backed firms in each city and year is in:
Z:\Hubs\2017\clean data The name of the file is firm_nr.txt.
Database is cities SQL script is: nr_firms.sql
Raw data is in:
Z:\VentureCapitalData\SDCVCData\vcdb2 The file is colevelsimple.txt
In order to see if there are outliers, I get the average coordinates for all cities and find the differences of the firm's coordinates from the city coordinate. The script for the average city coordinates is in
Z:\Hubs\2017\sql scripts and the file name is newcolevel.sql.
The differences are taken in excel. The file containing the differences is in
Z:\Hubs\2017 and the file name is new_colevel.txt.
Data on the circle area in each city and year is in:
Z:\Hubs\2017\clean data The name of the file is circles.txt. (It contains only 106 observations)
Database is cities SQL script is: circles.sql
The script for joining the two tables on the VC table is in:
Z:\Hubs\2017\sql scripts The name of the file is new_firm_nr_circles.sql
We use the cities with greater than 10 active VC backed firms. Data on the cities and number of active firms is in:
E:\McNair\Projects\Hubs\Summer 2017 The file is CitiesWithGT10Active.txt
The script for joining the final data with this file is located in
Z:\Hubs\2017\sql scripts The file name is final_joined_kerda.sql.
The final data is in
Z:\Hubs\2017\clean data The file name is new_final_kerda.txt.
Also:
- Enclosing Circle Algorithm
- Normalizer
- Geocode.py
Unbiased measure
The number of startups affects the total area of the circles according to some function. The task is to find an unbiased measure of the area, which is not affected by the number of the startups, given the size and their distribution.
For the unbiased calculation of a measure in a different context see: http://users.nber.org/~edegan/w/images/d/d0/Hall_(2005)_-_A_Note_On_The_Bias_In_Herfindahl_Type_Measures_Based_On_Count_Data.pdf