Crunchbase Database

From edegan.com
Revision as of 15:55, 21 March 2019 by Hiep (talk | contribs)
Jump to navigation Jump to search


Project
Crunchbase Database
Project logo 02.png
Project Information
Has title Crunchbase Database
Has owner Hiep Nguyen
Has start date 2019/03/13
Has deadline date 2019/03/22
Has project status Active
Dependent(s): Ecosystem Organization Classifier, Incubator Seed Data
Copyright © 2019 edegan.com. All Rights Reserved.


Files and Dbase

Files are in:

  • E:\projects\crunchbase3
  • Z:\crunchbase3

Dbase is crunchbase3

The old project page is Crunchbase Data. File locations listed as Z:/bulk/ should now be Z:/bulk/mcnair/. For example there is an old loadscript in /bulk/mcnair/crunchbase/crunchbaseData/load_crunchbase.sql


Crunchbase Pro

https://www.crunchbase.com/login

Login details:

  • mcnair@rice.edu getpasswordfromed

Getting and cleaning data

The url to make API calls is https://api.crunchbase.com/v3.1/csv_export/csv_export.tar.gz?user_key=[API KEY GOES HERE]

API key (premium) is located at E:\projects\crunchbase3

The command line (bash script) to get the data and extract the data (1.9gb) is at E:\projects\crunchbase3\get_data.sh

Alternatively, we can download and extract directly using windows command prompt by typing the following commands

curl -O https://api.crunchbase.com/v3.1/csv_export/csv_export.tar.gz?user_key=[API key goes here] \
      
tar -xvf csv_export.tar.gz_user_key=[API key goes here].

Current csv files from crunchbase data

data\acquisitions.csv
data\category_groups.csv
data\degrees.csv
data\events.csv
data\event_appearances.csv
data\funding_rounds.csv
data\funds.csv
data\investments.csv
data\investment_partners.csv
data\investors.csv
data\ipos.csv
data\jobs.csv
data\organizations.csv
data\organization_descriptions.csv
data\org_parents.csv
data\people.csv
data\people_descriptions.csv

The sql script get_data.sql from last year is copied to the current Crunchbase3 directory. However, two databases are very different now and adjustments are necessary. Hiep will fix it on 03/22/2019.