It is important to follow Ed's direction of cleaning the data using aggregate function before putting the data into excel. This will keep you from a lot of manual checking that is unnecessary. When ready, paste the data you have into an excel file. In that excel file, I made three columns: one to check whether state codes were equivalent, one checking whether the date of first investment was 3 years before the MA or IPO, and one checking whether both of these conditions were satisfied for each company. I did this using simple if statements. This process is manual checking and filtering to see whether matches are correct or not and are thus extremely subjective and tedious. First, I went through and checked the companies that did not have equivalent state codes. If the company was one that I knew or the name was unique to the point that I did not believe the same name would appear in another state, I marked the state codes as equivalent. I did the same for the date of first investment vs MA/IPO date. Then I removed all duplicates that had the marking Warning Multiple Matches, and the data sheets were clean.
==Finding Companies that Underwent IPOs and MAs==
#Load IPOClean and MAClean into the database.
#Make a table of keys by doing a JOIN on the two tables in order to find the companies that underwent both MAs and IPOs. (Do this by joining on the primary keys of portcos in both tables.) Make another table by left joining these results back with IPOClean and MAClean. COPY this table into an excel sheet and manually find which companies you want to keep as MAs and which you want to keep as IPOs. Make sure to keep the portco primary key in this table.
#Load the decided IPO and MA data back into the database into a table called IPOMASelected, including the primary keys of the portcos, and the primary keys of the MAs and IPOs selected.
#Make a table of primary keys of portcos and MAs and IPOs called ExitKeysClean. Do this by taking the IPOs which didn't have MAs from IPOClean and the MAs which didn't have IPOs from MAClean and the selected MAs and IPOs from IPOMASelected. Note that one way to do this is to LEFT JOIN the MA table with the IPO table. Select the companies where the IPO table are null as these are the companies that only had MAs. Do the same for IPOs. Now you have tables of companies that underwent only MAs and only IPOs.
#Make a table called PortCoExits by compiling information from ExitKeysClean Left joined to your main (no duplicates) IPO and MA tables, and using CASE WHEN statements.
Example Join on a composite Key:
DROP TABLE IPOMAForReview;
CREATE TABLE IPOMAForReview
SELECT A.*, B.targetname, B.targetstate, B.announcedate
FROM IPOClean AS A
JOIN MAClean AS B ON A.coname=B.coname AND A.statecode=B.statecode AND A.datefirstinv=B.datefirstinv;
Example CASE WHEN statement:
DROP TABLE PortCoExits;
CREATE TABLE PortCoExits
SELECT A.coname, A.statecode, A.datefirst,
CASE WHEN b.coname IS NOT NULL THEN 1::int WHEN c.coname IS NOT NULL THEN 0::int ELSE NULL::int END AS IPOvsMA,
CASE WHEN b.coname IS NOT NULL THEN b.proceedsall WHEN c.coname IS NOT NULL THEN c.transactionvalue ELSE NULL::int END AS exitvalue,
...
FROM ExitKeysClean AS A
LEFT JOIN IPONoDups AS B ON A.issuername=B.issuername AND A.statecode=B.statecode AND A.issuedate=B.issuedate
LEFT JOIN MANoDups AS B ON A.targetname=B.targetname AND A.statecode=B.statecode AND A.announcedate=B.announcedate;
When putting values together, make sure that they are in the same units (pref $m).
==Process For Creating the PortCoExits Table==