==Process For Creating the PortCoExits Table==
Even if you ===MA Process===First we must load the clean, manually checked tables back into the excel sheet for Warning Multiple Matches with the Hall warningdatabase. DROP TABLE MAClean; CREATE TABLE MAClean ( conamestd varchar(255), targetnamestd varchar(255), method varchar(100), x1 varchar(255), coname varchar(255), statecode varchar(10), datefirstinv date, x2 varchar(255), targetname varchar(255), targetstate varchar(10), announceddate date ); \COPY MAClean FROM 'MAClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV --7205 SELECT COUNT(*) FROM(SELECT DISTINCT targetname, targetstate, announceddate FROM MAClean)a; --7188 As you can see there are still may be duplicatesduplicate primary keys in the table. To get rid of these I wrote a query that only includes chooses primary keys that occur only once and matches them against MANoDups. That way you will have unique primary keys by construction.
There are two companies that have the name Masspower in the MAClean file. One is written in all caps and will thus not be caught by an aggregate function. I will select only the companies where the primary keys occurs once and join this to MAClean. I will then select needed info from MANoDps.
DROP TABLE MACleanNoDups;
CREATE TABLE MACleanNoDups AS
Now do the same for the IPOs.
===IPO Process===
DROP TABLE IPOClean;
CREATE TABLE IPOClean (
conamestd varchar(255),
issuernamestd varchar(255),
method varchar(100),
x1 varchar(255),
coname varchar(255),
statecode varchar(10),
datefirstinv date,
x2 varchar(255),
issuername varchar(255),
issuerstate varchar(10),
issuedate date
);
\COPY IPOClean FROM 'IPOClean.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV
--2146
SELECT COUNT(*) FROM(SELECT DISTINCT issuername, issuerstate, issuedate FROM IPOClean)a;
--2141
As with the MA process, there were duplicates in the clean IPO table. Get rid of these using the same process as with MAs. Only choose the primary keys that occur once and join these to the IPONoDups table.
DROP TABLE IPOCleanNoDups;