FROM (SELECT DISTINCT issuer, issuedate, statecode FROM ipos)a;
--9491
The keys are not unique so we must remove duplicate keys first. You will need to try different methods to isolate the duplicate keys. This is where you can be creative. I first started byfinding the duplicates based on issuer, issuedate and statecode which is our key. Have a look in textpad/excel for ways to filter these keys. We would like to save as much information as possible so rather than excluding all these entries which sum to 1888 rows in the ipos table maybe there's some other way we can filter out records and still have distinct keys. DROP TABLE ipoduplicates; CREATE TABLE ipoduplicates AS SELECT *, COUNT(*) FROM (SELECT issuer, issuedate, statecode FROM ipos)a GROUP BY issuer, issuedate, statecode HAVING COUNT(*) > 1; --939 \COPY ipoduplicates TO 'ipoduplicates.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV;In the file you will notice that many keys contain different principalamts. Let's keep the MAX principal amount and throw out the same key that has a lower principalamt. This query is shown below. DROP TABLE ipoinclude; CREATE TABLE ipoinclude AS SELECT issuer, issuedate, statecode, MAX(principalamt) AS principalamt FROM ipos GROUP BY issuer, issuedate, statecode; --9470
==Creating Stage Flags Table==