Working with PostgreSQL
Configuring a copy of Postgres on Windows
Add PostgreSQL to the path if it isn't already:
Control Panel->System->Advanced->Environmental Variables Add: C:\Program Files\PostgreSQL\9.0\bin
or from a cmd window (though sometimes this doesn't stick):
SET PATH=%PATH%;C:\Program Files\PostgreSQL\9.0\bin;
To use PLPerl on windows, you will need to be careful to mix and match the right versions of PostgreSQL and Perl.
Perl version 5.10.1 (a 64bit build is available from ActiveState) works with PostgreSQL 9.0.1 (64bit build 1500). To see your PostgreSQL version type:
psql -c "select version();" template1
To see your perl version type:
perl -v
With these two versions together you should be able to add plperl to template1 (which all new dbs will inherit) with the command:
createlang plperl template1
There is a list of commands/client applications, with links to documentation, which is useful.
You will almost surely want to 'performance tune' your postgresql database, as the default settings are near useless. In particular edit postgresql.conf (which is in the data directory of your install) to change:
shared_buffers = 512MB #Use about 10-15% of available RAM, to a max of 512Mb on windows effective_cache_size = 3GB #Use half to 3/4 of available RAM, depending on your pref work_mem = 128MB # This is the memory allocated to each query sort maintenance_work_mem = 256MB #This is for vacuum, and a max of 256 is recommended.
Note that with work_mem this is the allocation to each sort. Each query you run may do many sorts and you may have many users, so this can explode quickly. 128Mb is an aggresive setting that assumes only a single user. See the various documentation resources, especially the official performance optimization page for more information.
Restart the server (on windows use the 'services' control panel) for many changes to take effect:
pg_ctl restart
Create a user using pgAdmin or the createuser command:
createuser ed_egan
And then create a database again using pgAdmin or the createdb command:
createdb -O ed_egan DBName
Postgres.Haas
We have a new dedicated Postgres server available for use by PhD Students and Faculty only. Access to the server is available from within Haas or over the VPN.
To get an account see Ed Egan (in F533). Ed can either set you up with SSH access so that you can create an account on the database server and as many databases as you like, or he can create an account on the database server, and a blank database owner solely by your account, for you.
Once you have a username and password you and a blank database, can use psql on your local computer (available from [1]) to connect to the database server as detailed below. You will probably find that this gives you everything you need - you can load data, perform queries, and produce output datasets for analysis (for example in STATA).
Without an SSH account you may not be able to use the backup and restore command-line tools. We are currently experimenting with ways to overcome this if necessary. Please ask for more information. If you are comfortable using SSH then all advanced functionality will be available, including the ability to mount your R drive using the commands:
mkdir /mnt/username mount -t cifs //bear/username/ /mnt/ed -o user=username
Currently we are running Postgres version 8.1 on this server. An upgrade is being considered, but the core functionality is more than adequate for our purposes. One useful function that is missing is unnest, which was introduced in version 8.4. However, this can be created manually with details here.
Dumping and Restoring a Database
This can be done in pgAdmin in Windows, but the commands (needed on Linux with SSH access) are:
To dump a db:
pg_dump mydb > db.backup
To reload this database:
pg_restore -d DBName db.backup
Working with psql
To connect your psql client to a db type (for localhost don't specify the host):
psql -h host DBName
Therefore to connect to Postgres.Haas you will type:
psql -h postgres.haas.berkeley.edu DBName
The 'new' postgres server is currently on 128.32.252.201. Connect using your username and DBname (firstname_data by default):
psql -h 128.32.252.201 -U username dbname
Useful commands are:
\q Quits \i basics.sql Run script basics.sql \dt List tables \COPY Psql's version of copy \r Reset the query buffer Ctrl-c Abort the current query
SQL Commands
There is a list of SQL commands that may help.
Make/delete tables and functions with CREATE and DROP:
CREATE TABLE tablename AS SELECT * FROM tablename WHERE date_prod >= '2002-01-01'; DROP TABLE tablename; CREATE FUNCTION getreal (text) RETURNS real AS $$ if ($_[0]=~/^\d{1,}\.\d{0,}$/) { return $_[0]; } return undef; $$ LANGUAGE plperl; DROP Function correctyear(int,int);
Populate data with COPY, INSERT and UPDATE:
INSERT INTO tablename VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27'); COPY tablename FROM '/home/user/weather.txt'; --http://www.postgresql.org/docs/8.4/interactive/sql-copy.html UPDATE tablename SET kind = 'Dramatic' WHERE kind = 'Drama'; UPDATE accounts SET (contact_last_name, contact_first_name) = (SELECT last_name, first_name FROM salesmen WHERE salesmen.id = accounts.sales_id);
Retrieve results with SELECT:
SELECT x.a y.a FROM x,y WHERE x.a=y.a SELECT * FROM x LEFT OUTER JOIN y ON x.a=y.a; SELECT * FROM (SELECT ) AS x LEFT OUTER JOIN (SELECT ) AS y ON x.a=y.a; SELECT COUNT (*) FROM ( ) AS Temp; SELECT COUNT (*) FROM Acq; SELECT x, min(y) FROM tablename GROUP BY x; SELECT * FROM tablename LIMIT 10; SELECT CASE WHEN a > b THEN 1 ELSE 2 END As Colname, FROM tablename;
Get information on a table:
SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'tablename';
Change a table with ALTER:
ALTER TABLE tablename ADD COLUMN colname real; ALTER TABLE tablename RENAME COLUMN product_no TO product_number;
Find out how a query will be executed with EXPLAIN (a Postgre command):
EXPLAIN ANALYZE SELECT * FROM x;
Python Functions
Installing python in a database - if Python is not already installed
Source: https://www.postgresql.org/docs/9.4/static/plpython-funcs.html
To get into database via terminal:
1) ssh researcher@128.42.44.181 2) cd \folder_name 3) psql database_name
Creating Functions:
CREATE FUNCTION pymax (a integer, b integer) RETURNS integer AS $$ if a > b: return a return b $$ LANGUAGE plpythonu;