Using R in PostgreSQL

From edegan.com
Revision as of 13:25, 2 March 2018 by JeeminS (talk | contribs)
Jump to navigation Jump to search

Perhaps a Simpler Alternative

Aggregate functions in PostgreSQL:
https://www.postgresql.org/docs/9.0/static/functions-aggregate.html

Joe Conway's Documentation on pl/r

http://joeconway.com/doc/plr-US.pdf 
http://www.joeconway.com/plr/doc/plr-funcs.html
http://www.joeconway.com/plr/doc/plr-window-funcs.html

Sample Function Code in PostgreSQL

Creating the Sample Function

Inside a database (ex) tigertest):

CREATE OR REPLACE FUNCTION r_max (integer, integer) RETURNS integer AS ’
    if (arg1 > arg2)
        return(arg1)
    else
        return(arg2)
’ LANGUAGE ’plr’ STRICT;

Calling the Sample Function

SELECT * FROM r_max(2,3);

returns

 r_max 
-------
     3
(1 row)

hcllayerwinv table from tigertest database

                       Table "public.hcllayerwinv"
      Column      |          Type          | Collation | Nullable | Default 
------------------+------------------------+-----------+----------+---------
 place            | character varying(100) |           |          | 
 statecode        | character varying(2)   |           |          | 
 year             | integer                |           |          | 
 layer            | integer                |           |          | 
 lcount           | numeric                |           |          | 
 nosingleton      | bigint                 |           |          | 
 nomultiton       | bigint                 |           |          | 
 nopair           | bigint                 |           |          | 
 nohull           | bigint                 |           |          | 
 totmultitoncount | numeric                |           |          | 
 avgmultitoncount | numeric                |           |          | 
 totpaircount     | numeric                |           |          | 
 avgpaircount     | numeric                |           |          | 
 totpairlength    | double precision       |           |          | 
 avgpairlength    | double precision       |           |          | 
 minpairlength    | double precision       |           |          | 
 maxpairlength    | double precision       |           |          | 
 tothullcount     | numeric                |           |          | 
 avghullcount     | numeric                |           |          | 
 tothullarea      | double precision       |           |          | 
 avghullarea      | double precision       |           |          | 
 minhullarea      | double precision       |           |          | 
 maxhullarea      | double precision       |           |          | 
 tothulldensity   | double precision       |           |          | 
 avghulldensity   | double precision       |           |          | 
 seedearlyinvf    | double precision       |           |          | 
 inflator         | numeric                |           |          | 
 seedearlyinvl16f | double precision       |           |          |

Regress by

--By: place, statecode, layer, --Regress: seedearlyinvl16f ON nosingleton, totmultitoncount, totpaircount, tothullcount, avgpairlength, avghullarea, avghulldensity

Installing PL/R on the dbase server

Check the version of the dbase by doing a

SELECT version();

This gives us 9.5.10

Then as root:

apt-get install postgresql-9.5-plr

Finally enable the extension in a dbase:

create extension plr;

See also: http://www.joeconway.com/presentations/plr-PGConfNYC2014.pdf

Instructions on Use

To use R from pgAdmin III, follow the instructions in this tutorial: choose the database, click SQL, and run "CREATE EXTENSION plr;". This allows R functions to be used in this database from then on. You should be able to run the given examples in the tutorial after this is done. This was run on databases tigertest and template1.

Another possibly useful presentation on PL/R.

How PL/R was installed on the RDP

On 2018-01-18, PL/R was installed to allow PostgreSQL users to use R functions from SQL queries. It was installed using this tutorial. The link from the tutorial for PL/R doesn't work, I used this instead.

It required R version 3.3.0 for PL/R to work with PostgreSQL 9.5. If the environmental variables R_HOME or PATH are edited so that they point to a different version of R, this might not work, and PL/R might have to be reinstalled. If the version of PostgreSQL is updated, both the R version and PL/R version will have to match it as well.