plsql.sgml

<!--
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/plsql.sgml,v 2.31 2001/05/12 22:51:35 petere Exp $
-->

<chapter id="plpgsql"> 
  <title>PL/pgSQL - <acronym>SQL</acronym> Procedural Language</title>

 <indexterm zone="plpgsql">
  <primary>PL/pgSQL</primary>
 </indexterm>

 <para>
  PL/pgSQL is a loadable procedural language for the
  <productname>Postgres</productname> database system.
 </para>
   
 <para>
  This package was originally written by Jan Wieck. This
  documentation was in part written 
  by Roberto Mello (<email>rmello@fslc.usu.edu</email>).
 </para>

  <sect1 id="plpgsql-overview">
   <title>Overview</title>

   <para>
    The design goals of PL/pgSQL were to create a loadable procedural
    language that
    <itemizedlist>
     <listitem>
      <para>
       can be used to create functions and trigger procedures,
      </para>
     </listitem>
     <listitem>
      <para>
       adds control structures to the <acronym>SQL</acronym> language,
      </para>
     </listitem>
     <listitem>
      <para>
       can perform complex computations,
      </para>
     </listitem>
     <listitem>
      <para>
       inherits all user defined types, functions and operators,
      </para>
     </listitem>
     <listitem>
      <para>
       can be defined to be trusted by the server,
      </para>
     </listitem>
     <listitem>
      <para>
       is easy to use.
      </para>
     </listitem>
    </itemizedlist>
   </para>
   <para>
    The PL/pgSQL call handler parses the function's source text and
    produces an internal binary instruction tree the first time the
    function is called. The produced bytecode is identified
    in the call handler by the object ID of the function. This ensures
    that changing a function by a DROP/CREATE sequence will take effect
    without establishing a new database connection. 
   </para>
   <para>
    For all expressions and <acronym>SQL</acronym> statements used in
    the function, the PL/pgSQL bytecode interpreter creates a
    prepared execution plan using the <acronym>SPI</acronym> manager's 
    <function>SPI_prepare()</function> and
    <function>SPI_saveplan()</function> functions. This is done the 
    first time the individual
    statement is processed in the PL/pgSQL function. Thus, a function with
    conditional code that contains many statements for which execution
    plans would be required, will only prepare and save those plans
    that are really used during the lifetime of the database
    connection.
   </para>
  <para>
  This means that you have to be careful about your user-defined
  functions. For example:

<programlisting>
CREATE FUNCTION populate() RETURNS INTEGER AS '
DECLARE
    -- Declarations
BEGIN
    PERFORM my_function();
END;
' LANGUAGE 'plpgsql';
</programlisting>
    If you create the above function, it will reference the OID for
    <function>my_function()</function> in its bytecode. Later, if you
    drop and re-create <function>my_function()</function>, then
    <function>populate()</function> will not be able to find
    <function>my_function()</function> anymore. You would then have to
    re-create <function>populate()</function>.
   </para>

   <para>
    Because PL/pgSQL saves execution plans in this way, queries that appear
    directly in a PL/pgSQL function must refer to the same tables and fields
    on every execution; that is, you cannot use a parameter as the name of
    a table or field in a query.  To get around
    this restriction, you can construct dynamic queries using the PL/pgSQL
    EXECUTE statement --- at the price of constructing a new query plan
    on every execution.
   </para>
   <para>
    Except for input/output conversion and calculation functions
    for user defined types, anything that can be defined in C language
    functions can also be done with PL/pgSQL. It is possible to
    create complex conditional computation functions and later use
    them to define operators or use them in functional indices.
   </para>
  <sect2 id="plpgsql-advantages">
   <title>Advantages of Using PL/pgSQL</title>

   <itemizedlist>
    <listitem>
     <para>
      Better performance (see <xref linkend="plpgsql-advantages-performance">)
     </para>
    </listitem>

    <listitem>
     <para>
      SQL support (see <xref linkend="plpgsql-advantages-sqlsupport">)
     </para>
    </listitem> 

    <listitem>
     <para>
      Portability (see <xref linkend="plpgsql-advantages-portability">)
     </para>
    </listitem>
   </itemizedlist>

   <sect3 id="plpgsql-advantages-performance">
    <title>Better Performance</title>

    <para>
     <acronym>SQL</acronym> is the language PostgreSQL (and
     most other Relational Databases) use as query
     language. It's portable and easy to learn. But every
     <acronym>SQL</acronym> statement must be executed
     individually by the database server.
    </para>

    <para>
     That means that your client application must send each
     query to the database server, wait for it to process it,
     receive the results, do some computation, then send
     other queries to the server. All this incurs inter
     process communication and may also incur network
     overhead if your client is on a different machine than
     the database server.
    </para>

    <para>
     With PL/pgSQL you can group a block of computation and a
     series of queries <emphasis>inside</emphasis> the
     database server, thus having the power of a procedural
     language and the ease of use of SQL, but saving lots of
     time because you don't have the whole client/server
     communication overhead. Your application will enjoy a
     considerable performance increase by using PL/pgSQL. 
    </para>
   </sect3>

   <sect3 id="plpgsql-advantages-sqlsupport">
    <title>SQL Support</title>

    <para>
     PL/pgSQL adds the power of a procedural language to the
     flexibility and ease of <acronym>SQL</acronym>. With
     PL/pgSQL you can use all the datatypes, columns, operators
     and functions of SQL.    
    </para>
   </sect3>

   <sect3 id="plpgsql-advantages-portability">
    <title>Portability</title>

    <para>
     Because PL/pgSQL functions run inside PostgreSQL, these
     functions will run on any platform where PostgreSQL
     runs. Thus you can reuse code and have less development costs.
    </para>
   </sect3>
  </sect2>

  <sect2 id="plpgsql-overview-developing-in-plpgsql">
   <title>Developing in PL/pgSQL</title>

   <para>
    Developing in PL/pgSQL is pretty straight forward, especially
    if you have developed in other database procedural languages,
    such as Oracle's PL/SQL. Two good ways of developing in
    PL/pgSQL are:

    <itemizedlist>
     <listitem>
      <para>
       Using a text editor and reloading the file with <command>psql</command>
      </para>
     </listitem>

     <listitem>
      <para>
       Using PostgreSQL's GUI Tool: pgaccess
      </para>
     </listitem>
    </itemizedlist>
   </para>

   <para>
    One good way to develop in PL/pgSQL is to simply use the text
    editor of your choice to create your functions, and in another
    console, use <command>psql</command> (PostgreSQL's interactive monitor) to load
    those functions. If you are doing it this way (and if you are
    a PL/pgSQL novice or in debugging stage), it is a good idea to 
    always <command>DROP</command> your function before creating it. That way
    when you reload the file, it'll drop your functions and then
    re-create them. For example:
<programlisting>
drop function testfunc(integer);
create function testfunc(integer) return integer as '
    ....
end;
' language 'plpgsql';
</programlisting>
   </para>

   <para>
    When you load the file for the first time,
    <productname>PostgreSQL</> will raise a warning saying this
    function doesn't exist and go on to create it. To load an SQL
    file (filename.sql) into a database named "dbname", use the command:
<programlisting>
psql -f filename.sql dbname
</programlisting>
   </para>

   <para>
    Another good way to develop in PL/pgSQL is using
    <productname>PostgreSQL</>'s GUI tool: pgaccess. It does some
    nice things for you, like escaping single-quotes, and making
    it easy to recreate and debug functions.
   </para>
  </sect2>
 </sect1>

  <!-- **** PL/pgSQL Description **** -->

  <sect1 id="plpgsql-description">
   <title>Description</title>

   <!-- **** PL/pgSQL structure **** -->

   <sect2>
    <title>Structure of PL/pgSQL</title>

    <para>
     PL/pgSQL is a <emphasis>block structured</emphasis> language. All
     keywords and identifiers can be used in mixed upper and
     lower-case. A block is defined as:

<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
<optional>DECLARE
    <replaceable>declarations</replaceable></optional>
BEGIN
    <replaceable>statements</replaceable>
END;
</synopsis>
    </para>

    <para>
     There can be any number of sub-blocks in the statement section
     of a block. Sub-blocks can be used to hide variables from outside a
     block of statements.
    </para>

    <para>
     The variables declared in the declarations section preceding a
     block are initialized to their default values every time the
     block is entered, not only once per function call. For example:
<programlisting>
CREATE FUNCTION somefunc() RETURNS INTEGER AS '
DECLARE
   quantity INTEGER := 30;
BEGIN
   RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 30
   quantity := 50;
   --
   -- Create a sub-block
   --
   DECLARE
      quantity INTEGER := 80;
   BEGIN
      RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 80
   END;

   RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 50
END;
' LANGUAGE 'plpgsql';
</programlisting>
    </para>

    <para>
     It is important not to confuse the use of BEGIN/END for
     grouping statements in PL/pgSQL with the database commands for
     transaction control.  PL/pgSQL's BEGIN/END are only for grouping;
     they do not start or end a transaction.  Functions and trigger procedures
     are always executed within a transaction established by an outer query
     --- they cannot start or commit transactions, since
     <productname>Postgres</productname> does not have nested transactions.
    </para>
   </sect2>

   <sect2>
    <title>Comments</title>

    <para>
     There are two types of comments in PL/pgSQL. A double dash <literal>--</literal>
     starts a comment that extends to the end of the line. A <literal>/*</literal>
     starts a block comment that extends to the next occurrence of <literal>*/</literal>.
     Block comments cannot be nested, but double dash comments can be
     enclosed into a block comment and a double dash can hide
     the block comment delimiters <literal>/*</literal> and <literal>*/</literal>.
    </para>
   </sect2>

   <!-- **** PL/pgSQL Variables and Constants **** -->
   <sect2>
    <title>Variables and Constants</title>

    <para>
     All variables, rows and records used in a block or its
     sub-blocks must be declared in the declarations section of a block.
     The exception being the loop variable of a FOR loop iterating over a range
     of integer values. 
    </para>

    <para>
     PL/pgSQL variables can have any SQL datatype, such as
     <type>INTEGER</type>, <type>VARCHAR</type> and
     <type>CHAR</type>. All variables have as default value the
     <acronym>SQL</acronym> NULL value. 
    </para>

    <para>
     Here are some examples of variable declarations:
<programlisting>
user_id INTEGER;
quantity NUMBER(5);
url VARCHAR;
</programlisting>
    </para>

    <sect3 id="plpgsql-description-default-vars">
     <title>Constants and Variables With Default Values</title>

     <para>
      The declarations have the following syntax:
<synopsis>
<replaceable>name</replaceable> <optional> CONSTANT </optional> <replaceable>type</replaceable> <optional> NOT NULL </optional> <optional> { DEFAULT | := } <replaceable>value</replaceable> </optional>;
</synopsis>
     </para>

     <para>
      The value of variables declared as CONSTANT cannot be changed. If NOT NULL
      is specified, an assignment of a NULL value results in a runtime
      error. Since the default value of all variables is the
      <acronym>SQL</acronym> NULL value, all variables declared as NOT NULL
      must also have a default value specified.
     </para>

     <para>
      The default value is evaluated every time the function is called. So
      assigning '<literal>now</literal>' to a variable of type
      <type>timestamp</type> causes the variable to have the
      time of the actual function call, not when the function was
      precompiled into its bytecode.
     </para>

     <para>
      Examples:
<programlisting>
quantity INTEGER := 32;
url varchar := ''http://mysite.com'';
user_id CONSTANT INTEGER := 10;
</programlisting>
     </para>
    </sect3>

    <sect3 id="plpgsql-description-passed-vars">
     <title>Parameters Passed to Functions</title>

     <para>
      Parameters passed to functions are named with the identifiers
      <literal>$1</literal>, <literal>$2</literal>,
      etc.  Optionally, aliases can be declared for the <literal>$n</literal>
      parameter names for increased readability.  Some examples:
<programlisting>
CREATE FUNCTION sales_tax(REAL) RETURNS REAL AS '
DECLARE
    subtotal ALIAS FOR $1;
BEGIN
    return subtotal * 0.06;
END;
' LANGUAGE 'plpgsql';


CREATE FUNCTION instr(VARCHAR,INTEGER) RETURNS INTEGER AS '
DECLARE
    v_string ALIAS FOR $1;
    index ALIAS FOR $2;
BEGIN
    -- Some computations here
END;
' LANGUAGE 'plpgsql';
</programlisting>
     </para>
    </sect3>

    <sect3 id="plpgsql-description-attributes">
     <title>Attributes</title>

     <para>
      Using the <type>%TYPE</type> and <type>%ROWTYPE</type>
      attributes, you can declare variables with the same
      datatype or structure of another database item (e.g: a
      table field).
     </para>

     <variablelist>
      <varlistentry>
       <term>
        <replaceable>variable</replaceable>%TYPE
       </term>
       <listitem>
        <para>
         <type>%TYPE</type> provides the datatype of a
         variable or database column. You can use this to
         declare variables that will hold database
         values. For example, let's say you have a column
         named <type>user_id</type> in your
         <type>users</type> table. To declare a variable with
         the same datatype as users.user_id you write:
<programlisting>
user_id   users.user_id%TYPE;
</programlisting>
        </para>

        <para>
         By using <type>%TYPE</type> you don't need to know
         the datatype of the structure you are referencing,
         and most important, if the datatype of the
         referenced item changes in the future (e.g: you
         change your table definition of user_id to become a
         REAL), you won't need to change your function
         definition.
        </para>
       </listitem>
      </varlistentry>

      <varlistentry>
       <term>
        <replaceable>table</replaceable>%ROWTYPE
       </term>
       <listitem>
        <para>
	 <type>%ROWTYPE</type> provides the composite datatype corresponding
	 to a whole row of the specified table.
	 <replaceable>table</replaceable> must be an existing
	 table or view name of the database. The fields of the row are
	 accessed in the dot notation. Parameters to a function can be
	 composite types (complete table rows). In that case, the
	 corresponding identifier $n will be a rowtype, and fields can
	 be selected from it, for example <literal>$1.user_id</literal>.
        </para>

        <para>
         Only the user-defined attributes of a table row are accessible in a
	 rowtype variable, not OID or other system attributes (because the
	 row could be from a view).  The fields of the rowtype inherit the
	 table's field sizes or precision for <type>char()</type>
	 etc. data types.
        </para>
<programlisting>
DECLARE
    users_rec users%ROWTYPE;
    user_id users.user_id%TYPE;
BEGIN
    user_id := users_rec.user_id;
    ...

create function cs_refresh_one_mv(integer) returns integer as '
   DECLARE
        key ALIAS FOR $1;
        table_data cs_materialized_views%ROWTYPE;
   BEGIN
        SELECT INTO table_data * FROM cs_materialized_views
               WHERE sort_key=key;

        IF NOT FOUND THEN
           RAISE EXCEPTION ''View '' || key || '' not found'';
           RETURN 0;
        END IF;

        -- The mv_name column of cs_materialized_views stores view
        -- names.
 
        TRUNCATE TABLE table_data.mv_name;
        INSERT INTO table_data.mv_name || '' '' || table_data.mv_query;

        return 1;
end;
' LANGUAGE 'plpgsql';
</programlisting>
       </listitem>
      </varlistentry>
     </variablelist>
    </sect3>

    <sect3 id="plpgsql-description-remaning-vars">
     <title>
      RENAME
     </title>

     <para>
      Using RENAME you can change the name of a variable, record
      or row. This is useful if NEW or OLD should be referenced
      by another name inside a trigger procedure.
     </para>

     <para>
      Syntax and examples:
<programlisting>
RENAME <replaceable>oldname</replaceable> TO <replaceable>newname</replaceable>;

RENAME id TO user_id;
RENAME this_var TO that_var;
</programlisting>
     </para>
    </sect3>
   </sect2>

   <!-- **** PL/pgSQL expressions **** -->

   <sect2>
    <title>Expressions</title>

    <para>
     All expressions used in PL/pgSQL statements are processed using
     the backend's executor. Expressions that appear to contain
     constants may in fact require run-time evaluation
     (e.g. <literal>'now'</literal>  for the 
     <type>timestamp</type> type) so
     it is impossible for the PL/pgSQL parser
     to identify real constant values other than the NULL keyword. All
     expressions are evaluated internally by executing a query
<synopsis>
SELECT <replaceable>expression</replaceable>
</synopsis>
     using the <acronym>SPI</acronym> manager. In the expression, occurrences of variable
     identifiers are substituted by parameters and the actual values from
     the variables are passed to the executor in the parameter array. All
     expressions used in a PL/pgSQL function are only prepared and
     saved once.  The only exception to this rule is an EXECUTE statement
     if parsing of a query is needed each time it is encountered.
    </para>

    <para>
     The type checking done by the <productname>Postgres</productname>
     main parser has some side
     effects to the interpretation of constant values. In detail there
     is a difference between what these two functions do:

<programlisting>
CREATE FUNCTION logfunc1 (text) RETURNS timestamp AS '
    DECLARE
        logtxt ALIAS FOR $1;
    BEGIN
        INSERT INTO logtable VALUES (logtxt, ''now'');
        RETURN ''now'';
    END;
' LANGUAGE 'plpgsql';
</programlisting>

     and

<programlisting>
CREATE FUNCTION logfunc2 (text) RETURNS timestamp AS '
    DECLARE
        logtxt ALIAS FOR $1;
        curtime timestamp;
    BEGIN
        curtime := ''now'';
        INSERT INTO logtable VALUES (logtxt, curtime);
        RETURN curtime;
    END;
' LANGUAGE 'plpgsql';
</programlisting>

     In the case of <function>logfunc1()</function>, the 
     <productname>Postgres</productname> main parser knows when 
     preparing the plan for the INSERT, that the string 
     <literal>'now'</literal> should be interpreted as 
     <type>timestamp</type> because the target field of logtable
     is of that type. Thus, it will make a constant from it at this
     time and this constant value is then used in all invocations of 
     <function>logfunc1()</function> during the lifetime of the
     backend. Needless to say that this isn't what the
     programmer wanted.
    </para>

    <para>
     In the case of <function>logfunc2()</function>, the 
     <productname>Postgres</productname> main parser does not know
     what type <literal>'now'</literal> should become and therefore 
     it returns a data type of <type>text</type> containing the string 
     <literal>'now'</literal>. During the assignment
     to the local variable curtime, the PL/pgSQL interpreter casts this
     string to the timestamp type by calling the
     <function>text_out()</function> and <function>timestamp_in()</function>
     functions for the conversion.
    </para>

    <para>
     This type checking done by the <productname>Postgres</productname> main
     parser got implemented after PL/pgSQL was nearly done.
     It is a difference between 6.3 and 6.4 and affects all functions
     using the prepared plan feature of the <acronym>SPI</acronym> manager.
     Using a local
     variable in the above manner is currently the only way in PL/pgSQL to get
     those values interpreted correctly.
    </para>

    <para>
     If record fields are used in expressions or statements, the data types of
     fields should not change between calls of one and the same expression.
     Keep this in mind when writing trigger procedures that handle events
     for more than one table.
    </para>
   </sect2>

   <!-- **** PL/pgSQL statements **** -->

  <sect2>
   <title>Statements</title>

   <para>
    Anything not understood by the PL/pgSQL parser as specified below
    will be put into a query and sent down to the database engine
    to execute. The resulting query should not return any data.
   </para>
   
   <sect3 id="plpgsql-statements-assignment">
    <title>Assignment</title>
    <para>
     An assignment of a value to a variable or row/record field is
     written as:
<synopsis>
<replaceable>identifier</replaceable> := <replaceable>expression</replaceable>;
</synopsis>

     If the expressions result data type doesn't match the variables
     data type, or the variable has a size/precision that is known
     (as for <type>char(20)</type>), the result value will be implicitly casted by
     the PL/pgSQL bytecode interpreter using the result types output- and
     the variables type input-functions. Note that this could potentially
     result in runtime errors generated by the types input functions.
    </para>

<programlisting>
user_id := 20;
tax := subtotal * 0.06;
</programlisting>
   </sect3>

   <sect3 id="plpgsql-statements-calling-other-funcs">
    <title>Calling another function</title>

    <para>
     All functions defined in a <productname>Postgres</productname>
     database return a value. Thus, the normal way to call a function
     is to execute a SELECT query or doing an assignment (resulting
     in a PL/pgSQL internal SELECT). 
    </para>

    <para>
     But there are cases where someone is not interested in the 
     function's result. In these cases, use the PERFORM
     statement.
<synopsis>
PERFORM <replaceable>query</replaceable>
</synopsis>
     This executes a <literal>SELECT <replaceable>query</replaceable></literal> over the
     <acronym>SPI manager</acronym> and discards the result. Identifiers like local
     variables are still substituted into parameters.
    </para>
<programlisting>
PERFORM create_mv(''cs_session_page_requests_mv'',''
     select   session_id, page_id, count(*) as n_hits,
              sum(dwell_time) as dwell_time, count(dwell_time) as dwell_count
     from     cs_fact_table
     group by session_id, page_id '');
</programlisting>
   </sect3>
   
   <sect3 id="plpgsql-statements-executing-dyn-queries">
    <title>Executing dynamic queries</title>
    
    <para>
     Often times you will want to generate dynamic queries inside
     your PL/pgSQL functions. Or you have functions that will
     generate other functions. PL/pgSQL provides the EXECUTE
     statement for these occasions.
    </para>

    <para>
<synopsis>
EXECUTE <replaceable class="command">query-string</replaceable>
</synopsis>
     where <replaceable>query-string</replaceable> is a string of type
     <type>text</type> containing the <replaceable>query</replaceable>
     to be executed.
    </para>

    <para>
    When working with dynamic queries you will have to face
    escaping of single quotes in PL/pgSQL. Please refer to the
    table available at the "Porting from Oracle PL/SQL" chapter
    for a detailed explanation that will save you some effort.
    </para>
     
    <para>
     Unlike all other queries in PL/pgSQL, a
     <replaceable>query</replaceable> run by an EXECUTE statement is
     not prepared and saved just once during the life of the server.
     Instead, the <replaceable>query</replaceable> is prepared each
     time the statement is run. The
     <replaceable>query-string</replaceable> can be dynamically
     created within the procedure to perform actions on variable
     tables and fields.
    </para>
  
    <para>
     The results from SELECT queries are discarded by EXECUTE, and
     SELECT INTO is not currently supported within EXECUTE.  So, the
     only way to extract a result from a dynamically-created SELECT is
     to use the FOR ... EXECUTE form described later.
    </para>

    <para>
     An example:
<synopsis>
EXECUTE ''UPDATE tbl SET ''
        || quote_ident(fieldname)
        || '' = ''
        || quote_literal(newvalue)
        || '' WHERE ...'';
</synopsis>
    </para>

    <para>
     This example shows use of the functions
     <function>quote_ident</function>(<type>TEXT</type>) and
     <function>quote_literal</function>(<type>TEXT</type>).
     Variables containing field and table identifiers should be
     passed to function <function>quote_ident()</function>.
     Variables containing literal elements of the dynamic query
     string should be passed to
     <function>quote_literal()</function>.  Both take the
     appropriate steps to return the input text enclosed in single
     or double quotes and with any embedded special characters.
    </para>

    <para>
     Here is a much larger example of a dynamic query and EXECUTE:
<programlisting>
CREATE FUNCTION cs_update_referrer_type_proc() RETURNS INTEGER AS '
DECLARE
    referrer_keys RECORD;  -- Declare a generic record to be used in a FOR
    a_output varchar(4000);
BEGIN 
    a_output := ''CREATE FUNCTION cs_find_referrer_type(varchar,varchar,varchar) 
                  RETURNS varchar AS '''' 
                     DECLARE 
                         v_host ALIAS FOR $1; 
                         v_domain ALIAS FOR $2; 
                         v_url ALIAS FOR $3; ''; 

    -- 
    -- Notice how we scan through the results of a query in a FOR loop
    -- using the FOR &lt;record&gt; construct.
    --

    FOR referrer_keys IN select * from cs_referrer_keys order by try_order LOOP
        a_output := a_output || '' if v_'' || referrer_keys.kind || '' like '''''''''' 
                 || referrer_keys.key_string || '''''''''' then return '''''' 
                 || referrer_keys.referrer_type || ''''''; end if;''; 
    END LOOP; 
  
    a_output := a_output || '' return null; end; '''' language ''''plpgsql'''';''; 
 
    -- This works because we are not substituting any variables
    -- Otherwise it would fail. Look at PERFORM for another way to run functions
    
    EXECUTE a_output; 
end; 
' LANGUAGE 'plpgsql';
</programlisting>
    </para>
   </sect3>

   <sect3 id="plpgsql-statements-diagnostics">
    <title>Obtaining other results status</title>

    <para>
<synopsis>
GET DIAGNOSTICS <replaceable>variable</replaceable> = <replaceable>item</replaceable> <optional> , ... </optional>
</synopsis>

     This command allows retrieval of system status indicators.  Each
     <replaceable>item</replaceable> is a keyword identifying a state
     value to be assigned to the specified variable (which should be
     of the right datatype to receive it).  The currently available
     status items are <varname>ROW_COUNT</>, the number of rows
     processed by the last <acronym>SQL</acronym> query sent down to
     the <acronym>SQL</acronym> engine; and <varname>RESULT_OID</>,
     the Oid of the last row inserted by the most recent
     <acronym>SQL</acronym> query.  Note that <varname>RESULT_OID</>
     is only useful after an INSERT query.
    </para>
   </sect3>
   
   <sect3 id="plpgsql-statements-returning">
    <title>Returning from a function</title>

    <para>
<synopsis>
RETURN <replaceable>expression</replaceable>
</synopsis>
     The function terminates and the value of
     <replaceable>expression</replaceable> will be returned to the
     upper executor. The return value of a function cannot be
     undefined. If control reaches the end of the top-level block of
     the function without hitting a RETURN statement, a runtime error
     will occur.
    </para>

    <para>
     The expressions result will be automatically casted into the
     function's return type as described for assignments.
    </para>
   </sect3>
  </sect2>

   <!-- **** PL/pgSQL Control Structures **** -->

  <sect2 id="plpgsql-description-control-structures">

   <title>Control Structures</title>
   <para>
    Control structures are probably the most useful (and
    important) part of PL/SQL. With PL/pgSQL's control structures,
    you can manipulate <productname>PostgreSQL</> data in a very
    flexible and powerful way. 
   </para>
    
   <sect3 id="plpgsql-description-conditionals">
    <title>Conditional Control: IF statements</title>

    <para>
     <function>IF</function> statements let you take action
     according to certain conditions. PL/pgSQL has three forms of
     IF: IF-THEN, IF-THEN-ELSE, IF-THEN-ELSE IF. NOTE: All
     PL/pgSQL IF statements need a corresponding <function>END
     IF</function> statement. In ELSE-IF statements you need two:
     one for the first IF and one for the second (ELSE IF).
    </para>

    <variablelist>
     <varlistentry>
      <term>
       IF-THEN
      </term>

      <listitem>
       <para>
        IF-THEN statements is the simplest form of an IF. The
        statements between THEN and END IF will be executed if
        the condition is true. Otherwise, the statements
        following END IF will be executed.
<programlisting>
IF v_user_id &lt;&gt; 0 THEN
    UPDATE users SET email = v_email WHERE user_id = v_user_id;
END IF;
</programlisting>
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term>
       IF-THEN-ELSE
      </term>

      <listitem>
       <para>
        IF-THEN-ELSE statements adds to IF-THEN by letting you
        specify the statements that should be executed if the
        condition evaluates to FALSE.
<programlisting>
IF parentid IS NULL or parentid = ''''
THEN 
    return fullname;
ELSE
    return hp_true_filename(parentid) || ''/'' || fullname;
END IF;


IF v_count > 0 THEN 
    INSERT INTO users_count(count) VALUES(v_count);
    return ''t'';
ELSE 
    return ''f'';
END IF;
</programlisting>
       </para>

       <para>
        IF statements can be nested and in the following
        example:
<programlisting>
IF demo_row.sex = ''m'' THEN
  pretty_sex := ''man'';
ELSE
  IF demo_row.sex = ''f'' THEN
    pretty_sex := ''woman'';
  END IF;
END IF;
</programlisting>
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term>
       IF-THEN-ELSE IF
      </term>

      <listitem>
       <para>
        When you use the "ELSE IF" statement, you are actually
        nesting an IF statement inside the ELSE
        statement. Thus you need one END IF statement for each
        nested IF and one for the parent IF-ELSE.
       </para>

       <para>
        For example:
<programlisting>
IF demo_row.sex = ''m'' THEN
   pretty_sex := ''man'';
ELSE IF demo_row.sex = ''f'' THEN
        pretty_sex := ''woman'';
     END IF;
END IF;
</programlisting>
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
   </sect3>

   <sect3 id="plpgsql-description-control-structures-loops">
    <title>Iterative Control: LOOP, WHILE, FOR and EXIT</title>

    <para>
     With the LOOP, WHILE, FOR and EXIT statements, you can
     control the flow of execution of your PL/pgSQL program
     iteratively.
    </para>

    <variablelist>
     <varlistentry>
      <term>
       LOOP
      </term>

      <listitem>
       <para>
<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
LOOP
    <replaceable>statements</replaceable>
END LOOP;
</synopsis>
        An unconditional loop that must be terminated explicitly
        by an EXIT statement. The optional label can be used by
        EXIT statements of nested loops to specify which level of
        nesting should be terminated.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term>
       EXIT
      </term>

      <listitem>
       <para>
<synopsis>
EXIT <optional> <replaceable>label</replaceable> </optional> <optional> WHEN <replaceable>expression</replaceable> </optional>;
</synopsis>
        If no <replaceable>label</replaceable> is given,
        the innermost loop is terminated and the
        statement following END LOOP is executed next.
        If <replaceable>label</replaceable> is given, it
        must be the label of the current or an upper level of nested loop
        blocks. Then the named loop or block is terminated and control
        continues with the statement after the loops/blocks corresponding
        END.
       </para>

       <para>
        Examples:
<programlisting>
LOOP
    -- some computations
    IF count > 0 THEN
        EXIT;  -- exit loop
    END IF;
END LOOP;

LOOP
    -- some computations
    EXIT WHEN count > 0;
END LOOP;
BEGIN
    -- some computations
    IF stocks > 100000 THEN
        EXIT;  -- illegal. Can't use EXIT outside of a LOOP
    END IF;
END;
</programlisting>
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term>
       WHILE
      </term>

      <listitem>
       <para>
        With the WHILE statement, you can loop through a
        sequence of statements as long as the evaluation of
        the condition expression is true.
<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
WHILE <replaceable>expression</replaceable> LOOP
    <replaceable>statements</replaceable>
END LOOP;
</synopsis>
        For example:
<programlisting>
WHILE amount_owed > 0 AND gift_certificate_balance > 0 LOOP
    -- some computations here
END LOOP;

WHILE NOT boolean_expression LOOP
    -- some computations here
END LOOP;
</programlisting>
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term>
       FOR
      </term>

      <listitem>
       <para>
<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
FOR <replaceable>name</replaceable> IN <optional> REVERSE </optional> <replaceable>expression</replaceable> .. <replaceable>expression</replaceable> LOOP
    <replaceable>statements</replaceable>
END LOOP;
</synopsis>
        A loop that iterates over a range of integer values. The variable
        <replaceable>name</replaceable> is automatically created as type
        integer and exists only inside the loop. The two expressions giving
        the lower and upper bound of the range are evaluated only when entering
        the loop. The iteration step is always 1.
       </para>

       <para>
        Some examples of FOR loops (see <xref
        linkend="plpgsql-description-records"> for iterating over
        records in FOR loops):
<programlisting>
FOR i IN 1..10 LOOP
  -- some expressions here

    RAISE NOTICE 'i is %',i;
END LOOP;

FOR i IN REVERSE 1..10 LOOP
    -- some expressions here
END LOOP;
</programlisting>
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
      </sect3>
   </sect2>

   <!-- **** PL/pgSQL records **** -->

  <sect2 id="plpgsql-description-records">
   <title>Working with RECORDs</title>

   <para>
    Records are similar to rowtypes, but they have no predefined structure.
    They are used in selections and FOR loops to hold one actual
    database row from a SELECT operation.
   </para>

   <sect3 id="plpgsql-description-records-declaration">
    <title>Declaration</title>

    <para>
     One variables of type RECORD can be used for different
     selections. Accessing a record or an attempt to assign
     a value to a record field when there is no actual row in it results
     in a runtime error. They can be declared like this:
    </para>

    <para>
<synopsis>
<replaceable>name</replaceable> RECORD;
</synopsis>
    </para>
   </sect3>

   <sect3 id="plpgsql-description-records-assignment">
    <title>Assignments</title>

    <para>
     An assignment of a complete selection into a record or row can
     be done by:
<synopsis>
SELECT  INTO <replaceable>target</replaceable> <replaceable>expressions</replaceable> FROM ...;
</synopsis>
     <replaceable>target</replaceable> can be a record, a row variable
     or a comma separated list of variables and
     record-/row-fields. Note that this is quite different from
     Postgres' normal interpretation of SELECT INTO, which is that the
     INTO target is a newly created table.  (If you want to create a
     table from a SELECT result inside a PL/pgSQL function, use the
     equivalent syntax <command>CREATE TABLE AS SELECT</command>.)
    </para>

    <para>
     If a row or a variable list is used as target, the selected values
     must exactly match the structure of the target(s) or a runtime error
     occurs. The FROM keyword can be followed by any valid qualification,
     grouping, sorting etc. that can be given for a SELECT statement.
    </para>

    <para>
     Once a record or row has been assigned to a RECORD variable,
     you can use the "." (dot) notation to access fields in that
     record:
<programlisting>
DECLARE
    users_rec RECORD;
    full_name varchar;
BEGIN
    SELECT INTO users_rec * FROM users WHERE user_id=3;

  full_name := users_rec.first_name || '' '' || users_rec.last_name;
</programlisting>
    </para>

    <para>
     There is a special variable named FOUND of type
     <type>boolean</type> that can be used immediately after a SELECT
     INTO to check if an assignment had success.
  
<programlisting>
SELECT INTO myrec * FROM EMP WHERE empname = myname;
IF NOT FOUND THEN
    RAISE EXCEPTION ''employee % not found'', myname;
END IF;
</programlisting>

     You can also use the IS NULL (or ISNULL) conditionals to
     test for NULLity of a RECORD/ROW. If the selection returns
     multiple rows, only the first is moved into the target
     fields. All others are silently discarded.
    </para>

    <para>
<programlisting>
DECLARE
    users_rec RECORD;
    full_name varchar;
BEGIN
    SELECT INTO users_rec * FROM users WHERE user_id=3;

    IF users_rec.homepage IS NULL THEN
        -- user entered no homepage, return "http://"

        return ''http://'';
    END IF;
END;
</programlisting>
    </para>
   </sect3>

   <sect3 id="plpgsql-description-records-iterating">
    <title>Iterating Through Records</title>

    <para>
     Using a special type of FOR loop, you can iterate through
     the results of a query and manipulate that data
     accordingly. The syntax is as follow:
<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
FOR <replaceable>record | row</replaceable> IN <replaceable>select_clause</replaceable> LOOP
    <replaceable>statements</replaceable>
END LOOP;
</synopsis>
     The record or row is assigned all the rows 
     resulting from the select clause and the loop body executed 
     for each. Here is an example:
    </para>

    <para>
<programlisting>
create function cs_refresh_mviews () returns integer as '
DECLARE
     mviews RECORD;

     -- Instead, if you did:
     -- mviews  cs_materialized_views%ROWTYPE;
     -- this record would ONLY be usable for the cs_materialized_views table

BEGIN
     PERFORM cs_log(''Refreshing materialized views...'');

     FOR mviews IN SELECT * FROM cs_materialized_views ORDER BY sort_key LOOP

         -- Now "mviews" has one record from cs_materialized_views

         PERFORM cs_log(''Refreshing materialized view '' || mview.mv_name || ''...'');
         TRUNCATE TABLE mview.mv_name;
         INSERT INTO mview.mv_name || '' '' || mview.mv_query;
     END LOOP;

     PERFORM cs_log(''Done refreshing materialized views.'');
     return 1;
end;
' language 'plpgsql';
</programlisting>

     If the loop is terminated with an EXIT statement, the last
     assigned row is still accessible after the loop.
    </para>

    <para>
     The FOR-IN EXECUTE statement is another way to iterate over
     records:
<synopsis>
<optional>&lt;&lt;label&gt;&gt;</optional>
FOR <replaceable>record | row</replaceable> IN EXECUTE <replaceable>text_expression</replaceable> LOOP 
    <replaceable>statements</replaceable>
END LOOP;
</synopsis>
     This is like the previous form, except that the source SELECT
     statement is specified as a string expression, which is evaluated
     and re-planned on each entry to the FOR loop.  This allows the
     programmer to choose the speed of a pre-planned query or the
     flexibility of a dynamic query, just as with a plain EXECUTE
     statement.
    </para>
   </sect3>
  </sect2>

  <sect2 id="plpgsql-description-aborting-and-messages">
   <title>Aborting and Messages</title>

   <para>
    Use the RAISE statement to throw messages into the 
    <productname>Postgres</productname> elog mechanism.

<synopsis>
RAISE <replaceable class="parameter">level</replaceable> '<replaceable class="parameter">format</replaceable>' <optional>, <replaceable class="parameter">variable</replaceable> <optional>...</optional></optional>;
</synopsis>

    Possible levels are DEBUG (write the message into the postmaster log),
    NOTICE (write the message into the postmaster log and forward it to
    the client application) and EXCEPTION (raise an error,
    aborting the transaction).
    Inside the format string, <literal>%</literal> is replaced by the next
    optional argument's external representation.
    Write <literal>%%</literal> to emit a literal <literal>%</literal>.
    Note that the optional arguments must presently
    be simple variables, not expressions, and the format must be a simple
    string literal.
   </para>

   <!--
   This example should work, but does not:
   	RAISE NOTICE ''Id number '' || key || '' not found!'';
   Put it back when we allow non-string-literal formats.
    -->

   <para>
<programlisting>
RAISE NOTICE ''Calling cs_create_job(%)'',v_job_id;
</programlisting>
    In this example, the value of v_job_id will replace the % in the
    string.
   </para>

   <para>
<programlisting>
RAISE EXCEPTION ''Inexistent ID --> %'',user_id;
</programlisting>
    This will abort the transaction with the given error message.
   </para>
  </sect2>   

   <!-- **** PL/pgSQL exceptions **** -->

   <sect2>
    <title>Exceptions</title>

    <para>
     <productname>Postgres</productname> does not have a very smart
     exception handling model. Whenever the parser, planner/optimizer
     or executor decide that a statement cannot be processed any longer,
     the whole transaction gets aborted and the system jumps back
     into the main loop to get the next query from the client application.
    </para>

    <para>
     It is possible to hook into the error mechanism to notice that this
     happens. But currently it is impossible to tell what really
     caused the abort (input/output conversion error, floating point
     error, parse error). And it is possible that the database backend
     is in an inconsistent state at this point so returning to the upper
     executor or issuing more commands might corrupt the whole database.
     And even if, at this point the information, that the transaction
     is aborted, is already sent to the client application, so resuming
     operation does not make any sense.
    </para>

    <para>
     Thus, the only thing PL/pgSQL currently does when it encounters
     an abort during execution of a function or trigger
     procedure is to write some additional DEBUG level log messages
     telling in which function and where (line number and type of
     statement) this happened.
    </para>
   </sect2>
 </sect1>


   <!-- **** PL/pgSQL trigger procedures **** -->

 <sect1 id="plpgsql-trigger">
  <title>Trigger Procedures</title>

  <para>
   PL/pgSQL can be used to define trigger procedures. They are created
   with the usual <command>CREATE FUNCTION</command> command as a function with no
   arguments and a return type of <type>OPAQUE</type>.
  </para>

  <para>
   There are some <productname>Postgres</productname> specific details
   in functions used as trigger procedures.
  </para>

  <para>
   First they have some special variables created automatically in the 
   top-level blocks declaration section. They are
   
   <variablelist>
    <varlistentry>
     <term><varname>NEW</varname></term>
     <listitem>
      <para>
       Data type <type>RECORD</type>; variable holding the new database row on INSERT/UPDATE
       operations on ROW level triggers.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>OLD</varname></term>
     <listitem>
      <para>
       Data type <type>RECORD</type>; variable holding the old database row on UPDATE/DELETE
       operations on ROW level triggers.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_NAME</varname></term>
     <listitem>
      <para>
       Data type <type>name</type>; variable that contains the name of the trigger actually
       fired.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_WHEN</varname></term>
     <listitem>
      <para>
       Data type <type>text</type>; a string of either 
              <literal>BEFORE</literal> or <literal>AFTER</literal> 
              depending on the triggers definition.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_LEVEL</varname></term>
     <listitem>
      <para>
       Data type <type>text</type>; a string of either 
              <literal>ROW</literal> or <literal>STATEMENT</literal> depending on the
       triggers definition.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_OP</varname></term>
     <listitem>
      <para>
       Data type <type>text</type>; a string of 
              <literal>INSERT</literal>, <literal>UPDATE</literal> 
              or <literal>DELETE</literal> telling
       for which operation the trigger is actually fired.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_RELID</varname></term>
     <listitem>
      <para>
       Data type <type>oid</type>; the object ID of the table that caused the
       trigger invocation.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_RELNAME</varname></term>
     <listitem>
      <para>
       Data type <type>name</type>; the name of the table that caused the trigger
       invocation.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_NARGS</varname></term>
     <listitem>
      <para>
       Data type <type>integer</type>; the number of arguments given to the trigger
       procedure in the <command>CREATE TRIGGER</command> statement.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><varname>TG_ARGV[]</varname></term>
     <listitem>
      <para>
       Data type array of <type>text</type>; the arguments from
              the <command>CREATE TRIGGER</command> statement.
       The index counts from 0 and can be given as an expression. Invalid
       indices (&lt; 0 or &gt;= tg_nargs) result in a NULL value.
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

   <para>
    Second they must return either NULL or a record/row containing
    exactly the structure of the table the trigger was fired for.
    Triggers fired AFTER might always return a NULL value with no
    effect. Triggers fired BEFORE signal the trigger manager
    to skip the operation for this actual row when returning NULL.
    Otherwise, the returned record/row replaces the inserted/updated
    row in the operation. It is possible to replace single values directly
    in NEW and return that or to build a complete new record/row to
    return.
   </para>

   <example>
    <title>A PL/pgSQL Trigger Procedure Example</title>

    <para>
     This trigger ensures, that any time a row is inserted or updated
     in the table, the current user name and time are stamped into the
     row. And it ensures that an employees name is given and that the
     salary is a positive value.

<programlisting>
CREATE TABLE emp (
    empname text,
    salary integer,
    last_date timestamp,
    last_user text
);

CREATE FUNCTION emp_stamp () RETURNS OPAQUE AS '
    BEGIN
        -- Check that empname and salary are given
        IF NEW.empname ISNULL THEN
            RAISE EXCEPTION ''empname cannot be NULL value'';
        END IF;
        IF NEW.salary ISNULL THEN
            RAISE EXCEPTION ''% cannot have NULL salary'', NEW.empname;
        END IF;

        -- Who works for us when she must pay for?
        IF NEW.salary < 0 THEN
            RAISE EXCEPTION ''% cannot have a negative salary'', NEW.empname;
        END IF;

        -- Remember who changed the payroll when
        NEW.last_date := ''now'';
        NEW.last_user := current_user;
        RETURN NEW;
    END;
' LANGUAGE 'plpgsql';

CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE PROCEDURE emp_stamp();
</programlisting>
    </para>
   </example>
  </sect1>

  <!-- **** PL/pgSQL Examples **** -->

  <sect1 id="plpgsql-examples">
   <title>Examples</title>

   <para>
    Here are only a few functions to demonstrate how easy it is to
    write PL/pgSQL
    functions. For more complex examples the programmer
    might look at the regression test for PL/pgSQL.
   </para>

   <para>
    One painful detail in writing functions in PL/pgSQL is the handling
    of single quotes. The function's source text on <command>CREATE FUNCTION</command> must
    be a literal string. Single quotes inside of literal strings must be
    either doubled or quoted with a backslash. We are still looking for
    an elegant alternative. In the meantime, doubling the single quotes
    as in the examples below should be used. Any solution for this
    in future versions of <productname>Postgres</productname> will be
    forward compatible.
   </para>

   <para>
    For a detailed explanation and examples of how to escape single
    quotes in different situations, please see <xref linkend="plpgsql-quote">.
   </para>

   <example>
    <title>A Simple PL/pgSQL Function to Increment an Integer</title>

    <para>
     The following two PL/pgSQL functions are identical to their
     counterparts from the C language function discussion. This
     function receives an <type>integer</type> and increments it by
     one, returning the incremented value.
    </para>

<programlisting>
CREATE FUNCTION add_one (integer) RETURNS integer AS '
    BEGIN
        RETURN $1 + 1;
    END;
' LANGUAGE 'plpgsql';
</programlisting>
   </example>

   <example>
    <title>A Simple PL/pgSQL Function to Concatenate Text</title>

    <para>
     This function receives two <type>text</type> parameters and
     returns the result of concatenating them.
    </para>

<programlisting>
CREATE FUNCTION concat_text (text, text) RETURNS text AS '
    BEGIN
        RETURN $1 || $2;
    END;
' LANGUAGE 'plpgsql';
</programlisting>
   </example>

   <example>
    <title>A PL/pgSQL Function on Composite Type</title>

    <para>
     In this example, we take EMP (a table) and an
     <type>integer</type> as arguments to our function, which returns
     a <type>boolean</type>. If the "salary" field of the EMP table is
     <literal>NULL</literal>, we return "f". Otherwise we compare with
     that field with the <type>integer</type> passed to the function
     and return the <type>boolean</type> result of the comparison (t
     or f). This is the PL/pgSQL equivalent to the example from the C
     functions.
    </para>

<programlisting>
CREATE FUNCTION c_overpaid (EMP, integer) RETURNS boolean AS '
    DECLARE
        emprec ALIAS FOR $1;
        sallim ALIAS FOR $2;
    BEGIN
        IF emprec.salary ISNULL THEN
            RETURN ''f'';
        END IF;
        RETURN emprec.salary > sallim;
    END;
' LANGUAGE 'plpgsql';
</programlisting>
   </example>
  </sect1>

  <!-- **** Porting from Oracle PL/SQL **** -->

 <sect1 id="plpgsql-porting">

  <sect1info>
   <date>
    February 2001
   </date>
   <author>
    <firstname>Roberto</firstname>
    <surname>Mello</surname>
    <affiliation>
     <address>
      <email>rmello@fslc.usu.edu</email>
     </address>
    </affiliation>
   </author>

   <legalnotice>
    <para>
     Except for portions of this document quoted from other sources,
     this document is licensed under the BSD License.
    </para>
   </legalnotice>
  </sect1info> 

  <title>Porting from Oracle PL/SQL</title>

  <indexterm>
   <primary>Oracle</primary>
  </indexterm>

  <indexterm>
   <primary>PL/SQL</primary>
  </indexterm>

  <note>
   <title>Author</title>
   <para>
    Roberto Mello (<email>rmello@fslc.usu.edu</email>)
   </para>
  </note>

  <para>
   This section explains differences between Oracle's PL/SQL and
   PostgreSQL's PL/pgSQL languages in the hopes of helping developers
   port applications from Oracle to PostgreSQL.  Most of the code here
   is from the <ulink url="http://www.arsdigita.com">ArsDigita</ulink>
   <ulink url="http://www.arsdigita.com/asj/clickstream">Clickstream
   module</ulink> that I ported to PostgreSQL when I took an
   internship with <ulink url="http://www.openforce.net">OpenForce
   Inc.</ulink> in the Summer of 2000.
  </para>

  <para>
   PL/pgSQL is similar to PL/SQL in many aspects. It is a block
   structured, imperative language (all variables have to be
   declared). PL/SQL has many more features than its PostgreSQL
   counterpart, but PL/pgSQL allows for a great deal of functionality
   and it is being improved constantly.
  </para>

  <sect2>
   <title>Main Differences</title>

   <para>
    Some things you should keep in mind when porting from Oracle to PostgreSQL:

    <itemizedlist>
     <listitem>
      <para>
       No default parameters in PostgreSQL.
      </para>
     </listitem>

     <listitem>
      <para>
       You can overload functions in PostgreSQL. This is often used to work 
       around the lack of default parameters.
      </para>
     </listitem>
     <listitem>
      <para>
       Assignments, loops and conditionals are similar. 
      </para>
     </listitem>

     <listitem>
      <para>
       No need for cursors in PostgreSQL, just put the query in the FOR 
       statement (see example below)
      </para>
     </listitem>

     <listitem>
      <para>
       In PostgreSQL you <emphasis>need</emphasis> to escape single
       quotes. See <xref linkend="plpgsql-quote">.
      </para>
     </listitem>
    </itemizedlist>
   </para>

   <sect3 id="plpgsql-quote">
    <title>Quote Me on That: Escaping Single Quotes</title>

    <para>
     In PostgreSQL you need to escape single quotes inside your
     function definition. This can lead to quite amusing code at
     times, especially if you are creating a function that generates
     other function(s), as in 
     <xref linkend="plpgsql-porting-nastyquote">.  
     One thing to keep in mind
     when escaping lots of single quotes is that, except for the
     beginning/ending quotes, all the others will come in even
     quantity.
    </para>

    <para>
     <xref linkend="plpgsql-quoting-table"> gives the scoop.  (You'll
     love this little chart.)
    </para>

    <table id="plpgsql-quoting-table">
     <title>Single Quotes Escaping Chart</title>

     <tgroup cols="4">
      <thead>
       <row>
        <entry>No. of Quotes</entry>
        <entry>Usage</entry>
        <entry>Example</entry>
        <entry>Result</entry>
       </row>
      </thead>

      <tbody>
       <row>
        <entry>1</entry>
        <entry>To begin/terminate function bodies</entry>
        <entry><programlisting>
CREATE FUNCTION foo() RETURNS INTEGER AS '...'
LANGUAGE 'plpgsql';
</programlisting></entry>
        <entry>as is</entry>
       </row>

       <row>
        <entry>2</entry>
        <entry>In assignments, SELECTs, to delimit strings, etc.</entry>
        <entry><programlisting>
a_output := ''Blah'';
SELECT * FROM users WHERE f_name=''foobar'';
</programlisting></entry>
        <entry><literal>SELECT * FROM users WHERE f_name='foobar';</literal></entry>
       </row>

       <row>
        <entry>4</entry>
        <entry>
         When you need two single quotes in your resulting string
         without terminating that string.
        </entry>
        <entry><programlisting>
a_output := a_output || '' AND name 
    LIKE ''''foobar'''' AND ...''
</programlisting></entry>
        <entry><literal>AND name LIKE 'foobar' AND ...</literal></entry>
       </row>

       <row>
        <entry>6</entry>
        <entry>
         When you want double quotes in your resulting string
         <emphasis>and</emphasis> terminate that string.
        </entry>
        <entry><programlisting>
a_output := a_output || '' AND name 
    LIKE ''''foobar''''''
</programlisting></entry>
        <entry>
         <literal>AND name LIKE 'foobar'</literal>
        </entry>
       </row>

       <row>
        <entry>10</entry>
        <entry>
         When you want two single quotes in the resulting string
         (which accounts for 8 quotes) <emphasis>and</emphasis>
         terminate that string (2 more).  You will probably only need
         that if you were using a function to generate other functions
         (like in <xref linkend="plpgsql-porting-nastyquote">).
        </entry>
        <entry><programlisting>
a_output := a_output || '' if v_'' || 
    referrer_keys.kind || '' like '''''''''' 
    || referrer_keys.key_string || '''''''''' 
    then return ''''''  || referrer_keys.referrer_type 
    || ''''''; end if;''; 
</programlisting></entry>
        <entry>
         <literal>if v_<...> like ''<...>'' then return ''<...>''; end if;</literal>
        </entry>
       </row>
      </tbody>
     </tgroup>
    </table>
   </sect3>
  </sect2>  

  <sect2 id="plpgsql-porting-functions">
   <title>
    Porting Functions
   </title>

   <example>
    <title>
     A Simple Function
    </title>
    <para>
     Here is an Oracle function:
<programlisting>
CREATE OR REPLACE FUNCTION cs_fmt_browser_version(v_name IN varchar, v_version IN varchar)
RETURN varchar IS
BEGIN
    IF v_version IS NULL THEN
        RETURN v_name;
    END IF;
    RETURN v_name || '/' || v_version;
END;
/
SHOW ERRORS;
</programlisting>
    </para>

    <para>
     Let's go through this function and see the differences to PL/pgSQL:

     <itemizedlist>
      <listitem>
       <para>
        The <literal>OR REPLACE</literal> clause is not allowed. You
        will have to explicitly drop the function before creating it
        to achieve similar results.
       </para>
      </listitem>

      <listitem>
       <para>
        <productname>PostgreSQL</productname> does not have named
        parameters. You have to explicitly alias them inside your
        function.
       </para>
      </listitem>

      <listitem>
       <para>
        Oracle can have <literal>IN</literal>, <literal>OUT</literal>,
        and <literal>INOUT</literal> parameters passed to functions.
        The <literal>INOUT</literal>, for example, means that the
        parameter will receive a value and return another. PostgreSQL
        only has <quote>IN</quote> parameters and functions can return
        only a single value.
       </para>
      </listitem>

      <listitem>
       <para>
        The <literal>RETURN</literal> key word in the function
        prototype (not the function body) becomes
        <literal>RETURNS</literal> in PostgreSQL.
       </para>
      </listitem>

      <listitem>
       <para>
        On PostgreSQL functions are created using single quotes as
        delimiters, so you have to escape single quotes inside your
        functions (which can be quite annoying at times; see <xref
        linkend="plpgsql-quote">).
       </para>
      </listitem>

      <listitem>
       <para>
        The <literal>/show errors</literal> command does not exist in
        PostgreSQL.
       </para>
      </listitem>
     </itemizedlist>
    </para>

    <para>
     So let's see how this function would be look like ported to
     PostgreSQL:

<programlisting>
DROP FUNCTION cs_fmt_browser_version(varchar, varchar);
CREATE FUNCTION cs_fmt_browser_version(varchar, varchar)
RETRUNS varchar AS '
DECLARE
    v_name ALIAS FOR $1;
    v_version ALIAS FOR $2;
BEGIN
    IF v_version IS NULL THEN
        return v_name;
    END IF;
    RETURN v_name || ''/'' || v_version;
END;
' LANGUAGE 'plpgsql';
</programlisting>
    </para>
   </example>
 
   <example id="plpgsql-porting-nastyquote">
    <title>
     A Function that Creates Another Function
    </title>

    <para>
     The following procedure grabs rows from a
     <literal>SELECT</literal> statement and builds a large function
     with the results in <literal>IF</literal> statements, for the
     sake of efficiency. Notice particularly the differences in
     cursors, <literal>FOR</literal> loops, and the need to escape
     single quotes in PostgreSQL.

<programlisting>
create or replace procedure cs_update_referrer_type_proc is 
    cursor referrer_keys is 
        select * from cs_referrer_keys 
        order by try_order;

    a_output varchar(4000); 
begin 
    a_output := 'create or replace function cs_find_referrer_type(v_host IN varchar, v_domain IN varchar, 
v_url IN varchar) return varchar is begin'; 

    for referrer_key in referrer_keys loop 
        a_output := a_output || ' if v_' || referrer_key.kind || ' like ''' || 
referrer_key.key_string || ''' then return ''' || referrer_key.referrer_type || 
'''; end if;'; 
    end loop; 

    a_output := a_output || ' return null; end;'; 
    execute immediate a_output; 
end; 
/ 
show errors
</programlisting>
    </para>

    <para>
     Here is how this function would end up in PostgreSQL:

<programlisting>
CREATE FUNCTION cs_update_referrer_type_proc() RETURNS integer AS '
DECLARE
    referrer_keys RECORD;  -- Declare a generic record to be used in a FOR
    a_output varchar(4000);
BEGIN 
    a_output := ''CREATE FUNCTION cs_find_referrer_type(varchar,varchar,varchar) 
                  RETURNS varchar AS '''' 
                     DECLARE 
                         v_host ALIAS FOR $1; 
                         v_domain ALIAS FOR $2; 
                         v_url ALIAS FOR $3; ''; 

    -- 
    -- Notice how we scan through the results of a query in a FOR loop
    -- using the FOR &lt;record&gt; construct.
    --

    FOR referrer_keys IN select * from cs_referrer_keys order by try_order LOOP
        a_output := a_output || '' if v_'' || referrer_keys.kind || '' like '''''''''' 
                 || referrer_keys.key_string || '''''''''' then return '''''' 
                 || referrer_keys.referrer_type || ''''''; end if;''; 
    END LOOP; 
  
    a_output := a_output || '' return null; end; '''' language ''''plpgsql'''';''; 
 
    -- This works because we are not substituting any variables
    -- Otherwise it would fail. Look at PERFORM for another way to run functions
    
    EXECUTE a_output; 
end; 
' LANGUAGE 'plpgsql';
</programlisting>
    </para>
   </example>
 
   <example>
    <title>
     A Procedure with a lot of String Manipulation and OUT Parameters
    </title>

    <para>
     The following Oracle PL/SQL procedure is used to parse a URL and
     return several elements (host, path and query). It is an
     procedure because in PL/pgSQL functions only one value can be returned
     (see <xref linkend="plpgsql-porting-procedures">).  In
     PostgreSQL, one way to work around this is to split the procedure
     in three different functions: one to return the host, another for
     the path and another for the query.
    </para>

<programlisting>
create or replace procedure cs_parse_url(
    v_url IN varchar,
    v_host OUT varchar,  -- This will be passed back
    v_path OUT varchar,  -- This one too
    v_query OUT varchar) -- And this one
is
    a_pos1 integer;
    a_pos2 integer;
begin
    v_host := NULL;
    v_path := NULL;
    v_query := NULL;
    a_pos1 := instr(v_url, '//'); -- PostgreSQL doesn't have an instr function

    if a_pos1 = 0 then
        return;
    end if;
    a_pos2 := instr(v_url, '/', a_pos1 + 2);
    if a_pos2 = 0 then
        v_host := substr(v_url, a_pos1 + 2);
        v_path := '/';
        return;
    end if;

    v_host := substr(v_url, a_pos1 + 2, a_pos2 - a_pos1 - 2);
    a_pos1 := instr(v_url, '?', a_pos2 + 1);

    if a_pos1 = 0 then
        v_path := substr(v_url, a_pos2);
        return;
    end if;

    v_path := substr(v_url, a_pos2, a_pos1 - a_pos2);
    v_query := substr(v_url, a_pos1 + 1);
end;
/
show errors;
</programlisting>

    <para>
     Here is how this procedure could be translated for PostgreSQL:

<programlisting>
drop function cs_parse_url_host(varchar); 
create function cs_parse_url_host(varchar) returns varchar as ' 
declare 
    v_url ALIAS FOR $1; 
    v_host varchar; 
    v_path varchar; 
    a_pos1 integer; 
    a_pos2 integer; 
    a_pos3 integer; 
begin 
    v_host := NULL; 
    a_pos1 := instr(v_url,''//''); 

    if a_pos1 = 0 then 
        return '''';  -- Return a blank
    end if; 

    a_pos2 := instr(v_url,''/'',a_pos1 + 2); 
    if a_pos2 = 0 then 
        v_host := substr(v_url, a_pos1 + 2); 
        v_path := ''/''; 
        return v_host; 
    end if; 

    v_host := substr(v_url, a_pos1 + 2, a_pos2 - a_pos1 - 2 ); 
    return v_host; 
end; 
' language 'plpgsql';
</programlisting>
    </para>
   </example>

   <note>
    <para>
     PostgreSQL does not have an <function>instr</function> function,
     so you can work around it using a combination of other functions.
     I got tired of doing this and created my own
     <function>instr</function> functions that behave exactly like
     Oracle's (it makes life easier). See the <xref
     linkend="plpgsql-porting-appendix"> for the code.
    </para>
   </note>
  </sect2>

  <sect2 id="plpgsql-porting-procedures">
   <title>
    Procedures
   </title>
   <para>
    Oracle procedures give a little more flexibility to the developer
    because nothing needs to be explicitly returned, but it can be
    through the use of INOUT or OUT parameters.
   </para>

   <para>
    An example:

<programlisting>
create or replace procedure cs_create_job(v_job_id in integer)
is
    a_running_job_count integer;
    pragma autonomous_transaction;<co id="co.plpgsql-porting-pragma">
begin
    lock table cs_jobs in exclusive mode;<co id="co.plpgsql-porting-locktable">

    select count(*) into a_running_job_count from cs_jobs
    where end_stamp is null;

    if a_running_job_count > 0 then
        commit; -- free lock<co id="co.plpgsql-porting-commit">
        raise_application_error(-20000, 'Unable to create a new job: a job is currently running.');
    end if;

    delete from cs_active_job;
    insert into cs_active_job(job_id) values(v_job_id);

    begin
        insert into cs_jobs(job_id, start_stamp) values(v_job_id, sysdate);
        exception when dup_val_on_index then null; -- don't worry if it already exists<co id="co.plpgsql-porting-exception">
    end;
    commit;
end;
/
show errors
</programlisting>
   </para>

   <para>
    Procedures like this can be easily converted into PostgreSQL
    functions returning an <type>INTEGER</type>. This procedure in
    particular is interesting because it can teach us some things:

    <calloutlist>
     <callout arearefs="co.plpgsql-porting-pragma">
      <para>
       There is no <literal>pragma</literal> statement in PostgreSQL.
      </para>
     </callout>

     <callout arearefs="co.plpgsql-porting-locktable">
      <para>
       If you do a <literal>LOCK TABLE</literal> in PL/pgSQL, the lock
       will not be released until the calling transaction is finished.
      </para>
     </callout>

     <callout arearefs="co.plpgsql-porting-commit">
      <para>
       You also cannot have transactions in PL/pgSQL procedures. The
       entire function (and other functions called from therein) is
       executed in a transaction and PostgreSQL rolls back the results if
       something goes wrong. Therefore only one
       <literal>BEGIN</literal> statement is allowed.
      </para>
     </callout>

     <callout arearefs="co.plpgsql-porting-exception">
      <para>
       The exception when would have to be replaced by an
       <literal>IF</literal> statement.
      </para>
     </callout>
    </calloutlist>
   </para>

   <para>
    So let's see one of the ways we could port this procedure to PL/pgSQL:

<programlisting>
drop function cs_create_job(integer);
create function cs_create_job(integer) returns integer as ' declare
    v_job_id alias for $1;
    a_running_job_count integer;
    a_num integer;
    -- pragma autonomous_transaction;
begin
    lock table cs_jobs in exclusive mode;
    select count(*) into a_running_job_count from cs_jobs where end_stamp is null;

    if a_running_job_count > 0 then
        -- commit; -- free lock
        raise exception ''Unable to create a new job: a job is currently running.'';
    end if;

    delete from cs_active_job;
    insert into cs_active_job(job_id) values(v_job_id);

    SELECT count(*) into a_num FROM cs_jobs WHERE job_id=v_job_id;
    IF NOT FOUND THEN  -- If nothing was returned in the last query
        -- This job is not in the table so lets insert it.
        insert into cs_jobs(job_id, start_stamp) values(v_job_id, sysdate());
        return 1;
    ELSE
        raise NOTICE ''Job already running.'';<co id="co.plpgsql-porting-raise">
    END IF;

    return 0;
end;
' language 'plpgsql';
</programlisting>

    <calloutlist>
     <callout arearefs="co.plpgsql-porting-raise">
      <para>
       Notice how you can raise notices (or errors) in PL/pgSQL.
      </para>
     </callout>
    </calloutlist>
   </para>
  </sect2>

  <sect2 id="plpgsql-porting-packages">
   <title>
    Packages
   </title>

   <note>
    <para>
     I haven't done much with packages myself, so if there are
     mistakes here, please let me know.
    </para>
   </note>

   <para>
    Packages are a way Oracle gives you to encapsulate PL/SQL
    statements and functions into one entity, like Java classes, where
    you define methods and objects. You can access these
    objects/methods with a <quote><literal>.</literal></quote>
    (dot). Here is an example of an Oracle package from ACS 4 (the
    <ulink url="http://www.arsdigita.com/doc/">ArsDigita Community
    System</ulink>):

<programlisting>
create or replace package body acs
as
  function add_user (
    user_id     in users.user_id%TYPE default null,
    object_type     in acs_objects.object_type%TYPE
               default 'user',
    creation_date   in acs_objects.creation_date%TYPE
               default sysdate,
    creation_user   in acs_objects.creation_user%TYPE
               default null,
    creation_ip     in acs_objects.creation_ip%TYPE default null,
  ...
  ) return users.user_id%TYPE
  is
    v_user_id       users.user_id%TYPE;
    v_rel_id        membership_rels.rel_id%TYPE;
  begin
    v_user_id := acs_user.new (user_id, object_type, creation_date,
                creation_user, creation_ip, email,
    ...
    return v_user_id;
  end;
end acs;
/
show errors
</programlisting>
   </para>

   <para>
    We port this to PostgreSQL by creating the different objects of
    the Oracle package as functions with a standard naming
    convention. We have to pay attention to some other details, like
    the lack of default parameters in PostgreSQL functions. The above
    package would become something like this:
 
<programlisting>
CREATE FUNCTION acs__add_user(integer,integer,varchar,datetime,integer,integer,...)
RETURNS integer AS '
DECLARE
    user_id ALIAS FOR $1;
    object_type ALIAS FOR $2;
    creation_date ALIAS FOR $3;
    creation_user ALIAS FOR $4;
    creation_ip ALIAS FOR $5;
    ...
    v_user_id users.user_id%TYPE;
    v_rel_id membership_rels.rel_id%TYPE;
BEGIN
    v_user_id := acs_user__new(user_id,object_type,creation_date,creation_user,creation_ip, ...);
    ...

    return v_user_id;
END;
' LANGUAGE 'plpgsql';
</programlisting>
   </para>
  </sect2>

  <sect2 id="plpgsql-porting-other">
   <title>
    Other Things to Watch For
   </title>

   <sect3>
    <title>EXECUTE</title>

    <para>
     The PostgreSQL version of <literal>EXECUTE</literal> works
     nicely, but you have to remember to use
     <function>quote_literal(TEXT)</function> and
     <function>quote_string(TEXT)</function> as described in <xref
     linkend="plpgsql-statements-executing-dyn-queries">.  Constructs of the type
     <literal>EXECUTE ''SELECT * from $1'';</literal> will not work
     unless you use these functions.
    </para>
   </sect3>

   <sect3 id="plpgsql-porting-optimization">
    <title>Optimizing PL/pgSQL Functions</title>

    <para>
     PostgreSQL gives you two function creation modifiers to optimize
     execution: <literal>iscachable</literal> (function always returns
     the same result when given the same arguments) and
     <literal>isstrict</literal> (function returns NULL if any
     argument is NULL).  Consult the <command>CREATE
     FUNCTION</command> reference for details.
    </para>

    <para>
     To make use of these optimization attributes, you have to use the
     <literal>WITH</literal> modifier in your <command>CREATE
     FUNCTION</command> statement.  Something like:

<programlisting>
CREATE FUNCTION foo(...) RETURNS integer AS '
...
' LANGUAGE 'plpgsql'
WITH (isstrict, iscachable);
</programlisting>
    </para>
   </sect3>
  </sect2>

  <sect2 id="plpgsql-porting-appendix">
   <title>
    Appendix
   </title>

   <sect3>
    <title>
     Code for my <function>instr</function> functions
    </title>

    <comment>
     This function should probably be integrated into the core.
    </comment>

<programlisting>
--
-- instr functions that mimic Oracle's counterpart
-- Syntax: instr(string1,string2,[n],[m]) where [] denotes optional params.
-- 
-- Searches string1 beginning at the nth character for the mth
-- occurrence of string2. If n is negative, search backwards. If m is
-- not passed, assume 1 (search starts at first character).
--
-- by Roberto Mello (rmello@fslc.usu.edu)
-- modified by Robert Gaszewski (graszew@poland.com)
-- Licensed under the GPL v2 or later.
--

DROP FUNCTION instr(varchar,varchar);
CREATE FUNCTION instr(varchar,varchar) RETURNS integer AS '
DECLARE
    pos integer;
BEGIN
    pos:= instr($1,$2,1);
    RETURN pos;
END;
' language 'plpgsql';


DROP FUNCTION instr(varchar,varchar,integer);
CREATE FUNCTION instr(varchar,varchar,integer) RETURNS integer AS '
DECLARE
    string ALIAS FOR $1;
    string_to_search ALIAS FOR $2;
    beg_index ALIAS FOR $3;
    pos integer NOT NULL DEFAULT 0;
    temp_str varchar;
    beg integer;
    length integer;
    ss_length integer;
BEGIN
    IF beg_index > 0 THEN

       temp_str := substring(string FROM beg_index);
       pos := position(string_to_search IN temp_str);

       IF pos = 0 THEN
	         RETURN 0;
	     ELSE
	         RETURN pos + beg_index - 1;
	     END IF;
    ELSE
       ss_length := char_length(string_to_search);
       length := char_length(string);
       beg := length + beg_index - ss_length + 2;

       WHILE beg > 0 LOOP

           temp_str := substring(string FROM beg FOR ss_length);
	         pos := position(string_to_search IN temp_str);

	         IF pos > 0 THEN
		           RETURN beg;
	         END IF;

	         beg := beg - 1;
       END LOOP;
       RETURN 0;
    END IF;
END;
' language 'plpgsql';

--
-- Written by Robert Gaszewski (graszew@poland.com)
-- Licensed under the GPL v2 or later.
--
DROP FUNCTION instr(varchar,varchar,integer,integer);
CREATE FUNCTION instr(varchar,varchar,integer,integer) RETURNS integer AS '
DECLARE
    string ALIAS FOR $1;
    string_to_search ALIAS FOR $2;
    beg_index ALIAS FOR $3;
    occur_index ALIAS FOR $4;
    pos integer NOT NULL DEFAULT 0;
    occur_number integer NOT NULL DEFAULT 0;
    temp_str varchar;
    beg integer;
    i integer;
    length integer;
    ss_length integer;
BEGIN
    IF beg_index > 0 THEN
        beg := beg_index;
        temp_str := substring(string FROM beg_index);

        FOR i IN 1..occur_index LOOP
            pos := position(string_to_search IN temp_str);

            IF i = 1 THEN
                beg := beg + pos - 1;
            ELSE
                beg := beg + pos;
            END IF;

            temp_str := substring(string FROM beg + 1);
        END LOOP;

        IF pos = 0 THEN
            RETURN 0;
        ELSE
            RETURN beg;
        END IF;
    ELSE
        ss_length := char_length(string_to_search);
        length := char_length(string);
        beg := length + beg_index - ss_length + 2;

        WHILE beg > 0 LOOP
            temp_str := substring(string FROM beg FOR ss_length);
            pos := position(string_to_search IN temp_str);

            IF pos > 0 THEN
                occur_number := occur_number + 1;

                IF occur_number = occur_index THEN
                    RETURN beg;
                END IF;
            END IF;

            beg := beg - 1;
        END LOOP;

        RETURN 0;
    END IF;
END;
' language 'plpgsql';
</programlisting>
   </sect3>
  </sect2>
  
 </sect1>

</chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->