diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index 701943448ff5ece57cce50408abf6156b300954e..352fb4bb3e1bf4973cec2f665e5232a94af7283e 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $ +$PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.29 2004/12/30 03:13:56 tgl Exp $ --> <chapter id="extend"> @@ -152,8 +152,8 @@ $PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $ <para> Domains can be created using the <acronym>SQL</> command - <command>CREATE DOMAIN</command>. Their creation and use is not - discussed in this chapter. + <xref linkend="sql-createdomain" endterm="sql-createdomain-title">. + Their creation and use is not discussed in this chapter. </para> </sect2> @@ -221,7 +221,7 @@ $PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $ Thus, when more than one argument position is declared with a polymorphic type, the net effect is that only certain combinations of actual argument types are allowed. For example, a function declared as - <literal>foo(anyelement, anyelement)</> will take any two input values, + <literal>equal(anyelement, anyelement)</> will take any two input values, so long as they are of the same data type. </para> diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index 831acc4371dfa8393f9da97a2f234018475306f5..36d58f4d8966477f5b4019da7c50b36912f6f93e 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.71 2004/12/29 23:36:47 tgl Exp $ +$PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.72 2004/12/30 03:13:56 tgl Exp $ --> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [ @@ -192,18 +192,19 @@ $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.71 2004/12/29 23:36:47 tgl Exp user-defined functions, data types, triggers, etc. These are advanced topics which should probably be approached only after all the other user documentation about <productname>PostgreSQL</> has - been understood. This part also describes the server-side + been understood. Later chapters in this part describe the server-side programming languages available in the <productname>PostgreSQL</productname> distribution as well as - general issues concerning server-side programming languages. This - information is only useful to readers that have read at least the - first few chapters of this part. + general issues concerning server-side programming languages. It + is essential to read at least the earlier sections of <xref + linkend="extend"> (covering functions) before diving into the + material about server-side programming languages. </para> </partintro> &extend; - &rules; &trigger; + &rules; &xplang; &plsql; diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml index 3c9aea3c43c53340f9e2e925843d88944a9f9920..2b53c84c7b15d673cce3388bceb4a444d627dd29 100644 --- a/doc/src/sgml/rules.sgml +++ b/doc/src/sgml/rules.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/rules.sgml,v 1.36 2004/11/15 06:32:14 neilc Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/rules.sgml,v 1.37 2004/12/30 03:13:56 tgl Exp $ --> <Chapter Id="rules"> <Title>The Rule System</Title> @@ -104,19 +104,19 @@ <ListItem> <Para> The range table is a list of relations that are used in the query. - In a <command>SELECT</command> statement these are the relations given after - the <literal>FROM</literal> key word. + In a <command>SELECT</command> statement these are the relations given after + the <literal>FROM</literal> key word. </Para> <Para> Every range table entry identifies a table or view and tells - by which name it is called in the other parts of the query. - In the query tree, the range table entries are referenced by - number rather than by name, so here it doesn't matter if there - are duplicate names as it would in an <Acronym>SQL</Acronym> - statement. This can happen after the range tables of rules - have been merged in. The examples in this chapter will not have - this situation. + by which name it is called in the other parts of the query. + In the query tree, the range table entries are referenced by + number rather than by name, so here it doesn't matter if there + are duplicate names as it would in an <Acronym>SQL</Acronym> + statement. This can happen after the range tables of rules + have been merged in. The examples in this chapter will not have + this situation. </Para> </ListItem> </VarListEntry> @@ -128,21 +128,21 @@ <ListItem> <Para> This is an index into the range table that identifies the - relation where the results of the query go. + relation where the results of the query go. </Para> <Para> - <command>SELECT</command> queries normally don't have a result - relation. The special case of a <command>SELECT INTO</command> is - mostly identical to a <command>CREATE TABLE</command> followed by a - <literal>INSERT ... SELECT</literal> and is not discussed - separately here. + <command>SELECT</command> queries normally don't have a result + relation. The special case of a <command>SELECT INTO</command> is + mostly identical to a <command>CREATE TABLE</command> followed by a + <literal>INSERT ... SELECT</literal> and is not discussed + separately here. </Para> <Para> For <command>INSERT</command>, <command>UPDATE</command>, and - <command>DELETE</command> commands, the result relation is the table - (or view!) where the changes take effect. + <command>DELETE</command> commands, the result relation is the table + (or view!) where the changes are to take effect. </Para> </ListItem> </VarListEntry> @@ -167,39 +167,39 @@ <Para> <command>DELETE</command> commands don't need a target list - because they don't produce any result. In fact, the planner will - add a special <acronym>CTID</> entry to the empty target list, but - this is after the rule system and will be discussed later; for the - rule system, the target list is empty. + because they don't produce any result. In fact, the planner will + add a special <acronym>CTID</> entry to the empty target list, but + this is after the rule system and will be discussed later; for the + rule system, the target list is empty. </Para> <Para> For <command>INSERT</command> commands, the target list describes - the new rows that should go into the result relation. It consists of the - expressions in the <literal>VALUES</> clause or the ones from the - <command>SELECT</command> clause in <literal>INSERT - ... SELECT</literal>. The first step of the rewrite process adds - target list entries for any columns that were not assigned to by - the original command but have defaults. Any remaining columns (with - neither a given value nor a default) will be filled in by the - planner with a constant null expression. + the new rows that should go into the result relation. It consists of the + expressions in the <literal>VALUES</> clause or the ones from the + <command>SELECT</command> clause in <literal>INSERT + ... SELECT</literal>. The first step of the rewrite process adds + target list entries for any columns that were not assigned to by + the original command but have defaults. Any remaining columns (with + neither a given value nor a default) will be filled in by the + planner with a constant null expression. </Para> <Para> For <command>UPDATE</command> commands, the target list - describes the new rows that should replace the old ones. In the - rule system, it contains just the expressions from the <literal>SET - column = expression</literal> part of the command. The planner will handle - missing columns by inserting expressions that copy the values from - the old row into the new one. And it will add the special - <acronym>CTID</> entry just as for <command>DELETE</command>, too. + describes the new rows that should replace the old ones. In the + rule system, it contains just the expressions from the <literal>SET + column = expression</literal> part of the command. The planner will handle + missing columns by inserting expressions that copy the values from + the old row into the new one. And it will add the special + <acronym>CTID</> entry just as for <command>DELETE</command>, too. </Para> <Para> Every entry in the target list contains an expression that can - be a constant value, a variable pointing to a column of one - of the relations in the range table, a parameter, or an expression - tree made of function calls, constants, variables, operators, etc. + be a constant value, a variable pointing to a column of one + of the relations in the range table, a parameter, or an expression + tree made of function calls, constants, variables, operators, etc. </Para> </ListItem> </VarListEntry> @@ -211,12 +211,12 @@ <ListItem> <Para> The query's qualification is an expression much like one of - those contained in the target list entries. The result value of - this expression is a Boolean that tells whether the operation - (<command>INSERT</command>, <command>UPDATE</command>, - <command>DELETE</command>, or <command>SELECT</command>) for the - final result row should be executed or not. It corresponds to the <literal>WHERE</> clause - of an <Acronym>SQL</Acronym> statement. + those contained in the target list entries. The result value of + this expression is a Boolean that tells whether the operation + (<command>INSERT</command>, <command>UPDATE</command>, + <command>DELETE</command>, or <command>SELECT</command>) for the + final result row should be executed or not. It corresponds to the <literal>WHERE</> clause + of an <Acronym>SQL</Acronym> statement. </Para> </ListItem> </VarListEntry> @@ -228,17 +228,17 @@ <ListItem> <Para> The query's join tree shows the structure of the <literal>FROM</> clause. - For a simple query like <literal>SELECT ... FROM a, b, c</literal>, the join tree is just - a list of the <literal>FROM</> items, because we are allowed to join them in - any order. But when <literal>JOIN</> expressions, particularly outer joins, - are used, we have to join in the order shown by the joins. - In that case, the join tree shows the structure of the <literal>JOIN</> expressions. The - restrictions associated with particular <literal>JOIN</> clauses (from <literal>ON</> or - <literal>USING</> expressions) are stored as qualification expressions attached - to those join-tree nodes. It turns out to be convenient to store - the top-level <literal>WHERE</> expression as a qualification attached to the - top-level join-tree item, too. So really the join tree represents - both the <literal>FROM</> and <literal>WHERE</> clauses of a <command>SELECT</command>. + For a simple query like <literal>SELECT ... FROM a, b, c</literal>, the join tree is just + a list of the <literal>FROM</> items, because we are allowed to join them in + any order. But when <literal>JOIN</> expressions, particularly outer joins, + are used, we have to join in the order shown by the joins. + In that case, the join tree shows the structure of the <literal>JOIN</> expressions. The + restrictions associated with particular <literal>JOIN</> clauses (from <literal>ON</> or + <literal>USING</> expressions) are stored as qualification expressions attached + to those join-tree nodes. It turns out to be convenient to store + the top-level <literal>WHERE</> expression as a qualification attached to the + top-level join-tree item, too. So really the join tree represents + both the <literal>FROM</> and <literal>WHERE</> clauses of a <command>SELECT</command>. </Para> </ListItem> </VarListEntry> @@ -250,10 +250,10 @@ <ListItem> <Para> The other parts of the query tree like the <literal>ORDER BY</> - clause aren't of interest here. The rule system - substitutes some entries there while applying rules, but that - doesn't have much to do with the fundamentals of the rule - system. + clause aren't of interest here. The rule system + substitutes some entries there while applying rules, but that + doesn't have much to do with the fundamentals of the rule + system. </Para> </ListItem> </VarListEntry> @@ -322,7 +322,7 @@ CREATE RULE "_RETURN" AS ON SELECT TO myview DO INSTEAD Currently, there can be only one action in an <literal>ON SELECT</> rule, and it must be an unconditional <command>SELECT</> action that is <literal>INSTEAD</>. This restriction was required to make rules safe enough to open them for ordinary users, and - it restricts <literal>ON SELECT</> rules to real view rules. + it restricts <literal>ON SELECT</> rules to act like views. </Para> <Para> @@ -695,29 +695,29 @@ UPDATE t1 SET b = t2.b WHERE t1.a = t2.a; <ItemizedList> <ListItem> - <Para> - The range tables contain entries for the tables <literal>t1</> and <literal>t2</>. - </Para> + <Para> + The range tables contain entries for the tables <literal>t1</> and <literal>t2</>. + </Para> </ListItem> <ListItem> - <Para> - The target lists contain one variable that points to column - <literal>b</> of the range table entry for table <literal>t2</>. - </Para> + <Para> + The target lists contain one variable that points to column + <literal>b</> of the range table entry for table <literal>t2</>. + </Para> </ListItem> <ListItem> - <Para> - The qualification expressions compare the columns <literal>a</> of both - range-table entries for equality. - </Para> + <Para> + The qualification expressions compare the columns <literal>a</> of both + range-table entries for equality. + </Para> </ListItem> <ListItem> - <Para> - The join trees show a simple join between <literal>t1</> and <literal>t2</>. - </Para> + <Para> + The join trees show a simple join between <literal>t1</> and <literal>t2</>. + </Para> </ListItem> </ItemizedList> </para> @@ -860,34 +860,34 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a; <ItemizedList> <ListItem> - <Para> - They are allowed to have no action. - </Para> - </ListItem> + <Para> + They are allowed to have no action. + </Para> + </ListItem> <ListItem> - <Para> - They can have multiple actions. - </Para> - </ListItem> + <Para> + They can have multiple actions. + </Para> + </ListItem> <ListItem> - <Para> - They can be <literal>INSTEAD</> or <literal>ALSO</> (default). - </Para> - </ListItem> + <Para> + They can be <literal>INSTEAD</> or <literal>ALSO</> (default). + </Para> + </ListItem> <ListItem> - <Para> - The pseudorelations <literal>NEW</> and <literal>OLD</> become useful. - </Para> - </ListItem> + <Para> + The pseudorelations <literal>NEW</> and <literal>OLD</> become useful. + </Para> + </ListItem> <ListItem> - <Para> - They can have rule qualifications. - </Para> - </ListItem> + <Para> + They can have rule qualifications. + </Para> + </ListItem> </ItemizedList> Second, they don't modify the query tree in place. Instead they @@ -1875,14 +1875,15 @@ GRANT SELECT ON phone_number TO secretary; </Para> <Para> - For the things that can be implemented by both, - it depends on the usage of the database, which is the best. + For the things that can be implemented by both, which is best + depends on the usage of the database. A trigger is fired for any affected row once. A rule manipulates - the query tree or generates an additional one. So if many + the query or generates an additional query. So if many rows are affected in one statement, a rule issuing one extra - command would usually do a better job than a trigger that is + command is likely to be faster than a trigger that is called for every single row and must execute its operations - many times. + many times. However, the trigger approach is conceptually far + simpler than the rule approach, and is easier for novices to get right. </Para> <Para> diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml index fdecf5483ba171c44d5e8bb30e9a06ecdb9293b3..f7fa39b80207b58a37540ae1e9b22a64d7d0dd89 100644 --- a/doc/src/sgml/trigger.sgml +++ b/doc/src/sgml/trigger.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere Exp $ +$PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.39 2004/12/30 03:13:56 tgl Exp $ --> <chapter id="triggers"> @@ -58,6 +58,15 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E respectively. </para> + <para> + Statement-level <quote>before</> triggers naturally fire before the + statement starts to do anything, while statement-level <quote>after</> + triggers fire at the very end of the statement. Row-level <quote>before</> + triggers fire immediately before a particular row is operated on, + while row-level <quote>after</> triggers fire at the end of the statement + (but before any statement-level <quote>after</> triggers). + </para> + <para> Trigger functions invoked by per-statement triggers should always return <symbol>NULL</symbol>. Trigger functions invoked by per-row @@ -110,6 +119,21 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E triggers are not fired. </para> + <para> + Typically, row before triggers are used for checking or + modifying the data that will be inserted or updated. For example, + a before trigger might be used to insert the current time into a + timestamp column, or to check that two elements of the row are + consistent. Row after triggers are most sensibly + used to propagate the updates to other tables, or make consistency + checks against other tables. The reason for this division of labor is + that an after trigger can be certain it is seeing the final value of the + row, while a before trigger cannot; there might be other before triggers + firing after it. If you have no specific reason to make a trigger before + or after, the before case is more efficient, since the information about + the operation doesn't have to be saved until end of statement. + </para> + <para> If a trigger function executes SQL commands then these commands may fire triggers again. This is known as cascading @@ -140,6 +164,20 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E trigger. </para> + <para> + Each programming language that supports triggers has its own method + for making the trigger input data available to the trigger function. + This input data includes the type of trigger event (e.g., + <command>INSERT</command> or <command>UPDATE</command>) as well as any + arguments that were listed in <command>CREATE TRIGGER</>. + For a row-level trigger, the input data also includes the + <varname>NEW</varname> row for <command>INSERT</command> and + <command>UPDATE</command> triggers, and/or the <varname>OLD</varname> row + for <command>UPDATE</command> and <command>DELETE</command> triggers. + Statement-level triggers do not currently have any way to examine the + individual row(s) modified by the statement. + </para> + </sect1> <sect1 id="trigger-datachanges"> @@ -277,73 +315,73 @@ typedef struct TriggerData <term><structfield>tg_event</></term> <listitem> <para> - Describes the event for which the function is called. You may use the - following macros to examine <literal>tg_event</literal>: - - <variablelist> - <varlistentry> - <term><literal>TRIGGER_FIRED_BEFORE(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger fired before the operation. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_AFTER(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger fired after the operation. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_FOR_ROW(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger fired for a row-level event. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_FOR_STATEMENT(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger fired for a statement-level event. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_BY_INSERT(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger was fired by an <command>INSERT</command> command. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_BY_UPDATE(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger was fired by an <command>UPDATE</command> command. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><literal>TRIGGER_FIRED_BY_DELETE(tg_event)</literal></term> - <listitem> - <para> - Returns true if the trigger was fired by a <command>DELETE</command> command. - </para> - </listitem> - </varlistentry> - </variablelist> + Describes the event for which the function is called. You may use the + following macros to examine <literal>tg_event</literal>: + + <variablelist> + <varlistentry> + <term><literal>TRIGGER_FIRED_BEFORE(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger fired before the operation. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_AFTER(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger fired after the operation. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_FOR_ROW(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger fired for a row-level event. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_FOR_STATEMENT(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger fired for a statement-level event. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_BY_INSERT(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger was fired by an <command>INSERT</command> command. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_BY_UPDATE(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger was fired by an <command>UPDATE</command> command. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>TRIGGER_FIRED_BY_DELETE(tg_event)</literal></term> + <listitem> + <para> + Returns true if the trigger was fired by a <command>DELETE</command> command. + </para> + </listitem> + </varlistentry> + </variablelist> </para> </listitem> </varlistentry> @@ -352,15 +390,15 @@ typedef struct TriggerData <term><structfield>tg_relation</></term> <listitem> <para> - A pointer to a structure describing the relation that the trigger fired for. - Look at <filename>utils/rel.h</> for details about - this structure. The most interesting things are - <literal>tg_relation->rd_att</> (descriptor of the relation - tuples) and <literal>tg_relation->rd_rel->relname</> - (relation name; the type is not <type>char*</> but - <type>NameData</>; use - <literal>SPI_getrelname(tg_relation)</> to get a <type>char*</> if you - need a copy of the name). + A pointer to a structure describing the relation that the trigger fired for. + Look at <filename>utils/rel.h</> for details about + this structure. The most interesting things are + <literal>tg_relation->rd_att</> (descriptor of the relation + tuples) and <literal>tg_relation->rd_rel->relname</> + (relation name; the type is not <type>char*</> but + <type>NameData</>; use + <literal>SPI_getrelname(tg_relation)</> to get a <type>char*</> if you + need a copy of the name). </para> </listitem> </varlistentry> @@ -369,13 +407,13 @@ typedef struct TriggerData <term><structfield>tg_trigtuple</></term> <listitem> <para> - A pointer to the row for which the trigger was fired. This is - the row being inserted, updated, or deleted. If this trigger - was fired for an <command>INSERT</command> or - <command>DELETE</command> then this is what you should return - to from the function if you don't want to replace the row with - a different one (in the case of <command>INSERT</command>) or - skip the operation. + A pointer to the row for which the trigger was fired. This is + the row being inserted, updated, or deleted. If this trigger + was fired for an <command>INSERT</command> or + <command>DELETE</command> then this is what you should return + from the function if you don't want to replace the row with + a different one (in the case of <command>INSERT</command>) or + skip the operation. </para> </listitem> </varlistentry> @@ -384,13 +422,13 @@ typedef struct TriggerData <term><structfield>tg_newtuple</></term> <listitem> <para> - A pointer to the new version of the row, if the trigger was - fired for an <command>UPDATE</command>, and <symbol>NULL</> if - it is for an <command>INSERT</command> or a - <command>DELETE</command>. This is what you have to return - from the function if the event is an <command>UPDATE</command> - and you don't want to replace this row by a different one or - skip the operation. + A pointer to the new version of the row, if the trigger was + fired for an <command>UPDATE</command>, and <symbol>NULL</> if + it is for an <command>INSERT</command> or a + <command>DELETE</command>. This is what you have to return + from the function if the event is an <command>UPDATE</command> + and you don't want to replace this row by a different one or + skip the operation. </para> </listitem> </varlistentry> @@ -399,8 +437,8 @@ typedef struct TriggerData <term><structfield>tg_trigger</></term> <listitem> <para> - A pointer to a structure of type <structname>Trigger</>, - defined in <filename>utils/rel.h</>: + A pointer to a structure of type <structname>Trigger</>, + defined in <filename>utils/rel.h</>: <programlisting> typedef struct Trigger diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 4a2dac06e0945fb5b9bf186f04bea5bbbfea5f01..649fd0e4da025e4704ab207548bd4572ced9ab1a 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp $ +$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.91 2004/12/30 03:13:56 tgl Exp $ --> <sect1 id="xfunc"> @@ -24,7 +24,7 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp <listitem> <para> procedural language functions (functions written in, for - example, <application>PL/Tcl</> or <application>PL/pgSQL</>) + example, <application>PL/pgSQL</> or <application>PL/Tcl</>) (<xref linkend="xfunc-pl">) </para> </listitem> @@ -44,9 +44,10 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp <para> Every kind of function can take base types, composite types, or - combinations of these as arguments (parameters). In addition, + combinations of these as arguments (parameters). In addition, every kind of function can return a base type or - a composite type. + a composite type. Functions may also be defined to return + sets of base or composite values. </para> <para> @@ -64,7 +65,8 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp <para> Throughout this chapter, it can be useful to look at the reference - page of the <xref linkend="sql-createfunction"> command to + page of the <xref linkend="sql-createfunction" + endterm="sql-createfunction-title"> command to understand the examples better. Some examples from this chapter can be found in <filename>funcs.sql</filename> and <filename>funcs.c</filename> in the <filename>src/tutorial</> @@ -141,7 +143,7 @@ CREATE FUNCTION one() RETURNS integer AS $$ SELECT 1 AS result; $$ LANGUAGE SQL; --- Alternative syntax: +-- Alternative syntax for string literal: CREATE FUNCTION one() RETURNS integer AS ' SELECT 1 AS result; ' LANGUAGE SQL; @@ -335,16 +337,16 @@ $$ LANGUAGE SQL; <itemizedlist> <listitem> <para> - The select list order in the query must be exactly the same as - that in which the columns appear in the table associated - with the composite type. (Naming the columns, as we did above, - is irrelevant to the system.) + The select list order in the query must be exactly the same as + that in which the columns appear in the table associated + with the composite type. (Naming the columns, as we did above, + is irrelevant to the system.) </para> </listitem> <listitem> <para> - You must typecast the expressions to match the - definition of the composite type, or you will get errors like this: + You must typecast the expressions to match the + definition of the composite type, or you will get errors like this: <screen> <computeroutput> ERROR: function declared to return emp returns varchar instead of text at column 1 @@ -356,15 +358,9 @@ ERROR: function declared to return emp returns varchar instead of text at colum </para> <para> - A function that returns a row (composite type) can be used as a table - function, as described below. It can also be called in the context - of an SQL expression, but only when you - extract a single attribute out of the row or pass the entire row into - another function that accepts the same composite type. - </para> - - <para> - This is an example of extracting an attribute out of a row type: + When you call a function that returns a row (composite type) in a + SQL expression, you might want only one field (attribute) from its + result. You can do that with syntax like this: <screen> SELECT (new_emp()).name; @@ -374,11 +370,14 @@ SELECT (new_emp()).name; None </screen> - We need the extra parentheses to keep the parser from getting confused: + The extra parentheses are needed to keep the parser from getting + confused. If you try to do it without them, you get something like this: <screen> SELECT new_emp().name; ERROR: syntax error at or near "." at character 17 +LINE 1: SELECT new_emp().name; + ^ </screen> </para> @@ -412,9 +411,8 @@ SELECT name(emp) AS youngster </para> <para> - The other way to use a function returning a row result is to declare a - second function accepting a row type argument and pass the - result of the first function to it: + Another way to use a function returning a row result is to pass the + result to another function that accepts the correct row type as input: <screen> CREATE FUNCTION getname(emp) RETURNS text AS $$ @@ -428,6 +426,11 @@ SELECT getname(new_emp()); (1 row) </screen> </para> + + <para> + Another way to use a function that returns a composite type is to + call it as a table function, as described below. + </para> </sect2> <sect2> @@ -469,7 +472,7 @@ SELECT *, upper(fooname) FROM getfoo(1) AS t1; <para> Note that we only got one row out of the function. This is because - we did not use <literal>SETOF</>. This is described in the next section. + we did not use <literal>SETOF</>. That is described in the next section. </para> </sect2> @@ -598,7 +601,7 @@ ERROR: could not determine "anyarray"/"anyelement" type because input has type </para> <para> - It is permitted to have polymorphic arguments with a deterministic + It is permitted to have polymorphic arguments with a fixed return type, but the converse is not. For example: <screen> CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$ @@ -621,6 +624,201 @@ DETAIL: A function returning "anyarray" or "anyelement" must have at least one </sect2> </sect1> + <sect1 id="xfunc-overload"> + <title>Function Overloading</title> + + <indexterm zone="xfunc-overload"> + <primary>overloading</primary> + <secondary>functions</secondary> + </indexterm> + + <para> + More than one function may be defined with the same SQL name, so long + as the arguments they take are different. In other words, + function names can be <firstterm>overloaded</firstterm>. When a + query is executed, the server will determine which function to + call from the data types and the number of the provided arguments. + Overloading can also be used to simulate functions with a variable + number of arguments, up to a finite maximum number. + </para> + + <para> + When creating a family of overloaded functions, one should be + careful not to create ambiguities. For instance, given the + functions +<programlisting> +CREATE FUNCTION test(int, real) RETURNS ... +CREATE FUNCTION test(smallint, double precision) RETURNS ... +</programlisting> + it is not immediately clear which function would be called with + some trivial input like <literal>test(1, 1.5)</literal>. The + currently implemented resolution rules are described in + <xref linkend="typeconv">, but it is unwise to design a system that subtly + relies on this behavior. + </para> + + <para> + A function that takes a single argument of a composite type should + generally not have the same name as any attribute (field) of that type. + Recall that <literal>attribute(table)</literal> is considered equivalent + to <literal>table.attribute</literal>. In the case that there is an + ambiguity between a function on a composite type and an attribute of + the composite type, the attribute will always be used. It is possible + to override that choice by schema-qualifying the function name + (that is, <literal>schema.func(table)</literal>) but it's better to + avoid the problem by not choosing conflicting names. + </para> + + <para> + When overloading C-language functions, there is an additional + constraint: The C name of each function in the family of + overloaded functions must be different from the C names of all + other functions, either internal or dynamically loaded. If this + rule is violated, the behavior is not portable. You might get a + run-time linker error, or one of the functions will get called + (usually the internal one). The alternative form of the + <literal>AS</> clause for the SQL <command>CREATE + FUNCTION</command> command decouples the SQL function name from + the function name in the C source code. For instance, +<programlisting> +CREATE FUNCTION test(int) RETURNS int + AS '<replaceable>filename</>', 'test_1arg' + LANGUAGE C; +CREATE FUNCTION test(int, int) RETURNS int + AS '<replaceable>filename</>', 'test_2arg' + LANGUAGE C; +</programlisting> + The names of the C functions here reflect one of many possible conventions. + </para> + </sect1> + + <sect1 id="xfunc-volatility"> + <title>Function Volatility Categories</title> + + <indexterm zone="xfunc-volatility"> + <primary>volatility</primary> + <secondary>functions</secondary> + </indexterm> + + <para> + Every function has a <firstterm>volatility</> classification, with + the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or + <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the + <command>CREATE FUNCTION</command> command does not specify a category. + The volatility category is a promise to the optimizer about the behavior + of the function: + + <itemizedlist> + <listitem> + <para> + A <literal>VOLATILE</> function can do anything, including modifying + the database. It can return different results on successive calls with + the same arguments. The optimizer makes no assumptions about the + behavior of such functions. A query using a volatile function will + re-evaluate the function at every row where its value is needed. + </para> + </listitem> + <listitem> + <para> + A <literal>STABLE</> function cannot modify the database and is + guaranteed to return the same results given the same arguments + for all calls within a single surrounding query. This category + allows the optimizer to optimize away multiple calls of the function + within a single query. In particular, it is safe to use an expression + containing such a function in an index scan condition. (Since an + index scan will evaluate the comparison value only once, not once at + each row, it is not valid to use a <literal>VOLATILE</> function in + an index scan condition.) + </para> + </listitem> + <listitem> + <para> + An <literal>IMMUTABLE</> function cannot modify the database and is + guaranteed to return the same results given the same arguments forever. + This category allows the optimizer to pre-evaluate the function when + a query calls it with constant arguments. For example, a query like + <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to + <literal>SELECT ... WHERE x = 4</>, because the function underlying + the integer addition operator is marked <literal>IMMUTABLE</>. + </para> + </listitem> + </itemizedlist> + </para> + + <para> + For best optimization results, you should label your functions with the + strictest volatility category that is valid for them. + </para> + + <para> + Any function with side-effects <emphasis>must</> be labeled + <literal>VOLATILE</>, so that calls to it cannot be optimized away. + Even a function with no side-effects needs to be labeled + <literal>VOLATILE</> if its value can change within a single query; + some examples are <literal>random()</>, <literal>currval()</>, + <literal>timeofday()</>. + </para> + + <para> + There is relatively little difference between <literal>STABLE</> and + <literal>IMMUTABLE</> categories when considering simple interactive + queries that are planned and immediately executed: it doesn't matter + a lot whether a function is executed once during planning or once during + query execution startup. But there is a big difference if the plan is + saved and reused later. Labeling a function <literal>IMMUTABLE</> when + it really isn't may allow it to be prematurely folded to a constant during + planning, resulting in a stale value being re-used during subsequent uses + of the plan. This is a hazard when using prepared statements or when + using function languages that cache plans (such as + <application>PL/pgSQL</>). + </para> + + <para> + Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">) + a function containing only <command>SELECT</> commands can safely be + marked <literal>STABLE</>, even if it selects from tables that might be + undergoing modifications by concurrent queries. + <productname>PostgreSQL</productname> will execute a <literal>STABLE</> + function using the snapshot established for the calling query, and so it + will see a fixed view of the database throughout that query. + Also note + that the <function>current_timestamp</> family of functions qualify + as stable, since their values do not change within a transaction. + </para> + + <para> + The same snapshotting behavior is used for <command>SELECT</> commands + within <literal>IMMUTABLE</> functions. It is generally unwise to select + from database tables within an <literal>IMMUTABLE</> function at all, + since the immutability will be broken if the table contents ever change. + However, <productname>PostgreSQL</productname> does not enforce that you + do not do that. + </para> + + <para> + A common error is to label a function <literal>IMMUTABLE</> when its + results depend on a configuration parameter. For example, a function + that manipulates timestamps might well have results that depend on the + <xref linkend="guc-timezone"> setting. For safety, such functions should + be labeled <literal>STABLE</> instead. + </para> + + <note> + <para> + Before <productname>PostgreSQL</productname> release 8.0, the requirement + that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify + the database was not enforced by the system. Release 8.0 enforces it + by requiring SQL functions and procedural language functions of these + categories to contain no SQL commands other than <command>SELECT</>. + (This is not a completely bulletproof test, since such functions could + still call <literal>VOLATILE</> functions that modify the database. + If you do that, you will find that the <literal>STABLE</> or + <literal>IMMUTABLE</> function does not notice the database changes + applied by the called function.) + </para> + </note> + </sect1> + <sect1 id="xfunc-pl"> <title>Procedural Language Functions</title> @@ -754,7 +952,7 @@ CREATE FUNCTION square_root(double precision) RETURNS double precision <para> If the name starts with the string <literal>$libdir</literal>, that part is replaced by the <productname>PostgreSQL</> package - library directory + library directory name, which is determined at build time.<indexterm><primary>$libdir</></> </para> </listitem> @@ -864,17 +1062,17 @@ CREATE FUNCTION square_root(double precision) RETURNS double precision <itemizedlist> <listitem> <para> - pass by value, fixed-length + pass by value, fixed-length </para> </listitem> <listitem> <para> - pass by reference, fixed-length + pass by reference, fixed-length </para> </listitem> <listitem> <para> - pass by reference, variable-length + pass by reference, variable-length </para> </listitem> </itemizedlist> @@ -993,169 +1191,169 @@ memcpy(destination->data, buffer, 40); <title>Equivalent C Types for Built-In SQL Types</title> <tgroup cols="3"> <thead> - <row> - <entry> - SQL Type - </entry> - <entry> - C Type - </entry> - <entry> - Defined In - </entry> - </row> + <row> + <entry> + SQL Type + </entry> + <entry> + C Type + </entry> + <entry> + Defined In + </entry> + </row> </thead> <tbody> - <row> - <entry><type>abstime</type></entry> - <entry><type>AbsoluteTime</type></entry> - <entry><filename>utils/nabstime.h</filename></entry> - </row> - <row> - <entry><type>boolean</type></entry> - <entry><type>bool</type></entry> - <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry> - </row> - <row> - <entry><type>box</type></entry> - <entry><type>BOX*</type></entry> - <entry><filename>utils/geo_decls.h</filename></entry> - </row> - <row> - <entry><type>bytea</type></entry> - <entry><type>bytea*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>"char"</type></entry> - <entry><type>char</type></entry> - <entry>(compiler built-in)</entry> - </row> - <row> - <entry><type>character</type></entry> - <entry><type>BpChar*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>cid</type></entry> - <entry><type>CommandId</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>date</type></entry> - <entry><type>DateADT</type></entry> - <entry><filename>utils/date.h</filename></entry> - </row> - <row> - <entry><type>smallint</type> (<type>int2</type>)</entry> - <entry><type>int2</type> or <type>int16</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>int2vector</type></entry> - <entry><type>int2vector*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>integer</type> (<type>int4</type>)</entry> - <entry><type>int4</type> or <type>int32</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>real</type> (<type>float4</type>)</entry> - <entry><type>float4*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>double precision</type> (<type>float8</type>)</entry> - <entry><type>float8*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>interval</type></entry> - <entry><type>Interval*</type></entry> - <entry><filename>utils/timestamp.h</filename></entry> - </row> - <row> - <entry><type>lseg</type></entry> - <entry><type>LSEG*</type></entry> - <entry><filename>utils/geo_decls.h</filename></entry> - </row> - <row> - <entry><type>name</type></entry> - <entry><type>Name</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>oid</type></entry> - <entry><type>Oid</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>oidvector</type></entry> - <entry><type>oidvector*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>path</type></entry> - <entry><type>PATH*</type></entry> - <entry><filename>utils/geo_decls.h</filename></entry> - </row> - <row> - <entry><type>point</type></entry> - <entry><type>POINT*</type></entry> - <entry><filename>utils/geo_decls.h</filename></entry> - </row> - <row> - <entry><type>regproc</type></entry> - <entry><type>regproc</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>reltime</type></entry> - <entry><type>RelativeTime</type></entry> - <entry><filename>utils/nabstime.h</filename></entry> - </row> - <row> - <entry><type>text</type></entry> - <entry><type>text*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>tid</type></entry> - <entry><type>ItemPointer</type></entry> - <entry><filename>storage/itemptr.h</filename></entry> - </row> - <row> - <entry><type>time</type></entry> - <entry><type>TimeADT</type></entry> - <entry><filename>utils/date.h</filename></entry> - </row> - <row> - <entry><type>time with time zone</type></entry> - <entry><type>TimeTzADT</type></entry> - <entry><filename>utils/date.h</filename></entry> - </row> - <row> - <entry><type>timestamp</type></entry> - <entry><type>Timestamp*</type></entry> - <entry><filename>utils/timestamp.h</filename></entry> - </row> - <row> - <entry><type>tinterval</type></entry> - <entry><type>TimeInterval</type></entry> - <entry><filename>utils/nabstime.h</filename></entry> - </row> - <row> - <entry><type>varchar</type></entry> - <entry><type>VarChar*</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> - <row> - <entry><type>xid</type></entry> - <entry><type>TransactionId</type></entry> - <entry><filename>postgres.h</filename></entry> - </row> + <row> + <entry><type>abstime</type></entry> + <entry><type>AbsoluteTime</type></entry> + <entry><filename>utils/nabstime.h</filename></entry> + </row> + <row> + <entry><type>boolean</type></entry> + <entry><type>bool</type></entry> + <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry> + </row> + <row> + <entry><type>box</type></entry> + <entry><type>BOX*</type></entry> + <entry><filename>utils/geo_decls.h</filename></entry> + </row> + <row> + <entry><type>bytea</type></entry> + <entry><type>bytea*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>"char"</type></entry> + <entry><type>char</type></entry> + <entry>(compiler built-in)</entry> + </row> + <row> + <entry><type>character</type></entry> + <entry><type>BpChar*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>cid</type></entry> + <entry><type>CommandId</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>date</type></entry> + <entry><type>DateADT</type></entry> + <entry><filename>utils/date.h</filename></entry> + </row> + <row> + <entry><type>smallint</type> (<type>int2</type>)</entry> + <entry><type>int2</type> or <type>int16</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>int2vector</type></entry> + <entry><type>int2vector*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>integer</type> (<type>int4</type>)</entry> + <entry><type>int4</type> or <type>int32</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>real</type> (<type>float4</type>)</entry> + <entry><type>float4*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>double precision</type> (<type>float8</type>)</entry> + <entry><type>float8*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>interval</type></entry> + <entry><type>Interval*</type></entry> + <entry><filename>utils/timestamp.h</filename></entry> + </row> + <row> + <entry><type>lseg</type></entry> + <entry><type>LSEG*</type></entry> + <entry><filename>utils/geo_decls.h</filename></entry> + </row> + <row> + <entry><type>name</type></entry> + <entry><type>Name</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>oid</type></entry> + <entry><type>Oid</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>oidvector</type></entry> + <entry><type>oidvector*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>path</type></entry> + <entry><type>PATH*</type></entry> + <entry><filename>utils/geo_decls.h</filename></entry> + </row> + <row> + <entry><type>point</type></entry> + <entry><type>POINT*</type></entry> + <entry><filename>utils/geo_decls.h</filename></entry> + </row> + <row> + <entry><type>regproc</type></entry> + <entry><type>regproc</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>reltime</type></entry> + <entry><type>RelativeTime</type></entry> + <entry><filename>utils/nabstime.h</filename></entry> + </row> + <row> + <entry><type>text</type></entry> + <entry><type>text*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>tid</type></entry> + <entry><type>ItemPointer</type></entry> + <entry><filename>storage/itemptr.h</filename></entry> + </row> + <row> + <entry><type>time</type></entry> + <entry><type>TimeADT</type></entry> + <entry><filename>utils/date.h</filename></entry> + </row> + <row> + <entry><type>time with time zone</type></entry> + <entry><type>TimeTzADT</type></entry> + <entry><filename>utils/date.h</filename></entry> + </row> + <row> + <entry><type>timestamp</type></entry> + <entry><type>Timestamp*</type></entry> + <entry><filename>utils/timestamp.h</filename></entry> + </row> + <row> + <entry><type>tinterval</type></entry> + <entry><type>TimeInterval</type></entry> + <entry><filename>utils/nabstime.h</filename></entry> + </row> + <row> + <entry><type>varchar</type></entry> + <entry><type>VarChar*</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> + <row> + <entry><type>xid</type></entry> + <entry><type>TransactionId</type></entry> + <entry><filename>postgres.h</filename></entry> + </row> </tbody> </tgroup> </table> @@ -1567,9 +1765,9 @@ concat_text(PG_FUNCTION_ARGS) <listitem> <para> Always zero the bytes of your structures using - <function>memset</function>. Without this, it's difficult to - support hash indexes or hash joins, as you must pick out only - the significant bits of your data structure to compute a hash. + <function>memset</function>. Without this, it's difficult to + support hash indexes or hash joins, as you must pick out only + the significant bits of your data structure to compute a hash. Even if you initialize all fields of your structure, there may be alignment padding (holes in the structure) that may contain garbage values. @@ -1618,7 +1816,7 @@ concat_text(PG_FUNCTION_ARGS) &dfunc; <sect2 id="xfunc-c-pgxs"> - <title>Extension build infrastructure</title> + <title>Extension Building Infrastructure</title> <indexterm zone="xfunc-c-pgxs"> <primary>pgxs</primary> @@ -1868,14 +2066,14 @@ c_overpaid(PG_FUNCTION_ARGS) HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0); int32 limit = PG_GETARG_INT32(1); bool isnull; - int32 salary; + Datum salary; - salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull)); + salary = GetAttributeByName(t, "salary", &isnull); if (isnull) PG_RETURN_BOOL(false); /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */ - PG_RETURN_BOOL(salary > limit); + PG_RETURN_BOOL(DatumGetInt32(salary) > limit); } </programlisting> </para> @@ -1890,7 +2088,10 @@ c_overpaid(PG_FUNCTION_ARGS) return parameter that tells whether the attribute is null. <function>GetAttributeByName</function> returns a <type>Datum</type> value that you can convert to the proper data type by using the - appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> macro. + appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> + macro. Note that the return value is meaningless if the null flag is + set; always check the null flag before trying to do anything with the + result. </para> <para> @@ -2222,7 +2423,7 @@ testpassbyval(PG_FUNCTION_ARGS) /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { - MemoryContext oldcontext; + MemoryContext oldcontext; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); @@ -2393,196 +2594,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray </sect2> </sect1> - <sect1 id="xfunc-overload"> - <title>Function Overloading</title> - - <indexterm zone="xfunc-overload"> - <primary>overloading</primary> - <secondary>functions</secondary> - </indexterm> - - <para> - More than one function may be defined with the same SQL name, so long - as the arguments they take are different. In other words, - function names can be <firstterm>overloaded</firstterm>. When a - query is executed, the server will determine which function to - call from the data types and the number of the provided arguments. - Overloading can also be used to simulate functions with a variable - number of arguments, up to a finite maximum number. - </para> - - <para> - When creating a family of overloaded functions, one should be - careful not to create ambiguities. For instance, given the - functions -<programlisting> -CREATE FUNCTION test(int, real) RETURNS ... -CREATE FUNCTION test(smallint, double precision) RETURNS ... -</programlisting> - it is not immediately clear which function would be called with - some trivial input like <literal>test(1, 1.5)</literal>. The - currently implemented resolution rules are described in - <xref linkend="typeconv">, but it is unwise to design a system that subtly - relies on this behavior. - </para> - - <para> - A function that takes a single argument of a composite type should - generally not have the same name as any attribute (field) of that type. - Recall that <literal>attribute(table)</literal> is considered equivalent - to <literal>table.attribute</literal>. In the case that there is an - ambiguity between a function on a composite type and an attribute of - the composite type, the attribute will always be used. It is possible - to override that choice by schema-qualifying the function name - (that is, <literal>schema.func(table)</literal>) but it's better to - avoid the problem by not choosing conflicting names. - </para> - - <para> - When overloading C-language functions, there is an additional - constraint: The C name of each function in the family of - overloaded functions must be different from the C names of all - other functions, either internal or dynamically loaded. If this - rule is violated, the behavior is not portable. You might get a - run-time linker error, or one of the functions will get called - (usually the internal one). The alternative form of the - <literal>AS</> clause for the SQL <command>CREATE - FUNCTION</command> command decouples the SQL function name from - the function name in the C source code. For instance, -<programlisting> -CREATE FUNCTION test(int) RETURNS int - AS '<replaceable>filename</>', 'test_1arg' - LANGUAGE C; -CREATE FUNCTION test(int, int) RETURNS int - AS '<replaceable>filename</>', 'test_2arg' - LANGUAGE C; -</programlisting> - The names of the C functions here reflect one of many possible conventions. - </para> - </sect1> - - <sect1 id="xfunc-volatility"> - <title>Function Volatility Categories</title> - - <indexterm zone="xfunc-volatility"> - <primary>volatility</primary> - <secondary>functions</secondary> - </indexterm> - - <para> - Every function has a <firstterm>volatility</> classification, with - the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or - <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the - <command>CREATE FUNCTION</command> command does not specify a category. - The volatility category is a promise to the optimizer about the behavior - of the function: - - <itemizedlist> - <listitem> - <para> - A <literal>VOLATILE</> function can do anything, including modifying - the database. It can return different results on successive calls with - the same arguments. The optimizer makes no assumptions about the - behavior of such functions. A query using a volatile function will - re-evaluate the function at every row where its value is needed. - </para> - </listitem> - <listitem> - <para> - A <literal>STABLE</> function cannot modify the database and is - guaranteed to return the same results given the same arguments - for all calls within a single surrounding query. This category - allows the optimizer to optimize away multiple calls of the function - within a single query. In particular, it is safe to use an expression - containing such a function in an index scan condition. (Since an - index scan will evaluate the comparison value only once, not once at - each row, it is not valid to use a <literal>VOLATILE</> function in - an index scan condition.) - </para> - </listitem> - <listitem> - <para> - An <literal>IMMUTABLE</> function cannot modify the database and is - guaranteed to return the same results given the same arguments forever. - This category allows the optimizer to pre-evaluate the function when - a query calls it with constant arguments. For example, a query like - <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to - <literal>SELECT ... WHERE x = 4</>, because the function underlying - the integer addition operator is marked <literal>IMMUTABLE</>. - </para> - </listitem> - </itemizedlist> - </para> - - <para> - For best optimization results, you should label your functions with the - strictest volatility category that is valid for them. - </para> - - <para> - Any function with side-effects <emphasis>must</> be labeled - <literal>VOLATILE</>, so that calls to it cannot be optimized away. - Even a function with no side-effects needs to be labeled - <literal>VOLATILE</> if its value can change within a single query; - some examples are <literal>random()</>, <literal>currval()</>, - <literal>timeofday()</>. - </para> - - <para> - There is relatively little difference between <literal>STABLE</> and - <literal>IMMUTABLE</> categories when considering simple interactive - queries that are planned and immediately executed: it doesn't matter - a lot whether a function is executed once during planning or once during - query execution startup. But there is a big difference if the plan is - saved and reused later. Labeling a function <literal>IMMUTABLE</> when - it really isn't may allow it to be prematurely folded to a constant during - planning, resulting in a stale value being re-used during subsequent uses - of the plan. This is a hazard when using prepared statements or when - using function languages that cache plans (such as - <application>PL/pgSQL</>). - </para> - - <para> - Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">) - a function containing only <command>SELECT</> commands can safely be - marked <literal>STABLE</>, even if it selects from tables that might be - undergoing modifications by concurrent queries. - <productname>PostgreSQL</productname> will execute a <literal>STABLE</> - function using the snapshot established for the calling query, and so it - will see a fixed view of the database throughout that query. - Also note - that the <function>current_timestamp</> family of functions qualify - as stable, since their values do not change within a transaction. - </para> - - <para> - The same snapshotting behavior is used for <command>SELECT</> commands - within <literal>IMMUTABLE</> functions. It is generally unwise to select - from database tables within an <literal>IMMUTABLE</> function at all, - since the immutability will be broken if the table contents ever change. - However, <productname>PostgreSQL</productname> does not enforce that you - do not do that. - </para> - - <para> - A common error is to label a function <literal>IMMUTABLE</> when its - results depend on a configuration parameter. For example, a function - that manipulates timestamps might well have results that depend on the - <xref linkend="guc-timezone"> setting. For safety, such functions should - be labeled <literal>STABLE</> instead. - </para> - - <note> - <para> - Before <productname>PostgreSQL</productname> release 8.0, the requirement - that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify - the database was not enforced by the system. Release 8.0 enforces it - by requiring SQL functions and procedural language functions of these - categories to contain no SQL commands other than <command>SELECT</>. - </para> - </note> - </sect1> - <!-- Keep this comment at the end of the file Local variables: mode:sgml diff --git a/doc/src/sgml/xplang.sgml b/doc/src/sgml/xplang.sgml index b8355d58f3c170b1259d0a0609611196c46b7492..88d59d8577a1d8fcff232515902630b1a918b35d 100644 --- a/doc/src/sgml/xplang.sgml +++ b/doc/src/sgml/xplang.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.26 2003/11/29 19:51:38 pgsql Exp $ +$PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.27 2004/12/30 03:13:56 tgl Exp $ --> <chapter id="xplang"> @@ -29,10 +29,16 @@ $PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.26 2003/11/29 19:51:38 pgsql Exp <para> Writing a handler for a new procedural language is described in <xref linkend="plhandler">. Several procedural languages are - available in the standard <productname>PostgreSQL</productname> + available in the core <productname>PostgreSQL</productname> distribution, which can serve as examples. </para> + <para> + There are additional procedural languages available that are not + included in the core distribution. <xref linkend="external-projects"> + has information about finding them. + </para> + <sect1 id="xplang-install"> <title>Installing Procedural Languages</title>