diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 3ae7d241e7bc17868e928747e1781d104911c522..727b00f0eafe5125b111fe6198dad62dddeb0a56 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.37 2005/01/09 17:47:30 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.38 2005/01/17 01:29:02 tgl Exp $ --> <chapter id="ddl"> <title>Data Definition</title> @@ -163,198 +163,6 @@ DROP TABLE products; </para> </sect1> - <sect1 id="ddl-system-columns"> - <title>System Columns</title> - - <para> - Every table has several <firstterm>system columns</> that are - implicitly defined by the system. Therefore, these names cannot be - used as names of user-defined columns. (Note that these - restrictions are separate from whether the name is a key word or - not; quoting a name will not allow you to escape these - restrictions.) You do not really need to be concerned about these - columns, just know they exist. - </para> - - <indexterm> - <primary>column</primary> - <secondary>system column</secondary> - </indexterm> - - <variablelist> - <varlistentry> - <term><structfield>oid</></term> - <listitem> - <para> - <indexterm> - <primary>OID</primary> - <secondary>column</secondary> - </indexterm> - The object identifier (object ID) of a row. This is a serial - number that is automatically added by - <productname>PostgreSQL</productname> to all table rows (unless - the table was created using <literal>WITHOUT OIDS</literal>, in which - case this column is not present). This column is of type - <type>oid</type> (same name as the column); see <xref - linkend="datatype-oid"> for more information about the type. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>tableoid</></term> - <listitem> - <indexterm> - <primary>tableoid</primary> - </indexterm> - - <para> - The OID of the table containing this row. This column is - particularly handy for queries that select from inheritance - hierarchies, since without it, it's difficult to tell which - individual table a row came from. The - <structfield>tableoid</structfield> can be joined against the - <structfield>oid</structfield> column of - <structname>pg_class</structname> to obtain the table name. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>xmin</></term> - <listitem> - <indexterm> - <primary>xmin</primary> - </indexterm> - - <para> - The identity (transaction ID) of the inserting transaction for - this row version. (A row version is an individual state of a - row; each update of a row creates a new row version for the same - logical row.) - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>cmin</></term> - <listitem> - <indexterm> - <primary>cmin</primary> - </indexterm> - - <para> - The command identifier (starting at zero) within the inserting - transaction. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>xmax</></term> - <listitem> - <indexterm> - <primary>xmax</primary> - </indexterm> - - <para> - The identity (transaction ID) of the deleting transaction, or - zero for an undeleted row version. It is possible for this column to - be nonzero in a visible row version. That usually indicates that the - deleting transaction hasn't committed yet, or that an attempted - deletion was rolled back. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>cmax</></term> - <listitem> - <indexterm> - <primary>cmax</primary> - </indexterm> - - <para> - The command identifier within the deleting transaction, or zero. - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term><structfield>ctid</></term> - <listitem> - <indexterm> - <primary>ctid</primary> - </indexterm> - - <para> - The physical location of the row version within its table. Note that - although the <structfield>ctid</structfield> can be used to - locate the row version very quickly, a row's - <structfield>ctid</structfield> will change each time it is - updated or moved by <command>VACUUM FULL</>. Therefore - <structfield>ctid</structfield> is useless as a long-term row - identifier. The OID, or even better a user-defined serial - number, should be used to identify logical rows. - </para> - </listitem> - </varlistentry> - </variablelist> - - <para> - OIDs are 32-bit quantities and are assigned from a single - cluster-wide counter. In a large or long-lived database, it is - possible for the counter to wrap around. Hence, it is bad - practice to assume that OIDs are unique, unless you take steps to - ensure that this is the case. If you need to identify the rows in - a table, using a sequence generator is strongly recommended. - However, OIDs can be used as well, provided that a few additional - precautions are taken: - - <itemizedlist> - <listitem> - <para> - A unique constraint should be created on the OID column of each - table for which the OID will be used to identify rows. - </para> - </listitem> - <listitem> - <para> - OIDs should never be assumed to be unique across tables; use - the combination of <structfield>tableoid</> and row OID if you - need a database-wide identifier. - </para> - </listitem> - <listitem> - <para> - The tables in question should be created using <literal>WITH - OIDS</literal> to ensure forward compatibility with future - releases of <productname>PostgreSQL</productname>. It is - planned that <literal>WITHOUT OIDS</> will become the default. - </para> - </listitem> - </itemizedlist> - </para> - - <para> - Transaction identifiers are also 32-bit quantities. In a - long-lived database it is possible for transaction IDs to wrap - around. This is not a fatal problem given appropriate maintenance - procedures; see <xref linkend="maintenance"> for details. It is - unwise, however, to depend on the uniqueness of transaction IDs - over the long term (more than one billion transactions). - </para> - - <para> - Command - identifiers are also 32-bit quantities. This creates a hard limit - of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands - within a single transaction. In practice this limit is not a - problem — note that the limit is on number of - <acronym>SQL</acronym> commands, not number of rows processed. - </para> - </sect1> - <sect1 id="ddl-default"> <title>Default Values</title> @@ -391,7 +199,7 @@ CREATE TABLE products ( </para> <para> - The default value may be a scalar expression, which will be + The default value may be an expression, which will be evaluated whenever the default value is inserted (<emphasis>not</emphasis> when the table is created). A common example is that a timestamp column may have a default of <literal>now()</>, @@ -460,9 +268,9 @@ CREATE TABLE products ( <para> A check constraint is the most generic constraint type. It allows - you to specify that the value in a certain column must satisfy an - arbitrary expression. For instance, to require positive product - prices, you could use: + you to specify that the value in a certain column must satisfy a + Boolean (truth-value) expression. For instance, to require positive + product prices, you could use: <programlisting> CREATE TABLE products ( product_no integer, @@ -500,7 +308,8 @@ CREATE TABLE products ( </programlisting> So, to specify a named constraint, use the key word <literal>CONSTRAINT</literal> followed by an identifier followed - by the constraint definition. + by the constraint definition. (If you don't specify a constraint + name in this way, the system chooses a name for you.) </para> <para> @@ -513,7 +322,7 @@ CREATE TABLE products ( name text, price numeric CHECK (price > 0), discounted_price numeric CHECK (discounted_price > 0), - CHECK (price > discounted_price) + <emphasis>CHECK (price > discounted_price)</emphasis> ); </programlisting> </para> @@ -529,9 +338,13 @@ CREATE TABLE products ( <para> We say that the first two constraints are column constraints, whereas the third one is a table constraint because it is written separately - from the column definitions. Column constraints can also be + from any one column definition. Column constraints can also be written as table constraints, while the reverse is not necessarily - possible. The above example could also be written as + possible, since a column constraint is supposed to refer to only the + column it is attached to. (<productname>PostgreSQL</productname> doesn't + enforce that rule, but you should follow it if you want your table + definitions to work with other database systems.) The above example could + also be written as <programlisting> CREATE TABLE products ( product_no integer, @@ -556,6 +369,22 @@ CREATE TABLE products ( It's a matter of taste. </para> + <para> + Names can be assigned to table constraints in just the same way as + for column constraints: +<programlisting> +CREATE TABLE products ( + product_no integer, + name text, + price numeric, + CHECK (price > 0), + discounted_price numeric, + CHECK (discounted_price > 0), + <emphasis>CONSTRAINT valid_discount</> CHECK (price > discounted_price) +); +</programlisting> + </para> + <indexterm> <primary>null value</primary> <secondary sortas="check constraints">with check constraints</secondary> @@ -564,7 +393,7 @@ CREATE TABLE products ( <para> It should be noted that a check constraint is satisfied if the check expression evaluates to true or the null value. Since most - expressions will evaluate to the null value if one operand is null, + expressions will evaluate to the null value if any operand is null, they will not prevent null values in the constrained columns. To ensure that a column does not contain null values, the not-null constraint described in the next section can be used. @@ -608,7 +437,7 @@ CREATE TABLE products ( <para> Of course, a column can have more than one constraint. Just write - the constraints after one another: + the constraints one after another: <programlisting> CREATE TABLE products ( product_no integer NOT NULL, @@ -624,7 +453,7 @@ CREATE TABLE products ( The <literal>NOT NULL</literal> constraint has an inverse: the <literal>NULL</literal> constraint. This does not mean that the column must be null, which would surely be useless. Instead, this - simply defines the default behavior that the column may be null. + simply selects the default behavior that the column may be null. The <literal>NULL</literal> constraint is not defined in the SQL standard and should not be used in portable applications. (It was only added to <productname>PostgreSQL</productname> to be @@ -695,10 +524,13 @@ CREATE TABLE example ( <emphasis>UNIQUE (a, c)</emphasis> ); </programlisting> + This specifies that the combination of values in the indicated columns + is unique across the whole table, though any one of the columns + need not be (and ordinarily isn't) unique. </para> <para> - It is also possible to assign names to unique constraints: + You can assign your own name for a unique constraint, in the usual way: <programlisting> CREATE TABLE products ( product_no integer <emphasis>CONSTRAINT must_be_different</emphasis> UNIQUE, @@ -857,7 +689,7 @@ CREATE TABLE orders ( <programlisting> CREATE TABLE orders ( order_id integer PRIMARY KEY, - product_no integer REFERENCES products, + product_no integer <emphasis>REFERENCES products</emphasis>, quantity integer ); </programlisting> @@ -877,10 +709,15 @@ CREATE TABLE t1 ( <emphasis>FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)</emphasis> ); </programlisting> - Of course, the number and type of the constrained columns needs to + Of course, the number and type of the constrained columns need to match the number and type of the referenced columns. </para> + <para> + You can assign your own name for a foreign key constraint, + in the usual way. + </para> + <para> A table can contain more than one foreign key constraint. This is used to implement many-to-many relationships between tables. Say @@ -907,7 +744,7 @@ CREATE TABLE order_items ( PRIMARY KEY (product_no, order_id) ); </programlisting> - Note also that the primary key overlaps with the foreign keys in + Notice that the primary key overlaps with the foreign keys in the last table. </para> @@ -1004,6 +841,198 @@ CREATE TABLE order_items ( </sect2> </sect1> + <sect1 id="ddl-system-columns"> + <title>System Columns</title> + + <para> + Every table has several <firstterm>system columns</> that are + implicitly defined by the system. Therefore, these names cannot be + used as names of user-defined columns. (Note that these + restrictions are separate from whether the name is a key word or + not; quoting a name will not allow you to escape these + restrictions.) You do not really need to be concerned about these + columns, just know they exist. + </para> + + <indexterm> + <primary>column</primary> + <secondary>system column</secondary> + </indexterm> + + <variablelist> + <varlistentry> + <term><structfield>oid</></term> + <listitem> + <para> + <indexterm> + <primary>OID</primary> + <secondary>column</secondary> + </indexterm> + The object identifier (object ID) of a row. This is a serial + number that is automatically added by + <productname>PostgreSQL</productname> to all table rows (unless + the table was created using <literal>WITHOUT OIDS</literal>, in which + case this column is not present). This column is of type + <type>oid</type> (same name as the column); see <xref + linkend="datatype-oid"> for more information about the type. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>tableoid</></term> + <listitem> + <indexterm> + <primary>tableoid</primary> + </indexterm> + + <para> + The OID of the table containing this row. This column is + particularly handy for queries that select from inheritance + hierarchies, since without it, it's difficult to tell which + individual table a row came from. The + <structfield>tableoid</structfield> can be joined against the + <structfield>oid</structfield> column of + <structname>pg_class</structname> to obtain the table name. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>xmin</></term> + <listitem> + <indexterm> + <primary>xmin</primary> + </indexterm> + + <para> + The identity (transaction ID) of the inserting transaction for + this row version. (A row version is an individual state of a + row; each update of a row creates a new row version for the same + logical row.) + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>cmin</></term> + <listitem> + <indexterm> + <primary>cmin</primary> + </indexterm> + + <para> + The command identifier (starting at zero) within the inserting + transaction. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>xmax</></term> + <listitem> + <indexterm> + <primary>xmax</primary> + </indexterm> + + <para> + The identity (transaction ID) of the deleting transaction, or + zero for an undeleted row version. It is possible for this column to + be nonzero in a visible row version. That usually indicates that the + deleting transaction hasn't committed yet, or that an attempted + deletion was rolled back. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>cmax</></term> + <listitem> + <indexterm> + <primary>cmax</primary> + </indexterm> + + <para> + The command identifier within the deleting transaction, or zero. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><structfield>ctid</></term> + <listitem> + <indexterm> + <primary>ctid</primary> + </indexterm> + + <para> + The physical location of the row version within its table. Note that + although the <structfield>ctid</structfield> can be used to + locate the row version very quickly, a row's + <structfield>ctid</structfield> will change each time it is + updated or moved by <command>VACUUM FULL</>. Therefore + <structfield>ctid</structfield> is useless as a long-term row + identifier. The OID, or even better a user-defined serial + number, should be used to identify logical rows. + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + OIDs are 32-bit quantities and are assigned from a single + cluster-wide counter. In a large or long-lived database, it is + possible for the counter to wrap around. Hence, it is bad + practice to assume that OIDs are unique, unless you take steps to + ensure that this is the case. If you need to identify the rows in + a table, using a sequence generator is strongly recommended. + However, OIDs can be used as well, provided that a few additional + precautions are taken: + + <itemizedlist> + <listitem> + <para> + A unique constraint should be created on the OID column of each + table for which the OID will be used to identify rows. + </para> + </listitem> + <listitem> + <para> + OIDs should never be assumed to be unique across tables; use + the combination of <structfield>tableoid</> and row OID if you + need a database-wide identifier. + </para> + </listitem> + <listitem> + <para> + The tables in question should be created using <literal>WITH + OIDS</literal> to ensure forward compatibility with future + releases of <productname>PostgreSQL</productname>. It is + planned that <literal>WITHOUT OIDS</> will become the default. + </para> + </listitem> + </itemizedlist> + </para> + + <para> + Transaction identifiers are also 32-bit quantities. In a + long-lived database it is possible for transaction IDs to wrap + around. This is not a fatal problem given appropriate maintenance + procedures; see <xref linkend="maintenance"> for details. It is + unwise, however, to depend on the uniqueness of transaction IDs + over the long term (more than one billion transactions). + </para> + + <para> + Command + identifiers are also 32-bit quantities. This creates a hard limit + of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands + within a single transaction. In practice this limit is not a + problem — note that the limit is on number of + <acronym>SQL</acronym> commands, not number of rows processed. + </para> + </sect1> + <sect1 id="ddl-inherit"> <title>Inheritance</title> @@ -1118,7 +1147,7 @@ SET SQL_Inheritance TO OFF; <para> In some cases you may wish to know which table a particular row originated from. There is a system column called - <structfield>TABLEOID</structfield> in each table which can tell you the + <structfield>tableoid</structfield> in each table which can tell you the originating table: <programlisting> @@ -1223,13 +1252,15 @@ WHERE c.altitude > 500 and c.tableoid = p.oid; <para> When you create a table and you realize that you made a mistake, or - the requirements of the application changed, then you can drop the + the requirements of the application change, then you can drop the table and create it again. But this is not a convenient option if the table is already filled with data, or if the table is referenced by other database objects (for instance a foreign key constraint). Therefore <productname>PostgreSQL</productname> - provides a family of commands to make modifications on existing - tables. + provides a family of commands to make modifications to existing + tables. Note that this is conceptually distinct from altering + the data contained in the table: here we are interested in altering + the definition, or structure, of the table. </para> <para> @@ -1275,7 +1306,7 @@ WHERE c.altitude > 500 and c.tableoid = p.oid; </indexterm> <para> - To add a column, use this command: + To add a column, use a command like this: <programlisting> ALTER TABLE products ADD COLUMN description text; </programlisting> @@ -1307,10 +1338,21 @@ ALTER TABLE products ADD COLUMN description text CHECK (description <> '') </indexterm> <para> - To remove a column, use this command: + To remove a column, use a command like this: <programlisting> ALTER TABLE products DROP COLUMN description; </programlisting> + Whatever data was in the column disappears. Table constraints involving + the column are dropped, too. However, if the column is referenced by a + foreign key constraint of another table, + <productname>PostgreSQL</productname> will not silently drop that + constraint. You can authorize dropping everything that depends on + the column by adding <literal>CASCADE</>: +<programlisting> +ALTER TABLE products DROP COLUMN description CASCADE; +</programlisting> + See <xref linkend="ddl-depend"> for a description of the general + mechanism behind this. </para> </sect2> @@ -1366,6 +1408,13 @@ ALTER TABLE products DROP CONSTRAINT some_name; identifier.) </para> + <para> + As with dropping a column, you need to add <literal>CASCADE</> if you + want to drop a constraint that something else depends on. An example + is that a foreign key constraint depends on a unique or primary key + constraint on the referenced column(s). + </para> + <para> This works the same for all constraint types except not-null constraints. To drop a not null constraint use @@ -1398,7 +1447,7 @@ ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77; <programlisting> ALTER TABLE products ALTER COLUMN price DROP DEFAULT; </programlisting> - This is equivalent to setting the default to null. + This is effectively the same as setting the default to null. As a consequence, it is not an error to drop a default where one hadn't been defined, because the default is implicitly the null value. @@ -1660,6 +1709,9 @@ CREATE SCHEMA myschema; <synopsis> <replaceable>schema</><literal>.</><replaceable>table</> </synopsis> + This works anywhere a table name is expected, including the table + modification commands and the data access commands discussed in + the following chapters. (For brevity we will speak of tables only, but the same ideas apply to other kinds of named objects, such as types and functions.) </para> @@ -1669,9 +1721,9 @@ CREATE SCHEMA myschema; <synopsis> <replaceable>database</><literal>.</><replaceable>schema</><literal>.</><replaceable>table</> </synopsis> - can be used too, but at present this is just for pro-forma compliance - with the SQL standard. If you write a database name, it must be the - same as the database you are connected to. + can be used too, but at present this is just for <foreignphrase>pro + forma</> compliance with the SQL standard. If you write a database name, + it must be the same as the database you are connected to. </para> <para> @@ -1681,9 +1733,6 @@ CREATE TABLE myschema.mytable ( ... ); </programlisting> - This works anywhere a table name is expected, including the table - modification commands and the data access commands discussed in - the following chapters. </para> <indexterm> @@ -1844,7 +1893,7 @@ SET search_path TO myschema; </para> <para> - See also <xref linkend="functions-info"> for other ways to access + See also <xref linkend="functions-info"> for other ways to manipulate the schema search path. </para> @@ -2044,7 +2093,13 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC; <listitem> <para> - Functions, operators, data types, domains + Functions and operators + </para> + </listitem> + + <listitem> + <para> + Data types and domains </para> </listitem> @@ -2120,7 +2175,7 @@ DROP TABLE products CASCADE; <para> According to the SQL standard, specifying either <literal>RESTRICT</literal> or <literal>CASCADE</literal> is - required. No database system actually implements it that way, but + required. No database system actually enforces that rule, but whether the default behavior is <literal>RESTRICT</literal> or <literal>CASCADE</literal> varies across systems. </para> @@ -2132,7 +2187,7 @@ DROP TABLE products CASCADE; from <productname>PostgreSQL</productname> versions prior to 7.3 are <emphasis>not</emphasis> maintained or created during the upgrade process. All other dependency types will be properly - created during an upgrade. + created during an upgrade from a pre-7.3 database. </para> </note> </sect1>