diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index a018849498136360af353779e01659193164ce35..6740d9d6f7687c10d895b0b4f8da24db0ad838ff 100644 --- a/doc/src/sgml/queries.sgml +++ b/doc/src/sgml/queries.sgml @@ -118,10 +118,12 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r </synopsis> A table reference can be a table name (possibly schema-qualified), - or a derived table such as a subquery, a table join, or complex - combinations of these. If more than one table reference is listed - in the <literal>FROM</> clause they are cross-joined (see below) - to form the intermediate virtual table that can then be subject to + or a derived table such as a subquery, a <literal>JOIN</> construct, or + complex combinations of these. If more than one table reference is + listed in the <literal>FROM</> clause, the tables are cross-joined + (that is, the Cartesian product of their rows is formed; see below). + The result of the <literal>FROM</> list is an intermediate virtual + table that can then be subject to transformations by the <literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses and is finally the result of the overall table expression. @@ -161,6 +163,16 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r A joined table is a table derived from two other (real or derived) tables according to the rules of the particular join type. Inner, outer, and cross-joins are available. + The general syntax of a joined table is +<synopsis> +<replaceable>T1</replaceable> <replaceable>join_type</replaceable> <replaceable>T2</replaceable> <optional> <replaceable>join_condition</replaceable> </optional> +</synopsis> + Joins of all types can be chained together, or nested: either or + both <replaceable>T1</replaceable> and + <replaceable>T2</replaceable> can be joined tables. Parentheses + can be used around <literal>JOIN</> clauses to control the join + order. In the absence of parentheses, <literal>JOIN</> clauses + nest left-to-right. </para> <variablelist> @@ -197,10 +209,28 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r <para> <literal>FROM <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable></literal> is equivalent to - <literal>FROM <replaceable>T1</replaceable>, - <replaceable>T2</replaceable></literal>. It is also equivalent to <literal>FROM <replaceable>T1</replaceable> INNER JOIN <replaceable>T2</replaceable> ON TRUE</literal> (see below). + It is also equivalent to + <literal>FROM <replaceable>T1</replaceable>, + <replaceable>T2</replaceable></literal>. + <note> + <para> + This latter equivalence does not hold exactly when more than two + tables appear, because <literal>JOIN</> binds more tightly than + comma. For example + <literal>FROM <replaceable>T1</replaceable> CROSS JOIN + <replaceable>T2</replaceable> INNER JOIN <replaceable>T3</replaceable> + ON <replaceable>condition</replaceable></literal> + is not the same as + <literal>FROM <replaceable>T1</replaceable>, + <replaceable>T2</replaceable> INNER JOIN <replaceable>T3</replaceable> + ON <replaceable>condition</replaceable></literal> + because the <replaceable>condition</replaceable> can + reference <replaceable>T1</replaceable> in the first case but not + the second. + </para> + </note> </para> </listitem> </varlistentry> @@ -240,47 +270,6 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r <quote>match</quote>, as explained in detail below. </para> - <para> - The <literal>ON</> clause is the most general kind of join - condition: it takes a Boolean value expression of the same - kind as is used in a <literal>WHERE</> clause. A pair of rows - from <replaceable>T1</> and <replaceable>T2</> match if the - <literal>ON</> expression evaluates to true for them. - </para> - - <para> - <literal>USING</> is a shorthand notation: it takes a - comma-separated list of column names, which the joined tables - must have in common, and forms a join condition specifying - equality of each of these pairs of columns. Furthermore, the - output of <literal>JOIN USING</> has one column for each of - the equated pairs of input columns, followed by the - remaining columns from each table. Thus, <literal>USING (a, b, - c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND - t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that - if <literal>ON</> is used there will be two columns - <literal>a</>, <literal>b</>, and <literal>c</> in the result, - whereas with <literal>USING</> there will be only one of each - (and they will appear first if <command>SELECT *</> is used). - </para> - - <para> - <indexterm> - <primary>join</primary> - <secondary>natural</secondary> - </indexterm> - <indexterm> - <primary>natural join</primary> - </indexterm> - Finally, <literal>NATURAL</> is a shorthand form of - <literal>USING</>: it forms a <literal>USING</> list - consisting of all column names that appear in both - input tables. As with <literal>USING</>, these columns appear - only once in the output table. If there are no common - columns, <literal>NATURAL</literal> behaves like - <literal>CROSS JOIN</literal>. - </para> - <para> The possible types of qualified join are: @@ -358,19 +347,70 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r </varlistentry> </variablelist> </para> + + <para> + The <literal>ON</> clause is the most general kind of join + condition: it takes a Boolean value expression of the same + kind as is used in a <literal>WHERE</> clause. A pair of rows + from <replaceable>T1</> and <replaceable>T2</> match if the + <literal>ON</> expression evaluates to true. + </para> + + <para> + The <literal>USING</> clause is a shorthand that allows you to take + advantage of the specific situation where both sides of the join use + the same name for the joining column(s). It takes a + comma-separated list of the shared column names + and forms a join condition that includes an equality comparison + for each one. For example, joining <replaceable>T1</> + and <replaceable>T2</> with <literal>USING (a, b)</> produces + the join condition <literal>ON <replaceable>T1</>.a + = <replaceable>T2</>.a AND <replaceable>T1</>.b + = <replaceable>T2</>.b</literal>. + </para> + + <para> + Furthermore, the output of <literal>JOIN USING</> suppresses + redundant columns: there is no need to print both of the matched + columns, since they must have equal values. While <literal>JOIN + ON</> produces all columns from <replaceable>T1</> followed by all + columns from <replaceable>T2</>, <literal>JOIN USING</> produces one + output column for each of the listed column pairs (in the listed + order), followed by any remaining columns from <replaceable>T1</>, + followed by any remaining columns from <replaceable>T2</>. + </para> + + <para> + <indexterm> + <primary>join</primary> + <secondary>natural</secondary> + </indexterm> + <indexterm> + <primary>natural join</primary> + </indexterm> + Finally, <literal>NATURAL</> is a shorthand form of + <literal>USING</>: it forms a <literal>USING</> list + consisting of all column names that appear in both + input tables. As with <literal>USING</>, these columns appear + only once in the output table. If there are no common + column names, <literal>NATURAL</literal> behaves like + <literal>CROSS JOIN</literal>. + </para> + + <note> + <para> + <literal>USING</literal> is reasonably safe from column changes + in the joined relations since only the listed columns + are combined. <literal>NATURAL</> is considerably more risky since + any schema changes to either relation that cause a new matching + column name to be present will cause the join to combine that new + column as well. + </para> + </note> </listitem> </varlistentry> </variablelist> - <para> - Joins of all types can be chained together or nested: either or - both <replaceable>T1</replaceable> and - <replaceable>T2</replaceable> can be joined tables. Parentheses - can be used around <literal>JOIN</> clauses to control the join - order. In the absence of parentheses, <literal>JOIN</> clauses - nest left-to-right. - </para> - <para> To put this together, assume we have tables <literal>t1</literal>: <programlisting> @@ -487,6 +527,8 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r clause is processed <emphasis>before</> the join, while a restriction placed in the <literal>WHERE</> clause is processed <emphasis>after</> the join. + That does not matter with inner joins, but it matters a lot with outer + joins. </para> </sect3>