diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index f2906cc82230150f72353609e9c831e90dcc10ca..7d6125c97e5203c9d092ceec3aaf351c1a5fcf1b 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.492 2009/11/24 19:21:15 petere Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.493 2009/12/15 17:57:46 tgl Exp $ --> <chapter id="functions"> <title>Functions and Operators</title> @@ -8440,7 +8440,8 @@ SELECT xmlroot(xmlparse(document '<?xml version="1.1"?><content>abc</content>'), The function <function>xmlagg</function> is, unlike the other functions described here, an aggregate function. It concatenates the input values to the aggregate function call, - like <function>xmlconcat</function> does. + much like <function>xmlconcat</function> does, except that concatenation + occurs across rows rather than across expressions in a single row. See <xref linkend="functions-aggregate"> for additional information about aggregate functions. </para> @@ -8459,18 +8460,29 @@ SELECT xmlagg(x) FROM test; </para> <para> - To determine the order of the concatenation, something like the - following approach can be used: + To determine the order of the concatenation, an <literal>ORDER BY</> + clause may be added to the aggregate call as described in + <xref linkend="syntax-aggregates">. For example: <screen><![CDATA[ -SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; +SELECT xmlagg(x ORDER BY y DESC) FROM test; xmlagg ---------------------- <bar/><foo>abc</foo> ]]></screen> + </para> - Again, see <xref linkend="functions-aggregate"> for additional - information. + <para> + The following non-standard approach used to be recommended + in previous versions, and may still be useful in specific + cases: + +<screen><![CDATA[ +SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; + xmlagg +---------------------- + <bar/><foo>abc</foo> +]]></screen> </para> </sect3> @@ -9887,20 +9899,19 @@ SELECT count(*) FROM sometable; The aggregate functions <function>array_agg</function> and <function>xmlagg</function>, as well as similar user-defined aggregate functions, produce meaningfully different result values - depending on the order of the input values. In the current - implementation, the order of the input is in principle unspecified. - Supplying the input values from a sorted subquery - will usually work, however. For example: + depending on the order of the input values. This ordering is + unspecified by default, but can be controlled by writing an + <literal>ORDER BY</> clause within the aggregate call, as shown in + <xref linkend="syntax-aggregates">. + Alternatively, supplying the input values from a sorted subquery + will usually work. For example: <screen><![CDATA[ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; ]]></screen> But this syntax is not allowed in the SQL standard, and is - not portable to other database systems. A future version of - <productname>PostgreSQL</> might provide an additional feature to control - the order in a better-defined way (<literal>xmlagg(expr ORDER BY expr, expr, - ...)</literal>). + not portable to other database systems. </para> <para> diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 2d11d20984ce91ee9cc7fef2540e01c902c7b8ad..984908927f7b67af7decd1b6f654ce99ba831a90 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.138 2009/11/05 23:24:22 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.139 2009/12/15 17:57:46 tgl Exp $ --> <chapter id="sql-syntax"> <title>SQL Syntax</title> @@ -1525,17 +1525,19 @@ sqrt(2) syntax of an aggregate expression is one of the following: <synopsis> -<replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] ) -<replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] ) -<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>) +<replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) +<replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) +<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] [ <replaceable>order_by_clause</replaceable> ] ) <replaceable>aggregate_name</replaceable> ( * ) </synopsis> where <replaceable>aggregate_name</replaceable> is a previously - defined aggregate (possibly qualified with a schema name), and + defined aggregate (possibly qualified with a schema name), <replaceable>expression</replaceable> is any value expression that does not itself contain an aggregate - expression or a window function call. + expression or a window function call, and + <replaceable>order_by_clause</replaceable> is a optional + <literal>ORDER BY</> clause as described below. </para> <para> @@ -1545,8 +1547,9 @@ sqrt(2) whether to ignore null values or not — but all the standard ones do.) The second form is the same as the first, since <literal>ALL</literal> is the default. The third form invokes the - aggregate for all distinct non-null values of the expressions found - in the input rows. The last form invokes the aggregate once for + aggregate for all distinct values of the expressions found + in the input rows (ignoring nulls if the function chooses to do so). + The last form invokes the aggregate once for each input row regardless of null or non-null values; since no particular input value is specified, it is generally only useful for the <function>count(*)</function> aggregate function. @@ -1560,6 +1563,40 @@ sqrt(2) distinct non-null values of <literal>f1</literal>. </para> + <para> + Ordinarily, the input rows are fed to the aggregate function in an + unspecified order. In many cases this does not matter; for example, + <function>min</> produces the same result no matter what order it + receives the inputs in. However, some aggregate functions + (such as <function>array_agg</> and <function>xmlagg</>) produce + results that depend on the ordering of the input rows. When using + such an aggregate, the optional <replaceable>order_by_clause</> can be + used to specify the desired ordering. The <replaceable>order_by_clause</> + has the same syntax as for a query-level <literal>ORDER BY</> clause, as + described in <xref linkend="queries-order">, except that its expressions + are always just expressions and cannot be output-column names or numbers. + For example: + +<programlisting> +SELECT array_agg(a ORDER BY b DESC) FROM table; +</programlisting> + </para> + + <para> + If <literal>DISTINCT</> is specified in addition to an + <replaceable>order_by_clause</>, then all the <literal>ORDER BY</> + expressions must match regular arguments of the aggregate; that is, + you cannot sort on an expression that is not included in the + <literal>DISTINCT</> list. + </para> + + <note> + <para> + The ability to specify both <literal>DISTINCT</> and <literal>ORDER BY</> + in an aggregate function is a <productname>PostgreSQL</> extension. + </para> + </note> + <para> The predefined aggregate functions are described in <xref linkend="functions-aggregate">. Other aggregate functions can be added @@ -1588,13 +1625,6 @@ sqrt(2) appearing only in the result list or <literal>HAVING</> clause applies with respect to the query level that the aggregate belongs to. </para> - - <note> - <para> - <productname>PostgreSQL</productname> currently does not support - <literal>DISTINCT</> with more than one input expression. - </para> - </note> </sect2> <sect2 id="syntax-window-functions"> @@ -1697,7 +1727,8 @@ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING <literal>count(*) OVER (PARTITION BY x ORDER BY y)</>. <literal>*</> is customarily not used for non-aggregate window functions. Aggregate window functions, unlike normal aggregate functions, do not - allow <literal>DISTINCT</> to be used within the function argument list. + allow <literal>DISTINCT</> or <literal>ORDER BY</> to be used within the + function argument list. </para> <para> diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index 00234f0e236d4b7aa3e7458afc97ab6f4629ccae..b4cc1e553b3ad7365319aca1075f4806be3bda71 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -13,6 +13,10 @@ * If a finalfunc is not supplied then the result is just the ending * value of transvalue. * + * If an aggregate call specifies DISTINCT or ORDER BY, we sort the input + * tuples and eliminate duplicates (if required) before performing the + * above-depicted process. + * * If transfunc is marked "strict" in pg_proc and initcond is NULL, * then the first non-NULL input_value is assigned directly to transvalue, * and transfunc isn't applied until the second non-NULL input_value. @@ -63,7 +67,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/executor/nodeAgg.c,v 1.169 2009/09/27 21:10:53 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/executor/nodeAgg.c,v 1.170 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -78,9 +82,9 @@ #include "miscadmin.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" +#include "optimizer/tlist.h" #include "parser/parse_agg.h" #include "parser/parse_coerce.h" -#include "parser/parse_oper.h" #include "utils/acl.h" #include "utils/builtins.h" #include "utils/lsyscache.h" @@ -104,9 +108,12 @@ typedef struct AggStatePerAggData AggrefExprState *aggrefstate; Aggref *aggref; - /* number of input arguments for aggregate */ + /* number of input arguments for aggregate function proper */ int numArguments; + /* number of inputs including ORDER BY expressions */ + int numInputs; + /* Oids of transfer functions */ Oid transfn_oid; Oid finalfn_oid; /* may be InvalidOid */ @@ -119,19 +126,24 @@ typedef struct AggStatePerAggData FmgrInfo transfn; FmgrInfo finalfn; - /* - * Type of input data and Oid of sort operator to use for it; only - * set/used when aggregate has DISTINCT flag. (These are not used - * directly by nodeAgg, but must be passed to the Tuplesort object.) - */ - Oid inputType; - Oid sortOperator; + /* number of sorting columns */ + int numSortCols; + + /* number of sorting columns to consider in DISTINCT comparisons */ + /* (this is either zero or the same as numSortCols) */ + int numDistinctCols; + + /* deconstructed sorting information (arrays of length numSortCols) */ + AttrNumber *sortColIdx; + Oid *sortOperators; + bool *sortNullsFirst; /* - * fmgr lookup data for input type's equality operator --- only set/used - * when aggregate has DISTINCT flag. + * fmgr lookup data for input columns' equality operators --- only + * set/used when aggregate has DISTINCT flag. Note that these are in + * order of sort column index, not parameter index. */ - FmgrInfo equalfn; + FmgrInfo *equalfns; /* array of length numDistinctCols */ /* * initial value from pg_aggregate entry @@ -142,6 +154,9 @@ typedef struct AggStatePerAggData /* * We need the len and byval info for the agg's input, result, and * transition data types in order to know how to copy/delete values. + * + * Note that the info for the input type is used only when handling + * DISTINCT aggs with just one argument, so there is only one input type. */ int16 inputtypeLen, resulttypeLen, @@ -150,18 +165,35 @@ typedef struct AggStatePerAggData resulttypeByVal, transtypeByVal; + /* + * Stuff for evaluation of inputs. We used to just use ExecEvalExpr, but + * with the addition of ORDER BY we now need at least a slot for passing + * data to the sort object, which requires a tupledesc, so we might as + * well go whole hog and use ExecProject too. + */ + TupleDesc evaldesc; /* descriptor of input tuples */ + ProjectionInfo *evalproj; /* projection machinery */ + + /* + * Slots for holding the evaluated input arguments. These are set up + * during ExecInitAgg() and then used for each input row. + */ + TupleTableSlot *evalslot; /* current input tuple */ + TupleTableSlot *uniqslot; /* used for multi-column DISTINCT */ + /* * These values are working state that is initialized at the start of an * input tuple group and updated for each input tuple. * - * For a simple (non DISTINCT) aggregate, we just feed the input values - * straight to the transition function. If it's DISTINCT, we pass the - * input values into a Tuplesort object; then at completion of the input - * tuple group, we scan the sorted values, eliminate duplicates, and run - * the transition function on the rest. + * For a simple (non DISTINCT/ORDER BY) aggregate, we just feed the input + * values straight to the transition function. If it's DISTINCT or + * requires ORDER BY, we pass the input values into a Tuplesort object; + * then at completion of the input tuple group, we scan the sorted values, + * eliminate duplicates if needed, and run the transition function on the + * rest. */ - Tuplesortstate *sortstate; /* sort object, if a DISTINCT agg */ + Tuplesortstate *sortstate; /* sort object, if DISTINCT or ORDER BY */ } AggStatePerAggData; /* @@ -220,7 +252,10 @@ static void advance_transition_function(AggState *aggstate, AggStatePerGroup pergroupstate, FunctionCallInfoData *fcinfo); static void advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup); -static void process_sorted_aggregate(AggState *aggstate, +static void process_ordered_aggregate_single(AggState *aggstate, + AggStatePerAgg peraggstate, + AggStatePerGroup pergroupstate); +static void process_ordered_aggregate_multi(AggState *aggstate, AggStatePerAgg peraggstate, AggStatePerGroup pergroupstate); static void finalize_aggregate(AggState *aggstate, @@ -254,12 +289,11 @@ initialize_aggregates(AggState *aggstate, { AggStatePerAgg peraggstate = &peragg[aggno]; AggStatePerGroup pergroupstate = &pergroup[aggno]; - Aggref *aggref = peraggstate->aggref; /* - * Start a fresh sort operation for each DISTINCT aggregate. + * Start a fresh sort operation for each DISTINCT/ORDER BY aggregate. */ - if (aggref->aggdistinct) + if (peraggstate->numSortCols > 0) { /* * In case of rescan, maybe there could be an uncompleted sort @@ -268,10 +302,23 @@ initialize_aggregates(AggState *aggstate, if (peraggstate->sortstate) tuplesort_end(peraggstate->sortstate); + /* + * We use a plain Datum sorter when there's a single input + * column; otherwise sort the full tuple. (See comments for + * process_ordered_aggregate_single.) + */ peraggstate->sortstate = - tuplesort_begin_datum(peraggstate->inputType, - peraggstate->sortOperator, false, - work_mem, false); + (peraggstate->numInputs == 1) ? + tuplesort_begin_datum(peraggstate->evaldesc->attrs[0]->atttypid, + peraggstate->sortOperators[0], + peraggstate->sortNullsFirst[0], + work_mem, false) : + tuplesort_begin_heap(peraggstate->evaldesc, + peraggstate->numSortCols, + peraggstate->sortColIdx, + peraggstate->sortOperators, + peraggstate->sortNullsFirst, + work_mem, false); } /* @@ -418,78 +465,110 @@ advance_transition_function(AggState *aggstate, static void advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup) { - ExprContext *econtext = aggstate->tmpcontext; int aggno; for (aggno = 0; aggno < aggstate->numaggs; aggno++) { AggStatePerAgg peraggstate = &aggstate->peragg[aggno]; AggStatePerGroup pergroupstate = &pergroup[aggno]; - AggrefExprState *aggrefstate = peraggstate->aggrefstate; - Aggref *aggref = peraggstate->aggref; - FunctionCallInfoData fcinfo; + int nargs = peraggstate->numArguments; int i; - ListCell *arg; - MemoryContext oldContext; + TupleTableSlot *slot; - /* Switch memory context just once for all args */ - oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory); + /* Evaluate the current input expressions for this aggregate */ + slot = ExecProject(peraggstate->evalproj, NULL); - /* Evaluate inputs and save in fcinfo */ - /* We start from 1, since the 0th arg will be the transition value */ - i = 1; - foreach(arg, aggrefstate->args) + if (peraggstate->numSortCols > 0) { - ExprState *argstate = (ExprState *) lfirst(arg); - - fcinfo.arg[i] = ExecEvalExpr(argstate, econtext, - fcinfo.argnull + i, NULL); - i++; - } + /* DISTINCT and/or ORDER BY case */ + Assert(slot->tts_nvalid == peraggstate->numInputs); - /* Switch back */ - MemoryContextSwitchTo(oldContext); + /* + * If the transfn is strict, we want to check for nullity + * before storing the row in the sorter, to save space if + * there are a lot of nulls. Note that we must only check + * numArguments columns, not numInputs, since nullity in + * columns used only for sorting is not relevant here. + */ + if (peraggstate->transfn.fn_strict) + { + for (i = 0; i < nargs; i++) + { + if (slot->tts_isnull[i]) + break; + } + if (i < nargs) + continue; + } - if (aggref->aggdistinct) - { - /* in DISTINCT mode, we may ignore nulls */ - /* XXX we assume there is only one input column */ - if (fcinfo.argnull[1]) - continue; - tuplesort_putdatum(peraggstate->sortstate, fcinfo.arg[1], - fcinfo.argnull[1]); + /* OK, put the tuple into the tuplesort object */ + if (peraggstate->numInputs == 1) + tuplesort_putdatum(peraggstate->sortstate, + slot->tts_values[0], + slot->tts_isnull[0]); + else + tuplesort_puttupleslot(peraggstate->sortstate, slot); } else { + /* We can apply the transition function immediately */ + FunctionCallInfoData fcinfo; + + /* Load values into fcinfo */ + /* Start from 1, since the 0th arg will be the transition value */ + Assert(slot->tts_nvalid >= nargs); + for (i = 0; i < nargs; i++) + { + fcinfo.arg[i + 1] = slot->tts_values[i]; + fcinfo.argnull[i + 1] = slot->tts_isnull[i]; + } + advance_transition_function(aggstate, peraggstate, pergroupstate, &fcinfo); } } } + /* - * Run the transition function for a DISTINCT aggregate. This is called - * after we have completed entering all the input values into the sort - * object. We complete the sort, read out the values in sorted order, - * and run the transition function on each non-duplicate value. + * Run the transition function for a DISTINCT or ORDER BY aggregate + * with only one input. This is called after we have completed + * entering all the input values into the sort object. We complete the + * sort, read out the values in sorted order, and run the transition + * function on each value (applying DISTINCT if appropriate). + * + * Note that the strictness of the transition function was checked when + * entering the values into the sort, so we don't check it again here; + * we just apply standard SQL DISTINCT logic. + * + * The one-input case is handled separately from the multi-input case + * for performance reasons: for single by-value inputs, such as the + * common case of count(distinct id), the tuplesort_getdatum code path + * is around 300% faster. (The speedup for by-reference types is less + * but still noticeable.) * * When called, CurrentMemoryContext should be the per-query context. */ static void -process_sorted_aggregate(AggState *aggstate, - AggStatePerAgg peraggstate, - AggStatePerGroup pergroupstate) +process_ordered_aggregate_single(AggState *aggstate, + AggStatePerAgg peraggstate, + AggStatePerGroup pergroupstate) { Datum oldVal = (Datum) 0; + bool oldIsNull = true; bool haveOldVal = false; MemoryContext workcontext = aggstate->tmpcontext->ecxt_per_tuple_memory; MemoryContext oldContext; + bool isDistinct = (peraggstate->numDistinctCols > 0); Datum *newVal; bool *isNull; FunctionCallInfoData fcinfo; + Assert(peraggstate->numDistinctCols < 2); + tuplesort_performsort(peraggstate->sortstate); + /* Load the column into argument 1 (arg 0 will be transition value) */ newVal = fcinfo.arg + 1; isNull = fcinfo.argnull + 1; @@ -502,13 +581,6 @@ process_sorted_aggregate(AggState *aggstate, while (tuplesort_getdatum(peraggstate->sortstate, true, newVal, isNull)) { - /* - * DISTINCT always suppresses nulls, per SQL spec, regardless of the - * transition function's strictness. - */ - if (*isNull) - continue; - /* * Clear and select the working context for evaluation of the equality * function and transition function. @@ -516,12 +588,18 @@ process_sorted_aggregate(AggState *aggstate, MemoryContextReset(workcontext); oldContext = MemoryContextSwitchTo(workcontext); - if (haveOldVal && - DatumGetBool(FunctionCall2(&peraggstate->equalfn, - oldVal, *newVal))) + /* + * If DISTINCT mode, and not distinct from prior, skip it. + */ + if (isDistinct && + haveOldVal && + ((oldIsNull && *isNull) || + (!oldIsNull && !*isNull && + DatumGetBool(FunctionCall2(&peraggstate->equalfns[0], + oldVal, *newVal))))) { /* equal to prior, so forget this one */ - if (!peraggstate->inputtypeByVal) + if (!peraggstate->inputtypeByVal && !*isNull) pfree(DatumGetPointer(*newVal)); } else @@ -529,23 +607,105 @@ process_sorted_aggregate(AggState *aggstate, advance_transition_function(aggstate, peraggstate, pergroupstate, &fcinfo); /* forget the old value, if any */ - if (haveOldVal && !peraggstate->inputtypeByVal) + if (!oldIsNull && !peraggstate->inputtypeByVal) pfree(DatumGetPointer(oldVal)); /* and remember the new one for subsequent equality checks */ oldVal = *newVal; + oldIsNull = *isNull; haveOldVal = true; } MemoryContextSwitchTo(oldContext); } - if (haveOldVal && !peraggstate->inputtypeByVal) + if (!oldIsNull && !peraggstate->inputtypeByVal) pfree(DatumGetPointer(oldVal)); tuplesort_end(peraggstate->sortstate); peraggstate->sortstate = NULL; } +/* + * Run the transition function for a DISTINCT or ORDER BY aggregate + * with more than one input. This is called after we have completed + * entering all the input values into the sort object. We complete the + * sort, read out the values in sorted order, and run the transition + * function on each value (applying DISTINCT if appropriate). + * + * When called, CurrentMemoryContext should be the per-query context. + */ +static void +process_ordered_aggregate_multi(AggState *aggstate, + AggStatePerAgg peraggstate, + AggStatePerGroup pergroupstate) +{ + MemoryContext workcontext = aggstate->tmpcontext->ecxt_per_tuple_memory; + FunctionCallInfoData fcinfo; + TupleTableSlot *slot1 = peraggstate->evalslot; + TupleTableSlot *slot2 = peraggstate->uniqslot; + int numArguments = peraggstate->numArguments; + int numDistinctCols = peraggstate->numDistinctCols; + bool haveOldValue = false; + int i; + + tuplesort_performsort(peraggstate->sortstate); + + ExecClearTuple(slot1); + if (slot2) + ExecClearTuple(slot2); + + while (tuplesort_gettupleslot(peraggstate->sortstate, true, slot1)) + { + /* + * Extract the first numArguments as datums to pass to the transfn. + * (This will help execTuplesMatch too, so do it immediately.) + */ + slot_getsomeattrs(slot1, numArguments); + + if (numDistinctCols == 0 || + !haveOldValue || + !execTuplesMatch(slot1, slot2, + numDistinctCols, + peraggstate->sortColIdx, + peraggstate->equalfns, + workcontext)) + { + /* Load values into fcinfo */ + /* Start from 1, since the 0th arg will be the transition value */ + for (i = 0; i < numArguments; i++) + { + fcinfo.arg[i + 1] = slot1->tts_values[i]; + fcinfo.argnull[i + 1] = slot1->tts_isnull[i]; + } + + advance_transition_function(aggstate, peraggstate, pergroupstate, + &fcinfo); + + if (numDistinctCols > 0) + { + /* swap the slot pointers to retain the current tuple */ + TupleTableSlot *tmpslot = slot2; + + slot2 = slot1; + slot1 = tmpslot; + haveOldValue = true; + } + } + + /* Reset context each time, unless execTuplesMatch did it for us */ + if (numDistinctCols == 0) + MemoryContextReset(workcontext); + + ExecClearTuple(slot1); + } + + if (slot2) + ExecClearTuple(slot2); + + tuplesort_end(peraggstate->sortstate); + peraggstate->sortstate = NULL; +} + /* * Compute the final value of one aggregate. * @@ -983,8 +1143,17 @@ agg_retrieve_direct(AggState *aggstate) AggStatePerAgg peraggstate = &peragg[aggno]; AggStatePerGroup pergroupstate = &pergroup[aggno]; - if (peraggstate->aggref->aggdistinct) - process_sorted_aggregate(aggstate, peraggstate, pergroupstate); + if (peraggstate->numSortCols > 0) + { + if (peraggstate->numInputs == 1) + process_ordered_aggregate_single(aggstate, + peraggstate, + pergroupstate); + else + process_ordered_aggregate_multi(aggstate, + peraggstate, + pergroupstate); + } finalize_aggregate(aggstate, peraggstate, pergroupstate, &aggvalues[aggno], &aggnulls[aggno]); @@ -1138,7 +1307,7 @@ agg_retrieve_hash_table(AggState *aggstate) AggStatePerAgg peraggstate = &peragg[aggno]; AggStatePerGroup pergroupstate = &pergroup[aggno]; - Assert(!peraggstate->aggref->aggdistinct); + Assert(peraggstate->numSortCols == 0); finalize_aggregate(aggstate, peraggstate, pergroupstate, &aggvalues[aggno], &aggnulls[aggno]); } @@ -1363,6 +1532,10 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) AggStatePerAgg peraggstate; Oid inputTypes[FUNC_MAX_ARGS]; int numArguments; + int numInputs; + int numSortCols; + int numDistinctCols; + List *sortlist; HeapTuple aggTuple; Form_pg_aggregate aggform; Oid aggtranstype; @@ -1401,19 +1574,24 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) /* Fill in the peraggstate data */ peraggstate->aggrefstate = aggrefstate; peraggstate->aggref = aggref; - numArguments = list_length(aggref->args); - peraggstate->numArguments = numArguments; + numInputs = list_length(aggref->args); + peraggstate->numInputs = numInputs; + peraggstate->sortstate = NULL; /* * Get actual datatypes of the inputs. These could be different from * the agg's declared input types, when the agg accepts ANY or a * polymorphic type. */ - i = 0; + numArguments = 0; foreach(lc, aggref->args) { - inputTypes[i++] = exprType((Node *) lfirst(lc)); + TargetEntry *tle = (TargetEntry *) lfirst(lc); + + if (!tle->resjunk) + inputTypes[numArguments++] = exprType((Node *) tle->expr); } + peraggstate->numArguments = numArguments; aggTuple = SearchSysCache(AGGFNOID, ObjectIdGetDatum(aggref->aggfnoid), @@ -1538,43 +1716,115 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) aggref->aggfnoid))); } + /* + * Get a tupledesc corresponding to the inputs (including sort + * expressions) of the agg. + */ + peraggstate->evaldesc = ExecTypeFromTL(aggref->args, false); + + /* Create slot we're going to do argument evaluation in */ + peraggstate->evalslot = ExecInitExtraTupleSlot(estate); + ExecSetSlotDescriptor(peraggstate->evalslot, peraggstate->evaldesc); + + /* Set up projection info for evaluation */ + peraggstate->evalproj = ExecBuildProjectionInfo(aggrefstate->args, + aggstate->tmpcontext, + peraggstate->evalslot, + NULL); + + /* + * If we're doing either DISTINCT or ORDER BY, then we have a list + * of SortGroupClause nodes; fish out the data in them and + * stick them into arrays. + * + * Note that by construction, if there is a DISTINCT clause then the + * ORDER BY clause is a prefix of it (see transformDistinctClause). + */ if (aggref->aggdistinct) { - Oid lt_opr; - Oid eq_opr; + sortlist = aggref->aggdistinct; + numSortCols = numDistinctCols = list_length(sortlist); + Assert(numSortCols >= list_length(aggref->aggorder)); + } + else + { + sortlist = aggref->aggorder; + numSortCols = list_length(sortlist); + numDistinctCols = 0; + } - /* We don't implement DISTINCT aggs in the HASHED case */ - Assert(node->aggstrategy != AGG_HASHED); + peraggstate->numSortCols = numSortCols; + peraggstate->numDistinctCols = numDistinctCols; + if (numSortCols > 0) + { /* - * We don't currently implement DISTINCT aggs for aggs having more - * than one argument. This isn't required for anything in the SQL - * spec, but really it ought to be implemented for - * feature-completeness. FIXME someday. + * We don't implement DISTINCT or ORDER BY aggs in the HASHED case + * (yet) */ - if (numArguments != 1) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("DISTINCT is supported only for single-argument aggregates"))); + Assert(node->aggstrategy != AGG_HASHED); + + /* If we have only one input, we need its len/byval info. */ + if (numInputs == 1) + { + get_typlenbyval(inputTypes[0], + &peraggstate->inputtypeLen, + &peraggstate->inputtypeByVal); + } + else if (numDistinctCols > 0) + { + /* we will need an extra slot to store prior values */ + peraggstate->uniqslot = ExecInitExtraTupleSlot(estate); + ExecSetSlotDescriptor(peraggstate->uniqslot, + peraggstate->evaldesc); + } + + /* Extract the sort information for use later */ + peraggstate->sortColIdx = + (AttrNumber *) palloc(numSortCols * sizeof(AttrNumber)); + peraggstate->sortOperators = + (Oid *) palloc(numSortCols * sizeof(Oid)); + peraggstate->sortNullsFirst = + (bool *) palloc(numSortCols * sizeof(bool)); + + i = 0; + foreach(lc, sortlist) + { + SortGroupClause *sortcl = (SortGroupClause *) lfirst(lc); + TargetEntry *tle = get_sortgroupclause_tle(sortcl, + aggref->args); + + /* the parser should have made sure of this */ + Assert(OidIsValid(sortcl->sortop)); - peraggstate->inputType = inputTypes[0]; - get_typlenbyval(inputTypes[0], - &peraggstate->inputtypeLen, - &peraggstate->inputtypeByVal); + peraggstate->sortColIdx[i] = tle->resno; + peraggstate->sortOperators[i] = sortcl->sortop; + peraggstate->sortNullsFirst[i] = sortcl->nulls_first; + i++; + } + Assert(i == numSortCols); + } + + if (aggref->aggdistinct) + { + Assert(numArguments > 0); /* - * Look up the sorting and comparison operators to use. XXX it's - * pretty bletcherous to be making this sort of semantic decision - * in the executor. Probably the parser should decide this and - * record it in the Aggref node ... or at latest, do it in the - * planner. + * We need the equal function for each DISTINCT comparison we will + * make. */ - get_sort_group_operators(inputTypes[0], - true, true, false, - <_opr, &eq_opr, NULL); - fmgr_info(get_opcode(eq_opr), &(peraggstate->equalfn)); - peraggstate->sortOperator = lt_opr; - peraggstate->sortstate = NULL; + peraggstate->equalfns = + (FmgrInfo *) palloc(numDistinctCols * sizeof(FmgrInfo)); + + i = 0; + foreach(lc, aggref->aggdistinct) + { + SortGroupClause *sortcl = (SortGroupClause *) lfirst(lc); + + fmgr_info(get_opcode(sortcl->eqop), &peraggstate->equalfns[i]); + i++; + } + Assert(i == numDistinctCols); } ReleaseSysCache(aggTuple); diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index c0bb01f1414c3e3203b0f20c625d4a4d044971df..018f36cef60e9bcdedfa3603e651f756b38a5357 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -15,7 +15,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.453 2009/12/07 05:22:22 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.454 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1023,9 +1023,10 @@ _copyAggref(Aggref *from) COPY_SCALAR_FIELD(aggfnoid); COPY_SCALAR_FIELD(aggtype); COPY_NODE_FIELD(args); - COPY_SCALAR_FIELD(agglevelsup); + COPY_NODE_FIELD(aggorder); + COPY_NODE_FIELD(aggdistinct); COPY_SCALAR_FIELD(aggstar); - COPY_SCALAR_FIELD(aggdistinct); + COPY_SCALAR_FIELD(agglevelsup); COPY_LOCATION_FIELD(location); return newnode; @@ -1968,6 +1969,7 @@ _copyFuncCall(FuncCall *from) COPY_NODE_FIELD(funcname); COPY_NODE_FIELD(args); + COPY_NODE_FIELD(agg_order); COPY_SCALAR_FIELD(agg_star); COPY_SCALAR_FIELD(agg_distinct); COPY_SCALAR_FIELD(func_variadic); diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 7a8bedf1cb960b4c6abd104fbd449929e39c43de..2387aaba410b582b34e0062f48c026cdcd40b4eb 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -22,7 +22,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.375 2009/12/07 05:22:22 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.376 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -183,9 +183,10 @@ _equalAggref(Aggref *a, Aggref *b) COMPARE_SCALAR_FIELD(aggfnoid); COMPARE_SCALAR_FIELD(aggtype); COMPARE_NODE_FIELD(args); - COMPARE_SCALAR_FIELD(agglevelsup); + COMPARE_NODE_FIELD(aggorder); + COMPARE_NODE_FIELD(aggdistinct); COMPARE_SCALAR_FIELD(aggstar); - COMPARE_SCALAR_FIELD(aggdistinct); + COMPARE_SCALAR_FIELD(agglevelsup); COMPARE_LOCATION_FIELD(location); return true; @@ -1943,6 +1944,7 @@ _equalFuncCall(FuncCall *a, FuncCall *b) { COMPARE_NODE_FIELD(funcname); COMPARE_NODE_FIELD(args); + COMPARE_NODE_FIELD(agg_order); COMPARE_SCALAR_FIELD(agg_star); COMPARE_SCALAR_FIELD(agg_distinct); COMPARE_SCALAR_FIELD(func_variadic); diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 04a39f17526ca1cf7bd2de79fe2ec97a9f800f0e..d251ef01d18017629dbe035bc80bd4766c6abd62 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/nodeFuncs.c,v 1.43 2009/10/08 02:39:21 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/nodeFuncs.c,v 1.44 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -882,6 +882,7 @@ exprLocation(Node *expr) FuncCall *fc = (FuncCall *) expr; /* consider both function name and leftmost arg */ + /* (we assume any ORDER BY nodes must be to right of name) */ loc = leftmostLoc(fc->location, exprLocation((Node *) fc->args)); } @@ -1082,6 +1083,7 @@ expression_tree_walker(Node *node, case T_SetToDefault: case T_CurrentOfExpr: case T_RangeTblRef: + case T_SortGroupClause: /* primitive node types with no expression subnodes */ break; case T_Aggref: @@ -1092,6 +1094,12 @@ expression_tree_walker(Node *node, if (expression_tree_walker((Node *) expr->args, walker, context)) return true; + if (expression_tree_walker((Node *) expr->aggorder, + walker, context)) + return true; + if (expression_tree_walker((Node *) expr->aggdistinct, + walker, context)) + return true; } break; case T_WindowFunc: @@ -1591,6 +1599,7 @@ expression_tree_mutator(Node *node, case T_SetToDefault: case T_CurrentOfExpr: case T_RangeTblRef: + case T_SortGroupClause: return (Node *) copyObject(node); case T_Aggref: { @@ -1599,6 +1608,8 @@ expression_tree_mutator(Node *node, FLATCOPY(newnode, aggref, Aggref); MUTATE(newnode->args, aggref->args, List *); + MUTATE(newnode->aggorder, aggref->aggorder, List *); + MUTATE(newnode->aggdistinct, aggref->aggdistinct, List *); return (Node *) newnode; } break; @@ -2403,6 +2414,8 @@ bool if (walker(fcall->args, context)) return true; + if (walker(fcall->agg_order, context)) + return true; if (walker(fcall->over, context)) return true; /* function name is deemed uninteresting */ diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 80a2fa76407ec1556804d7b10c6046888c326cb4..cac346464bd15b998e4b56e6fb34a7271a521d07 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.374 2009/12/07 05:22:22 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.375 2009/12/15 17:57:46 tgl Exp $ * * NOTES * Every node type that can appear in stored rules' parsetrees *must* @@ -869,9 +869,10 @@ _outAggref(StringInfo str, Aggref *node) WRITE_OID_FIELD(aggfnoid); WRITE_OID_FIELD(aggtype); WRITE_NODE_FIELD(args); - WRITE_UINT_FIELD(agglevelsup); + WRITE_NODE_FIELD(aggorder); + WRITE_NODE_FIELD(aggdistinct); WRITE_BOOL_FIELD(aggstar); - WRITE_BOOL_FIELD(aggdistinct); + WRITE_UINT_FIELD(agglevelsup); WRITE_LOCATION_FIELD(location); } @@ -1857,6 +1858,7 @@ _outFuncCall(StringInfo str, FuncCall *node) WRITE_NODE_FIELD(funcname); WRITE_NODE_FIELD(args); + WRITE_NODE_FIELD(agg_order); WRITE_BOOL_FIELD(agg_star); WRITE_BOOL_FIELD(agg_distinct); WRITE_BOOL_FIELD(func_variadic); diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 4192ca7e330e29ebf0b6324ca66d63bd5ee262a0..3e4a595e827609ac81bc3edac64fb8e7012e5f8d 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/readfuncs.c,v 1.227 2009/10/28 14:55:38 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/readfuncs.c,v 1.228 2009/12/15 17:57:46 tgl Exp $ * * NOTES * Path and Plan nodes do not have any readfuncs support, because we @@ -461,9 +461,10 @@ _readAggref(void) READ_OID_FIELD(aggfnoid); READ_OID_FIELD(aggtype); READ_NODE_FIELD(args); - READ_UINT_FIELD(agglevelsup); + READ_NODE_FIELD(aggorder); + READ_NODE_FIELD(aggdistinct); READ_BOOL_FIELD(aggstar); - READ_BOOL_FIELD(aggdistinct); + READ_UINT_FIELD(agglevelsup); READ_LOCATION_FIELD(location); READ_DONE(); diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c index 6bdbd2a913deb8e44b417c55b72717cc854a5fb5..2b7934b686fce34c6edcab49ce5ed51420fb64b5 100644 --- a/src/backend/optimizer/plan/planagg.c +++ b/src/backend/optimizer/plan/planagg.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/planagg.c,v 1.46 2009/06/11 14:48:59 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/planagg.c,v 1.47 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -239,12 +239,12 @@ find_minmax_aggs_walker(Node *node, List **context) { Aggref *aggref = (Aggref *) node; Oid aggsortop; - Expr *curTarget; + TargetEntry *curTarget; MinMaxAggInfo *info; ListCell *l; Assert(aggref->agglevelsup == 0); - if (list_length(aggref->args) != 1) + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL) return true; /* it couldn't be MIN/MAX */ /* note: we do not care if DISTINCT is mentioned ... */ @@ -255,19 +255,19 @@ find_minmax_aggs_walker(Node *node, List **context) /* * Check whether it's already in the list, and add it if not. */ - curTarget = linitial(aggref->args); + curTarget = (TargetEntry *) linitial(aggref->args); foreach(l, *context) { info = (MinMaxAggInfo *) lfirst(l); if (info->aggfnoid == aggref->aggfnoid && - equal(info->target, curTarget)) + equal(info->target, curTarget->expr)) return false; } info = (MinMaxAggInfo *) palloc0(sizeof(MinMaxAggInfo)); info->aggfnoid = aggref->aggfnoid; info->aggsortop = aggsortop; - info->target = curTarget; + info->target = curTarget->expr; *context = lappend(*context, info); @@ -584,15 +584,15 @@ replace_aggs_with_params_mutator(Node *node, List **context) if (IsA(node, Aggref)) { Aggref *aggref = (Aggref *) node; + TargetEntry *curTarget = (TargetEntry *) linitial(aggref->args); ListCell *l; - Expr *curTarget = linitial(aggref->args); foreach(l, *context) { MinMaxAggInfo *info = (MinMaxAggInfo *) lfirst(l); if (info->aggfnoid == aggref->aggfnoid && - equal(info->target, curTarget)) + equal(info->target, curTarget->expr)) return (Node *) info->param; } elog(ERROR, "failed to re-find aggregate info record"); diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 6b24098cdd3e3a431129af905b133f3422f7c349..5de096e8319f6194e6d4815e26a857ec02758228 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.261 2009/10/28 14:55:38 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.262 2009/12/15 17:57:46 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1096,11 +1096,12 @@ grouping_planner(PlannerInfo *root, double tuple_fraction) bool can_sort; /* - * Executor doesn't support hashed aggregation with DISTINCT - * aggregates. (Doing so would imply storing *all* the input - * values in the hash table, which seems like a certain loser.) + * Executor doesn't support hashed aggregation with DISTINCT or + * ORDER BY aggregates. (Doing so would imply storing *all* the + * input values in the hash table, and/or running many sorts in + * parallel, either of which seems like a certain loser.) */ - can_hash = (agg_counts.numDistinctAggs == 0 && + can_hash = (agg_counts.numOrderedAggs == 0 && grouping_is_hashable(parse->groupClause)); can_sort = grouping_is_sortable(parse->groupClause); if (can_hash && can_sort) diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index a5c343298e6b61df02e9a4485434722519ef6c88..56b432c9bb2008404157b27dbadd65b0b126c1a2 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/util/clauses.c,v 1.280 2009/12/14 02:15:52 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/util/clauses.c,v 1.281 2009/12/15 17:57:47 tgl Exp $ * * HISTORY * AUTHOR DATE MAJOR EVENT @@ -471,21 +471,22 @@ count_agg_clauses_walker(Node *node, AggClauseCounts *counts) HeapTuple aggTuple; Form_pg_aggregate aggform; Oid aggtranstype; - int i; ListCell *l; Assert(aggref->agglevelsup == 0); counts->numAggs++; - if (aggref->aggdistinct) - counts->numDistinctAggs++; + if (aggref->aggorder != NIL || aggref->aggdistinct != NIL) + counts->numOrderedAggs++; - /* extract argument types */ - numArguments = list_length(aggref->args); - inputTypes = (Oid *) palloc(sizeof(Oid) * numArguments); - i = 0; + /* extract argument types (ignoring any ORDER BY expressions) */ + inputTypes = (Oid *) palloc(sizeof(Oid) * list_length(aggref->args)); + numArguments = 0; foreach(l, aggref->args) { - inputTypes[i++] = exprType((Node *) lfirst(l)); + TargetEntry *tle = (TargetEntry *) lfirst(l); + + if (!tle->resjunk) + inputTypes[numArguments++] = exprType((Node *) tle->expr); } /* fetch aggregate transition datatype from pg_aggregate */ diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 49ec8481846abd71a1e366d04696b67718a71cb0..0f47e7bae9b98f2bab4581ef5ce82f1e2dcf6d1a 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -17,7 +17,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/backend/parser/analyze.c,v 1.396 2009/10/31 01:41:31 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/analyze.c,v 1.397 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -828,13 +828,13 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt) stmt->sortClause, &qry->targetList, true /* fix unknowns */, - false /* not window function */); + false /* allow SQL92 rules */); qry->groupClause = transformGroupClause(pstate, stmt->groupClause, &qry->targetList, qry->sortClause, - false /* not window function */); + false /* allow SQL92 rules */); if (stmt->distinctClause == NIL) { @@ -846,7 +846,8 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt) /* We had SELECT DISTINCT */ qry->distinctClause = transformDistinctClause(pstate, &qry->targetList, - qry->sortClause); + qry->sortClause, + false); qry->hasDistinctOn = false; } else @@ -1044,7 +1045,7 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt) stmt->sortClause, &qry->targetList, true /* fix unknowns */, - false /* not window function */); + false /* allow SQL92 rules */); qry->limitOffset = transformLimitClause(pstate, stmt->limitOffset, "OFFSET"); @@ -1298,7 +1299,7 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt) sortClause, &qry->targetList, false /* no unknowns expected */, - false /* not window function */); + false /* allow SQL92 rules */); pstate->p_rtable = list_truncate(pstate->p_rtable, sv_rtable_length); pstate->p_relnamespace = sv_relnamespace; diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 3257ccbbf839dbb437d521a5cab9763473461612..22449579d0f7372bc2fa8fdf5ddee02e14a4ccb6 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -11,7 +11,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/gram.y,v 2.696 2009/12/11 03:34:55 itagaki Exp $ + * $PostgreSQL: pgsql/src/backend/parser/gram.y,v 2.697 2009/12/15 17:57:47 tgl Exp $ * * HISTORY * AUTHOR DATE MAJOR EVENT @@ -8651,6 +8651,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("timezone"); n->args = list_make2($5, $1); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8711,6 +8712,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("like_escape"); n->args = list_make2($3, $5); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8725,6 +8727,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("like_escape"); n->args = list_make2($4, $6); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8739,6 +8742,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("like_escape"); n->args = list_make2($3, $5); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8753,6 +8757,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("like_escape"); n->args = list_make2($4, $6); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8766,6 +8771,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("similar_escape"); n->args = list_make2($4, makeNullAConst(-1)); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8778,6 +8784,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("similar_escape"); n->args = list_make2($4, $6); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8790,6 +8797,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("similar_escape"); n->args = list_make2($5, makeNullAConst(-1)); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -8802,6 +8810,7 @@ a_expr: c_expr { $$ = $1; } FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("similar_escape"); n->args = list_make2($5, $7); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9220,6 +9229,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9232,6 +9242,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9244,6 +9255,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = list_make1($4); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = TRUE; @@ -9256,6 +9268,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = lappend($3, $6); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = TRUE; @@ -9263,11 +9276,25 @@ func_expr: func_name '(' ')' over_clause n->location = @1; $$ = (Node *)n; } - | func_name '(' ALL func_arg_list ')' over_clause + | func_name '(' func_arg_list sort_clause ')' over_clause + { + FuncCall *n = makeNode(FuncCall); + n->funcname = $1; + n->args = $3; + n->agg_order = $4; + n->agg_star = FALSE; + n->agg_distinct = FALSE; + n->func_variadic = FALSE; + n->over = $6; + n->location = @1; + $$ = (Node *)n; + } + | func_name '(' ALL func_arg_list opt_sort_clause ')' over_clause { FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = $4; + n->agg_order = $5; n->agg_star = FALSE; n->agg_distinct = FALSE; /* Ideally we'd mark the FuncCall node to indicate @@ -9275,19 +9302,20 @@ func_expr: func_name '(' ')' over_clause * for that in FuncCall at the moment. */ n->func_variadic = FALSE; - n->over = $6; + n->over = $7; n->location = @1; $$ = (Node *)n; } - | func_name '(' DISTINCT func_arg_list ')' over_clause + | func_name '(' DISTINCT func_arg_list opt_sort_clause ')' over_clause { FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = $4; + n->agg_order = $5; n->agg_star = FALSE; n->agg_distinct = TRUE; n->func_variadic = FALSE; - n->over = $6; + n->over = $7; n->location = @1; $$ = (Node *)n; } @@ -9306,6 +9334,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = $1; n->args = NIL; + n->agg_order = NIL; n->agg_star = TRUE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9366,6 +9395,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("now"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9437,6 +9467,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("current_user"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9449,6 +9480,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("current_user"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9461,6 +9493,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("session_user"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9473,6 +9506,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("current_user"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9485,6 +9519,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("current_database"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9497,6 +9532,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("current_schema"); n->args = NIL; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9511,6 +9547,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("date_part"); n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9528,6 +9565,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("overlay"); n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9541,6 +9579,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("position"); n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9556,6 +9595,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("substring"); n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9577,6 +9617,7 @@ func_expr: func_name '(' ')' over_clause */ n->funcname = SystemFuncName(((Value *)llast($5->names))->val.str); n->args = list_make1($3); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9592,6 +9633,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("btrim"); n->args = $4; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9604,6 +9646,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("ltrim"); n->args = $4; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9616,6 +9659,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("rtrim"); n->args = $4; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -9628,6 +9672,7 @@ func_expr: func_name '(' ')' over_clause FuncCall *n = makeNode(FuncCall); n->funcname = SystemFuncName("btrim"); n->args = $3; + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; @@ -11252,6 +11297,7 @@ makeOverlaps(List *largs, List *rargs, int location, core_yyscan_t yyscanner) errmsg("wrong number of parameters on right side of OVERLAPS expression"), parser_errposition(location))); n->args = list_concat(largs, rargs); + n->agg_order = NIL; n->agg_star = FALSE; n->agg_distinct = FALSE; n->func_variadic = FALSE; diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index 05943969215a64c08c5534755feb5b074ecb1775..3ffa4f82605dddf0a2e4824973e5d3ddd9dde0af 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.88 2009/06/11 14:49:00 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.89 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -19,8 +19,10 @@ #include "optimizer/tlist.h" #include "optimizer/var.h" #include "parser/parse_agg.h" +#include "parser/parse_clause.h" #include "parser/parsetree.h" #include "rewrite/rewriteManip.h" +#include "utils/builtins.h" #include "utils/lsyscache.h" @@ -43,15 +45,97 @@ static bool check_ungrouped_columns_walker(Node *node, * Finish initial transformation of an aggregate call * * parse_func.c has recognized the function as an aggregate, and has set - * up all the fields of the Aggref except agglevelsup. Here we must - * determine which query level the aggregate actually belongs to, set - * agglevelsup accordingly, and mark p_hasAggs true in the corresponding + * up all the fields of the Aggref except aggdistinct and agglevelsup. + * However, the args list is just bare expressions, and the aggorder list + * hasn't been transformed at all. + * + * Here we convert the args list into a targetlist by inserting TargetEntry + * nodes, and then transform the aggorder and agg_distinct specifications to + * produce lists of SortGroupClause nodes. (That might also result in adding + * resjunk expressions to the targetlist.) + * + * We must also determine which query level the aggregate actually belongs to, + * set agglevelsup accordingly, and mark p_hasAggs true in the corresponding * pstate level. */ void -transformAggregateCall(ParseState *pstate, Aggref *agg) +transformAggregateCall(ParseState *pstate, Aggref *agg, bool agg_distinct) { + List *tlist; + List *torder; + List *tdistinct = NIL; + AttrNumber attno; + int save_next_resno; int min_varlevel; + ListCell *lc; + + /* + * Transform the plain list of Exprs into a targetlist. We don't bother + * to assign column names to the entries. + */ + tlist = NIL; + attno = 1; + foreach(lc, agg->args) + { + Expr *arg = (Expr *) lfirst(lc); + TargetEntry *tle = makeTargetEntry(arg, attno++, NULL, false); + + tlist = lappend(tlist, tle); + } + + /* + * If we have an ORDER BY, transform it. This will add columns to the + * tlist if they appear in ORDER BY but weren't already in the arg list. + * They will be marked resjunk = true so we can tell them apart from + * regular aggregate arguments later. + * + * We need to mess with p_next_resno since it will be used to number any + * new targetlist entries. + */ + save_next_resno = pstate->p_next_resno; + pstate->p_next_resno = attno; + + torder = transformSortClause(pstate, + agg->aggorder, + &tlist, + true /* fix unknowns */, + true /* force SQL99 rules */); + + /* + * If we have DISTINCT, transform that to produce a distinctList. + */ + if (agg_distinct) + { + tdistinct = transformDistinctClause(pstate, &tlist, torder, true); + + /* + * Remove this check if executor support for hashed distinct for + * aggregates is ever added. + */ + foreach(lc, tdistinct) + { + SortGroupClause *sortcl = (SortGroupClause *) lfirst(lc); + + if (!OidIsValid(sortcl->sortop)) + { + Node *expr = get_sortgroupclause_expr(sortcl, tlist); + + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_FUNCTION), + errmsg("could not identify an ordering operator for type %s", + format_type_be(exprType(expr))), + errdetail("Aggregates with DISTINCT must be able to sort their inputs."), + parser_errposition(pstate, exprLocation(expr)))); + } + } + } + + /* Update the Aggref with the transformation results */ + agg->args = tlist; + agg->aggorder = torder; + agg->aggdistinct = tdistinct; + + pstate->p_next_resno = save_next_resno; /* * The aggregate's level is the same as the level of the lowest-level diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 84443db8932dd6f18eef1e18e35ed0cf06b7762d..d6ee7c921c10cc187e8e94a51acc01b9aa2127c2 100644 --- a/src/backend/parser/parse_clause.c +++ b/src/backend/parser/parse_clause.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_clause.c,v 1.193 2009/10/27 17:11:18 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_clause.c,v 1.194 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1440,7 +1440,7 @@ findTargetlistEntrySQL99(ParseState *pstate, Node *node, List **tlist) List * transformGroupClause(ParseState *pstate, List *grouplist, List **targetlist, List *sortClause, - bool isWindowFunc) + bool useSQL99) { List *result = NIL; ListCell *gl; @@ -1451,7 +1451,7 @@ transformGroupClause(ParseState *pstate, List *grouplist, TargetEntry *tle; bool found = false; - if (isWindowFunc) + if (useSQL99) tle = findTargetlistEntrySQL99(pstate, gexpr, targetlist); else tle = findTargetlistEntrySQL92(pstate, gexpr, targetlist, @@ -1510,15 +1510,15 @@ transformGroupClause(ParseState *pstate, List *grouplist, * ORDER BY items will be added to the targetlist (as resjunk columns) * if not already present, so the targetlist must be passed by reference. * - * This is also used for window ORDER BY clauses (which act almost the - * same, but are always interpreted per SQL99 rules). + * This is also used for window and aggregate ORDER BY clauses (which act + * almost the same, but are always interpreted per SQL99 rules). */ List * transformSortClause(ParseState *pstate, List *orderlist, List **targetlist, bool resolveUnknown, - bool isWindowFunc) + bool useSQL99) { List *sortlist = NIL; ListCell *olitem; @@ -1528,7 +1528,7 @@ transformSortClause(ParseState *pstate, SortBy *sortby = (SortBy *) lfirst(olitem); TargetEntry *tle; - if (isWindowFunc) + if (useSQL99) tle = findTargetlistEntrySQL99(pstate, sortby->node, targetlist); else tle = findTargetlistEntrySQL92(pstate, sortby->node, targetlist, @@ -1598,12 +1598,12 @@ transformWindowDefinitions(ParseState *pstate, windef->orderClause, targetlist, true /* fix unknowns */, - true /* window function */); + true /* force SQL99 rules */); partitionClause = transformGroupClause(pstate, windef->partitionClause, targetlist, orderClause, - true /* window function */); + true /* force SQL99 rules */); /* * And prepare the new WindowClause. @@ -1684,10 +1684,14 @@ transformWindowDefinitions(ParseState *pstate, * and allows the user to choose the equality semantics used by DISTINCT, * should she be working with a datatype that has more than one equality * operator. + * + * is_agg is true if we are transforming an aggregate(DISTINCT ...) + * function call. This does not affect any behavior, only the phrasing + * of error messages. */ List * transformDistinctClause(ParseState *pstate, - List **targetlist, List *sortClause) + List **targetlist, List *sortClause, bool is_agg) { List *result = NIL; ListCell *slitem; @@ -1716,6 +1720,8 @@ transformDistinctClause(ParseState *pstate, if (tle->resjunk) ereport(ERROR, (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + is_agg ? + errmsg("in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list") : errmsg("for SELECT DISTINCT, ORDER BY expressions must appear in select list"), parser_errposition(pstate, exprLocation((Node *) tle->expr)))); diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 8513102a9c35bde6cbd3234ee2a5ed10f4c69335..7dc20b4d6a7c4ba6a973ccdab1251ef54303e528 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_expr.c,v 1.250 2009/11/13 19:48:20 heikki Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_expr.c,v 1.251 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -431,7 +431,7 @@ transformIndirection(ParseState *pstate, Node *basenode, List *indirection) newresult = ParseFuncOrColumn(pstate, list_make1(n), list_make1(result), - false, false, false, + NIL, false, false, false, NULL, true, location); if (newresult == NULL) unknown_attribute(pstate, result, strVal(n), location); @@ -598,7 +598,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), - false, false, false, + NIL, false, false, false, NULL, true, cref->location); } break; @@ -643,7 +643,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), - false, false, false, + NIL, false, false, false, NULL, true, cref->location); } break; @@ -701,7 +701,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), - false, false, false, + NIL, false, false, false, NULL, true, cref->location); } break; @@ -1221,6 +1221,7 @@ transformFuncCall(ParseState *pstate, FuncCall *fn) return ParseFuncOrColumn(pstate, fn->funcname, targs, + fn->agg_order, fn->agg_star, fn->agg_distinct, fn->func_variadic, diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index d86bd1732f9c095889e415361825cd320a346c49..9ece26a6bac709040106a6a434e5c943ab57858f 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_func.c,v 1.218 2009/10/31 01:41:31 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_func.c,v 1.219 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -55,10 +55,12 @@ static Node *ParseComplexProjection(ParseState *pstate, char *funcname, * reporting a no-such-function error. * * The argument expressions (in fargs) must have been transformed already. + * But the agg_order expressions, if any, have not been. */ Node * ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, - bool agg_star, bool agg_distinct, bool func_variadic, + List *agg_order, bool agg_star, bool agg_distinct, + bool func_variadic, WindowDef *over, bool is_column, int location) { Oid rettype; @@ -170,8 +172,9 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, * the "function call" could be a projection. We also check that there * wasn't any aggregate or variadic decoration, nor an argument name. */ - if (nargs == 1 && !agg_star && !agg_distinct && over == NULL && - !func_variadic && argnames == NIL && list_length(funcname) == 1) + if (nargs == 1 && agg_order == NIL && !agg_star && !agg_distinct && + over == NULL && !func_variadic && argnames == NIL && + list_length(funcname) == 1) { Oid argtype = actual_arg_types[0]; @@ -240,6 +243,12 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, errmsg("DISTINCT specified, but %s is not an aggregate function", NameListToString(funcname)), parser_errposition(pstate, location))); + if (agg_order != NIL) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("ORDER BY specified, but %s is not an aggregate function", + NameListToString(funcname)), + parser_errposition(pstate, location))); if (over) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), @@ -372,9 +381,12 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, aggref->aggfnoid = funcid; aggref->aggtype = rettype; + /* args and aggorder will be modified by transformAggregateCall */ aggref->args = fargs; + aggref->aggorder = agg_order; + /* aggdistinct will be set by transformAggregateCall */ aggref->aggstar = agg_star; - aggref->aggdistinct = agg_distinct; + /* agglevelsup will be set by transformAggregateCall */ aggref->location = location; /* @@ -407,7 +419,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, parser_errposition(pstate, location))); /* parse_agg.c does additional aggregate-specific processing */ - transformAggregateCall(pstate, aggref); + transformAggregateCall(pstate, aggref, agg_distinct); retval = (Node *) aggref; } @@ -453,6 +465,15 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, NameListToString(funcname)), parser_errposition(pstate, location))); + /* + * ordered aggs not allowed in windows yet + */ + if (agg_order != NIL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("aggregate ORDER BY is not implemented for window functions"), + parser_errposition(pstate, location))); + if (retset) ereport(ERROR, (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION), diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c index f21fcb610443580a185b5635eaf8a8e1a63bfbb5..9184ca8c7637933712e5c2f3a62ec922d96ebcb6 100644 --- a/src/backend/parser/parse_utilcmd.c +++ b/src/backend/parser/parse_utilcmd.c @@ -19,7 +19,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/backend/parser/parse_utilcmd.c,v 2.31 2009/12/07 05:22:22 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_utilcmd.c,v 2.32 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -388,6 +388,7 @@ transformColumnDefinition(ParseState *pstate, CreateStmtContext *cxt, funccallnode = makeNode(FuncCall); funccallnode->funcname = SystemFuncName("nextval"); funccallnode->args = list_make1(castnode); + funccallnode->agg_order = NIL; funccallnode->agg_star = false; funccallnode->agg_distinct = false; funccallnode->func_variadic = false; diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 71e614d7004d9e2b5417f6a29287a222a13463a7..856a2f75806f8ea4ec15c024a611abd614d111b0 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -9,7 +9,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/utils/adt/ruleutils.c,v 1.316 2009/12/07 05:22:22 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/utils/adt/ruleutils.c,v 1.317 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -5442,32 +5442,44 @@ get_agg_expr(Aggref *aggref, deparse_context *context) { StringInfo buf = context->buf; Oid argtypes[FUNC_MAX_ARGS]; + List *arglist; int nargs; ListCell *l; - if (list_length(aggref->args) > FUNC_MAX_ARGS) - ereport(ERROR, - (errcode(ERRCODE_TOO_MANY_ARGUMENTS), - errmsg("too many arguments"))); + /* Extract the regular arguments, ignoring resjunk stuff for the moment */ + arglist = NIL; nargs = 0; foreach(l, aggref->args) { - Node *arg = (Node *) lfirst(l); + TargetEntry *tle = (TargetEntry *) lfirst(l); + Node *arg = (Node *) tle->expr; Assert(!IsA(arg, NamedArgExpr)); + if (tle->resjunk) + continue; + if (nargs >= FUNC_MAX_ARGS) /* paranoia */ + ereport(ERROR, + (errcode(ERRCODE_TOO_MANY_ARGUMENTS), + errmsg("too many arguments"))); argtypes[nargs] = exprType(arg); + arglist = lappend(arglist, arg); nargs++; } appendStringInfo(buf, "%s(%s", generate_function_name(aggref->aggfnoid, nargs, NIL, argtypes, NULL), - aggref->aggdistinct ? "DISTINCT " : ""); + (aggref->aggdistinct != NIL) ? "DISTINCT " : ""); /* aggstar can be set only in zero-argument aggregates */ if (aggref->aggstar) appendStringInfoChar(buf, '*'); else - get_rule_expr((Node *) aggref->args, context, true); + get_rule_expr((Node *) arglist, context, true); + if (aggref->aggorder != NIL) + { + appendStringInfoString(buf, " ORDER BY "); + get_rule_orderby(aggref->aggorder, aggref->args, false, context); + } appendStringInfoChar(buf, ')'); } diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h index 3fe284addbac271134734afdebe921b6836b93b9..14663347c51978d278c97072a1b8bdc2acc91a80 100644 --- a/src/include/catalog/catversion.h +++ b/src/include/catalog/catversion.h @@ -37,7 +37,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.557 2009/12/11 03:34:56 itagaki Exp $ + * $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.558 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -53,6 +53,6 @@ */ /* yyyymmddN */ -#define CATALOG_VERSION_NO 200912111 +#define CATALOG_VERSION_NO 200912151 #endif diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 351b1c971826a9212efe645a344fe2152ea3d669..265c4e2a18acc9296b0b75c04748f91cf2e5ce6c 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -13,7 +13,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/nodes/parsenodes.h,v 1.418 2009/12/11 03:34:56 itagaki Exp $ + * $PostgreSQL: pgsql/src/include/nodes/parsenodes.h,v 1.419 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -264,9 +264,10 @@ typedef struct TypeCast /* * FuncCall - a function or aggregate invocation * + * agg_order (if not NIL) indicates we saw 'foo(... ORDER BY ...)'. * agg_star indicates we saw a 'foo(*)' construct, while agg_distinct - * indicates we saw 'foo(DISTINCT ...)'. In either case, the construct - * *must* be an aggregate call. Otherwise, it might be either an + * indicates we saw 'foo(DISTINCT ...)'. In any of these cases, the + * construct *must* be an aggregate call. Otherwise, it might be either an * aggregate or some other kind of function. However, if OVER is present * it had better be an aggregate or window function. */ @@ -275,6 +276,7 @@ typedef struct FuncCall NodeTag type; List *funcname; /* qualified name of function */ List *args; /* the arguments (list of exprs) */ + List *agg_order; /* ORDER BY (list of SortBy) */ bool agg_star; /* argument was really '*' */ bool agg_distinct; /* arguments were labeled DISTINCT */ bool func_variadic; /* last argument was labeled VARIADIC */ diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 0320e231553595d511a3b72e818dcb033e75b97b..58d286ab9c8a287fa56f0a821f370038c0784dbe 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -10,7 +10,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/nodes/primnodes.h,v 1.151 2009/10/08 02:39:24 tgl Exp $ + * $PostgreSQL: pgsql/src/include/nodes/primnodes.h,v 1.152 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -209,16 +209,29 @@ typedef struct Param /* * Aggref + * + * The aggregate's args list is a targetlist, ie, a list of TargetEntry nodes + * (before Postgres 8.5 it was just bare expressions). The non-resjunk TLEs + * represent the aggregate's regular arguments (if any) and resjunk TLEs can + * be added at the end to represent ORDER BY expressions that are not also + * arguments. As in a top-level Query, the TLEs can be marked with + * ressortgroupref indexes to let them be referenced by SortGroupClause + * entries in the aggorder and/or aggdistinct lists. This represents ORDER BY + * and DISTINCT operations to be applied to the aggregate input rows before + * they are passed to the transition function. The grammar only allows a + * simple "DISTINCT" specifier for the arguments, but we use the full + * query-level representation to allow more code sharing. */ typedef struct Aggref { Expr xpr; Oid aggfnoid; /* pg_proc Oid of the aggregate */ Oid aggtype; /* type Oid of result of the aggregate */ - List *args; /* arguments to the aggregate */ - Index agglevelsup; /* > 0 if agg belongs to outer query */ + List *args; /* arguments and sort expressions */ + List *aggorder; /* ORDER BY (list of SortGroupClause) */ + List *aggdistinct; /* DISTINCT (list of SortGroupClause) */ bool aggstar; /* TRUE if argument list was really '*' */ - bool aggdistinct; /* TRUE if it's agg(DISTINCT ...) */ + Index agglevelsup; /* > 0 if agg belongs to outer query */ int location; /* token location, or -1 if unknown */ } Aggref; diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h index cb42bd4a958d4df2ef92619c4030db6567dfddc4..7eb7721758a263f8b08a681259a4198f9a920a00 100644 --- a/src/include/optimizer/clauses.h +++ b/src/include/optimizer/clauses.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/optimizer/clauses.h,v 1.98 2009/06/11 14:49:11 momjian Exp $ + * $PostgreSQL: pgsql/src/include/optimizer/clauses.h,v 1.99 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -23,7 +23,7 @@ typedef struct { int numAggs; /* total number of aggregate calls */ - int numDistinctAggs; /* number that use DISTINCT */ + int numOrderedAggs; /* number that use DISTINCT or ORDER BY */ Size transitionSpace; /* for pass-by-ref transition data */ } AggClauseCounts; diff --git a/src/include/parser/parse_agg.h b/src/include/parser/parse_agg.h index 05ec677cb9cc08ac869e7682ce5ce1fe97b9123c..b7c25f4181147fa5f918a431ecb885af7b6960db 100644 --- a/src/include/parser/parse_agg.h +++ b/src/include/parser/parse_agg.h @@ -6,7 +6,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/parser/parse_agg.h,v 1.39 2009/06/11 14:49:11 momjian Exp $ + * $PostgreSQL: pgsql/src/include/parser/parse_agg.h,v 1.40 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -15,7 +15,8 @@ #include "parser/parse_node.h" -extern void transformAggregateCall(ParseState *pstate, Aggref *agg); +extern void transformAggregateCall(ParseState *pstate, Aggref *agg, + bool agg_distinct); extern void transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc, WindowDef *windef); diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h index 6bfb021f862d0ac820e9372be2eff08d913e2a83..1989b3e4e44ac5336a7cf18bba69056290a3f202 100644 --- a/src/include/parser/parse_clause.h +++ b/src/include/parser/parse_clause.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/parser/parse_clause.h,v 1.56 2009/08/27 20:08:02 tgl Exp $ + * $PostgreSQL: pgsql/src/include/parser/parse_clause.h,v 1.57 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -28,16 +28,16 @@ extern Node *transformLimitClause(ParseState *pstate, Node *clause, const char *constructName); extern List *transformGroupClause(ParseState *pstate, List *grouplist, List **targetlist, List *sortClause, - bool isWindowFunc); + bool useSQL99); extern List *transformSortClause(ParseState *pstate, List *orderlist, - List **targetlist, bool resolveUnknown, bool isWindowFunc); + List **targetlist, bool resolveUnknown, bool useSQL99); extern List *transformWindowDefinitions(ParseState *pstate, List *windowdefs, List **targetlist); extern List *transformDistinctClause(ParseState *pstate, - List **targetlist, List *sortClause); + List **targetlist, List *sortClause, bool is_agg); extern List *transformDistinctOnClause(ParseState *pstate, List *distinctlist, List **targetlist, List *sortClause); diff --git a/src/include/parser/parse_func.h b/src/include/parser/parse_func.h index 0a38f740b1c74477473d77343e8945a11d0b19cd..8521f25ceaacc0b8248cee91c54af4274d1a8599 100644 --- a/src/include/parser/parse_func.h +++ b/src/include/parser/parse_func.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/parser/parse_func.h,v 1.66 2009/10/08 02:39:25 tgl Exp $ + * $PostgreSQL: pgsql/src/include/parser/parse_func.h,v 1.67 2009/12/15 17:57:47 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -44,7 +44,8 @@ typedef enum extern Node *ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, - bool agg_star, bool agg_distinct, bool func_variadic, + List *agg_order, bool agg_star, bool agg_distinct, + bool func_variadic, WindowDef *over, bool is_column, int location); extern FuncDetailCode func_get_detail(List *funcname, diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out index a48b45007f34d9604b943b0667975afd087873b1..9669a52fbeaeb448aff76bb74a2d8a52075b0c49 100644 --- a/src/test/regress/expected/aggregates.out +++ b/src/test/regress/expected/aggregates.out @@ -529,3 +529,273 @@ select max(unique2), generate_series(1,3) as g from tenk1 order by g desc; 9999 | 1 (3 rows) +-- +-- Test combinations of DISTINCT and/or ORDER BY +-- +select array_agg(a order by b) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); + array_agg +----------- + {3,4,2,1} +(1 row) + +select array_agg(a order by a) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); + array_agg +----------- + {1,2,3,4} +(1 row) + +select array_agg(a order by a desc) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); + array_agg +----------- + {4,3,2,1} +(1 row) + +select array_agg(b order by a desc) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); + array_agg +----------- + {2,1,3,4} +(1 row) + +select array_agg(distinct a) + from (values (1),(2),(1),(3),(null),(2)) v(a); + array_agg +-------------- + {1,2,3,NULL} +(1 row) + +select array_agg(distinct a order by a) + from (values (1),(2),(1),(3),(null),(2)) v(a); + array_agg +-------------- + {1,2,3,NULL} +(1 row) + +select array_agg(distinct a order by a desc) + from (values (1),(2),(1),(3),(null),(2)) v(a); + array_agg +-------------- + {NULL,3,2,1} +(1 row) + +select array_agg(distinct a order by a desc nulls last) + from (values (1),(2),(1),(3),(null),(2)) v(a); + array_agg +-------------- + {3,2,1,NULL} +(1 row) + +-- multi-arg aggs, strict/nonstrict, distinct/order by +select aggfstr(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + aggfstr +--------------------------------------- + {"(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select aggfns(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + aggfns +----------------------------------------------- + {"(1,3,foo)","(0,,)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select aggfstr(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + aggfstr +--------------------------------------- + {"(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select aggfns(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + aggfns +----------------------------------------------- + {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select aggfstr(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + aggfstr +--------------------------------------- + {"(3,1,baz)","(2,2,bar)","(1,3,foo)"} +(1 row) + +select aggfns(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + aggfns +----------------------------------------------- + {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"} +(1 row) + +-- test specific code paths +select aggfns(distinct a,a,c order by c using ~<~,a) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + aggfns +------------------------------------------------ + {"(2,2,bar)","(3,3,baz)","(1,1,foo)","(0,0,)"} +(1 row) + +select aggfns(distinct a,a,c order by c using ~<~) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + aggfns +------------------------------------------------ + {"(2,2,bar)","(3,3,baz)","(1,1,foo)","(0,0,)"} +(1 row) + +select aggfns(distinct a,a,c order by a) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + aggfns +------------------------------------------------ + {"(0,0,)","(1,1,foo)","(2,2,bar)","(3,3,baz)"} +(1 row) + +select aggfns(distinct a,b,c order by a,c using ~<~,b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + aggfns +----------------------------------------------- + {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +-- check node I/O via view creation and usage, also deparsing logic +create view agg_view1 as + select aggfns(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); +select * from agg_view1; + aggfns +----------------------------------------------- + {"(1,3,foo)","(0,,)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +-------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(v.a, v.b, v.c) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c); +(1 row) + +create or replace view agg_view1 as + select aggfns(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; +select * from agg_view1; + aggfns +----------------------------------------------- + {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(DISTINCT v.a, v.b, v.c) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c), generate_series(1, 3) i(i); +(1 row) + +create or replace view agg_view1 as + select aggfns(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; +select * from agg_view1; + aggfns +----------------------------------------------- + {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(DISTINCT v.a, v.b, v.c ORDER BY v.b) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c), generate_series(1, 3) i(i); +(1 row) + +create or replace view agg_view1 as + select aggfns(a,b,c order by b+1) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); +select * from agg_view1; + aggfns +----------------------------------------------- + {"(3,1,baz)","(2,2,bar)","(1,3,foo)","(0,,)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(v.a, v.b, v.c ORDER BY (v.b + 1)) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c); +(1 row) + +create or replace view agg_view1 as + select aggfns(a,a,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); +select * from agg_view1; + aggfns +------------------------------------------------ + {"(3,3,baz)","(2,2,bar)","(1,1,foo)","(0,0,)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +--------------------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(v.a, v.a, v.c ORDER BY v.b) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c); +(1 row) + +create or replace view agg_view1 as + select aggfns(a,b,c order by c using ~<~) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); +select * from agg_view1; + aggfns +----------------------------------------------- + {"(2,2,bar)","(3,1,baz)","(1,3,foo)","(0,,)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ + SELECT aggfns(v.a, v.b, v.c ORDER BY v.c USING ~<~ NULLS LAST) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c); +(1 row) + +create or replace view agg_view1 as + select aggfns(distinct a,b,c order by a,c using ~<~,b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; +select * from agg_view1; + aggfns +----------------------------------------------- + {"(0,,)","(1,3,foo)","(2,2,bar)","(3,1,baz)"} +(1 row) + +select pg_get_viewdef('agg_view1'::regclass); + pg_get_viewdef +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + SELECT aggfns(DISTINCT v.a, v.b, v.c ORDER BY v.a, v.c USING ~<~ NULLS LAST, v.b) AS aggfns FROM (VALUES (1,3,'foo'::text), (0,NULL::integer,NULL::text), (2,2,'bar'::text), (3,1,'baz'::text)) v(a, b, c), generate_series(1, 2) i(i); +(1 row) + +drop view agg_view1; +-- incorrect DISTINCT usage errors +select aggfns(distinct a,b,c order by i) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list +LINE 1: select aggfns(distinct a,b,c order by i) + ^ +select aggfns(distinct a,b,c order by a,b+1) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list +LINE 1: select aggfns(distinct a,b,c order by a,b+1) + ^ +select aggfns(distinct a,b,c order by a,b,i,c) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list +LINE 1: select aggfns(distinct a,b,c order by a,b,i,c) + ^ +select aggfns(distinct a,a,c order by a,b) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list +LINE 1: select aggfns(distinct a,a,c order by a,b) + ^ diff --git a/src/test/regress/expected/create_aggregate.out b/src/test/regress/expected/create_aggregate.out index 1c540b2d95650b4933f6c8d5c0f5d7e514df3139..448e319794c664645f80a3e52b4094f858d93b94 100644 --- a/src/test/regress/expected/create_aggregate.out +++ b/src/test/regress/expected/create_aggregate.out @@ -32,6 +32,10 @@ CREATE AGGREGATE newcnt ("any") ( sfunc = int8inc_any, stype = int8, initcond = '0' ); +COMMENT ON AGGREGATE nosuchagg (*) IS 'should fail'; +ERROR: aggregate nosuchagg(*) does not exist +COMMENT ON AGGREGATE newcnt (*) IS 'an agg(*) comment'; +COMMENT ON AGGREGATE newcnt ("any") IS 'an agg(any) comment'; -- multi-argument aggregate create function sum3(int8,int8,int8) returns int8 as 'select $1 + $2 + $3' language sql strict immutable; @@ -39,7 +43,19 @@ create aggregate sum2(int8,int8) ( sfunc = sum3, stype = int8, initcond = '0' ); -COMMENT ON AGGREGATE nosuchagg (*) IS 'should fail'; -ERROR: aggregate nosuchagg(*) does not exist -COMMENT ON AGGREGATE newcnt (*) IS 'an agg(*) comment'; -COMMENT ON AGGREGATE newcnt ("any") IS 'an agg(any) comment'; +-- multi-argument aggregates sensitive to distinct/order, strict/nonstrict +create type aggtype as (a integer, b integer, c text); +create function aggf_trans(aggtype[],integer,integer,text) returns aggtype[] +as 'select array_append($1,ROW($2,$3,$4)::aggtype)' +language sql strict immutable; +create function aggfns_trans(aggtype[],integer,integer,text) returns aggtype[] +as 'select array_append($1,ROW($2,$3,$4)::aggtype)' +language sql immutable; +create aggregate aggfstr(integer,integer,text) ( + sfunc = aggf_trans, stype = aggtype[], + initcond = '{}' +); +create aggregate aggfns(integer,integer,text) ( + sfunc = aggfns_trans, stype = aggtype[], + initcond = '{}' +); diff --git a/src/test/regress/output/misc.source b/src/test/regress/output/misc.source index 91e0d2b04d79dab77c3dab10c274e7a496c856f1..0effa4b853debd01679302852ce4360a25e46989 100644 --- a/src/test/regress/output/misc.source +++ b/src/test/regress/output/misc.source @@ -571,6 +571,7 @@ SELECT user_relns() AS user_relns a_star abstime_tbl aggtest + aggtype array_index_op_test array_op_test arrtest @@ -668,7 +669,7 @@ SELECT user_relns() AS user_relns toyemp varchar_tbl xacttest -(101 rows) +(102 rows) SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer'))); name diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql index 39b0187a1e86af56dda7d487f7537d96ca31a510..18f2e57b72b7fadc3895f87e8025a9fc66d2ad54 100644 --- a/src/test/regress/sql/aggregates.sql +++ b/src/test/regress/sql/aggregates.sql @@ -230,3 +230,128 @@ select max(unique2) from tenk1 order by 1; select max(unique2) from tenk1 order by max(unique2); select max(unique2) from tenk1 order by max(unique2)+1; select max(unique2), generate_series(1,3) as g from tenk1 order by g desc; + +-- +-- Test combinations of DISTINCT and/or ORDER BY +-- + +select array_agg(a order by b) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); +select array_agg(a order by a) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); +select array_agg(a order by a desc) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); +select array_agg(b order by a desc) + from (values (1,4),(2,3),(3,1),(4,2)) v(a,b); + +select array_agg(distinct a) + from (values (1),(2),(1),(3),(null),(2)) v(a); +select array_agg(distinct a order by a) + from (values (1),(2),(1),(3),(null),(2)) v(a); +select array_agg(distinct a order by a desc) + from (values (1),(2),(1),(3),(null),(2)) v(a); +select array_agg(distinct a order by a desc nulls last) + from (values (1),(2),(1),(3),(null),(2)) v(a); + +-- multi-arg aggs, strict/nonstrict, distinct/order by + +select aggfstr(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); +select aggfns(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + +select aggfstr(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; +select aggfns(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + +select aggfstr(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; +select aggfns(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + +-- test specific code paths + +select aggfns(distinct a,a,c order by c using ~<~,a) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; +select aggfns(distinct a,a,c order by c using ~<~) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; +select aggfns(distinct a,a,c order by a) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; +select aggfns(distinct a,b,c order by a,c using ~<~,b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + +-- check node I/O via view creation and usage, also deparsing logic + +create view agg_view1 as + select aggfns(a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(distinct a,b,c) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(distinct a,b,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,3) i; + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(a,b,c order by b+1) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(a,a,c order by b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(a,b,c order by c using ~<~) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c); + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +create or replace view agg_view1 as + select aggfns(distinct a,b,c order by a,c using ~<~,b) + from (values (1,3,'foo'),(0,null,null),(2,2,'bar'),(3,1,'baz')) v(a,b,c), + generate_series(1,2) i; + +select * from agg_view1; +select pg_get_viewdef('agg_view1'::regclass); + +drop view agg_view1; + +-- incorrect DISTINCT usage errors + +select aggfns(distinct a,b,c order by i) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +select aggfns(distinct a,b,c order by a,b+1) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +select aggfns(distinct a,b,c order by a,b,i,c) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; +select aggfns(distinct a,a,c order by a,b) + from (values (1,1,'foo')) v(a,b,c), generate_series(1,2) i; diff --git a/src/test/regress/sql/create_aggregate.sql b/src/test/regress/sql/create_aggregate.sql index 9d02048e24579a2b8064a315ac55b8a42309102b..61742482523fc140e623484e78b691fa1382be9e 100644 --- a/src/test/regress/sql/create_aggregate.sql +++ b/src/test/regress/sql/create_aggregate.sql @@ -38,6 +38,10 @@ CREATE AGGREGATE newcnt ("any") ( initcond = '0' ); +COMMENT ON AGGREGATE nosuchagg (*) IS 'should fail'; +COMMENT ON AGGREGATE newcnt (*) IS 'an agg(*) comment'; +COMMENT ON AGGREGATE newcnt ("any") IS 'an agg(any) comment'; + -- multi-argument aggregate create function sum3(int8,int8,int8) returns int8 as 'select $1 + $2 + $3' language sql strict immutable; @@ -47,6 +51,23 @@ create aggregate sum2(int8,int8) ( initcond = '0' ); -COMMENT ON AGGREGATE nosuchagg (*) IS 'should fail'; -COMMENT ON AGGREGATE newcnt (*) IS 'an agg(*) comment'; -COMMENT ON AGGREGATE newcnt ("any") IS 'an agg(any) comment'; +-- multi-argument aggregates sensitive to distinct/order, strict/nonstrict +create type aggtype as (a integer, b integer, c text); + +create function aggf_trans(aggtype[],integer,integer,text) returns aggtype[] +as 'select array_append($1,ROW($2,$3,$4)::aggtype)' +language sql strict immutable; + +create function aggfns_trans(aggtype[],integer,integer,text) returns aggtype[] +as 'select array_append($1,ROW($2,$3,$4)::aggtype)' +language sql immutable; + +create aggregate aggfstr(integer,integer,text) ( + sfunc = aggf_trans, stype = aggtype[], + initcond = '{}' +); + +create aggregate aggfns(integer,integer,text) ( + sfunc = aggfns_trans, stype = aggtype[], + initcond = '{}' +);