Skip to content
Snippets Groups Projects
  1. Jun 21, 2012
    • Tom Lane's avatar
      Fix memory leak in ARRAY(SELECT ...) subqueries. · 66567ab2
      Tom Lane authored
      Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which
      I think can only happen if the sub-select is embedded in a larger,
      correlated subquery) would leak memory for the duration of the query,
      due to not reclaiming the array generated in the previous execution.
      Per bug #6698 from Armando Miraglia.  Diagnosis and fix idea by Heikki,
      patch itself by me.
      
      This has been like this all along, so back-patch to all supported versions.
      66567ab2
  2. Jun 10, 2012
  3. Mar 20, 2012
    • Tom Lane's avatar
      Restructure SELECT INTO's parsetree representation into CreateTableAsStmt. · 9dbf2b7d
      Tom Lane authored
      Making this operation look like a utility statement seems generally a good
      idea, and particularly so in light of the desire to provide command
      triggers for utility statements.  The original choice of representing it as
      SELECT with an IntoClause appendage had metastasized into rather a lot of
      places, unfortunately, so that this patch is a great deal more complicated
      than one might at first expect.
      
      In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
      subcommands required restructuring some EXPLAIN-related APIs.  Add-on code
      that calls ExplainOnePlan or ExplainOneUtility, or uses
      ExplainOneQuery_hook, will need adjustment.
      
      Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
      which formerly were accepted though undocumented, are no longer accepted.
      The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
      The CREATE RULE case doesn't seem to have much real-world use (since the
      rule would work only once before failing with "table already exists"),
      so we'll not bother with that one.
      
      Both SELECT INTO and CREATE TABLE AS still return a command tag of
      "SELECT nnnn".  There was some discussion of returning "CREATE TABLE nnnn",
      but for the moment backwards compatibility wins the day.
      
      Andres Freund and Tom Lane
      9dbf2b7d
  4. Jan 26, 2012
  5. Jan 02, 2012
  6. Dec 07, 2011
    • Tom Lane's avatar
      Create a "sort support" interface API for faster sorting. · c6e3ac11
      Tom Lane authored
      This patch creates an API whereby a btree index opclass can optionally
      provide non-SQL-callable support functions for sorting.  In the initial
      patch, we only use this to provide a directly-callable comparator function,
      which can be invoked with a bit less overhead than the traditional
      SQL-callable comparator.  While that should be of value in itself, the real
      reason for doing this is to provide a datatype-extensible framework for
      more aggressive optimizations, as in Peter Geoghegan's recent work.
      
      Robert Haas and Tom Lane
      c6e3ac11
  7. Oct 11, 2011
    • Tom Lane's avatar
      Rearrange the implementation of index-only scans. · a0185461
      Tom Lane authored
      This commit changes index-only scans so that data is read directly from the
      index tuple without first generating a faux heap tuple.  The only immediate
      benefit is that indexes on system columns (such as OID) can be used in
      index-only scans, but this is necessary infrastructure if we are ever to
      support index-only scans on expression indexes.  The executor is now ready
      for that, though the planner still needs substantial work to recognize
      the possibility.
      
      To do this, Vars in index-only plan nodes have to refer to index columns
      not heap columns.  I introduced a new special varno, INDEX_VAR, to mark
      such Vars to avoid confusion.  (In passing, this commit renames the two
      existing special varnos to OUTER_VAR and INNER_VAR.)  This allows
      ruleutils.c to handle them with logic similar to what we use for subplan
      reference Vars.
      
      Since index-only scans are now fundamentally different from regular
      indexscans so far as their expression subtrees are concerned, I also chose
      to change them to have their own plan node type (and hence, their own
      executor source file).
      a0185461
  8. Oct 08, 2011
    • Tom Lane's avatar
      Support index-only scans using the visibility map to avoid heap fetches. · a2822fb9
      Tom Lane authored
      When a btree index contains all columns required by the query, and the
      visibility map shows that all tuples on a target heap page are
      visible-to-all, we don't need to fetch that heap page.  This patch depends
      on the previous patches that made the visibility map reliable.
      
      There's a fair amount left to do here, notably trying to figure out a less
      chintzy way of estimating the cost of an index-only scan, but the core
      functionality seems ready to commit.
      
      Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.
      a2822fb9
  9. Sep 22, 2011
    • Tom Lane's avatar
      Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps. · f1972723
      Tom Lane authored
      This provides information about the numbers of tuples that were visited
      but not returned by table scans, as well as the numbers of join tuples
      that were considered and discarded within a join plan node.
      
      There is still some discussion going on about the best way to report counts
      for outer-join situations, but I think most of what's in the patch would
      not change if we revise that, so I'm going to go ahead and commit it as-is.
      
      Documentation changes to follow (they weren't in the submitted patch
      either).
      
      Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
      f1972723
  10. Sep 01, 2011
  11. Aug 22, 2011
    • Tom Lane's avatar
      Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist. · b33f78df
      Tom Lane authored
      Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER
      ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger
      fired for the same update.  Fix by not trying to overload the use of
      estate->es_trig_tuple_slot.  Per report from Yoran Heling.
      
      Back-patch to 9.0, when trigger WHEN conditions were introduced.
      b33f78df
  12. Jul 04, 2011
  13. Apr 13, 2011
    • Tom Lane's avatar
      Pass collations to functions in FunctionCallInfoData, not FmgrInfo. · d64713df
      Tom Lane authored
      Since collation is effectively an argument, not a property of the function,
      FmgrInfo is really the wrong place for it; and this becomes critical in
      cases where a cached FmgrInfo is used for varying purposes that might need
      different collation settings.  Fix by passing it in FunctionCallInfoData
      instead.  In particular this allows a clean fix for bug #5970 (record_cmp
      not working).  This requires touching a bit more code than the original
      method, but nobody ever thought that collations would not be an invasive
      patch...
      d64713df
  14. Apr 10, 2011
  15. Feb 27, 2011
    • Tom Lane's avatar
      Refactor the executor's API to support data-modifying CTEs better. · a874fe7b
      Tom Lane authored
      The originally committed patch for modifying CTEs didn't interact well
      with EXPLAIN, as noted by myself, and also had corner-case problems with
      triggers, as noted by Dean Rasheed.  Those problems show it is really not
      practical for ExecutorEnd to call any user-defined code; so split the
      cleanup duties out into a new function ExecutorFinish, which must be called
      between the last ExecutorRun call and ExecutorEnd.  Some Asserts have been
      added to these functions to help verify correct usage.
      
      It is no longer necessary for callers of the executor to call
      AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
      done by ExecutorStart/ExecutorFinish respectively.  If you really need to
      suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
      ExecutorStart.
      
      Also, refactor portal commit processing to allow for the possibility that
      PortalDrop will invoke user-defined code.  I think this is not actually
      necessary just yet, since the portal-execution-strategy logic forces any
      non-pure-SELECT query to be run to completion before we will consider
      committing.  But it seems like good future-proofing.
      a874fe7b
  16. Feb 26, 2011
    • Tom Lane's avatar
      Support data-modifying commands (INSERT/UPDATE/DELETE) in WITH. · 389af951
      Tom Lane authored
      This patch implements data-modifying WITH queries according to the
      semantics that the updates all happen with the same command counter value,
      and in an unspecified order.  Therefore one WITH clause can't see the
      effects of another, nor can the outer query see the effects other than
      through the RETURNING values.  And attempts to do conflicting updates will
      have unpredictable results.  We'll need to document all that.
      
      This commit just fixes the code; documentation updates are waiting on
      author.
      
      Marko Tiikkaja and Hitoshi Harada
      389af951
  17. Feb 20, 2011
  18. Feb 10, 2011
    • Tom Lane's avatar
      Fix improper matching of resjunk column names for FOR UPDATE in subselect. · e617f0d7
      Tom Lane authored
      Flattening of subquery range tables during setrefs.c could lead to the
      rangetable indexes in PlanRowMark nodes not matching up with the column
      names previously assigned to the corresponding resjunk ctid (resp. tableoid
      or wholerow) columns.  Typical symptom would be either a "cannot extract
      system attribute from virtual tuple" error or an Assert failure.  This
      wasn't a problem before 9.0 because we didn't support FOR UPDATE below the
      top query level, and so the final flattening could never renumber an RTE
      that was relevant to FOR UPDATE.  Fix by using a plan-tree-wide unique
      number for each PlanRowMark to label the associated resjunk columns, so
      that the number need not change during flattening.
      
      Per report from David Johnston (though I'm darned if I can see how this got
      past initial testing of the relevant code).  Back-patch to 9.0.
      e617f0d7
  19. Jan 13, 2011
    • Tom Lane's avatar
      Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly. · d487afbb
      Tom Lane authored
      In an inherited UPDATE/DELETE, each target table has its own subplan,
      because it might have a column set different from other targets.  This
      means that the resjunk columns we add to support EvalPlanQual might be
      at different physical column numbers in each subplan.  The EvalPlanQual
      rewrite I did for 9.0 failed to account for this, resulting in possible
      misbehavior or even crashes during concurrent updates to the same row,
      as seen in a recent report from Gordon Shannon.  Revise the data structure
      so that we track resjunk column numbers separately for each subplan.
      
      I also chose to move responsibility for identifying the physical column
      numbers back to executor startup, instead of assuming that numbers derived
      during preprocess_targetlist would stay valid throughout subsequent
      massaging of the plan.  That's a bit slower, so we might want to consider
      undoing it someday; but it would complicate the patch considerably and
      didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
      d487afbb
  20. Jan 01, 2011
  21. Dec 31, 2010
    • Tom Lane's avatar
      Move symbols for ExecMergeJoin's state machine into nodeMergejoin.c. · 7b464015
      Tom Lane authored
      There's no reason for these values to be known anywhere else.  After
      doing this, executor/execdefs.h is vestigial and can be removed.
      7b464015
    • Tom Lane's avatar
      Support RIGHT and FULL OUTER JOIN in hash joins. · f4e4b327
      Tom Lane authored
      This is advantageous first because it allows us to hash the smaller table
      regardless of the outer-join type, and second because hash join can be more
      flexible than merge join in dealing with arbitrary join quals in a FULL
      join.  For merge join all the join quals have to be mergejoinable, but hash
      join will work so long as there's at least one hashjoinable qual --- the
      others can be any condition.  (This is true essentially because we don't
      keep per-inner-tuple match flags in merge join, while hash join can do so.)
      
      To do this, we need a has-it-been-matched flag for each tuple in the
      hashtable, not just one for the current outer tuple.  The key idea that
      makes this practical is that we can store the match flag in the tuple's
      infomask, since there are lots of bits there that are of no interest for a
      MinimalTuple.  So we aren't increasing the size of the hashtable at all for
      the feature.
      
      To write this without turning the hash code into even more of a pile of
      spaghetti than it already was, I rewrote ExecHashJoin in a state-machine
      style, similar to ExecMergeJoin.  Other than that decision, it was pretty
      straightforward.
      f4e4b327
  22. Dec 03, 2010
    • Tom Lane's avatar
      Create core infrastructure for KNNGIST. · d583f10b
      Tom Lane authored
      This is a heavily revised version of builtin_knngist_core-0.9.  The
      ordering operators are no longer mixed in with actual quals, which would
      have confused not only humans but significant parts of the planner.
      Instead, ordering operators are carried separately throughout planning and
      execution.
      
      Since the API for ambeginscan and amrescan functions had to be changed
      anyway, this commit takes the opportunity to rationalize that a bit.
      RelationGetIndexScan no longer forces a premature index_rescan call;
      instead, callers of index_beginscan must call index_rescan too.  Aside from
      making the AM-side initialization logic a bit less peculiar, this has the
      advantage that we do not make a useless extra am_rescan call when there are
      runtime key values.  AMs formerly could not assume that the key values
      passed to amrescan were actually valid; now they can.
      
      Teodor Sigaev and Tom Lane
      d583f10b
  23. Nov 01, 2010
    • Tom Lane's avatar
      Avoid using a local FunctionCallInfoData struct in ExecMakeFunctionResult · 0811ff20
      Tom Lane authored
      and related routines.
      
      We already had a redundant FunctionCallInfoData struct in FuncExprState,
      but were using that copy only in set-returning-function cases, to avoid
      keeping function evaluation state in the expression tree for the benefit
      of plpgsql's "simple expression" logic.  But of course that didn't work
      anyway.  Given the recent fixes in plpgsql there is no need to have two
      separate behaviors here.  Getting rid of the local FunctionCallInfoData
      structs should make things a little faster (because we don't need to do
      InitFunctionCallInfoData each time), and it also makes for a noticeable
      reduction in stack space consumption during recursive calls.
      0811ff20
  24. Oct 26, 2010
  25. Oct 14, 2010
    • Tom Lane's avatar
      Support MergeAppend plans, to allow sorted output from append relations. · 11cad29c
      Tom Lane authored
      This patch eliminates the former need to sort the output of an Append scan
      when an ordered scan of an inheritance tree is wanted.  This should be
      particularly useful for fast-start cases such as queries with LIMIT.
      
      Original patch by Greg Stark, with further hacking by Hans-Jurgen Schonig,
      Robert Haas, and Tom Lane.
      11cad29c
  26. Sep 20, 2010
  27. Jul 28, 2010
    • Tom Lane's avatar
      Fix potential failure when hashing the output of a subplan that produces · 133924e1
      Tom Lane authored
      a pass-by-reference datatype with a nontrivial projection step.
      We were using the same memory context for the projection operation as for
      the temporary context used by the hashtable routines in execGrouping.c.
      However, the hashtable routines feel free to reset their temp context at
      any time, which'd lead to destroying input data that was still needed.
      Report and diagnosis by Tao Ma.
      
      Back-patch to 8.1, where the problem was introduced by the changes that
      allowed us to work with "virtual" tuples instead of materializing intermediate
      tuple values everywhere.  The earlier code looks quite similar, but it doesn't
      suffer the problem because the data gets copied into another context as a
      result of having to materialize ExecProject's output tuple.
      133924e1
  28. Feb 26, 2010
  29. Feb 12, 2010
    • Tom Lane's avatar
      Extend the set of frame options supported for window functions. · ec4be2ee
      Tom Lane authored
      This patch allows the frame to start from CURRENT ROW (in either RANGE or
      ROWS mode), and it also adds support for ROWS n PRECEDING and ROWS n FOLLOWING
      start and end points.  (RANGE value PRECEDING/FOLLOWING isn't there yet ---
      the grammar works, but that's all.)
      
      Hitoshi Harada, reviewed by Pavel Stehule
      ec4be2ee
  30. Jan 06, 2010
    • Tom Lane's avatar
      Add support for doing FULL JOIN ON FALSE. While this is really a rather · 90f4c2d9
      Tom Lane authored
      peculiar variant of UNION ALL, and so wouldn't likely get written directly
      as-is, it's possible for it to arise as a result of simplification of
      less-obviously-silly queries.  In particular, now that we can do flattening
      of subqueries that have constant outputs and are underneath an outer join,
      it's possible for the case to result from simplification of queries of the
      type exhibited in bug #5263.  Back-patch to 8.4 to avoid a functionality
      regression for this type of query.
      90f4c2d9
  31. Jan 02, 2010
  32. Dec 15, 2009
  33. Dec 07, 2009
  34. Nov 20, 2009
    • Tom Lane's avatar
      Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be · 7fc0f062
      Tom Lane authored
      checked to determine whether the trigger should be fired.
      
      For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
      triggers it can provide a noticeable performance improvement, since queuing of
      a deferred trigger event and re-fetching of the row(s) at end of statement can
      be short-circuited if the trigger does not need to be fired.
      
      Takahiro Itagaki, reviewed by KaiGai Kohei.
      7fc0f062
  35. Oct 26, 2009
    • Tom Lane's avatar
      Re-implement EvalPlanQual processing to improve its performance and eliminate · 9f2ee8f2
      Tom Lane authored
      a lot of strange behaviors that occurred in join cases.  We now identify the
      "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
      UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
      appropriate row into each scan node in the rechecking plan, forcing it to emit
      only that one row.  The former behavior could rescan the whole of each joined
      relation for each recheck, which was terrible for performance, and what's much
      worse could result in duplicated output tuples.
      
      Also, the original implementation of EvalPlanQual could not re-use the recheck
      execution tree --- it had to go through a full executor init and shutdown for
      every row to be tested.  To avoid this overhead, I've associated a special
      runtime Param with each LockRows or ModifyTable plan node, and arranged to
      make every scan node below such a node depend on that Param.  Thus, by
      signaling a change in that Param, the EPQ machinery can just rescan the
      already-built test plan.
      
      This patch also adds a prohibition on set-returning functions in the
      targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
      duplicate-output-tuple problem.  It seems fairly reasonable since the
      other restrictions on SELECT FOR UPDATE are meant to ensure that there
      is a unique correspondence between source tuples and result tuples,
      which an output SRF destroys as much as anything else does.
      9f2ee8f2
  36. Oct 12, 2009
    • Tom Lane's avatar
      Move the handling of SELECT FOR UPDATE locking and rechecking out of · 0adaf4cb
      Tom Lane authored
      execMain.c and into a new plan node type LockRows.  Like the recent change
      to put table updating into a ModifyTable plan node, this increases planning
      flexibility by allowing the operations to occur below the top level of the
      plan tree.  It's necessary in any case to restore the previous behavior of
      having FOR UPDATE locking occur before ModifyTable does.
      
      This partially refactors EvalPlanQual to allow multiple rows-under-test
      to be inserted into the EPQ machinery before starting an EPQ test query.
      That isn't sufficient to fix EPQ's general bogosity in the face of plans
      that return multiple rows per test row, though.  Since this patch is
      mostly about getting some plan node infrastructure in place and not about
      fixing ten-year-old bugs, I will leave EPQ improvements for another day.
      
      Another behavioral change that we could now think about is doing FOR UPDATE
      before LIMIT, but that too seems like it should be treated as a followon
      patch.
      0adaf4cb
  37. Oct 10, 2009
    • Tom Lane's avatar
      Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. · 8a5849b7
      Tom Lane authored
      They are now handled by a new plan node type called ModifyTable, which is
      placed at the top of the plan tree.  In itself this change doesn't do much,
      except perhaps make the handling of RETURNING lists and inherited UPDATEs a
      tad less klugy.  But it is necessary preparation for the intended extension of
      allowing RETURNING queries inside WITH.
      
      Marko Tiikkaja
      8a5849b7
  38. Sep 27, 2009
    • Tom Lane's avatar
      Replace the array-style TupleTable data structure with a simple List of · f92e8a4b
      Tom Lane authored
      TupleTableSlot nodes.  This eliminates the need to count in advance
      how many Slots will be needed, which seems more than worth the small
      increase in the amount of palloc traffic during executor startup.
      
      The ExecCountSlots infrastructure is now all dead code, but I'll remove it
      in a separate commit for clarity.
      
      Per a comment from Robert Haas.
      f92e8a4b
Loading