Skip to content
Snippets Groups Projects
  1. Jun 29, 2016
    • Tom Lane's avatar
      Fix match_foreign_keys_to_quals for FKs linking to unused rtable entries. · b32e6350
      Tom Lane authored
      Since get_relation_foreign_keys doesn't try to determine whether RTEs
      are actually part of the query semantics, it might make FK info records
      linking to RTEs that won't have a RelOptInfo at all.  Cope with that.
      Per bug #14219 from Andrew Gierth.
      
      Report: <20160629183338.1397.43514@wrigleys.postgresql.org>
      b32e6350
  2. Jun 18, 2016
    • Tom Lane's avatar
      Restore foreign-key-aware estimation of join relation sizes. · 100340e2
      Tom Lane authored
      This patch provides a new implementation of the logic added by commit
      137805f8 and later removed by 77ba6108.  It differs from the original
      primarily in expending much less effort per joinrel in large queries,
      which it accomplishes by doing most of the matching work once per query not
      once per joinrel.  Hopefully, it's also less buggy and better commented.
      The never-documented enable_fkey_estimates GUC remains gone.
      
      There remains work to be done to make the selectivity estimates account
      for nulls in FK referencing columns; but that was true of the original
      patch as well.  We may be able to address this point later in beta.
      In the meantime, any error should be in the direction of overestimating
      rather than underestimating joinrel sizes, which seems like the direction
      we want to err in.
      
      Tomas Vondra and Tom Lane
      
      Discussion: <31041.1465069446@sss.pgh.pa.us>
      100340e2
  3. Apr 22, 2016
    • Tom Lane's avatar
      Fix planner failure with full join in RHS of left join. · 80f66a9a
      Tom Lane authored
      Given a left join containing a full join in its righthand side, with
      the left join's joinclause referencing only one side of the full join
      (in a non-strict fashion, so that the full join doesn't get simplified),
      the planner could fail with "failed to build any N-way joins" or related
      errors.  This happened because the full join was seen as overlapping the
      left join's RHS, and then recent changes within join_is_legal() caused
      that function to conclude that the full join couldn't validly be formed.
      Rather than try to rejigger join_is_legal() yet more to allow this,
      I think it's better to fix initsplan.c so that the required join order
      is explicit in the SpecialJoinInfo data structure.  The previous coding
      there essentially ignored full joins, relying on the fact that we don't
      flatten them in the joinlist data structure to preserve their ordering.
      That's sufficient to prevent a wrong plan from being formed, but as this
      example shows, it's not sufficient to ensure that the right plan will
      be formed.  We need to work a bit harder to ensure that the right plan
      looks sane according to the SpecialJoinInfos.
      
      Per bug #14105 from Vojtech Rylko.  This was apparently induced by
      commit 8703059c (though now that I've seen it, I wonder whether there
      are related cases that could have failed before that); so back-patch
      to all active branches.  Unfortunately, that patch also went into 9.0,
      so this bug is a regression that won't be fixed in that branch.
      80f66a9a
  4. Mar 14, 2016
    • Tom Lane's avatar
      Rethink representation of PathTargets. · 307c7885
      Tom Lane authored
      In commit 19a54114 I did not make PathTarget a subtype of Node,
      and embedded a RelOptInfo's reltarget directly into it rather than having
      a separately-allocated Node.  In hindsight that was misguided
      micro-optimization, enabled by the fact that at that point we didn't have
      any Paths with custom PathTargets.  Now that PathTarget processing has
      been fleshed out some more, it's easier to see that it's better to have
      PathTarget as an indepedent Node type, even if it does cost us one more
      palloc to create a RelOptInfo.  So change it while we still can.
      
      This commit just changes the representation, without doing anything more
      interesting than that.
      307c7885
  5. Mar 10, 2016
    • Tom Lane's avatar
      Give pull_var_clause() reject/recurse/return behavior for WindowFuncs too. · c82c92b1
      Tom Lane authored
      All along, this function should have treated WindowFuncs in a manner
      similar to Aggrefs, ie with an option whether or not to recurse into them.
      By not considering the case, it was always recursing, which is OK for most
      callers (although I suspect that the case in prepare_sort_from_pathkeys
      might represent a bug).  But now we need return-without-recursing behavior
      as well.  There are also more than a few callers that should never see a
      WindowFunc, and now we'll get some error checking on that.
      c82c92b1
    • Tom Lane's avatar
      Refactor pull_var_clause's API to make it less tedious to extend. · 364a9f47
      Tom Lane authored
      In commit 1d97c19a and later c1d9579d, we extended
      pull_var_clause's API by adding enum-type arguments.  That's sort of a pain
      to maintain, though, because it means every time we add a new behavior we
      must touch every last one of the call sites, even if there's a reasonable
      default behavior that most of them could use.  Let's switch over to using a
      bitmask of flags, instead; that seems more maintainable and might save a
      nanosecond or two as well.  This commit changes no behavior in itself,
      though I'm going to follow it up with one that does add a new behavior.
      
      In passing, remove flatten_tlist(), which has not been used since 9.1
      and would otherwise need the same API changes.
      
      Removing these enums means that optimizer/tlist.h no longer needs to
      depend on optimizer/var.h.  Changing that caused a number of C files to
      need addition of #include "optimizer/var.h" (probably we can thank old
      runs of pgrminclude for that); but on balance it seems like a good change
      anyway.
      364a9f47
  6. Feb 19, 2016
    • Tom Lane's avatar
      Add an explicit representation of the output targetlist to Paths. · 19a54114
      Tom Lane authored
      Up to now, there's been an assumption that all Paths for a given relation
      compute the same output column set (targetlist).  However, there are good
      reasons to remove that assumption.  For example, an indexscan on an
      expression index might be able to return the value of an expensive function
      "for free".  While we have the ability to generate such a plan today in
      simple cases, we don't have a way to model that it's cheaper than a plan
      that computes the function from scratch, nor a way to create such a plan
      in join cases (where the function computation would normally happen at
      the topmost join node).  Also, we need this so that we can have Paths
      representing post-scan/join steps, where the targetlist may well change
      from one step to the next.  Therefore, invent a "struct PathTarget"
      representing the columns we expect a plan step to emit.  It's convenient
      to include the output tuple width and tlist evaluation cost in this struct,
      and there will likely be additional fields in future.
      
      While Path nodes that actually do have custom outputs will need their own
      PathTargets, it will still be true that most Paths for a given relation
      will compute the same tlist.  To reduce the overhead added by this patch,
      keep a "default PathTarget" in RelOptInfo, and allow Paths that compute
      that column set to just point to their parent RelOptInfo's reltarget.
      (In the patch as committed, actually every Path is like that, since we
      do not yet have any cases of custom PathTargets.)
      
      I took this opportunity to provide some more-honest costing of
      PlaceHolderVar evaluation.  Up to now, the assumption that "scan/join
      reltargetlists have cost zero" was applied not only to Vars, where it's
      reasonable, but also PlaceHolderVars where it isn't.  Now, we add the eval
      cost of a PlaceHolderVar's expression to the first plan level where it can
      be computed, by including it in the PathTarget cost field and adding that
      to the cost estimates for Paths.  This isn't perfect yet but it's much
      better than before, and there is a way forward to improve it more.  This
      costing change affects the join order chosen for a couple of the regression
      tests, changing expected row ordering.
      19a54114
  7. Jan 08, 2016
    • Tom Lane's avatar
      Delay creation of subplan tlist until after create_plan(). · c44d0138
      Tom Lane authored
      Once upon a time it was necessary for grouping_planner() to determine
      the tlist it wanted from the scan/join plan subtree before it called
      query_planner(), because query_planner() would actually make a Plan using
      that.  But we refactored things a long time ago to delay construction of
      the Plan tree till later, so there's no need to build that tlist until
      (and indeed unless) we're ready to plaster it onto the Plan.  The only
      thing query_planner() cares about is what Vars are going to be needed for
      the tlist, and it can perfectly well get that by looking at the real tlist
      rather than some masticated version.
      
      Well, actually, there is one minor glitch in that argument, which is that
      make_subplanTargetList also adds Vars appearing only in HAVING to the
      tlist it produces.  So now we have to account for HAVING explicitly in
      build_base_rel_tlists.  But that just adds a few lines of code, and
      I doubt it moves the needle much on processing time; we might be doing
      pull_var_clause() twice on the havingQual, but before we had it scanning
      dummy tlist entries instead.
      
      This is a very small down payment on rationalizing grouping_planner
      enough so it can be refactored.
      c44d0138
  8. Jan 02, 2016
  9. Dec 11, 2015
    • Tom Lane's avatar
      Get rid of the planner's LateralJoinInfo data structure. · 4fcf4845
      Tom Lane authored
      I originally modeled this data structure on SpecialJoinInfo, but after
      commit acfcd45c that looks like a pretty poor decision.
      All we really need is relid sets identifying laterally-referenced rels;
      and most of the time, what we want to know about includes indirect lateral
      references, a case the LateralJoinInfo data was unsuited to compute with
      any efficiency.  The previous commit redefined RelOptInfo.lateral_relids
      as the transitive closure of lateral references, so that it easily supports
      checking indirect references.  For the places where we really do want just
      direct references, add a new RelOptInfo field direct_lateral_relids, which
      is easily set up as a copy of lateral_relids before we perform the
      transitive closure calculation.  Then we can just drop lateral_info_list
      and LateralJoinInfo and the supporting code.  This makes the planner's
      handling of lateral references noticeably more efficient, and shorter too.
      
      Such a change can't be back-patched into stable branches for fear of
      breaking extensions that might be looking at the planner's data structures;
      but it seems not too late to push it into 9.5, so I've done so.
      4fcf4845
    • Tom Lane's avatar
      Still more fixes for planner's handling of LATERAL references. · acfcd45c
      Tom Lane authored
      More fuzz testing by Andreas Seltenreich exposed that the planner did not
      cope well with chains of lateral references.  If relation X references Y
      laterally, and Y references Z laterally, then we will have to scan X on the
      inside of a nestloop with Z, so for all intents and purposes X is laterally
      dependent on Z too.  The planner did not understand this and would generate
      intermediate joins that could not be used.  While that was usually harmless
      except for wasting some planning cycles, under the right circumstances it
      would lead to "failed to build any N-way joins" or "could not devise a
      query plan" planner failures.
      
      To fix that, convert the existing per-relation lateral_relids and
      lateral_referencers relid sets into their transitive closures; that is,
      they now show all relations on which a rel is directly or indirectly
      laterally dependent.  This not only fixes the chained-reference problem
      but allows some of the relevant tests to be made substantially simpler
      and faster, since they can be reduced to simple bitmap manipulations
      instead of searches of the LateralJoinInfo list.
      
      Also, when a PlaceHolderVar that is due to be evaluated at a join contains
      lateral references, we should treat those references as indirect lateral
      dependencies of each of the join's base relations.  This prevents us from
      trying to join any individual base relations to the lateral reference
      source before the join is formed, which again cannot work.
      
      Andreas' testing also exposed another oversight in the "dangerous
      PlaceHolderVar" test added in commit 85e5e222.  Simply rejecting
      unsafe join paths in joinpath.c is insufficient, because in some cases
      we will end up rejecting *all* possible paths for a particular join, again
      leading to "could not devise a query plan" failures.  The restriction has
      to be known also to join_is_legal and its cohort functions, so that they
      will not select a join for which that will happen.  I chose to move the
      supporting logic into joinrels.c where the latter functions are.
      
      Back-patch to 9.3 where LATERAL support was introduced.
      acfcd45c
  10. Aug 06, 2015
    • Tom Lane's avatar
      Further fixes for degenerate outer join clauses. · 8703059c
      Tom Lane authored
      Further testing revealed that commit f69b4b94 was still a few
      bricks shy of a load: minor tweaking of the previous test cases resulted
      in the same wrong-outer-join-order problem coming back.  After study
      I concluded that my previous changes in make_outerjoininfo() were just
      accidentally masking the problem, and should be reverted in favor of
      forcing syntactic join order whenever an upper outer join's predicate
      doesn't mention a lower outer join's LHS.  This still allows the
      chained-outer-joins style that is the normally optimizable case.
      
      I also tightened things up some more in join_is_legal().  It seems to me
      on review that what's really happening in the exception case where we
      ignore a mismatched special join is that we're allowing the proposed join
      to associate into the RHS of the outer join we're comparing it to.  As
      such, we should *always* insist that the proposed join be a left join,
      which eliminates a bunch of rather dubious argumentation.  The case where
      we weren't enforcing that was the one that was already known buggy anyway
      (it had a violatable Assert before the aforesaid commit) so it hardly
      deserves a lot of deference.
      
      Back-patch to all active branches, like the previous patch.  The added
      regression test case failed in all branches back to 9.1, and I think it's
      only an unrelated change in costing calculations that kept 9.0 from
      choosing a broken plan.
      8703059c
  11. Aug 02, 2015
    • Tom Lane's avatar
      Fix some planner issues with degenerate outer join clauses. · f69b4b94
      Tom Lane authored
      An outer join clause that didn't actually reference the RHS (perhaps only
      after constant-folding) could confuse the join order enforcement logic,
      leading to wrong query results.  Also, nested occurrences of such things
      could trigger an Assertion that on reflection seems incorrect.
      
      Per fuzz testing by Andreas Seltenreich.  The practical use of such cases
      seems thin enough that it's not too surprising we've not heard field
      reports about it.
      
      This has been broken for a long time, so back-patch to all active branches.
      f69b4b94
  12. Jul 25, 2015
    • Tom Lane's avatar
      Redesign tablesample method API, and do extensive code review. · dd7a8f66
      Tom Lane authored
      The original implementation of TABLESAMPLE modeled the tablesample method
      API on index access methods, which wasn't a good choice because, without
      specialized DDL commands, there's no way to build an extension that can
      implement a TSM.  (Raw inserts into system catalogs are not an acceptable
      thing to do, because we can't undo them during DROP EXTENSION, nor will
      pg_upgrade behave sanely.)  Instead adopt an API more like procedural
      language handlers or foreign data wrappers, wherein the only SQL-level
      support object needed is a single handler function identified by having
      a special return type.  This lets us get rid of the supporting catalog
      altogether, so that no custom DDL support is needed for the feature.
      
      Adjust the API so that it can support non-constant tablesample arguments
      (the original coding assumed we could evaluate the argument expressions at
      ExecInitSampleScan time, which is undesirable even if it weren't outright
      unsafe), and discourage sampling methods from looking at invisible tuples.
      Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
      within and across queries, as required by the SQL standard, and deal more
      honestly with methods that can't support that requirement.
      
      Make a full code-review pass over the tablesample additions, and fix
      assorted bugs, omissions, infelicities, and cosmetic issues (such as
      failure to put the added code stanzas in a consistent ordering).
      Improve EXPLAIN's output of tablesample plans, too.
      
      Back-patch to 9.5 so that we don't have to support the original API
      in production.
      dd7a8f66
  13. Apr 25, 2015
    • Tom Lane's avatar
      Prevent improper reordering of antijoins vs. outer joins. · 3cf86860
      Tom Lane authored
      An outer join appearing within the RHS of an antijoin can't commute with
      the antijoin, but somehow I missed teaching make_outerjoininfo() about
      that.  In Teodor Sigaev's recent trouble report, this manifests as a
      "could not find RelOptInfo for given relids" error within eqjoinsel();
      but I think silently wrong query results are possible too, if the planner
      misorders the joins and doesn't happen to trigger any internal consistency
      checks.  It's broken as far back as we had antijoins, so back-patch to all
      supported branches.
      3cf86860
  14. Mar 12, 2015
    • Tom Lane's avatar
      Improve planner's cost estimation in the presence of semijoins. · b5572269
      Tom Lane authored
      If we have a semijoin, say
      	SELECT * FROM x WHERE x1 IN (SELECT y1 FROM y)
      and we're estimating the cost of a parameterized indexscan on x, the number
      of repetitions of the indexscan should not be taken as the size of y; it'll
      really only be the number of distinct values of y1, because the only valid
      plan with y on the outside of a nestloop would require y to be unique-ified
      before joining it to x.  Most of the time this doesn't make that much
      difference, but sometimes it can lead to drastically underestimating the
      cost of the indexscan and hence choosing a bad plan, as pointed out by
      David Kubečka.
      
      Fixing this is a bit difficult because parameterized indexscans are costed
      out quite early in the planning process, before we have the information
      that would be needed to call estimate_num_groups() and thereby estimate the
      number of distinct values of the join column(s).  However we can move the
      code that extracts a semijoin RHS's unique-ification columns, so that it's
      done in initsplan.c rather than on-the-fly in create_unique_path().  That
      shouldn't make any difference speed-wise and it's really a bit cleaner too.
      
      The other bit of information we need is the size of the semijoin RHS,
      which is easy if it's a single relation (we make those estimates before
      considering indexscan costs) but problematic if it's a join relation.
      The solution adopted here is just to use the product of the sizes of the
      join component rels.  That will generally be an overestimate, but since
      estimate_num_groups() only uses this input as a clamp, an overestimate
      shouldn't hurt us too badly.  In any case we don't allow this new logic
      to produce a value larger than we would have chosen before, so that at
      worst an overestimate leaves us no wiser than we were before.
      b5572269
  15. Jan 06, 2015
  16. May 06, 2014
    • Bruce Momjian's avatar
      pgindent run for 9.4 · 0a783200
      Bruce Momjian authored
      This includes removing tabs after periods in C comments, which was
      applied to back branches, so this change should not effect backpatching.
      0a783200
  17. Jan 30, 2014
    • Tom Lane's avatar
      Fix bogus handling of "postponed" lateral quals. · 043f6ff0
      Tom Lane authored
      When pulling a "postponed" qual from a LATERAL subquery up into the quals
      of an outer join, we must make sure that the postponed qual is included
      in those seen by make_outerjoininfo().  Otherwise we might compute a
      too-small min_lefthand or min_righthand for the outer join, leading to
      "JOIN qualification cannot refer to other relations" failures from
      distribute_qual_to_rels.  Subtler errors in the created plan seem possible,
      too, if the extra qual would only affect join ordering constraints.
      
      Per bug #9041 from David Leverton.  Back-patch to 9.3.
      043f6ff0
  18. Jan 07, 2014
  19. Nov 22, 2013
    • Tom Lane's avatar
      Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. · 784e762e
      Tom Lane authored
      This patch adds the ability to write TABLE( function1(), function2(), ...)
      as a single FROM-clause entry.  The result is the concatenation of the
      first row from each function, followed by the second row from each
      function, etc; with NULLs inserted if any function produces fewer rows than
      others.  This is believed to be a much more useful behavior than what
      Postgres currently does with multiple SRFs in a SELECT list.
      
      This syntax also provides a reasonable way to combine use of column
      definition lists with WITH ORDINALITY: put the column definition list
      inside TABLE(), where it's clear that it doesn't control the ordinality
      column as well.
      
      Also implement SQL-compliant multiple-argument UNNEST(), by turning
      UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
      
      The SQL standard specifies TABLE() with only a single function, not
      multiple functions, and it seems to require an implicit UNNEST() which is
      not what this patch does.  There may be something wrong with that reading
      of the spec, though, because if it's right then the spec's TABLE() is just
      a pointless alternative spelling of UNNEST().  After further review of
      that, we might choose to adopt a different syntax for what this patch does,
      but in any case this functionality seems clearly worthwhile.
      
      Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
      significantly revised by me
      784e762e
  20. Nov 15, 2013
    • Tom Lane's avatar
      Compute correct em_nullable_relids in get_eclass_for_sort_expr(). · f3b3b8d5
      Tom Lane authored
      Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr
      must be able to compute valid em_nullable_relids for any new equivalence
      class members it creates.  I'd worried about this in the commit message
      for db9f0e1d, but claimed that it wasn't a
      problem because multi-member ECs should already exist when it runs.  That
      is transparently wrong, though, because this function is also called by
      initialize_mergeclause_eclasses, which runs during deconstruct_jointree.
      The example given in the bug report (which the new regression test item
      is based upon) fails because the COALESCE() expression is first seen by
      initialize_mergeclause_eclasses rather than process_equivalence.
      
      Fixing this requires passing the appropriate nullable_relids set to
      get_eclass_for_sort_expr, and it requires new code to compute that set
      for top-level expressions such as ORDER BY, GROUP BY, etc.  We store
      the top-level nullable_relids in a new field in PlannerInfo to avoid
      computing it many times.  In the back branches, I've added the new
      field at the end of the struct to minimize ABI breakage for planner
      plugins.  There doesn't seem to be a good alternative to changing
      get_eclass_for_sort_expr's API signature, though.  There probably aren't
      any third-party extensions calling that function directly; moreover,
      if there are, they probably need to think about what to pass for
      nullable_relids anyway.
      
      Back-patch to 9.2, like the previous patch in this area.
      f3b3b8d5
  21. Aug 19, 2013
    • Tom Lane's avatar
      Fix qual-clause-misplacement issues with pulled-up LATERAL subqueries. · c64de21e
      Tom Lane authored
      In an example such as
      SELECT * FROM
        i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true;
      it is safe to pull up the LATERAL subquery into its parent, but we must
      then treat the "i.n = j.n" clause as a qual clause of the LEFT JOIN.  The
      previous coding in deconstruct_recurse mistakenly labeled the clause as
      "is_pushed_down", resulting in wrong semantics if the clause were applied
      at the join node, as per an example submitted awhile ago by Jeremy Evans.
      To fix, postpone processing of such clauses until we return back up to
      the appropriate recursion depth in deconstruct_recurse.
      
      In addition, tighten the is-safe-to-pull-up checks in is_simple_subquery;
      we previously missed the possibility that the LATERAL subquery might itself
      contain an outer join that makes lateral references in lower quals unsafe.
      
      A regression test case equivalent to Jeremy's example was already in my
      commit of yesterday, but was giving the wrong results because of this
      bug.  This patch fixes the expected output for that, and also adds a
      test case for the second problem.
      c64de21e
  22. Aug 18, 2013
    • Tom Lane's avatar
      Fix planner problems with LATERAL references in PlaceHolderVars. · 9e7e29c7
      Tom Lane authored
      The planner largely failed to consider the possibility that a
      PlaceHolderVar's expression might contain a lateral reference to a Var
      coming from somewhere outside the PHV's syntactic scope.  We had a previous
      report of a problem in this area, which I tried to fix in a quick-hack way
      in commit 4da6439b, but Antonin Houska
      pointed out that there were still some problems, and investigation turned
      up other issues.  This patch largely reverts that commit in favor of a more
      thoroughly thought-through solution.  The new theory is that a PHV's
      ph_eval_at level cannot be higher than its original syntactic level.  If it
      contains lateral references, those don't change the ph_eval_at level, but
      rather they create a lateral-reference requirement for the ph_eval_at join
      relation.  The code in joinpath.c needs to handle that.
      
      Another issue is that createplan.c wasn't handling nested PlaceHolderVars
      properly.
      
      In passing, push knowledge of lateral-reference checks for join clauses
      into join_clause_is_movable_to.  This is mainly so that FDWs don't need
      to deal with it.
      
      This patch doesn't fix the original join-qual-placement problem reported by
      Jeremy Evans (and indeed, one of the new regression test cases shows the
      wrong answer because of that).  But the PlaceHolderVar problems need to be
      fixed before that issue can be addressed, so committing this separately
      seems reasonable.
      9e7e29c7
  23. Aug 15, 2013
    • Tom Lane's avatar
      Remove ph_may_need from PlaceHolderInfo, with attendant simplifications. · 1b1d3d92
      Tom Lane authored
      The planner logic that attempted to make a preliminary estimate of the
      ph_needed levels for PlaceHolderVars seems to be completely broken by
      lateral references.  Fortunately, the potential join order optimization
      that this code supported seems to be of relatively little value in
      practice; so let's just get rid of it rather than trying to fix it.
      
      Getting rid of this allows fairly substantial simplifications in
      placeholder.c, too, so planning in such cases should be a bit faster.
      
      Issue noted while pursuing bugs reported by Jeremy Evans and Antonin
      Houska, though this doesn't in itself fix either of their reported cases.
      What this does do is prevent an Assert crash in the kind of query
      illustrated by the added regression test.  (I'm not sure that the plan for
      that query is stable enough across platforms to be usable as a regression
      test output ... but we'll soon find out from the buildfarm.)
      
      Back-patch to 9.3.  The problem case can't arise without LATERAL, so
      no need to touch older branches.
      1b1d3d92
  24. Jul 23, 2013
    • Alvaro Herrera's avatar
      Tweak FOR UPDATE/SHARE error message wording (again) · c359a1b0
      Alvaro Herrera authored
      In commit 0ac5ad51 I changed some error messages from "FOR
      UPDATE/SHARE" to a rather long gobbledygook which nobody liked.  Then,
      in commit cb9b66d3 I changed them again, but the alternative chosen
      there was deemed suboptimal by Peter Eisentraut, who in message
      1373937980.20441.8.camel@vanquo.pezone.net proposed an alternative
      involving a dynamically-constructed string based on the actual locking
      strength specified in the SQL command.  This patch implements that
      suggestion.
      c359a1b0
  25. May 29, 2013
  26. Feb 06, 2013
  27. Jan 23, 2013
    • Alvaro Herrera's avatar
      Improve concurrency of foreign key locking · 0ac5ad51
      Alvaro Herrera authored
      This patch introduces two additional lock modes for tuples: "SELECT FOR
      KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
      other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
      FOR UPDATE".  UPDATE commands that do not modify the values stored in
      the columns that are part of the key of the tuple now grab a SELECT FOR
      NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
      with tuple locks of the FOR KEY SHARE variety.
      
      Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
      means the concurrency improvement applies to them, which is the whole
      point of this patch.
      
      The added tuple lock semantics require some rejiggering of the multixact
      module, so that the locking level that each transaction is holding can
      be stored alongside its Xid.  Also, multixacts now need to persist
      across server restarts and crashes, because they can now represent not
      only tuple locks, but also tuple updates.  This means we need more
      careful tracking of lifetime of pg_multixact SLRU files; since they now
      persist longer, we require more infrastructure to figure out when they
      can be removed.  pg_upgrade also needs to be careful to copy
      pg_multixact files over from the old server to the new, or at least part
      of multixact.c state, depending on the versions of the old and new
      servers.
      
      Tuple time qualification rules (HeapTupleSatisfies routines) need to be
      careful not to consider tuples with the "is multi" infomask bit set as
      being only locked; they might need to look up MultiXact values (i.e.
      possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
      whereas they previously were assured to only use information readily
      available from the tuple header.  This is considered acceptable, because
      the extra I/O would involve cases that would previously cause some
      commands to block waiting for concurrent transactions to finish.
      
      Another important change is the fact that locking tuples that have
      previously been updated causes the future versions to be marked as
      locked, too; this is essential for correctness of foreign key checks.
      This causes additional WAL-logging, also (there was previously a single
      WAL record for a locked tuple; now there are as many as updated copies
      of the tuple there exist.)
      
      With all this in place, contention related to tuples being checked by
      foreign key rules should be much reduced.
      
      As a bonus, the old behavior that a subtransaction grabbing a stronger
      tuple lock than the parent (sub)transaction held on a given tuple and
      later aborting caused the weaker lock to be lost, has been fixed.
      
      Many new spec files were added for isolation tester framework, to ensure
      overall behavior is sane.  There's probably room for several more tests.
      
      There were several reviewers of this patch; in particular, Noah Misch
      and Andres Freund spent considerable time in it.  Original idea for the
      patch came from Simon Riggs, after a problem report by Joel Jacobson.
      Most code is from me, with contributions from Marti Raudsepp, Alexander
      Shulgin, Noah Misch and Andres Freund.
      
      This patch was discussed in several pgsql-hackers threads; the most
      important start at the following message-ids:
      	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
      	1290721684-sup-3951@alvh.no-ip.org
      	1294953201-sup-2099@alvh.no-ip.org
      	1320343602-sup-2290@alvh.no-ip.org
      	1339690386-sup-8927@alvh.no-ip.org
      	4FE5FF020200002500048A3D@gw.wicourts.gov
      	4FEAB90A0200002500048B7D@gw.wicourts.gov
      0ac5ad51
  28. Jan 01, 2013
  29. Oct 18, 2012
    • Tom Lane's avatar
      Fix planning of non-strict equivalence clauses above outer joins. · 72a4231f
      Tom Lane authored
      If a potential equivalence clause references a variable from the nullable
      side of an outer join, the planner needs to take care that derived clauses
      are not pushed to below the outer join; else they may use the wrong value
      for the variable.  (The problem arises only with non-strict clauses, since
      if an upper clause can be proven strict then the outer join will get
      simplified to a plain join.)  The planner attempted to prevent this type
      of error by checking that potential equivalence clauses aren't
      outerjoin-delayed as a whole, but actually we have to check each side
      separately, since the two sides of the clause will get moved around
      separately if it's treated as an equivalence.  Bugs of this type can be
      demonstrated as far back as 7.4, even though releases before 8.3 had only
      a very ad-hoc notion of equivalence clauses.
      
      In addition, we neglected to account for the possibility that such clauses
      might have nonempty nullable_relids even when not outerjoin-delayed; so the
      equivalence-class machinery lacked logic to compute correct nullable_relids
      values for clauses it constructs.  This oversight was harmless before 9.2
      because we were only using RestrictInfo.nullable_relids for OR clauses;
      but as of 9.2 it could result in pushing constructed equivalence clauses
      to incorrect places.  (This accounts for bug #7604 from Bill MacArthur.)
      
      Fix the first problem by adding a new test check_equivalence_delay() in
      distribute_qual_to_rels, and fix the second one by adding code in
      equivclass.c and called functions to set correct nullable_relids for
      generated clauses.  Although I believe the second part of this is not
      currently necessary before 9.2, I chose to back-patch it anyway, partly to
      keep the logic similar across branches and partly because it seems possible
      we might find other reasons why we need valid values of nullable_relids in
      the older branches.
      
      Add regression tests illustrating these problems.  In 9.0 and up, also
      add test cases checking that we can push constants through outer joins,
      since we've broken that optimization before and I nearly broke it again
      with an overly simplistic patch for this problem.
      72a4231f
  30. Sep 01, 2012
    • Tom Lane's avatar
      Fix mark_placeholder_maybe_needed to handle LATERAL references. · 4da6439b
      Tom Lane authored
      If a PlaceHolderVar contains a pulled-up LATERAL reference, its minimum
      possible evaluation level might be higher in the join tree than its
      original syntactic location.  That in turn affects the ph_needed level for
      any contained PlaceHolderVars (that is, those PHVs had better propagate up
      the join tree at least to the evaluation level of the outer PHV).  We got
      this mostly right, but mark_placeholder_maybe_needed() failed to account
      for the effect, and in consequence could leave the inner PHVs with
      ph_may_need less than what their ultimate ph_needed value will be.  That's
      bad because it could lead to failure to select a join order that will allow
      evaluation of the inner PHV at a valid location.  Fix that, and add an
      Assert that checks that we don't ever set ph_needed to more than
      ph_may_need.
      4da6439b
    • Tom Lane's avatar
      Partially restore qual scope checks in distribute_qual_to_rels(). · c97a547a
      Tom Lane authored
      The LATERAL implementation is now basically complete, and I still don't
      see a cost-effective way to make an exact qual scope cross-check in the
      presence of LATERAL.  However, I did add a PlannerInfo.hasLateralRTEs flag
      along the way, so it's easy to make the check only when not hasLateralRTEs.
      That seems to still be useful, and it beats having no check at all.
      c97a547a
  31. Aug 27, 2012
    • Tom Lane's avatar
      Fix up planner infrastructure to support LATERAL properly. · 9ff79b9d
      Tom Lane authored
      This patch takes care of a number of problems having to do with failure
      to choose valid join orders and incorrect handling of lateral references
      pulled up from subqueries.  Notable changes:
      
      * Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to
      represent join ordering constraints created by lateral references.
      (I first considered extending the SpecialJoinInfo structure, but the
      semantics are different enough that a separate data structure seems
      better.)  Extend join_is_legal() and related functions to prevent trying
      to form unworkable joins, and to ensure that we will consider joins that
      satisfy lateral references even if the joins would be clauseless.
      
      * Fill in the infrastructure needed for the last few types of relation scan
      paths to support parameterization.  We'd have wanted this eventually
      anyway, but it is necessary now because a relation that gets pulled up out
      of a UNION ALL subquery may acquire a reltargetlist containing lateral
      references, meaning that its paths *have* to be parameterized whether or
      not we have any code that can push join quals down into the scan.
      
      * Compute data about lateral references early in query_planner(), and save
      in RelOptInfo nodes, to avoid repetitive calculations later.
      
      * Assorted corner-case bug fixes.
      
      There's probably still some bugs left, but this is a lot closer to being
      real than it was before.
      9ff79b9d
  32. Aug 18, 2012
    • Tom Lane's avatar
      Another round of planner fixes for LATERAL. · 084a29c9
      Tom Lane authored
      Formerly, subquery pullup had no need to examine other entries in the range
      table, since they could not contain any references to the subquery being
      pulled up.  That's no longer true with LATERAL, so now we need to be able
      to visit rangetable subexpressions to replace Vars referencing the
      pulled-up subquery.  Also, this means that extract_lateral_references must
      be unsurprised at encountering lateral PlaceHolderVars, since such might be
      created when pulling up a subquery that's underneath an outer join with
      respect to the lateral reference.
      084a29c9
  33. Aug 12, 2012
    • Tom Lane's avatar
      More fixes for planner's handling of LATERAL. · c1774d2c
      Tom Lane authored
      Re-allow subquery pullup for LATERAL subqueries, except when the subquery
      is below an outer join and contains lateral references to relations outside
      that outer join.  If we pull up in such a case, we risk introducing lateral
      cross-references into outer joins' ON quals, which is something the code is
      entirely unprepared to cope with right now; and I'm not sure it'll ever be
      worth coping with.
      
      Support lateral refs in VALUES (this seems to be the only additional path
      type that needs such support as a consequence of re-allowing subquery
      pullup).
      
      Put in a slightly hacky fix for joinpath.c's refusal to consider
      parameterized join paths even when there cannot be any unparameterized
      ones.  This was causing "could not devise a query plan for the given query"
      failures in queries involving more than two FROM items.
      
      Put in an even more hacky fix for distribute_qual_to_rels() being unhappy
      with join quals that contain references to rels outside their syntactic
      scope; which is to say, disable that test altogether.  Need to think about
      how to preserve some sort of debugging cross-check here, while not
      expending more cycles than befits a debugging cross-check.
      c1774d2c
  34. Aug 08, 2012
    • Tom Lane's avatar
      Implement SQL-standard LATERAL subqueries. · 5ebaaa49
      Tom Lane authored
      This patch implements the standard syntax of LATERAL attached to a
      sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM,
      since set-returning function calls are expected to be one of the principal
      use-cases.
      
      The main change here is a rewrite of the mechanism for keeping track of
      which relations are visible for column references while the FROM clause is
      being scanned.  The parser "namespace" lists are no longer lists of bare
      RTEs, but are lists of ParseNamespaceItem structs, which carry an RTE
      pointer as well as some visibility-controlling flags.  Aside from
      supporting LATERAL correctly, this lets us get rid of the ancient hacks
      that required rechecking subqueries and JOIN/ON and function-in-FROM
      expressions for invalid references after they were initially parsed.
      Invalid column references are now always correctly detected on sight.
      
      In passing, remove assorted parser error checks that are now dead code by
      virtue of our having gotten rid of add_missing_from, as well as some
      comments that are obsolete for the same reason.  (It was mainly
      add_missing_from that caused so much fudging here in the first place.)
      
      The planner support for this feature is very minimal, and will be improved
      in future patches.  It works well enough for testing purposes, though.
      
      catversion bump forced due to new field in RangeTblEntry.
      5ebaaa49
  35. Jun 10, 2012
  36. Apr 19, 2012
    • Tom Lane's avatar
      Revise parameterized-path mechanism to fix assorted issues. · 5b7b5518
      Tom Lane authored
      This patch adjusts the treatment of parameterized paths so that all paths
      with the same parameterization (same set of required outer rels) for the
      same relation will have the same rowcount estimate.  We cache the rowcount
      estimates to ensure that property, and hopefully save a few cycles too.
      Doing this makes it practical for add_path_precheck to operate without
      a rowcount estimate: it need only assume that paths with different
      parameterizations never dominate each other, which is close enough to
      true anyway for coarse filtering, because normally a more-parameterized
      path should yield fewer rows thanks to having more join clauses to apply.
      
      In add_path, we do the full nine yards of comparing rowcount estimates
      along with everything else, so that we can discard parameterized paths that
      don't actually have an advantage.  This fixes some issues I'd found with
      add_path rejecting parameterized paths on the grounds that they were more
      expensive than not-parameterized ones, even though they yielded many fewer
      rows and hence would be cheaper once subsequent joining was considered.
      
      To make the same-rowcounts assumption valid, we have to require that any
      parameterized path enforce *all* join clauses that could be obtained from
      the particular set of outer rels, even if not all of them are useful for
      indexing.  This is required at both base scans and joins.  It's a good
      thing anyway since the net impact is that join quals are checked at the
      lowest practical level in the join tree.  Hence, discard the original
      rather ad-hoc mechanism for choosing parameterization joinquals, and build
      a better one that has a more principled rule for when clauses can be moved.
      The original rule was actually buggy anyway for lack of knowledge about
      which relations are part of an outer join's outer side; getting this right
      requires adding an outer_relids field to RestrictInfo.
      5b7b5518
  37. Jan 02, 2012
Loading