Skip to content
Snippets Groups Projects
  1. Jan 29, 2018
  2. Jan 28, 2018
    • Tom Lane's avatar
      Add stack-overflow guards in set-operation planning. · 1b2a3860
      Tom Lane authored
      create_plan_recurse lacked any stack depth check.  This is not per
      our normal coding rules, but I'd supposed it was safe because earlier
      planner processing is more complex and presumably should eat more
      stack.  But bug #15033 from Andrew Grossman shows this isn't true,
      at least not for queries having the form of a many-thousand-way
      INTERSECT stack.
      
      Further testing showed that recurse_set_operations is also capable
      of being crashed in this way, since it likewise will recurse to the
      bottom of a parsetree before calling any support functions that
      might themselves contain any stack checks.  However, its stack
      consumption is only perhaps a third of create_plan_recurse's.
      
      It's possible that this particular problem with create_plan_recurse can
      only manifest in 9.6 and later, since before that we didn't build a Path
      tree for set operations.  But having seen this example, I now have no
      faith in the proposition that create_plan_recurse doesn't need a stack
      check, so back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
      1b2a3860
    • Simon Riggs's avatar
      Default monitoring roles - errata · 76e117db
      Simon Riggs authored
      25fff407 introduced
      default monitoring roles. Apply these corrections:
      
      * Allow access to pg_stat_get_wal_senders()
        by role pg_read_all_stats
      
      * Correct comment in pg_stat_get_wal_receiver()
        to show it is no longer superuser-only.
      
      Author: Feike Steenbergen
      Reviewed-by: Michael Paquier
      
      Apply to HEAD, then later backpatch to 10
      76e117db
  3. Jan 27, 2018
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2018c. · 2d71b270
      Tom Lane authored
      DST law changes in Brazil, Sao Tome and Principe.  Historical corrections
      for Bolivia, Japan, and South Sudan.  The "US/Pacific-New" zone has been
      removed (it was only a link to America/Los_Angeles anyway).
      2d71b270
    • Tom Lane's avatar
      Avoid crash during EvalPlanQual recheck of an inner indexscan. · 78433f41
      Tom Lane authored
      Commit 09529a70 changed nodeIndexscan.c and nodeIndexonlyscan.c to
      postpone initialization of the indexscan proper until the first tuple
      fetch.  It overlooked the question of mark/restore behavior, which means
      that if some caller attempts to mark the scan before the first tuple fetch,
      you get a null pointer dereference.
      
      The only existing user of mark/restore is nodeMergejoin.c, which (somewhat
      accidentally) will never attempt to set a mark before the first inner tuple
      unless the inner child node is a Material node.  Hence the case can't arise
      normally, so it seems sufficient to document the assumption at both ends.
      However, during an EvalPlanQual recheck, ExecScanFetch doesn't call
      IndexNext but just returns the jammed-in test tuple.  Therefore, if we're
      doing a recheck in a plan tree with a mergejoin with inner indexscan,
      it's possible to reach ExecIndexMarkPos with iss_ScanDesc still null,
      as reported by Guo Xiang Tan in bug #15032.
      
      Really, when there's a test tuple supplied during an EPQ recheck, touching
      the index at all is the wrong thing: rather, the behavior of mark/restore
      ought to amount to saving and restoring the es_epqScanDone flag.  We can
      avoid finding a place to actually save the flag, for the moment, because
      given the assumption that no caller will set a mark before fetching a
      tuple, es_epqScanDone must always be set by the time we try to mark.
      So the actual behavior change required is just to not reach the index
      access if a test tuple is supplied.
      
      The set of plan node types that need to consider this issue are those
      that support EPQ test tuples (i.e., call ExecScan()) and also support
      mark/restore; which is to say, IndexScan, IndexOnlyScan, and perhaps
      CustomScan.  It's tempting to try to fix the problem in one place by
      teaching ExecMarkPos() itself about EPQ; but ExecMarkPos supports some
      plan types that aren't Scans, and also it seems risky to make assumptions
      about what a CustomScan wants to do here.  Also, the most likely future
      change here is to decide that we do need to support marks placed before
      the first tuple, which would require additional work in IndexScan and
      IndexOnlyScan in any case.  Hence, fix the EPQ issue in nodeIndexscan.c
      and nodeIndexonlyscan.c, accepting the small amount of code duplicated
      thereby, and leave it to CustomScan providers to fix this bug if they
      have it.
      
      Back-patch to v10 where commit 09529a70 came in.  In earlier branches,
      the index_markpos() call is a waste of cycles when EPQ is active, but
      no more than that, so it doesn't seem appropriate to back-patch further.
      
      Discussion: https://postgr.es/m/20180126074932.3098.97815@wrigleys.postgresql.org
      78433f41
    • Magnus Hagander's avatar
      Add missing semicolons in documentation examples · 56067dee
      Magnus Hagander authored
      Author: Daniel Gustafsson <daniel@yesql.se>
      56067dee
  4. Jan 26, 2018
  5. Jan 24, 2018
  6. Jan 23, 2018
    • Tom Lane's avatar
      Teach reparameterize_path() to handle AppendPaths. · c5e59bb6
      Tom Lane authored
      If we're inside a lateral subquery, there may be no unparameterized paths
      for a particular child relation of an appendrel, in which case we *must*
      be able to create similarly-parameterized paths for each other child
      relation, else the planner will fail with "could not devise a query plan
      for the given query".  This means that there are situations where we'd
      better be able to reparameterize at least one path for each child.
      
      This calls into question the assumption in reparameterize_path() that
      it can just punt if it feels like it.  However, the only case that is
      known broken right now is where the child is itself an appendrel so that
      all its paths are AppendPaths.  (I think possibly I disregarded that in
      the original coding on the theory that nested appendrels would get folded
      together --- but that only happens *after* reparameterize_path(), so it's
      not excused from handling a child AppendPath.)  Given that this code's been
      like this since 9.3 when LATERAL was introduced, it seems likely we'd have
      heard of other cases by now if there were a larger problem.
      
      Per report from Elvis Pranskevichus.  Back-patch to 9.3.
      
      Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
      c5e59bb6
    • Alvaro Herrera's avatar
      Remove unnecessary include · bfe23f83
      Alvaro Herrera authored
      autovacuum.c no longer needs dsa.h, since commit 31ae1638.
      Author: Masahiko Sawada
      Discussion: https://postgr.es/m/CAD21AoCWvYyXrvdANSHWWWEWJH5TeAWAkJ_2gqrHhukG+OBo1g@mail.gmail.com
      bfe23f83
    • Tom Lane's avatar
      5ca17d65
    • Robert Haas's avatar
      Report an ERROR if a parallel worker fails to start properly. · 383e4268
      Robert Haas authored
      Commit 28724fd9 fixed things so that
      if a background worker fails to start due to fork() failure or because
      it is terminated before startup succeeds, BGWH_STOPPED will be
      reported.  However, that only helps if the code that uses the
      background worker machinery notices the change in status, and the code
      in parallel.c did not.
      
      To fix that, do two things.  First, make sure that when a worker
      exits, it triggers the leader to read from error queues.  That way, if
      a worker which has attached to an error queue exits uncleanly, the
      leader is sure to throw some error, either the contents of the
      ErrorResponse sent by the worker, or "lost connection to parallel
      worker" if it exited without sending one.  To cover the case where
      the worker never starts up in the first place or exits before
      attaching to the error queue, the ParallelContext now keeps track
      of which workers have sent at least one message via the error
      queue.  A worker which sends no messages by the time the parallel
      operation finishes will be checked to see whether it exited before
      attaching to the error queue; if so, a new error message, "parallel
      worker failed to initialize", will be reported.  If not, we'll
      continue to wait until it either starts up and exits cleanly, starts
      up and exits uncleanly, or fails to start, and then take the
      appropriate action.
      
      Patch by me, reviewed by Amit Kapila.
      
      Discussion: http://postgr.es/m/CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com
      383e4268
    • Bruce Momjian's avatar
      doc: simplify intermediate certificate mention in libpq docs · 5aaa8666
      Bruce Momjian authored
      Backpatch-through: 9.3
      5aaa8666
  7. Jan 22, 2018
    • Tom Lane's avatar
      Make pg_dump's ACL, sec label, and comment entries reliably identifiable. · 46246fd9
      Tom Lane authored
      _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL,
      and COMMENT TOC entries that are for large objects by seeing whether
      the tag for them starts with "LARGE OBJECT ".  While that works fine
      for actual large objects, which are indeed tagged that way, it's
      subject to false positives unless every such entry's tag starts with an
      appropriate type ID.  And in fact it does not work for ACLs, because
      up to now we customarily tagged those entries with just the bare name
      of the object.  This means that an ACL for an object named
      "LARGE OBJECT something" would be misclassified as data not schema,
      with undesirable results in a schema-only or data-only dump ---
      although pg_upgrade seems unaffected, due to the special case for
      binary-upgrade mode further down in _tocEntryRequired().
      
      We can fix this by changing all the dumpACL calls to use the label
      strings already in use for comments and security labels, which do
      follow the convention of starting with an object type indicator.
      
      Well, mostly they follow it.  dumpDatabase() got it wrong, using
      just the bare database name for those purposes, so that a database
      named "LARGE OBJECT something" would similarly be subject to having
      its comment or security label dropped or included when not wanted.
      Bring that into line too.  (Note that up to now, database ACLs have
      not been processed by pg_dump, so that this issue doesn't affect them.)
      
      _tocEntryRequired() itself is not free of fault: it was overly liberal
      about matching object tags to "LARGE OBJECT " in binary-upgrade mode.
      This looks like it is probably harmless because there would be no data
      component to strip anyway in that mode, but at best it's trouble
      waiting to happen, so tighten that up too.
      
      The possible misclassification of SECURITY LABEL entries for databases is
      in principle a security problem, but the opportunities for actual exploits
      seem too narrow to be interesting.  The other cases seem like just bugs,
      since an object owner can change its ACL or comment for himself, he needn't
      try to trick someone else into doing it by choosing a strange name.
      
      This has been broken since per-large-object TOC entries were introduced
      in 9.0, so back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us
      46246fd9
  8. Jan 21, 2018
    • Magnus Hagander's avatar
      Fix wording of "hostaddrs" · d66cfe1b
      Magnus Hagander authored
      The field is still called "hostaddr", so make sure references use
      "hostaddr values" instead.
      
      Author: Michael Paquier <michael.paquier@gmail.com>
      d66cfe1b
    • Bruce Momjian's avatar
      doc: update intermediate certificate instructions · e477d148
      Bruce Momjian authored
      Document how to properly create root and intermediate certificates using
      v3_ca extensions and where to place intermediate certificates so they
      are properly transferred to the remote side with the leaf certificate to
      link to the remote root certificate.  This corrects docs that used to
      say that intermediate certificates must be stored with the root
      certificate.
      
      Also add instructions on how to create root, intermediate, and leaf
      certificates.
      
      Discussion: https://postgr.es/m/20180116002238.GC12724@momjian.us
      
      Reviewed-by: Michael Paquier
      
      Backpatch-through: 9.3
      e477d148
  9. Jan 19, 2018
  10. Jan 18, 2018
    • Tom Lane's avatar
      Extend configure's __int128 test to check for a known gcc bug. · 5dcbdcbd
      Tom Lane authored
      On Sparc64, use of __attribute__(aligned(8)) with __int128 causes faulty
      code generation in gcc versions at least through 5.5.0.  We can work around
      that by disabling use of __int128, so teach configure to test for the bug.
      
      This solution doesn't fix things for the case of cross-compiling with a
      buggy compiler; to support that nicely, we'd need to add a manual disable
      switch.  Unless more such cases turn up, it doesn't seem worth the work.
      Affected users could always edit pg_config.h manually.
      
      In passing, fix some typos in the existing configure test for __int128.
      They're harmless because we only compile that code not run it, but
      they're still confusing for anyone looking at it closely.
      
      This is needed in support of commit 75180499, so back-patch to 9.5
      as that was.
      
      Marina Polyakova, Victor Wagner, Tom Lane
      
      Discussion: https://postgr.es/m/0d3a9fa264cebe1cb9966f37b7c06e86@postgrespro.ru
      5dcbdcbd
  11. Jan 17, 2018
    • Robert Haas's avatar
      postgres_fdw: Avoid 'outer pathkeys do not match mergeclauses' error. · 3f05a30b
      Robert Haas authored
      When pushing down a join to a foreign server, postgres_fdw constructs
      an alternative plan to be used for any EvalPlanQual rechecks that
      prove to be necessary.  This plan is stored as the outer subplan of
      the Foreign Scan implementing the pushed-down join.  Previously, this
      alternative plan could have a different nominal sort ordering than its
      parent, which seemed OK since there will only be one tuple per base
      table anyway in the case of an EvalPlanQual recheck.  Actually,
      though, it caused a problem if that path was used as a building block
      for the EvalPlanQual recheck plan of a higher-level foreign join,
      because we could end up with a merge join one of whose inputs was not
      labelled with the correct sort order.  Repair by injecting an extra
      Sort node into the EvalPlanQual recheck plan whenever it would
      otherwise fail to be sorted at least as well as its parent Foreign
      Scan.
      
      Report by Jeff Janes.  Patch by me, reviewed by Tom Lane, who also
      provided the test case and comment text.
      
      Discussion: http://postgr.es/m/CAMkU=1y2G8VOVBHv3iXU2TMAj7-RyBFFW1uhkr5sm9LQ2=X35g@mail.gmail.com
      3f05a30b
  12. Jan 15, 2018
  13. Jan 12, 2018
    • Tom Lane's avatar
      Fix postgres_fdw to cope with duplicate GROUP BY entries. · 67854bc5
      Tom Lane authored
      Commit 7012b132, which added the ability to push down aggregates and
      grouping to the remote server, wasn't careful to ensure that the remote
      server would have the same idea we do about which columns are the grouping
      columns, in cases where there are textually identical GROUP BY expressions.
      Such cases typically led to "targetlist item has multiple sortgroupref
      labels" errors.
      
      To fix this reliably, switch over to using "GROUP BY column-number" syntax
      rather than "GROUP BY expression" in transmitted queries, and adjust
      foreign_grouping_ok() to be more careful about duplicating the sortgroupref
      labeling of the local pathtarget.
      
      Per bug #14890 from Sean Johnston.  Back-patch to v10 where the buggy code
      was introduced.
      
      Jeevan Chalke, reviewed by Ashutosh Bapat
      
      Discussion: https://postgr.es/m/20171107134948.1508.94783@wrigleys.postgresql.org
      67854bc5
    • Tom Lane's avatar
      Avoid unnecessary failure in SELECT concurrent with ALTER NO INHERIT. · 55e5eb4d
      Tom Lane authored
      If a query against an inheritance tree runs concurrently with an ALTER
      TABLE that's disinheriting one of the tree members, it's possible to get
      a "could not find inherited attribute" error because after obtaining lock
      on the removed member, make_inh_translation_list sees that its columns
      have attinhcount=0 and decides they aren't the columns it's looking for.
      
      An ideal fix, perhaps, would avoid including such a just-removed member
      table in the query at all; but there seems no way to accomplish that
      without adding expensive catalog rechecks or creating a likelihood of
      deadlocks.  Instead, let's just drop the check on attinhcount.  In this
      way, a query that's included a just-disinherited child will still
      succeed, which is not a completely unreasonable behavior.
      
      This problem has existed for a long time, so back-patch to all supported
      branches.  Also add an isolation test verifying related behaviors.
      
      Patch by me; the new isolation test is based on Kyotaro Horiguchi's work.
      
      Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp
      55e5eb4d
    • Tom Lane's avatar
      Fix incorrect handling of subquery pullup in the presence of grouping sets. · d3ca1a6c
      Tom Lane authored
      If we flatten a subquery whose target list contains constants or
      expressions, when those output columns are used in GROUPING SET columns,
      the planner was capable of doing the wrong thing by merging a pulled-up
      expression into the surrounding expression during const-simplification.
      Then the late processing that attempts to match subexpressions to grouping
      sets would fail to match those subexpressions to grouping sets, with the
      effect that they'd not go to null when expected.
      
      To fix, wrap such subquery outputs in PlaceHolderVars, ensuring that
      they preserve their separate identity throughout the planner's expression
      processing.  This is a bit of a band-aid, because the wrapper defeats
      const-simplification even in places where it would be safe to allow.
      But a nicer fix would likely be too invasive to back-patch, and the
      consequences of the missed optimizations probably aren't large in most
      cases.
      
      Back-patch to 9.5 where grouping sets were introduced.
      
      Heikki Linnakangas, with small mods and better test cases by me;
      additional review by Andrew Gierth
      
      Discussion: https://postgr.es/m/7dbdcf5c-b5a6-ef89-4958-da212fe10176@iki.fi
      d3ca1a6c
  14. Jan 11, 2018
    • Teodor Sigaev's avatar
      Fix behavior of ~> (cube, int) operator · b8279a78
      Teodor Sigaev authored
      ~> (cube, int) operator was especially designed for knn-gist search.
      However, it appears that knn-gist search can't work correctly with current
      behavior of this operator when dataset contains cubes of variable
      dimensionality. In this case, the same value of second operator argument
      can point to different dimension depending on dimensionality of particular cube.
      Such behavior is incompatible with gist indexing of cubes, and knn-gist doesn't
      work correctly for it.
      
      This patch changes behavior of ~> (cube, int) operator by introducing dimension
      numbering where value of second argument unambiguously identifies number of
      dimension. With new behavior, this operator can be correctly supported by
      knn-gist. Relevant changes to cube operator class are also included.
      
      Backpatch to v9.6 where operator was introduced.
      
      Since behavior of ~> (cube, int) operator is changed, depending entities
      must be refreshed after upgrade. Such as, expression indexes using this
      operator must be reindexed, materialized views must be rebuilt, stored
      procedures and client code must be revised to correctly use new behavior.
      That should be mentioned in release notes.
      
      Noticed by: Tomas Vondra
      Author: Alexander Korotkov
      Reviewed by: Tomas Vondra, Andrey Borodin
      Discussion: https://www.postgresql.org/message-id/flat/a9657f6a-b497-36ff-e56-482a2c7e3292@2ndquadrant.com
      b8279a78
  15. Jan 10, 2018
    • Tom Lane's avatar
      Fix sample INSTR() functions in the plpgsql documentation. · 08adf688
      Tom Lane authored
      These functions are stated to be Oracle-compatible, but they weren't.
      Yugo Nagata noticed that while our code returns zero for a zero or
      negative fourth parameter (occur_index), Oracle throws an error.
      Further testing by me showed that there was also a discrepancy in the
      interpretation of a negative third parameter (beg_index): Oracle thinks
      that a negative beg_index indicates the last place where the target
      substring can *begin*, whereas our code thinks it is the last place
      where the target can *end*.
      
      Adjust the sample code to behave like Oracle in both these respects.
      Also change it to be a CDATA[] section, simplifying copying-and-pasting
      out of the documentation source file.  And fix minor problems in the
      introductory comment, which wasn't very complete or accurate.
      
      Back-patch to all supported branches.  Although this patch only touches
      documentation, we should probably call it out as a bug fix in the next
      minor release notes, since users who have adopted the functions will
      likely want to update their versions.
      
      Yugo Nagata and Tom Lane
      
      Discussion: https://postgr.es/m/20171229191705.c0b43a8c.nagata@sraoss.co.jp
      08adf688
    • Tom Lane's avatar
      Remove dubious micro-optimization in ckpt_buforder_comparator(). · 7eb0187a
      Tom Lane authored
      It seems incorrect to assume that the list of CkptSortItems can never
      contain duplicate page numbers: concurrent activity could result in some
      page getting dropped from a low-numbered buffer and later loaded into a
      high-numbered buffer while BufferSync is scanning the buffer pool.
      If that happened, the comparator would give self-inconsistent results,
      potentially confusing qsort().  Saving one comparison step is not worth
      possibly getting the sort wrong.
      
      So far as I can tell, nothing would actually go wrong given our current
      implementation of qsort().  It might get a bit slower than expected
      if there were a large number of duplicates of one value, but that's
      surely a probability-epsilon case.  Still, the comment is wrong,
      and if we ever switched to another sort implementation it might be
      less forgiving.
      
      In passing, avoid casting away const-ness of the argument pointers;
      I've not seen any compiler complaints from that, but it seems likely
      that some compilers would not like it.
      
      Back-patch to 9.6 where this code came in, just in case I've underestimated
      the possible consequences.
      
      Discussion: https://postgr.es/m/18437.1515607610@sss.pgh.pa.us
      7eb0187a
  16. Jan 09, 2018
    • Alvaro Herrera's avatar
      Change some bogus PageGetLSN calls to BufferGetLSNAtomic · 37dd1128
      Alvaro Herrera authored
      As src/backend/access/transam/README says, PageGetLSN may only be called
      by processes holding either exclusive lock on buffer, or a shared lock
      on buffer plus buffer header lock.  Therefore any place that only holds
      a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN,
      which internally obtains buffer header lock prior to reading the LSN.
      
      A few callsites failed to comply with this rule.  This was detected by
      running all tests under a new (not committed) assertion that verifies
      PageGetLSN locking contract.  All but one of the callsites that failed
      the assertion are fixed by this patch.  Remaining callsites were
      inspected manually and determined not to need any change.
      
      The exception (unfixed callsite) is in TestForOldSnapshot, which only
      has a Page argument, making it impossible to access the corresponding
      Buffer from it.  Fixing that seems a much larger patch that will have to
      be done separately; and that's just as well, since it was only
      introduced in 9.6 and other bugs are much older.
      
      Some of these bugs are ancient; backpatch all the way back to 9.3.
      
      Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal
      Reviewed-by: Michaël Paquier
      Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
      37dd1128
    • Tom Lane's avatar
      While waiting for a condition variable, detect postmaster death. · d56a5f99
      Tom Lane authored
      The general assumption for postmaster child processes is that they
      should just exit(1), reasonably promptly, if the postmaster disappears.
      condition_variable.c neglected this consideration and could be left
      waiting forever, if the counterpart process it is waiting for has
      done the right thing and exited.
      
      We had some discussion of adjusting the WaitEventSet API to make it
      harder to make this type of mistake in future; but for the moment,
      and for v10, let's make this narrow fix.
      
      Discussion: https://postgr.es/m/20412.1515456143@sss.pgh.pa.us
      d56a5f99
    • Tom Lane's avatar
      Fix race condition during replication origin drop. · 1f5adbd7
      Tom Lane authored
      replorigin_drop() misunderstood the API for condition variables: it
      had ConditionVariablePrepareToSleep and ConditionVariableCancelSleep
      inside its test-and-sleep loop, rather than outside the loop as
      intended.  The net effect is a narrow race-condition window wherein,
      if the process using a replication slot releases it immediately after
      replorigin_drop() releases the ReplicationOriginLock, replorigin_drop()
      would get into the condition variable's wait list too late and then
      wait indefinitely for a signal that won't come.
      
      Because there's a different CV for each replication slot, we can't
      just move the ConditionVariablePrepareToSleep call to above the
      test-and-sleep loop.  What we can do, in the wake of commit 13db3b93,
      is drop the ConditionVariablePrepareToSleep call entirely.  This fix
      depends on that commit because (at least in principle) the slot matching
      the target replication origin might move around, so that once in a blue
      moon successive loop iterations might involve different CVs.  We can now
      cope with such a scenario, at the cost of an extra trip through the
      retry loop.
      
      (There are ways we could fix this bug without depending on that commit,
      but they're all a lot more complicated than this way.)
      
      While at it, upgrade the rather skimpy comments in this function.
      
      Back-patch to v10 where this code came in.
      
      Discussion: https://postgr.es/m/19947.1515455433@sss.pgh.pa.us
      1f5adbd7
    • Tom Lane's avatar
      Allow ConditionVariable[PrepareTo]Sleep to auto-switch between CVs. · 4af2190e
      Tom Lane authored
      The original coding here insisted that callers manually cancel any prepared
      sleep for one condition variable before starting a sleep on another one.
      While that's not a huge burden today, it seems like a gotcha that will bite
      us in future if the use of condition variables increases; anything we can
      do to make the use of this API simpler and more robust is attractive.
      Hence, allow these functions to automatically switch their attention to
      a different CV when required.  This is safe for the same reason it was OK
      for commit aced5a92 to let a broadcast operation cancel any prepared CV
      sleep: whenever we return to the other test-and-sleep loop, we will
      automatically re-prepare that CV, paying at most an extra test of that
      loop's exit condition.
      
      Back-patch to v10 where condition variables were introduced.  Ordinarily
      we would probably not back-patch a change like this, but since it does not
      invalidate any coding pattern that was legal before, it seems safe enough.
      Furthermore, there's an open bug in replorigin_drop() for which the
      simplest fix requires this.  Even if we chose to fix that in some more
      complicated way, the hazard would remain that we might back-patch some
      other bug fix that requires this behavior.
      
      Patch by me, reviewed by Thomas Munro.
      
      Discussion: https://postgr.es/m/2437.1515368316@sss.pgh.pa.us
      4af2190e
    • Bruce Momjian's avatar
      pg_upgrade: prevent check on live cluster from generating error · 1776c817
      Bruce Momjian authored
      Previously an inaccurate but harmless error was generated when running
      --check on a live server before reporting the servers as compatible.
      The fix is to split error reporting and exit control in the exec_prog()
      API.
      
      Reported-by: Daniel Westermann
      
      Backpatch-through: 10
      1776c817
  17. Jan 06, 2018
    • Tom Lane's avatar
      Reorder steps in ConditionVariablePrepareToSleep for more safety. · 83fe2708
      Tom Lane authored
      In the admittedly-very-unlikely case that AddWaitEventToSet fails,
      ConditionVariablePrepareToSleep would error out after already having
      set cv_sleep_target, which is probably bad, and after having already
      set cv_wait_event_set, which is very bad.  Transaction abort might or
      might not clean up cv_sleep_target properly; but there is nothing
      that would be aware that the WaitEventSet wasn't fully constructed,
      so that all future condition variable sleeps would be broken.
      We can easily guard against these hazards with slight restructuring.
      
      Back-patch to v10 where condition_variable.c was introduced.
      
      Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
      83fe2708
    • Tom Lane's avatar
      Rewrite ConditionVariableBroadcast() to avoid live-lock. · 1c77e990
      Tom Lane authored
      The original implementation of ConditionVariableBroadcast was, per its
      self-description, "the dumbest way possible".  Thomas Munro found out
      it was a bit too dumb.  An awakened process may immediately re-queue
      itself, if the specific condition it's waiting for is not yet satisfied.
      If this happens before ConditionVariableBroadcast is able to see the wait
      queue as empty, then ConditionVariableBroadcast will re-awaken the same
      process, repeating the cycle.  Given unlucky timing this back-and-forth
      can repeat indefinitely; loops lasting thousands of seconds have been
      seen in testing.
      
      To fix, add our own process to the end of the wait queue to serve as a
      sentinel, and exit the broadcast loop once our process is not there
      anymore.  There are various special considerations described in the
      comments, the principal disadvantage being that wakers can no longer
      be sure whether they awakened a real waiter or just a sentinel.  But in
      practice nobody pays attention to the result of ConditionVariableSignal
      or ConditionVariableBroadcast anyway, so that problem seems hypothetical.
      
      Back-patch to v10 where condition_variable.c was introduced.
      
      Tom Lane and Thomas Munro
      
      Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com
      1c77e990
  18. Jan 05, 2018
Loading