Skip to content
Snippets Groups Projects
  1. Dec 27, 2017
  2. Dec 22, 2017
    • Tom Lane's avatar
      Fix UNION/INTERSECT/EXCEPT over no columns. · c252ccda
      Tom Lane authored
      Since 9.4, we've allowed the syntax "select union select" and variants
      of that.  However, the planner wasn't expecting a no-column set operation
      and ended up treating the set operation as if it were UNION ALL.
      
      Turns out it's trivial to fix in v10 and later; we just need to be careful
      about not generating a Sort node with no sort keys.  However, since a weird
      corner case like this is never going to be exercised by developers, we'd
      better have thorough regression tests if we want to consider it supported.
      
      Per report from Victor Yegorov.
      
      Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
      c252ccda
  3. Dec 21, 2017
  4. Dec 20, 2017
  5. Dec 19, 2017
    • Robert Haas's avatar
      Try again to fix accumulation of parallel worker instrumentation. · 72567f61
      Robert Haas authored
      When a Gather or Gather Merge node is started and stopped multiple
      times, accumulate instrumentation data only once, at the end, instead
      of after each execution, to avoid recording inflated totals.
      
      Commit 778e78ae, the previous attempt
      at a fix, instead reset the state after every execution, which worked
      for the general instrumentation data but had problems for the additional
      instrumentation specific to Sort and Hash nodes.
      
      Report by hubert depesz lubaczewski.  Analysis and fix by Amit Kapila,
      following a design proposal from Thomas Munro, with a comment tweak
      by me.
      
      Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
      72567f61
  6. Dec 18, 2017
    • Peter Eisentraut's avatar
      doc: Fix figures in example description · db2ee079
      Peter Eisentraut authored
      
      oversight in 244c8b46
      
      Reported-by: default avatarBlaz Merela <blaz@merela.org>
      db2ee079
    • Fujii Masao's avatar
      Fix bug in cancellation of non-exclusive backup to avoid assertion failure. · 133d2fab
      Fujii Masao authored
      Previously an assertion failure occurred when pg_stop_backup() for
      non-exclusive backup was aborted while it's waiting for WAL files to
      be archived. This assertion failure happened in do_pg_abort_backup()
      which was called when a non-exclusive backup was canceled.
      do_pg_abort_backup() assumes that there is at least one non-exclusive
      backup running when it's called. But pg_stop_backup() can be canceled
      even after it marks the end of non-exclusive backup (e.g.,
      during waiting for WAL archiving). This broke the assumption that
      do_pg_abort_backup() relies on, and which caused an assertion failure.
      
      This commit changes do_pg_abort_backup() so that it does nothing
      when non-exclusive backup has been already marked as completed.
      That is, the asssumption is also changed, and do_pg_abort_backup()
      now can handle even the case where it's called when there is
      no running backup.
      
      Backpatch to 9.6 where SQL-callable non-exclusive backup was added.
      
      Author: Masahiko Sawada and Michael Paquier
      Reviewed-By: Robert Haas and Fujii Masao
      Discussion: https://www.postgresql.org/message-id/CAD21AoD2L1Fu2c==gnVASMyFAAaq3y-AQ2uEVj-zTCGFFjvmDg@mail.gmail.com
      133d2fab
    • Robert Haas's avatar
      Fix crashes on plans with multiple Gather (Merge) nodes. · b70ea4c7
      Robert Haas authored
      es_query_dsa turns out to be broken by design, because it supposes
      that there is only one DSA for the whole query, whereas there is
      actually one per Gather (Merge) node.  For now, work around that
      problem by setting and clearing the pointer around the sections of
      code that might need it.  It's probably a better idea to get rid of
      es_query_dsa altogether in favor of having each node keep track
      individually of which DSA is relevant, but that seems like more than
      we would want to back-patch.
      
      Thomas Munro, reviewed and tested by Andreas Seltenreich, Amit
      Kapila, and by me.
      
      Discussion: http://postgr.es/m/CAEepm=1U6as=brnVvMNixEV2tpi8NuyQoTmO8Qef0-VV+=7MDA@mail.gmail.com
      b70ea4c7
  7. Dec 16, 2017
  8. Dec 15, 2017
    • Andres Freund's avatar
      Perform a lot more sanity checks when freezing tuples. · d3044f8b
      Andres Freund authored
      The previous commit has shown that the sanity checks around freezing
      aren't strong enough. Strengthening them seems especially important
      because the existance of the bug has caused corruption that we don't
      want to make even worse during future vacuum cycles.
      
      The errors are emitted with ereport rather than elog, despite being
      "should never happen" messages, so a proper error code is emitted. To
      avoid superflous translations, mark messages as internal.
      
      Author: Andres Freund and Alvaro Herrera
      Reviewed-By: Alvaro Herrera, Michael Paquier
      Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de
      Backpatch: 9.3-
      d3044f8b
    • Andres Freund's avatar
      Fix pruning of locked and updated tuples. · 1224383e
      Andres Freund authored
      Previously it was possible that a tuple was not pruned during vacuum,
      even though its update xmax (i.e. the updating xid in a multixact with
      both key share lockers and an updater) was below the cutoff horizon.
      
      As the freezing code assumed, rightly so, that that's not supposed to
      happen, xmax would be preserved (as a member of a new multixact or
      xmax directly). That causes two problems: For one the tuple is below
      the xmin horizon, which can cause problems if the clog is truncated or
      once there's an xid wraparound. The bigger problem is that that will
      break HOT chains, which in turn can lead two to breakages: First,
      failing index lookups, which in turn can e.g lead to constraints being
      violated. Second, future hot prunes / vacuums can end up making
      invisible tuples visible again. There's other harmful scenarios.
      
      Fix the problem by recognizing that tuples can be DEAD instead of
      RECENTLY_DEAD, even if the multixactid has alive members, if the
      update_xid is below the xmin horizon. That's safe because newer
      versions of the tuple will contain the locking xids.
      
      A followup commit will harden the code somewhat against future similar
      bugs and already corrupted data.
      
      Author: Andres Freund, with changes by Alvaro Herrera
      Reported-By: Daniel Wood
      Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter
         Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier
      Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier
      Discussion:
          https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com
          https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de
      Backpatch: 9.3-
      1224383e
  9. Dec 14, 2017
    • Andrew Dunstan's avatar
      Fix walsender timeouts when decoding a large transaction · 14c15b1f
      Andrew Dunstan authored
      The logical slots have a fast code path for sending data so as not to
      impose too high a per message overhead. The fast path skips checks for
      interrupts and timeouts. However, the existing coding failed to consider
      the fact that a transaction with a large number of changes may take a
      very long time to be processed and sent to the client. This causes the
      walsender to ignore interrupts for potentially a long time and more
      importantly it will result in the walsender being killed due to
      timeout at the end of such a transaction.
      
      This commit changes the fast path to also check for interrupts and only
      allows calling the fast path when the last keepalive check happened less
      than half the walsender timeout ago. Otherwise the slower code path will
      be taken.
      
      Backpatched to 9.4
      
      Petr Jelinek, reviewed by  Kyotaro HORIGUCHI, Yura Sokolov,  Craig
      Ringer and Robert Haas.
      
      Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
      14c15b1f
  10. Dec 13, 2017
  11. Dec 11, 2017
    • Peter Eisentraut's avatar
      Fix comment · c55253b7
      Peter Eisentraut authored
      
      Reported-by: default avatarNoah Misch <noah@leadboat.com>
      c55253b7
    • Tom Lane's avatar
      Fix corner-case coredump in _SPI_error_callback(). · e3d194f7
      Tom Lane authored
      I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL,
      and only fills it in some time later.  If an error were to happen in
      between, _SPI_error_callback would try to dereference the null pointer.
      This is unlikely --- there's not much between those points except
      push-snapshot calls --- but it's clearly not impossible.  Tweak the
      callback to do nothing if the pointer isn't set yet.
      
      It's been like this for awhile, so back-patch to all supported branches.
      e3d194f7
  12. Dec 09, 2017
    • Magnus Hagander's avatar
      Fix typo · 22e71b3a
      Magnus Hagander authored
      Reported by Robins Tharakan
      22e71b3a
    • Noah Misch's avatar
      MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries. · e2cc6505
      Noah Misch authored
      Notably, this permits linking to the 32-bit Perl binaries advertised on
      perl.org, namely Strawberry Perl and ActivePerl.  This has a side effect
      of permitting linking to binaries built with obsolete MSVC versions.
      
      By default, MSVC 2012 and later require a "safe exception handler table"
      in each binary.  MinGW-built, 32-bit DLLs lack the relevant exception
      handler metadata, so linking to them failed with error LNK2026.  Restore
      the semantics of MSVC 2010, which omits the table from a given binary if
      some linker input lacks metadata.  This has no effect on 64-bit builds
      or on MSVC 2010 and earlier.  Back-patch to 9.3 (all supported
      versions).
      
      Reported by Victor Wagner.
      
      Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
      e2cc6505
    • Noah Misch's avatar
      MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T. · 9b5c9979
      Noah Misch authored
      Commits 5a5c2fec and
      b5178c5d introduced support for modern
      MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl
      distributions like Strawberry Perl and modern ActivePerl.  Perl has no
      robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so
      test this.  Back-patch to 9.3 (all supported versions).
      
      The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when
      $Config{gccversion} is nonempty.  That banks on every gcc-built Perl
      using the same ABI.  gcc could change its default ABI the way MSVC once
      did, and one could build Perl with gcc and the non-default ABI.
      
      The GNU make build system could benefit from a similar test, without
      which it does not support MSVC-built Perl.  For now, just add a comment.
      Most users taking the special step of building Perl with MSVC probably
      build PostgreSQL with MSVC.
      
      Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
      9b5c9979
  13. Dec 08, 2017
  14. Dec 06, 2017
    • Robert Haas's avatar
      Report failure to start a background worker. · a8ef4e81
      Robert Haas authored
      When a worker is flagged as BGW_NEVER_RESTART and we fail to start it,
      or if it is not marked BGW_NEVER_RESTART but is terminated before
      startup succeeds, what BgwHandleStatus should be reported?  The
      previous code really hadn't considered this possibility (as indicated
      by the comments which ignore it completely) and would typically return
      BGWH_NOT_YET_STARTED, but that's not a good answer, because then
      there's no way for code using GetBackgroundWorkerPid() to tell the
      difference between a worker that has not started but will start
      later and a worker that has not started and will never be started.
      So, when this case happens, return BGWH_STOPPED instead.  Update the
      comments to reflect this.
      
      The preceding fix by itself is insufficient to fix the problem,
      because the old code also didn't send a notification to the process
      identified in bgw_notify_pid when startup failed.  That might've
      been technically correct under the theory that the status of the
      worker was BGWH_NOT_YET_STARTED, because the status would indeed not
      change when the worker failed to start, but now that we're more
      usefully reporting BGWH_STOPPED, a notification is needed.
      
      Without these fixes, code which starts background workers and then
      uses the recommended APIs to wait for those background workers to
      start would hang indefinitely if the postmaster failed to fork a
      worker.
      
      Amit Kapila and Robert Haas
      
      Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
      a8ef4e81
  15. Dec 05, 2017
  16. Dec 04, 2017
    • Tom Lane's avatar
      Clean up assorted messiness around AllocateDir() usage. · 2a11b188
      Tom Lane authored
      This patch fixes a couple of low-probability bugs that could lead to
      reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE)
      concerning directory-open or file-open failures.  It also fixes places
      where we took shortcuts in reporting such errors, either by using elog
      instead of ereport or by using ereport but forgetting to specify an
      errcode.  And it eliminates a lot of just plain redundant error-handling
      code.
      
      In service of all this, export fd.c's formerly-static function
      ReadDirExtended, so that external callers can make use of the coding
      pattern
      
      	dir = AllocateDir(path);
      	while ((de = ReadDirExtended(dir, path, LOG)) != NULL)
      
      if they'd like to treat directory-open failures as mere LOG conditions
      rather than errors.  Also fix FreeDir to be a no-op if we reach it
      with dir == NULL, as such a coding pattern would cause.
      
      Then, remove code at many call sites that was throwing an error or log
      message for AllocateDir failure, as ReadDir or ReadDirExtended can handle
      that job just fine.  Aside from being a net code savings, this gets rid of
      a lot of not-quite-up-to-snuff reports, as mentioned above.  (In some
      places these changes result in replacing a custom error message such as
      "could not open tablespace directory" with more generic wording "could not
      open directory", but it was agreed that the custom wording buys little as
      long as we report the directory name.)  In some other call sites where we
      can't just remove code, change the error reports to be fully
      project-style-compliant.
      
      Also reorder code in restoreTwoPhaseData that was acquiring a lock
      between AllocateDir and ReadDir; in the unlikely but surely not
      impossible case that LWLockAcquire changes errno, AllocateDir failures
      would be misreported.  There is no great value in opening the directory
      before acquiring TwoPhaseStateLock, so just do it in the other order.
      
      Also fix CheckXLogRemoved to guarantee that it preserves errno,
      as quite a number of call sites are implicitly assuming.  (Again,
      it's unlikely but I think not impossible that errno could change
      during a SpinLockAcquire.  If so, this function was broken for its
      own purposes as well as breaking callers.)
      
      And change a few places that were using not-per-project-style messages,
      such as "could not read directory" when "could not open directory" is
      more correct.
      
      Back-patch the exporting of ReadDirExtended, in case we have occasion
      to back-patch some fix that makes use of it; it's not needed right now
      but surely making it global is pretty harmless.  Also back-patch the
      restoreTwoPhaseData and CheckXLogRemoved fixes.  The rest of this is
      essentially cosmetic and need not get back-patched.
      
      Michael Paquier, with a bit of additional work by me
      
      Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
      2a11b188
    • Tom Lane's avatar
      Support boolean columns in functional-dependency statistics. · bf2b317f
      Tom Lane authored
      There's no good reason that the multicolumn stats stuff shouldn't work on
      booleans.  But it looked only for "Var = pseudoconstant" clauses, and it
      will seldom find those for boolean Vars, since earlier phases of planning
      will fold "boolvar = true" or "boolvar = false" to just "boolvar" or
      "NOT boolvar" respectively.  Improve dependencies_clauselist_selectivity()
      to recognize such clauses as equivalent to equality restrictions.
      
      This fixes a failure of the extended stats mechanism to apply in a case
      reported by Vitaliy Garnashevich.  It's not a complete solution to his
      problem because the bitmap-scan costing code isn't consulting extended
      stats where it should, but that's surely an independent issue.
      
      In passing, improve some comments, get rid of a NumRelids() test that's
      redundant with the preceding bms_membership() test, and fix
      dependencies_clauselist_selectivity() so that estimatedclauses actually
      is a pure output argument as stated by its API contract.
      
      Back-patch to v10 where this code was introduced.
      
      Discussion: https://postgr.es/m/73a4936d-2814-dc08-ed0c-978f76f435b0@gmail.com
      bf2b317f
  17. Nov 30, 2017
    • Noah Misch's avatar
      Fix non-GNU makefiles for AIX make. · f8252b64
      Noah Misch authored
      Invoking the Makefile without an explicit target was building every
      possible target instead of just the "all" target.  Back-patch to 9.3
      (all supported versions).
      f8252b64
  18. Nov 29, 2017
  19. Nov 28, 2017
  20. Nov 27, 2017
    • Tom Lane's avatar
      Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE. · a57aa430
      Tom Lane authored
      rewriteTargetListUD's processing is dependent on the relkind of the query's
      target table.  That was fine at the time it was made to act that way, even
      for queries on inheritance trees, because all tables in an inheritance tree
      would necessarily be plain tables.  However, the 9.5 feature addition
      allowing some members of an inheritance tree to be foreign tables broke the
      assumption that rewriteTargetListUD's output tlist could be applied to all
      child tables with nothing more than column-number mapping.  This led to
      visible failures if foreign child tables had row-level triggers, and would
      also break in cases where child tables belonged to FDWs that used methods
      other than CTID for row identification.
      
      To fix, delay running rewriteTargetListUD until after the planner has
      expanded inheritance, so that it is applied separately to the (already
      mapped) tlist for each child table.  We can conveniently call it from
      preprocess_targetlist.  Refactor associated code slightly to avoid the
      need to heap_open the target relation multiple times during
      preprocess_targetlist.  (The APIs remain a bit ugly, particularly around
      the point of which steps scribble on parse->targetList and which don't.
      But avoiding such scribbling would require a change in FDW callback APIs,
      which is more pain than it's worth.)
      
      Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
      we transition from rows providing a CTID to rows that don't.  (That's
      really an independent bug, but it manifests in much the same cases.)
      
      Add a regression test checking one manifestation of this problem, which
      was that row-level triggers on a foreign child table did not work right.
      
      Back-patch to 9.5 where the problem was introduced.
      
      Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
      
      Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
      a57aa430
    • Magnus Hagander's avatar
      Fix typo in comment · 4f2d0af1
      Magnus Hagander authored
      Andreas Karlsson
      4f2d0af1
  21. Nov 26, 2017
    • Tom Lane's avatar
      Pad XLogReaderState's main_data buffer more aggressively. · 94fd57df
      Tom Lane authored
      Originally, we palloc'd this buffer just barely big enough to hold the
      largest xlog record seen so far.  It turns out that that can result in
      valgrind complaints, because some compilers will emit code that assumes
      it can safely fetch padding bytes at the end of a struct, and those
      padding bytes were unallocated so far as aset.c was concerned.  We can
      fix that by MAXALIGN'ing the palloc request size, ensuring that it is big
      enough to include any possible padding that might've been omitted from
      the on-disk record.
      
      An additional objection to the original coding is that it could result in
      many repeated palloc cycles, in the worst case where we see a series of
      gradually larger xlog records.  We can ameliorate that cheaply by
      imposing a minimum buffer size that's large enough for most xlog records.
      BLCKSZ/2 was chosen after a bit of discussion.
      
      In passing, remove an obsolete comment in struct xl_heap_new_cid that the
      combocid field is free due to alignment considerations.  Perhaps that was
      true at some point, but it's not now.
      
      Back-patch to 9.5 where this code came in.
      
      Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
      94fd57df
Loading