Skip to content
Snippets Groups Projects
  1. May 20, 2016
  2. May 14, 2016
  3. May 13, 2016
    • Tom Lane's avatar
      Ensure plan stability in contrib/btree_gist regression test. · b15ab18e
      Tom Lane authored
      Buildfarm member skink failed with symptoms suggesting that an
      auto-analyze had happened and changed the plan displayed for a
      test query.  Although this is evidently of low probability,
      regression tests that sometimes fail are no fun, so add commands
      to force a bitmap scan to be chosen.
      b15ab18e
  4. May 10, 2016
  5. May 09, 2016
  6. May 07, 2016
    • Tom Lane's avatar
      fded7669
    • Peter Eisentraut's avatar
      Distrust external OpenSSL clients; clear err queue · e3a493ac
      Peter Eisentraut authored
      OpenSSL has an unfortunate tendency to mix per-session state error
      handling with per-thread error handling.  This can cause problems when
      programs that link to libpq with OpenSSL enabled have some other use of
      OpenSSL; without care, one caller of OpenSSL may cause problems for the
      other caller.  Backend code might similarly be affected, for example
      when a third party extension independently uses OpenSSL without taking
      the appropriate precautions.
      
      To fix, don't trust other users of OpenSSL to clear the per-thread error
      queue.  Instead, clear the entire per-thread queue ahead of certain I/O
      operations when it appears that there might be trouble (these I/O
      operations mostly need to call SSL_get_error() to check for success,
      which relies on the queue being empty).  This is slightly aggressive,
      but it's pretty clear that the other callers have a very dubious claim
      to ownership of the per-thread queue.  Do this is both frontend and
      backend code.
      
      Finally, be more careful about clearing our own error queue, so as to
      not cause these problems ourself.  It's possibly that control previously
      did not always reach SSLerrmessage(), where ERR_get_error() was supposed
      to be called to clear the queue's earliest code.  Make sure
      ERR_get_error() is always called, so as to spare other users of OpenSSL
      the possibility of similar problems caused by libpq (as opposed to
      problems caused by a third party OpenSSL library like PHP's OpenSSL
      extension).  Again, do this is both frontend and backend code.
      
      See bug #12799 and https://bugs.php.net/bug.php?id=68276
      
      Based on patches by Dave Vitek and Peter Eisentraut.
      
      From: Peter Geoghegan <pg@bowt.ie>
      e3a493ac
  7. May 06, 2016
    • Tom Lane's avatar
      Fix possible read past end of string in to_timestamp(). · 11247dd9
      Tom Lane authored
      to_timestamp() handles the TH/th format codes by advancing over two input
      characters, whatever those are.  It failed to notice whether there were
      two characters available to be skipped, making it possible to advance
      the pointer past the end of the input string and keep on parsing.
      A similar risk existed in the handling of "Y,YYY" format: it would advance
      over three characters after the "," whether or not three characters were
      available.
      
      In principle this might be exploitable to disclose contents of server
      memory.  But the security team concluded that it would be very hard to use
      that way, because the parsing loop would stop upon hitting any zero byte,
      and TH/th format codes can't be consecutive --- they have to follow some
      other format code, which would have to match whatever data is there.
      So it seems impractical to examine memory very much beyond the end of the
      input string via this bug; and the input string will always be in local
      memory not in disk buffers, making it unlikely that anything very
      interesting is close to it in a predictable way.  So this doesn't quite
      rise to the level of needing a CVE.
      
      Thanks to Wolf Roediger for reporting this bug.
      11247dd9
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2016d. · 29d154e3
      Tom Lane authored
      DST law changes in Russia (Magadan, Tomsk regions) and Venezuela.
      Historical corrections for Russia.  There are new zone names Europe/Kirov
      and Asia/Tomsk reflecting the fact that these regions now have different
      time zone histories from adjacent regions.
      29d154e3
  8. May 04, 2016
  9. May 02, 2016
    • Tom Lane's avatar
      Fix configure's incorrect version tests for flex and perl. · 28e02b8f
      Tom Lane authored
      awk's equality-comparison operator is "==" not "=".  We got this right
      in many places, but not in configure's checks for supported version
      numbers of flex and perl.  It hadn't been noticed because unsupported
      versions are so old as to be basically extinct in the wild, and because
      the only consequence is whether or not a WARNING flies by during
      configure.
      
      Daniel Gustafsson noted the problem with respect to the test for flex,
      I found the other by reviewing other awk calls.
      28e02b8f
    • Heikki Linnakangas's avatar
      Remove unused macros. · 7ee15c05
      Heikki Linnakangas authored
      CHECK_PAGE_OFFSET_RANGE() has been unused forever.
      CHECK_RELATION_BLOCK_RANGE() has been unused in pgstatindex.c ever since
      bt_page_stats() and bt_page_items() functions were moved from pgstattuple
      to pageinspect module. It still exists in pageinspect/btreefuncs.c.
      
      Daniel Gustafsson
      7ee15c05
  10. Apr 30, 2016
    • Tom Lane's avatar
      Fix mishandling of equivalence-class tests in parameterized plans. · f02cb8c9
      Tom Lane authored
      Given a three-or-more-way equivalence class, such as X.Y = Y.Y = Z.Z,
      it was possible for the planner to omit one of the quals needed to
      enforce that all members of the equivalence class are actually equal.
      This only happened in the case of a parameterized join node for two
      of the relations, that is a plan tree like
      
      	Nested Loop
      	  ->  Scan X
      	  ->  Nested Loop
      	    ->  Scan Y
      	    ->  Scan Z
      	          Filter: Z.Z = X.X
      
      The eclass machinery normally expects to apply X.X = Y.Y when those
      two relations are joined, but in this shape of plan tree they aren't
      joined until the top node --- and, if the lower nested loop is marked
      as parameterized by X, the top node will assume that the relevant eclass
      condition(s) got pushed down into the lower node.  On the other hand,
      the scan of Z assumes that it's only responsible for constraining Z.Z
      to match any one of the other eclass members.  So one or another of
      the required quals sometimes fell between the cracks, depending on
      whether consideration of the eclass in get_joinrel_parampathinfo()
      for the lower nested loop chanced to generate X.X = Y.Y or X.X = Z.Z
      as the appropriate constraint there.  If it generated the latter,
      it'd erroneously suppose that the Z scan would take care of matters.
      To fix, force X.X = Y.Y to be generated and applied at that join node
      when this case occurs.
      
      This is *extremely* hard to hit in practice, because various planner
      behaviors conspire to mask the problem; starting with the fact that the
      planner doesn't really like to generate a parameterized plan of the
      above shape.  (It might have been impossible to hit it before we
      tweaked things to allow this plan shape for star-schema cases.)  Many
      thanks to Alexander Kirkouski for submitting a reproducible test case.
      
      The bug can be demonstrated in all branches back to 9.2 where parameterized
      paths were introduced, so back-patch that far.
      f02cb8c9
  11. Apr 28, 2016
    • Tom Lane's avatar
      Adjust DatumGetBool macro, this time for sure. · c563d97c
      Tom Lane authored
      Commit 23a41573 attempted to fix the DatumGetBool macro to ignore bits
      in a Datum that are to the left of the actual bool value.  But it did that
      by casting the Datum to bool; and on compilers that use C99 semantics for
      bool, that ends up being a whole-word test, not a 1-byte test.  This seems
      to be the true explanation for contrib/seg failing in VS2015.  To fix, use
      GET_1_BYTE() explicitly.  I think in the previous patch, I'd had some idea
      of not having to commit to bool being exactly 1 byte wide, but regardless
      of what the compiler's bool is, boolean columns and Datums are certainly
      1 byte wide.
      
      The previous fix was (eventually) back-patched into all active versions,
      so do likewise with this one.
      c563d97c
  12. Apr 23, 2016
    • Tom Lane's avatar
      Rename strtoi() to strtoint(). · 0f549128
      Tom Lane authored
      NetBSD has seen fit to invent a libc function named strtoi(), which
      conflicts with the long-established static functions of the same name in
      datetime.c and ecpg's interval.c.  While muttering darkly about intrusions
      on application namespace, we'll rename our functions to avoid the conflict.
      
      Back-patch to all supported branches, since this would affect attempts
      to build any of them on recent NetBSD.
      
      Thomas Munro
      0f549128
  13. Apr 22, 2016
    • Tom Lane's avatar
      Fix planner failure with full join in RHS of left join. · ad2d32b5
      Tom Lane authored
      Given a left join containing a full join in its righthand side, with
      the left join's joinclause referencing only one side of the full join
      (in a non-strict fashion, so that the full join doesn't get simplified),
      the planner could fail with "failed to build any N-way joins" or related
      errors.  This happened because the full join was seen as overlapping the
      left join's RHS, and then recent changes within join_is_legal() caused
      that function to conclude that the full join couldn't validly be formed.
      Rather than try to rejigger join_is_legal() yet more to allow this,
      I think it's better to fix initsplan.c so that the required join order
      is explicit in the SpecialJoinInfo data structure.  The previous coding
      there essentially ignored full joins, relying on the fact that we don't
      flatten them in the joinlist data structure to preserve their ordering.
      That's sufficient to prevent a wrong plan from being formed, but as this
      example shows, it's not sufficient to ensure that the right plan will
      be formed.  We need to work a bit harder to ensure that the right plan
      looks sane according to the SpecialJoinInfos.
      
      Per bug #14105 from Vojtech Rylko.  This was apparently induced by
      commit 8703059c (though now that I've seen it, I wonder whether there
      are related cases that could have failed before that); so back-patch
      to all active branches.  Unfortunately, that patch also went into 9.0,
      so this bug is a regression that won't be fixed in that branch.
      ad2d32b5
  14. Apr 21, 2016
    • Tom Lane's avatar
      Improve TranslateSocketError() to handle more Windows error codes. · b5ebc513
      Tom Lane authored
      The coverage was rather lean for cases that bind() or listen() might
      return.  Add entries for everything that there's a direct equivalent
      for in the set of Unix errnos that elog.c has heard of.
      b5ebc513
    • Tom Lane's avatar
      Remove dead code in win32.h. · d1e4ede7
      Tom Lane authored
      There's no longer a need for the MSVC-version-specific code stanza that
      forcibly redefines errno code symbols, because since commit 73838b52 we're
      unconditionally redefining them in the stanza before this one anyway.
      Now it's merely confusing and ugly, so get rid of it; and improve the
      comment that explains what's going on here.
      
      Although this is just cosmetic, back-patch anyway since I'm intending
      to back-patch some less-cosmetic changes in this same hunk of code.
      d1e4ede7
    • Tom Lane's avatar
      Provide errno-translation wrappers around bind() and listen() on Windows. · 6848827b
      Tom Lane authored
      Fix Windows builds to report something useful rather than "could not bind
      IPv4 socket: No error" when bind() fails.
      
      Back-patch of commits d1b7d487 and 22989a8e.
      
      Discussion: <4065.1452450340@sss.pgh.pa.us>
      6848827b
    • Tom Lane's avatar
      Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK. · c7c145e4
      Tom Lane authored
      When we shoehorned "x op ANY (array)" into the SQL syntax, we created a
      fundamental ambiguity as to the proper treatment of a sub-SELECT on the
      righthand side: perhaps what's meant is to compare x against each row of
      the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar
      sub-SELECT that delivers a single array value whose members should be
      compared against x.  The grammar resolves it as the former case whenever
      the RHS is a select_with_parens, making the latter case hard to reach ---
      but you can get at it, with tricks such as attaching a no-op cast to the
      sub-SELECT.  Parse analysis would throw away the no-op cast, leaving a
      parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr.
      ruleutils.c was not clued in on this fine point, and would naively emit
      "x op ANY ((SELECT ...))", which would be parsed as the first alternative,
      typically leading to errors like "operator does not exist: text = text[]"
      during dump/reload of a view or rule containing such a construct.  To fix,
      emit a no-op cast when dumping such a parsetree.  This might well be
      exactly what the user wrote to get the construct accepted in the first
      place; and even if she got there with some other dodge, it is a valid
      representation of the parsetree.
      
      Per report from Karl Czajkowski.  He mentioned only a case involving
      RLS policies, but actually the problem is very old, so back-patch to
      all supported branches.
      
      Report: <20160421001832.GB7976@moraine.isi.edu>
      c7c145e4
    • Tom Lane's avatar
      Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait. · 1b22368f
      Tom Lane authored
      In commit 2ffa8696 we made pg_ctl recognize an environment variable
      PGCTLTIMEOUT to set the default timeout for starting and stopping the
      postmaster.  However, pg_regress uses pg_ctl only for the "stop" end of
      that; it has bespoke code for starting the postmaster, and that code has
      historically had a hard-wired 60-second timeout.  Further buildfarm
      experience says it'd be a good idea if that timeout were also controlled
      by PGCTLTIMEOUT, so let's make it so.  Like the previous patch, back-patch
      to all active branches.
      
      Discussion: <13969.1461191936@sss.pgh.pa.us>
      1b22368f
  15. Apr 18, 2016
    • Tom Lane's avatar
      Further reduce the number of semaphores used under --disable-spinlocks. · b24f7e28
      Tom Lane authored
      Per discussion, there doesn't seem to be much value in having
      NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are
      running more than a few backends concurrently, you really had better have a
      real spinlock implementation if you want tolerable performance.  And 1024
      semaphores is a sizable fraction of the system-wide SysV semaphore limit
      on many platforms.  Therefore, reduce this setting's default value to 128
      to make it less likely to cause out-of-semaphores problems.
      b24f7e28
    • Tom Lane's avatar
      Fix --disable-spinlocks in 9.2 and 9.3 branches. · 37f30b25
      Tom Lane authored
      My back-patch of the 9.4-era commit 44cd47c1 into 9.2 and 9.3 fixed
      HPPA builds as expected, but it broke --disable-spinlocks builds, because
      the dummy spinlock is initialized before the underlying semaphore
      infrastructure is alive.  In 9.4 and up this works because of commit
      daa7527a, which decoupled initialization of an slock_t variable
      from access to the actual system semaphore object.  The best solution
      seems to be to back-port that patch, which should be a net win anyway
      because it improves the usability of --disable-spinlocks builds in the
      older branches; and it's been out long enough now to not be worrisome
      from a stability perspective.
      37f30b25
  16. Apr 16, 2016
  17. Apr 15, 2016
    • Tom Lane's avatar
      Sync 9.2 and 9.3 versions of barrier.h with 9.4's version. · d7dbc882
      Tom Lane authored
      We weren't particularly maintaining barrier.h before 9.4, because nothing
      was using it in those branches.  Well, nothing until commit 37de8de9 got
      back-patched.  That broke 9.2 and 9.3 for some non-mainstream platforms
      that we haven't been testing in the buildfarm, including icc on ia64,
      HPPA, and Alpha.
      
      This commit effectively back-patches commits e5592c61, 89779bf2,
      and 747ca669, though I did it just by copying the file (less copyright
      date updates) rather than by cherry-picking those commits.
      
      Per an attempt to run gaur and pademelon over old branches they've
      not been run on since ~2013.
      d7dbc882
    • Andres Freund's avatar
      Remove trailing commas in enums. · b5450859
      Andres Freund authored
      These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These
      exist in differing places in most supported branches.
      b5450859
  18. Apr 14, 2016
    • Tom Lane's avatar
      Fix pg_dump so pg_upgrade'ing an extension with simple opfamilies works. · 6bb42d52
      Tom Lane authored
      As reported by Michael Feld, pg_upgrade'ing an installation having
      extensions with operator families that contain just a single operator class
      failed to reproduce the extension membership of those operator families.
      This caused no immediate ill effects, but would create problems when later
      trying to do a plain dump and restore, because the seemingly-not-part-of-
      the-extension operator families would appear separately in the pg_dump
      output, and then would conflict with the families created by loading the
      extension.  This has been broken ever since extensions were introduced,
      and many of the standard contrib extensions are affected, so it's a bit
      astonishing nobody complained before.
      
      The cause of the problem is a perhaps-ill-considered decision to omit
      such operator families from pg_dump's output on the grounds that the
      CREATE OPERATOR CLASS commands could recreate them, and having explicit
      CREATE OPERATOR FAMILY commands would impede loading the dump script into
      pre-8.3 servers.  Whatever the merits of that decision when 8.3 was being
      written, it looks like a poor tradeoff now.  We can fix the pg_upgrade
      problem simply by removing that code, so that the operator families are
      dumped explicitly (and then will be properly made to be part of their
      extensions).
      
      Although this fixes the behavior of future pg_upgrade runs, it does nothing
      to clean up existing installations that may have improperly-linked operator
      families.  Given the small number of complaints to date, maybe we don't
      need to worry about providing an automated solution for that; anyone who
      needs to clean it up can do so with manual "ALTER EXTENSION ADD OPERATOR
      FAMILY" commands, or even just ignore the duplicate-opfamily errors they
      get during a pg_restore.  In any case we need this fix.
      
      Back-patch to all supported branches.
      
      Discussion: <20228.1460575691@sss.pgh.pa.us>
      6bb42d52
  19. Apr 12, 2016
    • Tom Lane's avatar
      Fix freshly-introduced PL/Python portability bug. · 9422b61d
      Tom Lane authored
      It turns out that those PyErr_Clear() calls I removed from plpy_elog.c
      in 7e3bb080 et al were not quite as random as they appeared: they
      mask a Python 2.3.x bug.  (Specifically, it turns out that PyType_Ready()
      can fail if the error indicator is set on entry, and PLy_traceback's fetch
      of frame.f_code may be the first operation in a session that requires the
      "frame" type to be readied.  Ick.)  Put back the clear call, but in a more
      centralized place closer to what it's protecting, and this time with a
      comment warning what it's really for.
      
      Per buildfarm member prairiedog.  Although prairiedog was only failing
      on HEAD, it seems clearly possible for this to occur in older branches
      as well, so back-patch to 9.2 the same as the previous patch.
      9422b61d
  20. Apr 11, 2016
    • Tom Lane's avatar
      Fix access-to-already-freed-memory issue in plpython's error handling. · e1f08ba1
      Tom Lane authored
      PLy_elog() could attempt to access strings that Python had already freed,
      because the strings that PLy_get_spi_error_data() returns are simply
      pointers into storage associated with the error "val" PyObject.  That's
      fine at the instant PLy_get_spi_error_data() returns them, but just after
      that PLy_traceback() intentionally releases the only refcount on that
      object, allowing it to be freed --- so that the strings we pass to
      ereport() are dangling pointers.
      
      In principle this could result in garbage output or a coredump.  In
      practice, I think the risk is pretty low, because there are no Python
      operations between where we decrement that refcount and where we use the
      strings (and copy them into PG storage), and thus no reason for Python
      to recycle the storage.  Still, it's clearly hazardous, and it leads to
      Valgrind complaints when running under a Valgrind that hasn't been
      lobotomized to ignore Python memory allocations.
      
      The code was a mess anyway: we fetched the error data out of Python
      (clearing Python's error indicator) with PyErr_Fetch, examined it, pushed
      it back into Python with PyErr_Restore (re-setting the error indicator),
      then immediately pulled it back out with another PyErr_Fetch.  Just to
      confuse matters even more, there were some gratuitous-and-yet-hazardous
      PyErr_Clear calls in the "examine" step, and we didn't get around to doing
      PyErr_NormalizeException until after the second PyErr_Fetch, making it even
      less clear which object was being manipulated where and whether we still
      had a refcount on it.  (If PyErr_NormalizeException did substitute a
      different "val" object, it's possible that the problem could manifest for
      real, because then we'd be doing assorted Python stuff with no refcount
      on the object we have string pointers into.)
      
      So, rearrange all that into some semblance of sanity, and don't decrement
      the refcount on the Python error objects until the end of PLy_elog().
      In HEAD, I failed to resist the temptation to reformat some messy bits
      from 5c3c3cd0 along the way.
      
      Back-patch as far as 9.2, because the code is substantially the same
      that far back.  I believe that 9.1 has the bug as well; but the code
      around it is rather different and I don't want to take a chance on
      breaking something for what seems a low-probability problem.
      e1f08ba1
  21. Apr 08, 2016
  22. Apr 06, 2016
  23. Apr 04, 2016
    • Tom Lane's avatar
      Fix latent portability issue in pgwin32_dispatch_queued_signals(). · 5496c75d
      Tom Lane authored
      The first iteration of the signal-checking loop would compute sigmask(0)
      which expands to 1<<(-1) which is undefined behavior according to the
      C standard.  The lack of field reports of trouble suggest that it
      evaluates to 0 on all existing Windows compilers, but that's hardly
      something to rely on.  Since signal 0 isn't a queueable signal anyway,
      we can just make the loop iterate from 1 instead, and save a few cycles
      as well as avoiding the undefined behavior.
      
      In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE
      twice in a row; there's no reason to waste cycles like that.
      
      Noted by Aleksander Alekseev, though this isn't his proposed fix.
      Back-patch to all supported branches.
      5496c75d
  24. Mar 30, 2016
  25. Mar 29, 2016
    • Tom Lane's avatar
      Avoid possibly-unsafe use of Windows' FormatMessage() function. · b4b06931
      Tom Lane authored
      Whenever this function is used with the FORMAT_MESSAGE_FROM_SYSTEM flag,
      it's good practice to include FORMAT_MESSAGE_IGNORE_INSERTS as well.
      Otherwise, if the message contains any %n insertion markers, the function
      will try to fetch argument strings to substitute --- which we are not
      passing, possibly leading to a crash.  This is exactly analogous to the
      rule about not giving printf() a format string you're not in control of.
      
      Noted and patched by Christian Ullrich.
      Back-patch to all supported branches.
      b4b06931
  26. Mar 28, 2016
Loading