Skip to content
Snippets Groups Projects
  1. May 04, 2016
  2. May 02, 2016
    • Tom Lane's avatar
      Fix configure's incorrect version tests for flex and perl. · 28e02b8f
      Tom Lane authored
      awk's equality-comparison operator is "==" not "=".  We got this right
      in many places, but not in configure's checks for supported version
      numbers of flex and perl.  It hadn't been noticed because unsupported
      versions are so old as to be basically extinct in the wild, and because
      the only consequence is whether or not a WARNING flies by during
      configure.
      
      Daniel Gustafsson noted the problem with respect to the test for flex,
      I found the other by reviewing other awk calls.
      28e02b8f
    • Heikki Linnakangas's avatar
      Remove unused macros. · 7ee15c05
      Heikki Linnakangas authored
      CHECK_PAGE_OFFSET_RANGE() has been unused forever.
      CHECK_RELATION_BLOCK_RANGE() has been unused in pgstatindex.c ever since
      bt_page_stats() and bt_page_items() functions were moved from pgstattuple
      to pageinspect module. It still exists in pageinspect/btreefuncs.c.
      
      Daniel Gustafsson
      7ee15c05
  3. Apr 30, 2016
    • Tom Lane's avatar
      Fix mishandling of equivalence-class tests in parameterized plans. · f02cb8c9
      Tom Lane authored
      Given a three-or-more-way equivalence class, such as X.Y = Y.Y = Z.Z,
      it was possible for the planner to omit one of the quals needed to
      enforce that all members of the equivalence class are actually equal.
      This only happened in the case of a parameterized join node for two
      of the relations, that is a plan tree like
      
      	Nested Loop
      	  ->  Scan X
      	  ->  Nested Loop
      	    ->  Scan Y
      	    ->  Scan Z
      	          Filter: Z.Z = X.X
      
      The eclass machinery normally expects to apply X.X = Y.Y when those
      two relations are joined, but in this shape of plan tree they aren't
      joined until the top node --- and, if the lower nested loop is marked
      as parameterized by X, the top node will assume that the relevant eclass
      condition(s) got pushed down into the lower node.  On the other hand,
      the scan of Z assumes that it's only responsible for constraining Z.Z
      to match any one of the other eclass members.  So one or another of
      the required quals sometimes fell between the cracks, depending on
      whether consideration of the eclass in get_joinrel_parampathinfo()
      for the lower nested loop chanced to generate X.X = Y.Y or X.X = Z.Z
      as the appropriate constraint there.  If it generated the latter,
      it'd erroneously suppose that the Z scan would take care of matters.
      To fix, force X.X = Y.Y to be generated and applied at that join node
      when this case occurs.
      
      This is *extremely* hard to hit in practice, because various planner
      behaviors conspire to mask the problem; starting with the fact that the
      planner doesn't really like to generate a parameterized plan of the
      above shape.  (It might have been impossible to hit it before we
      tweaked things to allow this plan shape for star-schema cases.)  Many
      thanks to Alexander Kirkouski for submitting a reproducible test case.
      
      The bug can be demonstrated in all branches back to 9.2 where parameterized
      paths were introduced, so back-patch that far.
      f02cb8c9
  4. Apr 28, 2016
    • Tom Lane's avatar
      Adjust DatumGetBool macro, this time for sure. · c563d97c
      Tom Lane authored
      Commit 23a41573 attempted to fix the DatumGetBool macro to ignore bits
      in a Datum that are to the left of the actual bool value.  But it did that
      by casting the Datum to bool; and on compilers that use C99 semantics for
      bool, that ends up being a whole-word test, not a 1-byte test.  This seems
      to be the true explanation for contrib/seg failing in VS2015.  To fix, use
      GET_1_BYTE() explicitly.  I think in the previous patch, I'd had some idea
      of not having to commit to bool being exactly 1 byte wide, but regardless
      of what the compiler's bool is, boolean columns and Datums are certainly
      1 byte wide.
      
      The previous fix was (eventually) back-patched into all active versions,
      so do likewise with this one.
      c563d97c
  5. Apr 23, 2016
    • Tom Lane's avatar
      Rename strtoi() to strtoint(). · 0f549128
      Tom Lane authored
      NetBSD has seen fit to invent a libc function named strtoi(), which
      conflicts with the long-established static functions of the same name in
      datetime.c and ecpg's interval.c.  While muttering darkly about intrusions
      on application namespace, we'll rename our functions to avoid the conflict.
      
      Back-patch to all supported branches, since this would affect attempts
      to build any of them on recent NetBSD.
      
      Thomas Munro
      0f549128
  6. Apr 22, 2016
    • Tom Lane's avatar
      Fix planner failure with full join in RHS of left join. · ad2d32b5
      Tom Lane authored
      Given a left join containing a full join in its righthand side, with
      the left join's joinclause referencing only one side of the full join
      (in a non-strict fashion, so that the full join doesn't get simplified),
      the planner could fail with "failed to build any N-way joins" or related
      errors.  This happened because the full join was seen as overlapping the
      left join's RHS, and then recent changes within join_is_legal() caused
      that function to conclude that the full join couldn't validly be formed.
      Rather than try to rejigger join_is_legal() yet more to allow this,
      I think it's better to fix initsplan.c so that the required join order
      is explicit in the SpecialJoinInfo data structure.  The previous coding
      there essentially ignored full joins, relying on the fact that we don't
      flatten them in the joinlist data structure to preserve their ordering.
      That's sufficient to prevent a wrong plan from being formed, but as this
      example shows, it's not sufficient to ensure that the right plan will
      be formed.  We need to work a bit harder to ensure that the right plan
      looks sane according to the SpecialJoinInfos.
      
      Per bug #14105 from Vojtech Rylko.  This was apparently induced by
      commit 8703059c (though now that I've seen it, I wonder whether there
      are related cases that could have failed before that); so back-patch
      to all active branches.  Unfortunately, that patch also went into 9.0,
      so this bug is a regression that won't be fixed in that branch.
      ad2d32b5
  7. Apr 21, 2016
    • Tom Lane's avatar
      Improve TranslateSocketError() to handle more Windows error codes. · b5ebc513
      Tom Lane authored
      The coverage was rather lean for cases that bind() or listen() might
      return.  Add entries for everything that there's a direct equivalent
      for in the set of Unix errnos that elog.c has heard of.
      b5ebc513
    • Tom Lane's avatar
      Remove dead code in win32.h. · d1e4ede7
      Tom Lane authored
      There's no longer a need for the MSVC-version-specific code stanza that
      forcibly redefines errno code symbols, because since commit 73838b52 we're
      unconditionally redefining them in the stanza before this one anyway.
      Now it's merely confusing and ugly, so get rid of it; and improve the
      comment that explains what's going on here.
      
      Although this is just cosmetic, back-patch anyway since I'm intending
      to back-patch some less-cosmetic changes in this same hunk of code.
      d1e4ede7
    • Tom Lane's avatar
      Provide errno-translation wrappers around bind() and listen() on Windows. · 6848827b
      Tom Lane authored
      Fix Windows builds to report something useful rather than "could not bind
      IPv4 socket: No error" when bind() fails.
      
      Back-patch of commits d1b7d487 and 22989a8e.
      
      Discussion: <4065.1452450340@sss.pgh.pa.us>
      6848827b
    • Tom Lane's avatar
      Fix ruleutils.c's dumping of ScalarArrayOpExpr containing an EXPR_SUBLINK. · c7c145e4
      Tom Lane authored
      When we shoehorned "x op ANY (array)" into the SQL syntax, we created a
      fundamental ambiguity as to the proper treatment of a sub-SELECT on the
      righthand side: perhaps what's meant is to compare x against each row of
      the sub-SELECT's result, or perhaps the sub-SELECT is meant as a scalar
      sub-SELECT that delivers a single array value whose members should be
      compared against x.  The grammar resolves it as the former case whenever
      the RHS is a select_with_parens, making the latter case hard to reach ---
      but you can get at it, with tricks such as attaching a no-op cast to the
      sub-SELECT.  Parse analysis would throw away the no-op cast, leaving a
      parsetree with an EXPR_SUBLINK SubLink directly under a ScalarArrayOpExpr.
      ruleutils.c was not clued in on this fine point, and would naively emit
      "x op ANY ((SELECT ...))", which would be parsed as the first alternative,
      typically leading to errors like "operator does not exist: text = text[]"
      during dump/reload of a view or rule containing such a construct.  To fix,
      emit a no-op cast when dumping such a parsetree.  This might well be
      exactly what the user wrote to get the construct accepted in the first
      place; and even if she got there with some other dodge, it is a valid
      representation of the parsetree.
      
      Per report from Karl Czajkowski.  He mentioned only a case involving
      RLS policies, but actually the problem is very old, so back-patch to
      all supported branches.
      
      Report: <20160421001832.GB7976@moraine.isi.edu>
      c7c145e4
    • Tom Lane's avatar
      Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait. · 1b22368f
      Tom Lane authored
      In commit 2ffa8696 we made pg_ctl recognize an environment variable
      PGCTLTIMEOUT to set the default timeout for starting and stopping the
      postmaster.  However, pg_regress uses pg_ctl only for the "stop" end of
      that; it has bespoke code for starting the postmaster, and that code has
      historically had a hard-wired 60-second timeout.  Further buildfarm
      experience says it'd be a good idea if that timeout were also controlled
      by PGCTLTIMEOUT, so let's make it so.  Like the previous patch, back-patch
      to all active branches.
      
      Discussion: <13969.1461191936@sss.pgh.pa.us>
      1b22368f
  8. Apr 18, 2016
    • Tom Lane's avatar
      Further reduce the number of semaphores used under --disable-spinlocks. · b24f7e28
      Tom Lane authored
      Per discussion, there doesn't seem to be much value in having
      NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are
      running more than a few backends concurrently, you really had better have a
      real spinlock implementation if you want tolerable performance.  And 1024
      semaphores is a sizable fraction of the system-wide SysV semaphore limit
      on many platforms.  Therefore, reduce this setting's default value to 128
      to make it less likely to cause out-of-semaphores problems.
      b24f7e28
    • Tom Lane's avatar
      Fix --disable-spinlocks in 9.2 and 9.3 branches. · 37f30b25
      Tom Lane authored
      My back-patch of the 9.4-era commit 44cd47c1 into 9.2 and 9.3 fixed
      HPPA builds as expected, but it broke --disable-spinlocks builds, because
      the dummy spinlock is initialized before the underlying semaphore
      infrastructure is alive.  In 9.4 and up this works because of commit
      daa7527a, which decoupled initialization of an slock_t variable
      from access to the actual system semaphore object.  The best solution
      seems to be to back-port that patch, which should be a net win anyway
      because it improves the usability of --disable-spinlocks builds in the
      older branches; and it's been out long enough now to not be worrisome
      from a stability perspective.
      37f30b25
  9. Apr 16, 2016
  10. Apr 15, 2016
    • Tom Lane's avatar
      Sync 9.2 and 9.3 versions of barrier.h with 9.4's version. · d7dbc882
      Tom Lane authored
      We weren't particularly maintaining barrier.h before 9.4, because nothing
      was using it in those branches.  Well, nothing until commit 37de8de9 got
      back-patched.  That broke 9.2 and 9.3 for some non-mainstream platforms
      that we haven't been testing in the buildfarm, including icc on ia64,
      HPPA, and Alpha.
      
      This commit effectively back-patches commits e5592c61, 89779bf2,
      and 747ca669, though I did it just by copying the file (less copyright
      date updates) rather than by cherry-picking those commits.
      
      Per an attempt to run gaur and pademelon over old branches they've
      not been run on since ~2013.
      d7dbc882
    • Andres Freund's avatar
      Remove trailing commas in enums. · b5450859
      Andres Freund authored
      These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These
      exist in differing places in most supported branches.
      b5450859
  11. Apr 14, 2016
    • Tom Lane's avatar
      Fix pg_dump so pg_upgrade'ing an extension with simple opfamilies works. · 6bb42d52
      Tom Lane authored
      As reported by Michael Feld, pg_upgrade'ing an installation having
      extensions with operator families that contain just a single operator class
      failed to reproduce the extension membership of those operator families.
      This caused no immediate ill effects, but would create problems when later
      trying to do a plain dump and restore, because the seemingly-not-part-of-
      the-extension operator families would appear separately in the pg_dump
      output, and then would conflict with the families created by loading the
      extension.  This has been broken ever since extensions were introduced,
      and many of the standard contrib extensions are affected, so it's a bit
      astonishing nobody complained before.
      
      The cause of the problem is a perhaps-ill-considered decision to omit
      such operator families from pg_dump's output on the grounds that the
      CREATE OPERATOR CLASS commands could recreate them, and having explicit
      CREATE OPERATOR FAMILY commands would impede loading the dump script into
      pre-8.3 servers.  Whatever the merits of that decision when 8.3 was being
      written, it looks like a poor tradeoff now.  We can fix the pg_upgrade
      problem simply by removing that code, so that the operator families are
      dumped explicitly (and then will be properly made to be part of their
      extensions).
      
      Although this fixes the behavior of future pg_upgrade runs, it does nothing
      to clean up existing installations that may have improperly-linked operator
      families.  Given the small number of complaints to date, maybe we don't
      need to worry about providing an automated solution for that; anyone who
      needs to clean it up can do so with manual "ALTER EXTENSION ADD OPERATOR
      FAMILY" commands, or even just ignore the duplicate-opfamily errors they
      get during a pg_restore.  In any case we need this fix.
      
      Back-patch to all supported branches.
      
      Discussion: <20228.1460575691@sss.pgh.pa.us>
      6bb42d52
  12. Apr 12, 2016
    • Tom Lane's avatar
      Fix freshly-introduced PL/Python portability bug. · 9422b61d
      Tom Lane authored
      It turns out that those PyErr_Clear() calls I removed from plpy_elog.c
      in 7e3bb080 et al were not quite as random as they appeared: they
      mask a Python 2.3.x bug.  (Specifically, it turns out that PyType_Ready()
      can fail if the error indicator is set on entry, and PLy_traceback's fetch
      of frame.f_code may be the first operation in a session that requires the
      "frame" type to be readied.  Ick.)  Put back the clear call, but in a more
      centralized place closer to what it's protecting, and this time with a
      comment warning what it's really for.
      
      Per buildfarm member prairiedog.  Although prairiedog was only failing
      on HEAD, it seems clearly possible for this to occur in older branches
      as well, so back-patch to 9.2 the same as the previous patch.
      9422b61d
  13. Apr 11, 2016
    • Tom Lane's avatar
      Fix access-to-already-freed-memory issue in plpython's error handling. · e1f08ba1
      Tom Lane authored
      PLy_elog() could attempt to access strings that Python had already freed,
      because the strings that PLy_get_spi_error_data() returns are simply
      pointers into storage associated with the error "val" PyObject.  That's
      fine at the instant PLy_get_spi_error_data() returns them, but just after
      that PLy_traceback() intentionally releases the only refcount on that
      object, allowing it to be freed --- so that the strings we pass to
      ereport() are dangling pointers.
      
      In principle this could result in garbage output or a coredump.  In
      practice, I think the risk is pretty low, because there are no Python
      operations between where we decrement that refcount and where we use the
      strings (and copy them into PG storage), and thus no reason for Python
      to recycle the storage.  Still, it's clearly hazardous, and it leads to
      Valgrind complaints when running under a Valgrind that hasn't been
      lobotomized to ignore Python memory allocations.
      
      The code was a mess anyway: we fetched the error data out of Python
      (clearing Python's error indicator) with PyErr_Fetch, examined it, pushed
      it back into Python with PyErr_Restore (re-setting the error indicator),
      then immediately pulled it back out with another PyErr_Fetch.  Just to
      confuse matters even more, there were some gratuitous-and-yet-hazardous
      PyErr_Clear calls in the "examine" step, and we didn't get around to doing
      PyErr_NormalizeException until after the second PyErr_Fetch, making it even
      less clear which object was being manipulated where and whether we still
      had a refcount on it.  (If PyErr_NormalizeException did substitute a
      different "val" object, it's possible that the problem could manifest for
      real, because then we'd be doing assorted Python stuff with no refcount
      on the object we have string pointers into.)
      
      So, rearrange all that into some semblance of sanity, and don't decrement
      the refcount on the Python error objects until the end of PLy_elog().
      In HEAD, I failed to resist the temptation to reformat some messy bits
      from 5c3c3cd0 along the way.
      
      Back-patch as far as 9.2, because the code is substantially the same
      that far back.  I believe that 9.1 has the bug as well; but the code
      around it is rather different and I don't want to take a chance on
      breaking something for what seems a low-probability problem.
      e1f08ba1
  14. Apr 08, 2016
  15. Apr 06, 2016
  16. Apr 04, 2016
    • Tom Lane's avatar
      Fix latent portability issue in pgwin32_dispatch_queued_signals(). · 5496c75d
      Tom Lane authored
      The first iteration of the signal-checking loop would compute sigmask(0)
      which expands to 1<<(-1) which is undefined behavior according to the
      C standard.  The lack of field reports of trouble suggest that it
      evaluates to 0 on all existing Windows compilers, but that's hardly
      something to rely on.  Since signal 0 isn't a queueable signal anyway,
      we can just make the loop iterate from 1 instead, and save a few cycles
      as well as avoiding the undefined behavior.
      
      In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE
      twice in a row; there's no reason to waste cycles like that.
      
      Noted by Aleksander Alekseev, though this isn't his proposed fix.
      Back-patch to all supported branches.
      5496c75d
  17. Mar 30, 2016
  18. Mar 29, 2016
    • Tom Lane's avatar
      Avoid possibly-unsafe use of Windows' FormatMessage() function. · b4b06931
      Tom Lane authored
      Whenever this function is used with the FORMAT_MESSAGE_FROM_SYSTEM flag,
      it's good practice to include FORMAT_MESSAGE_IGNORE_INSERTS as well.
      Otherwise, if the message contains any %n insertion markers, the function
      will try to fetch argument strings to substitute --- which we are not
      passing, possibly leading to a crash.  This is exactly analogous to the
      rule about not giving printf() a format string you're not in control of.
      
      Noted and patched by Christian Ullrich.
      Back-patch to all supported branches.
      b4b06931
  19. Mar 28, 2016
  20. Mar 27, 2016
    • Andres Freund's avatar
      Change various Gin*Is* macros to return 0/1. · 290cc21d
      Andres Freund authored
      Returning the direct result of bit arithmetic, in a macro intended to be
      used in a boolean manner, can be problematic if the return value is
      stored in a variable of type 'bool'. If bool is implemented using C99's
      _Bool, that can lead to comparison failures if the variable is then
      compared again with the expression (see ginStepRight() for an example
      that fails), as _Bool forces the result to be 0/1. That happens in some
      configurations of newer MSVC compilers.  It's also problematic when
      storing the result of such an expression in a narrower type.
      
      Several gin macros have been declared in that style since gin's initial
      commit in 8a3631f8.
      
      There's a lot more macros like this, but this is the only one causing
      regression test failures; and I don't want to commit and backpatch a
      larger patch with lots of conflicts just before the next set of minor
      releases.
      
      Discussion: 20150811154237.GD17575@awork2.anarazel.de
      Backpatch: All supported branches
      290cc21d
  21. Mar 26, 2016
    • Tom Lane's avatar
      Modernize zic's test for valid timezone abbreviations. · 7a68106e
      Tom Lane authored
      We really need to sync all of our IANA-derived timezone code with upstream,
      but that's going to be a large patch and I certainly don't care to shove
      such a thing into stable branches immediately before a release.  As a
      stopgap, copy just the tzcode2016c logic that checks validity of timezone
      abbreviations.  This prevents getting multiple "time zone abbreviation
      differs from POSIX standard" bleats with tzdata 2014b and later.
      7a68106e
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2016c. · 96fa3745
      Tom Lane authored
      DST law changes in Azerbaijan, Chile, Haiti, Palestine, and Russia (Altai,
      Astrakhan, Kirov, Sakhalin, Ulyanovsk regions).  Historical corrections
      for Lithuania, Moldova, Russia (Kaliningrad, Samara, Volgograd).
      
      As of 2015b, the keepers of the IANA timezone database started to use
      numeric time zone abbreviations (e.g., "+04") instead of inventing
      abbreviations not found in the wild like "ASTT".  This causes our rather
      old copy of zic to whine "warning: time zone abbreviation differs from
      POSIX standard" several times during "make install".  This warning is
      harmless according to the IANA folk, and I don't see any problems with
      these abbreviations in some simple tests; but it seems like now would be
      a good time to update our copy of the tzcode stuff.  I'll look into that
      soon.
      96fa3745
  22. Mar 19, 2016
    • Andrew Dunstan's avatar
      Remove dependency on psed for MSVC builds. · 89bf78a9
      Andrew Dunstan authored
      Modern Perl has removed psed from its core distribution, so it might not
      be readily available on some build platforms. We therefore replace its
      use with a Perl script generated by s2p, which is equivalent to the sed
      script. The latter is retained for non-MSVC builds to avoid creating a
      new hard dependency on Perl for non-Windows tarball builds.
      
      Backpatch to all live branches.
      
      Michael Paquier and me.
      89bf78a9
  23. Mar 17, 2016
    • Tom Lane's avatar
      Fix "pg_bench -C -M prepared". · be6f9ea2
      Tom Lane authored
      This didn't work because when we dropped and re-established a database
      connection, we did not bother to reset session-specific state such as
      the statements-are-prepared flags.
      
      The st->prepared[] array certainly needs to be flushed, and I cleared a
      couple of other fields as well that couldn't possibly retain meaningful
      state for a new connection.
      
      In passing, fix some bogus comments and strange field order choices.
      
      Per report from Robins Tharakan.
      be6f9ea2
  24. Mar 15, 2016
    • Tom Lane's avatar
      Cope if platform declares mbstowcs_l(), but not locale_t, in <xlocale.h>. · e39f86fe
      Tom Lane authored
      Previously, we included <xlocale.h> only if necessary to get the definition
      of type locale_t.  According to notes in PGAC_TYPE_LOCALE_T, this is
      important because on some versions of glibc that file supplies an
      incompatible declaration of locale_t.  (This info may be obsolete, because
      on my RHEL6 box that seems to be the *only* definition of locale_t; but
      there may still be glibc's in the wild for which it's a live concern.)
      
      It turns out though that on FreeBSD and maybe other BSDen, you can get
      locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from
      <xlocale.h>.  This was leaving us compiling calls to mbstowcs_l() and
      friends with no visible prototype, which causes a warning and could
      possibly cause actual trouble, since it's not declared to return int.
      
      Hence, adjust the configure checks so that we'll include <xlocale.h>
      either if it's necessary to get type locale_t or if it's necessary to
      get a declaration of mbstowcs_l().
      
      Report and patch by Aleksander Alekseev, somewhat whacked around by me.
      Back-patch to all supported branches, since we have been using
      mbstowcs_l() since 9.1.
      e39f86fe
  25. Mar 14, 2016
    • Tom Lane's avatar
      Add missing NULL terminator to list_SECURITY_LABEL_preposition[]. · 39b3ea71
      Tom Lane authored
      On the machines I tried this on, pressing TAB after SECURITY LABEL led to
      being offered ON and FOR as intended, plus random other keywords (varying
      across machines).  But if you were a bit more unlucky you'd get a crash,
      as reported by nummervet@mail.ru in bug #14019.
      
      Seems to have been an aboriginal error in the SECURITY LABEL patch,
      commit 4d355a83.  Hence, back-patch to all supported versions.
      There's no bug in HEAD, though, thanks to our recent tab-completion
      rewrite.
      39b3ea71
  26. Mar 10, 2016
    • Magnus Hagander's avatar
      Avoid crash on old Windows with AVX2-capable CPU for VS2013 builds · 78b59780
      Magnus Hagander authored
      The Visual Studio 2013 CRT generates invalid code when it makes a 64-bit
      build that is later used on a CPU that supports AVX2 instructions using a
      version of Windows before 7SP1/2008R2SP1.
      
      Detect this combination, and in those cases turn off the generation of
      FMA3, per recommendation from the Visual Studio team.
      
      The bug is actually in the CRT shipping with Visual Studio 2013, but
      Microsoft have stated they're only fixing it in newer major versions.
      The fix is therefor conditioned specifically on being built with this
      version of Visual Studio, and not previous or later versions.
      
      Author: Christian Ullrich
      78b59780
    • Andres Freund's avatar
      Avoid unlikely data-loss scenarios due to rename() without fsync. · ce8f4291
      Andres Freund authored
      Renaming a file using rename(2) is not guaranteed to be durable in face
      of crashes. Use the previously added durable_rename()/durable_link_or_rename()
      in various places where we previously just renamed files.
      
      Most of the changed call sites are arguably not critical, but it seems
      better to err on the side of too much durability.  The most prominent
      known case where the previously missing fsyncs could cause data loss is
      crashes at the end of a checkpoint. After the actual checkpoint has been
      performed, old WAL files are recycled. When they're filled, their
      contents are fdatasynced, but we did not fsync the containing
      directory. An OS/hardware crash in an unfortunate moment could then end
      up leaving that file with its old name, but new content; WAL replay
      would thus not replay it.
      
      Reported-By: Tomas Vondra
      Author: Michael Paquier, Tomas Vondra, Andres Freund
      Discussion: 56583BDD.9060302@2ndquadrant.com
      Backpatch: All supported branches
      ce8f4291
Loading