Skip to content
Snippets Groups Projects
  1. May 08, 2017
    • Tom Lane's avatar
      Further patch rangetypes_selfuncs.c's statistics slot management. · 4509b4eb
      Tom Lane authored
      Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
      not of the type of the column the statistics are for.
      
      This bug is at least partly the fault of sloppy specification comments
      for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
      of the stavalues entries, not of the underlying column.  (I double-checked
      other callers and they seem to get this right.)  Adjust the comments to be
      more correct.
      
      Per buildfarm.
      
      Security: CVE-2017-7484
      4509b4eb
    • Tom Lane's avatar
      Fix possibly-uninitialized variable. · a199582e
      Tom Lane authored
      Oversight in e2d4ef8d et al (my fault not Peter's).  Per buildfarm.
      
      Security: CVE-2017-7484
      a199582e
    • Noah Misch's avatar
      Match pg_user_mappings limits to information_schema.user_mapping_options. · db215810
      Noah Misch authored
      Both views replace the umoptions field with NULL when the user does not
      meet qualifications to see it.  They used different qualifications, and
      pg_user_mappings documented qualifications did not match its implemented
      qualifications.  Make its documentation and implementation match those
      of user_mapping_options.  One might argue for stronger qualifications,
      but these have long, documented tenure.  pg_user_mappings has always
      exhibited this problem, so back-patch to 9.2 (all supported versions).
      
      Michael Paquier and Feike Steenbergen.  Reviewed by Jeff Janes.
      Reported by Andrew Wheelwright.
      
      Security: CVE-2017-7486
      db215810
    • Peter Eisentraut's avatar
      Translation updates · 769294f3
      Peter Eisentraut authored
      Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
      Source-Git-Hash: 14c4b5cb0f9330a9397159979c48e7076fa856d8
      769294f3
    • Peter Eisentraut's avatar
      Add security checks to selectivity estimation functions · d45cd7c0
      Peter Eisentraut authored
      
      Some selectivity estimation functions run user-supplied operators over
      data obtained from pg_statistic without security checks, which allows
      those operators to leak pg_statistic data without having privileges on
      the underlying tables.  Fix by checking that one of the following is
      satisfied: (1) the user has table or column privileges on the table
      underlying the pg_statistic data, or (2) the function implementing the
      user-supplied operator is leak-proof.  If neither is satisfied, planning
      will proceed as if there are no statistics available.
      
      At least one of these is satisfied in most cases in practice.  The only
      situations that are negatively impacted are user-defined or
      not-leak-proof operators on a security-barrier view.
      
      Reported-by: default avatarRobert Haas <robertmhaas@gmail.com>
      Author: Peter Eisentraut <peter_e@gmx.net>
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      
      Security: CVE-2017-7484
      d45cd7c0
  2. May 07, 2017
    • Stephen Frost's avatar
      RLS: Fix ALL vs. SELECT+UPDATE policy usage · d617c762
      Stephen Frost authored
      When we add the SELECT-privilege based policies to the RLS with check
      options (such as for an UPDATE statement, or when we have INSERT ...
      RETURNING), we need to be sure and use the 'USING' case if the policy is
      actually an 'ALL' policy (which could have both a USING clause and an
      independent WITH CHECK clause).
      
      This could result in policies acting differently when built using ALL
      (when the ALL had both USING and WITH CHECK clauses) and when building
      the policies independently as SELECT and UPDATE policies.
      
      Fix this by adding an explicit boolean to add_with_check_options() to
      indicate when the USING policy should be used, even if the policy has
      both USING and WITH CHECK policies on it.
      
      Reported by: Rod Taylor
      
      Back-patch to 9.5 where RLS was introduced.
      d617c762
  3. May 05, 2017
  4. May 04, 2017
    • Tom Lane's avatar
      Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan. · 6cfb428b
      Tom Lane authored
      GiST's getNextNearest() function attempts to pfree the previously-returned
      tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older
      branches).  However, if we are rescanning a plan node after ending a
      previous scan early, those tuple pointers could be pointing to garbage,
      because they would be pointing into the scan's pageDataCxt or queueCxt
      which has been reset.  In a debug build this reliably results in a crash,
      although I think it might sometimes accidentally fail to fail in
      production builds.
      
      To fix, clear the pointer field anyplace we reset a context it might
      be pointing into.  This may be overkill --- I think probably only the
      queueCxt case is involved in this bug, so that resetting in gistrescan()
      would be sufficient --- but dangling pointers are generally bad news,
      so let's avoid them.
      
      Another plausible answer might be to just not bother with the pfree in
      getNextNearest().  The reconstructed tuples would go away anyway in the
      context resets, and I'm far from convinced that freeing them a bit earlier
      really saves anything meaningful.  I'll stick with the original logic in
      this patch, but if we find more problems in the same area we should
      consider that approach.
      
      Per bug #14641 from Denis Smirnov.  Back-patch to 9.5 where this
      logic was introduced.
      
      Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
      6cfb428b
  5. May 03, 2017
  6. Apr 28, 2017
    • Robert Haas's avatar
      Fix VALIDATE CONSTRAINT to consider NO INHERIT attribute. · a0291c33
      Robert Haas authored
      Currently, trying to validate a NO INHERIT constraint on the parent will
      search for the constraint in child tables (where it is not supposed to
      exist), wrongly causing a "constraint does not exist" error.
      
      Amit Langote, per a report from Hans Buschmann.
      
      Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
      a0291c33
    • Andres Freund's avatar
      Don't use on-disk snapshots for exported logical decoding snapshot. · 54270d7e
      Andres Freund authored
      Logical decoding stores historical snapshots on disk, so that logical
      decoding can restart without having to reconstruct a snapshot from
      scratch (for which the resources are not guaranteed to be present
      anymore).  These serialized snapshots were also used when creating a
      new slot via the walsender interface, which can export a "full"
      snapshot (i.e. one that can read all tables, not just catalog ones).
      
      The problem is that the serialized snapshots are only useful for
      catalogs and not for normal user tables.  Thus the use of such a
      serialized snapshot could result in an inconsistent snapshot being
      exported, which could lead to queries returning wrong data.  This
      would only happen if logical slots are created while another logical
      slot already exists.
      
      Author: Petr Jelinek
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
      Backport: 9.4, where logical decoding was introduced.
      54270d7e
  7. Apr 27, 2017
    • Andres Freund's avatar
      Preserve required !catalog tuples while computing initial decoding snapshot. · 47f896b5
      Andres Freund authored
      The logical decoding machinery already preserved all the required
      catalog tuples, which is sufficient in the course of normal logical
      decoding, but did not guarantee that non-catalog tuples were preserved
      during computation of the initial snapshot when creating a slot over
      the replication protocol.
      
      This could cause a corrupted initial snapshot being exported.  The
      time window for issues is usually not terribly large, but on a busy
      server it's perfectly possible to it hit it.  Ongoing decoding is not
      affected by this bug.
      
      To avoid increased overhead for the SQL API, only retain additional
      tuples when a logical slot is being created over the replication
      protocol.  To do so this commit changes the signature of
      CreateInitDecodingContext(), but it seems unlikely that it's being
      used in an extension, so that's probably ok.
      
      In a drive-by fix, fix handling of
      ReplicationSlotsComputeRequiredXmin's already_locked argument, which
      should only apply to ProcArrayLock, not ReplicationSlotControlLock.
      
      Reported-By: Erik Rijkers
      Analyzed-By: Petr Jelinek
      Author: Petr Jelinek, heavily editorialized by Andres Freund
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
      Backport: 9.4, where logical decoding was introduced.
      47f896b5
  8. Apr 24, 2017
    • Tom Lane's avatar
      Fix postmaster's handling of fork failure for a bgworker process. · dba1f310
      Tom Lane authored
      This corner case didn't behave nicely at all: the postmaster would
      (partially) update its state as though the process had started
      successfully, and be quite confused thereafter.  Fix it to act
      like the worker had crashed, instead.
      
      In passing, refactor so that do_start_bgworker contains all the
      state-change logic for bgworker launch, rather than just some of it.
      
      Back-patch as far as 9.4.  9.3 contains similar logic, but it's just
      enough different that I don't feel comfortable applying the patch
      without more study; and the use of bgworkers in 9.3 was so small
      that it doesn't seem worth the extra work.
      
      transam/parallel.c is still entirely unprepared for the possibility
      of bgworker startup failure, but that seems like material for a
      separate patch.
      
      Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us
      dba1f310
    • Andrew Gierth's avatar
      Repair crash with unsortable data in grouping sets. · 7be3678a
      Andrew Gierth authored
      Previously the code would generate incorrect results, assertion
      failures, or crashes if given unsortable (but hashable) columns in
      grouping sets.  Handle by throwing an error instead.
      
      Report and patch by Pavan Deolasee (though I changed the error
      wording slightly); regression test by me.
      
      (This affects 9.5 only since the planner was refactored in 9.6.)
      7be3678a
    • Andres Freund's avatar
      Zero padding in replication origin's checkpointed on disk-state. · 81ff04de
      Andres Freund authored
      This seems to be largely cosmetic, avoiding valgrind bleats and the
      like. The uninitialized padding influences the CRC of the on-disk
      entry, but because it's also used when verifying the CRC, that doesn't
      cause spurious failures.  Backpatch nonetheless.
      
      It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql
      doesn't exercise the checkpoint path, but checkpoints are fairly
      expensive on weaker machines, and we'd have to stop/start for that to
      be meaningful.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
      Backpatch: 9.5, where replication origins were introduced
      81ff04de
  9. Apr 23, 2017
    • Tom Lane's avatar
      Fix order of arguments to SubTransSetParent(). · a66e01bb
      Tom Lane authored
      ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions)
      mixed up the parent and child XIDs when calling SubTransSetParent to
      record the transactions' relationship in pg_subtrans.
      
      Remarkably, analysis by Simon Riggs suggests that this doesn't lead to
      visible problems (at least, not in non-Assert builds).  That might
      explain why we'd not noticed it before.  Nonetheless, it's surely wrong.
      
      This code was born broken, so back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
      a66e01bb
  10. Apr 21, 2017
    • Tom Lane's avatar
      Avoid depending on non-POSIX behavior of fcntl(2). · 9f8be754
      Tom Lane authored
      The POSIX standard does not say that the success return value for
      fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
      We had several calls that were making the stronger assumption.  Adjust
      them to test specifically for -1 for strict spec compliance.
      
      The standard further leaves open the possibility that the O_NONBLOCK
      flag bit is not the only active one in F_SETFL's argument.  Formally,
      therefore, one ought to get the current flags with F_GETFL and store
      them back with only the O_NONBLOCK bit changed when trying to change
      the nonblock state.  In port/noblock.c, we were doing the full pushup
      in pg_set_block but not in pg_set_noblock, which is just weird.  Make
      both of them do it properly, since they have little business making
      any assumptions about the socket they're handed.  The other places
      where we're issuing F_SETFL are working with FDs we just got from
      pipe(2), so it's reasonable to assume the FDs' properties are all
      default, so I didn't bother adding F_GETFL steps there.
      
      Also, while pg_set_block deserves some points for trying to do things
      right, somebody had decided that it'd be even better to cast fcntl's
      third argument to "long".  Which is completely loony, because POSIX
      clearly says the third argument for an F_SETFL call is "int".
      
      Given the lack of field complaints, these missteps apparently are not
      of significance on any common platforms.  But they're still wrong,
      so back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
      9f8be754
  11. Apr 17, 2017
    • Tom Lane's avatar
      Always build a custom plan node's targetlist from the path's pathtarget. · 6f0f98bb
      Tom Lane authored
      We were applying the use_physical_tlist optimization to all relation
      scan plans, even those implemented by custom scan providers.  However,
      that's a bad idea for a couple of reasons.  The custom provider might
      be unable to provide columns that it hadn't expected to be asked for
      (for example, the custom scan might depend on an index-only scan).
      Even more to the point, there's no good reason to suppose that this
      "optimization" is a win for a custom scan; whatever the custom provider
      is doing is likely not based on simply returning physical heap tuples.
      (As a counterexample, if the custom scan is an interface to a column store,
      demanding all columns would be a huge loss.)  If it is a win, the custom
      provider could make that decision for itself and insert a suitable
      pathtarget into the path, anyway.
      
      Per discussion with Dmitry Ivanov.  Back-patch to 9.5 where custom scan
      support was introduced.  The argument that the custom provider can adjust
      the behavior by changing the pathtarget only applies to 9.6+, but on
      balance it seems more likely that use_physical_tlist will hurt custom
      scans than help them.
      
      Discussion: https://postgr.es/m/e29ddd30-8ef9-4da5-a50b-2bb7b8c7198d@postgrespro.ru
      6f0f98bb
    • Peter Eisentraut's avatar
      Fix compiler warning · b6e6ae1d
      Peter Eisentraut authored
      Introduced by 087e696f, happens with gcc
      4.7.2.
      b6e6ae1d
  12. Apr 15, 2017
    • Tom Lane's avatar
      Provide a way to control SysV shmem attach address in EXEC_BACKEND builds. · bbd4a1b6
      Tom Lane authored
      In standard non-Windows builds, there's no particular reason to care what
      address the kernel chooses to map the shared memory segment at.  However,
      when building with EXEC_BACKEND, there's a risk that the chosen address
      won't be available in all child processes.  Linux with ASLR enabled (which
      it is by default) seems particularly at risk because it puts shmem segments
      into the same area where it maps shared libraries.  We can work around
      that by specifying a mapping address that's outside the range where
      shared libraries could get mapped.  On x86_64 Linux, 0x7e0000000000
      seems to work well.
      
      This is only meant for testing/debugging purposes, so it doesn't seem
      necessary to go as far as providing a GUC (or any user-visible
      documentation, though we might change that later).  Instead, it's just
      controlled by setting an environment variable PG_SHMEM_ADDR to the
      desired attach address.
      
      Back-patch to all supported branches, since the point here is to
      remove intermittent buildfarm failures on EXEC_BACKEND animals.
      Owners of affected animals will need to add a suitable setting of
      PG_SHMEM_ADDR to their build_env configuration.
      
      Discussion: https://postgr.es/m/7036.1492231361@sss.pgh.pa.us
      bbd4a1b6
  13. Apr 13, 2017
    • Tom Lane's avatar
      Fix regexport.c to behave sanely with lookaround constraints. · 67665a71
      Tom Lane authored
      regexport.c thought it could just ignore LACON arcs, but the correct
      behavior is to treat them as satisfiable while consuming zero input
      (rather reminiscently of commit 9f1e642d).  Otherwise, the emitted
      simplified-NFA representation may contain no paths leading from initial
      to final state, which unsurprisingly confuses pg_trgm, as seen in
      bug #14623 from Jeff Janes.
      
      Since regexport's output representation has no concept of an arc that
      consumes zero input, recurse internally to find the next normal arc(s)
      after any LACON transitions.  We'd be forced into changing that
      representation if a LACON could be the last arc reaching the final
      state, but fortunately the regex library never builds NFAs with such
      a configuration, so there always is a next normal arc.
      
      Back-patch to 9.3 where this logic was introduced.
      
      Discussion: https://postgr.es/m/20170413180503.25948.94871@wrigleys.postgresql.org
      67665a71
  14. Apr 06, 2017
    • Heikki Linnakangas's avatar
      Remove dead code and fix comments in fast-path function handling. · 39cedd8d
      Heikki Linnakangas authored
      HandleFunctionRequest() is no longer responsible for reading the protocol
      message from the client, since commit 2b3a8b20. Fix the outdated
      comments.
      
      HandleFunctionRequest() now always returns 0, because the code that used
      to return EOF was moved in 2b3a8b20. Therefore, the caller no longer
      needs to check the return value.
      
      Reported by Andres Freund. Backpatch to all supported versions, even though
      this doesn't have any user-visible effect, to make backporting future
      patches in this area easier.
      
      Discussion: https://www.postgresql.org/message-id/20170405010525.rt5azbya5fkbhvrx@alap3.anarazel.de
      39cedd8d
    • Tom Lane's avatar
      Fix integer-overflow problems in interval comparison. · d68a2b20
      Tom Lane authored
      When using integer timestamps, the interval-comparison functions tried
      to compute the overall magnitude of an interval as an int64 number of
      microseconds.  As reported by Frazer McLean, this overflows for intervals
      exceeding about 296000 years, which is bad since we nominally allow
      intervals many times larger than that.  That results in wrong comparison
      results, and possibly in corrupted btree indexes for columns containing
      such large interval values.
      
      To fix, compute the magnitude as int128 instead.  Although some compilers
      have native support for int128 calculations, many don't, so create our
      own support functions that can do 128-bit addition and multiplication
      if the compiler support isn't there.  These support functions are designed
      with an eye to allowing the int128 code paths in numeric.c to be rewritten
      for use on all platforms, although this patch doesn't do that, or even
      provide all the int128 primitives that will be needed for it.
      
      Back-patch as far as 9.4.  Earlier releases did not guard against overflow
      of interval values at all (commit 146604ec fixed that), so it seems not
      very exciting to worry about overly-large intervals for them.
      
      Before 9.6, we did not assume that unreferenced "static inline" functions
      would not draw compiler warnings, so omit functions not directly referenced
      by timestamp.c, the only present consumer of int128.h.  (We could have
      omitted these functions in HEAD too, but since they were written and
      debugged on the way to the present patch, and they look likely to be needed
      by numeric.c, let's keep them in HEAD.)  I did not bother to try to prevent
      such warnings in a --disable-integer-datetimes build, though.
      
      Before 9.5, configure will never define HAVE_INT128, so the part of
      int128.h that exploits a native int128 implementation is dead code in the
      9.4 branch.  I didn't bother to remove it, thinking that keeping the file
      looking similar in different branches is more useful.
      
      In HEAD only, add a simple test harness for int128.h in src/tools/.
      
      In back branches, this does not change the float-timestamps code path.
      That's not subject to the same kind of overflow risk, since it computes
      the interval magnitude as float8.  (No doubt, when this code was originally
      written, overflow was disregarded for exactly that reason.)  There is a
      precision hazard instead :-(, but we'll avert our eyes from that question,
      since no complaints have been reported and that code's deprecated anyway.
      
      Kyotaro Horiguchi and Tom Lane
      
      Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com
      d68a2b20
  15. Apr 01, 2017
  16. Mar 28, 2017
  17. Mar 26, 2017
    • Tom Lane's avatar
      Fix unportable disregard of alignment requirements in RADIUS code. · 24fc43d4
      Tom Lane authored
      The compiler is entitled to store a char[] local variable with no
      particular alignment requirement.  Our RADIUS code cavalierly took such
      a local variable and cast its address to a struct type that does have
      alignment requirements.  On an alignment-picky machine this would lead
      to bus errors.  To fix, declare the local variable honestly, and then
      cast its address to char * for use in the I/O calls.
      
      Given the lack of field complaints, there must be very few if any
      people affected; but nonetheless this is a clear portability issue,
      so back-patch to all supported branches.
      
      Noted while looking at a Coverity complaint in the same code.
      24fc43d4
  18. Mar 24, 2017
  19. Mar 17, 2017
    • Heikki Linnakangas's avatar
      Fix and simplify check for whether we're running as Windows service. · 96fd76dd
      Heikki Linnakangas authored
      If the process token contains SECURITY_SERVICE_RID, but it has been
      disabled by the SE_GROUP_USE_FOR_DENY_ONLY attribute, win32_is_service()
      would incorrectly report that we're running as a service. That situation
      arises, e.g. if postmaster is launched with a restricted security token,
      with the "Log in as Service" privilege explicitly removed.
      
      Replace the broken code with CheckProcessTokenMembership(), which does
      this correctly. Also replace similar code in win32_is_admin(), even
      though it got this right, for simplicity and consistency.
      
      Per bug #13755, reported by Breen Hagan. Back-patch to all supported
      versions. Patch by Takayuki Tsunakawa, reviewed by Michael Paquier.
      
      Discussion: https://www.postgresql.org/message-id/20151104062315.2745.67143%40wrigleys.postgresql.org
      96fd76dd
  20. Mar 16, 2017
  21. Mar 14, 2017
  22. Mar 08, 2017
    • Tom Lane's avatar
      Use doubly-linked block lists in aset.c to reduce large-chunk overhead. · 50a9d714
      Tom Lane authored
      Large chunks (those too large for any palloc freelist) are managed as
      separate blocks.  Formerly, realloc'ing or pfree'ing such a chunk required
      O(N) time in a context with N blocks, since we had to traipse down the
      singly-linked block list to locate the block's predecessor before we could
      fix the list links.  This can result in O(N^2) runtime in situations where
      large numbers of such chunks are manipulated within one context.  Cases
      like that were not foreseen in the original design of aset.c, and indeed
      didn't arise until fairly recently.  But such problems can now occur in
      reorderbuffer.c and in hash joining, both of which make repeated large
      requests without scaling up their request size as they do so, and which
      will free their requests in not-necessarily-LIFO order.
      
      To fix, change the block list from singly-linked to doubly-linked.
      This adds another 4 or 8 bytes to ALLOC_BLOCKHDRSZ, but that doesn't
      seem like unacceptable overhead, since aset.c's blocks are normally
      8K or more, and never less than 1K in current practice.
      
      In passing, get rid of some redundant AllocChunkGetPointer() calls in
      AllocSetRealloc (the compiler might be smart enough to optimize these
      away anyway, but no need to assume that) and improve AllocSetCheck's
      checking of block header fields.
      
      Back-patch to 9.4 where reorderbuffer.c appeared.  We could take this
      further back, but currently there's no evidence that it would be useful.
      
      Discussion: https://postgr.es/m/CAMkU=1x1hvue1XYrZoWk_omG0Ja5nBvTdvgrOeVkkeqs71CV8g@mail.gmail.com
      50a9d714
  23. Mar 06, 2017
    • Tom Lane's avatar
      Avoid dangling pointer to relation name in RLS code path in DoCopy(). · 420d9ec0
      Tom Lane authored
      With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE,
      and would sometimes fail without that, because it used the relation name
      directly from the relcache as part of the parsetree it's building.  That
      becomes a potentially-dangling pointer as soon as the relcache entry is
      closed, a bit further down.  Typical symptom if the relcache entry chanced
      to get cleared would be "relation does not exist" error with a garbage
      relation name, or possibly a core dump; but if you were really truly
      unlucky, the COPY might copy from the wrong table.
      
      Per report from Andrew Dunstan that regression tests fail with
      -DRELCACHE_FORCE_RELEASE.  The core tests now pass for me (but have
      not tried "make check-world" yet).
      
      Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
      420d9ec0
  24. Mar 04, 2017
    • Tom Lane's avatar
      In rebuild_relation(), don't access an already-closed relcache entry. · 807df31d
      Tom Lane authored
      This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by
      Andrew Dunstan, and could sometimes fail in normal operation, resulting
      in a wrong persistence value being used for the transient table.
      It's not immediately clear to me what effects that might have beyond
      the risk of a crash while accessing OldHeap->rd_rel->relpersistence,
      but it's probably not good.
      
      Bug introduced by commit f41872d0, and made substantially worse by
      commit 85b506bb, which added a second such access significantly
      later than the heap_close.  I doubt the first reference could fail
      in a production scenario, but the second one definitely could.
      
      Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
      807df31d
  25. Feb 28, 2017
  26. Feb 22, 2017
  27. Feb 21, 2017
    • Tom Lane's avatar
      Fix sloppy handling of corner-case errors in fd.c. · ff1b032a
      Tom Lane authored
      Several places in fd.c had badly-thought-through handling of error returns
      from lseek() and close().  The fact that those would seldom fail on valid
      FDs is probably the reason we've not noticed this up to now; but if they
      did fail, we'd get quite confused.
      
      LruDelete and LruInsert actually just Assert'd that lseek never fails,
      which is pretty awful on its face.
      
      In LruDelete, we indeed can't throw an error, because that's likely to get
      called during error abort and so throwing an error would probably just lead
      to an infinite loop.  But by the same token, throwing an error from the
      close() right after that was ill-advised, not to mention that it would've
      left the LRU state corrupted since we'd already unlinked the VFD from the
      list.  I also noticed that really, most of the time, we should know the
      current seek position and it shouldn't be necessary to do an lseek here at
      all.  As patched, if we don't have a seek position and an lseek attempt
      doesn't give us one, we'll close the file but then subsequent re-open
      attempts will fail (except in the somewhat-unlikely case that a
      FileSeek(SEEK_SET) call comes between and allows us to re-establish a known
      target seek position).  This isn't great but it won't result in any state
      corruption.
      
      Meanwhile, having an Assert instead of an honest test in LruInsert is
      really dangerous: if that lseek failed, a subsequent read or write would
      read or write from the start of the file, not where the caller expected,
      leading to data corruption.
      
      In both LruDelete and FileClose, if close() fails, just LOG that and mark
      the VFD closed anyway.  Possibly leaking an FD is preferable to getting
      into an infinite loop or corrupting the VFD list.  Besides, as far as I can
      tell from the POSIX spec, it's unspecified whether or not the file has been
      closed, so treating it as still open could be the wrong thing anyhow.
      
      I also fixed a number of other places that were being sloppy about
      behaving correctly when the seekPos is unknown.
      
      Also, I changed FileSeek to return -1 with EINVAL for the cases where it
      detects a bad offset, rather than throwing a hard elog(ERROR).  It seemed
      pretty inconsistent that some bad-offset cases would get a failure return
      while others got elog(ERROR).  It was missing an offset validity check for
      the SEEK_CUR case on a closed file, too.
      
      Back-patch to all supported branches, since all this code is fundamentally
      identical in all of them.
      
      Discussion: https://postgr.es/m/2982.1487617365@sss.pgh.pa.us
      ff1b032a
  28. Feb 15, 2017
    • Tom Lane's avatar
      Make sure that hash join's bulk-tuple-transfer loops are interruptible. · 96ba1764
      Tom Lane authored
      The loops in ExecHashJoinNewBatch(), ExecHashIncreaseNumBatches(), and
      ExecHashRemoveNextSkewBucket() are all capable of iterating over many
      tuples without ever doing a CHECK_FOR_INTERRUPTS, so that the backend
      might fail to respond to SIGINT or SIGTERM for an unreasonably long time.
      Fix that.  In the case of ExecHashJoinNewBatch(), it seems useful to put
      the added CHECK_FOR_INTERRUPTS into ExecHashJoinGetSavedTuple() rather
      than directly in the loop, because that will also ensure that both
      principal code paths through ExecHashJoinOuterGetTuple() will do a
      CHECK_FOR_INTERRUPTS, which seems like a good idea to avoid surprises.
      
      Back-patch to all supported branches.
      
      Tom Lane and Thomas Munro
      
      Discussion: https://postgr.es/m/6044.1487121720@sss.pgh.pa.us
      96ba1764
    • Tom Lane's avatar
      Fix YA unwanted behavioral difference with operator_precedence_warning. · 2b47e29f
      Tom Lane authored
      Jeff Janes noted that the error cursor position shown for some errors
      would vary when operator_precedence_warning is turned on.  We'd prefer
      that option to have no undocumented effects, so this isn't desirable.
      To fix, make sure that an AEXPR_PAREN node has the same exprLocation
      as its child node.
      
      (Note: it would be a little cheaper to use @2 here instead of an
      exprLocation call, but there are cases where that wouldn't produce
      the identical answer, so don't do it like that.)
      
      Back-patch to 9.5 where this feature was introduced.
      
      Discussion: https://postgr.es/m/CAMkU=1ykK+VhhcQ4Ky8KBo9FoaUJH3f3rDQB8TkTXi-ZsBRUkQ@mail.gmail.com
      2b47e29f
Loading