Skip to content
Snippets Groups Projects
  1. Jun 15, 2017
  2. May 15, 2017
  3. Apr 25, 2017
  4. Apr 24, 2017
    • Tom Lane's avatar
      Use pselect(2) not select(2), if available, to wait in postmaster's loop. · 81069a9e
      Tom Lane authored
      Traditionally we've unblocked signals, called select(2), and then blocked
      signals again.  The code expects that the select() will be cancelled with
      EINTR if an interrupt occurs; but there's a race condition, which is that
      an already-pending signal will be delivered as soon as we unblock, and then
      when we reach select() there will be nothing preventing it from waiting.
      This can result in a long delay before we perform any action that
      ServerLoop was supposed to have taken in response to the signal.  As with
      the somewhat-similar symptoms fixed by commit 89390208, the main practical
      problem is slow launching of parallel workers.  The window for trouble is
      usually pretty short, corresponding to one iteration of ServerLoop; but
      it's not negligible.
      
      To fix, use pselect(2) in place of select(2) where available, as that's
      designed to solve exactly this problem.  Where not available, we continue
      to use the old way, and are no worse off than before.
      
      pselect(2) has been required by POSIX since about 2001, so most modern
      platforms should have it.  A bigger portability issue is that some
      implementations are said to be non-atomic, ie pselect() isn't really
      any different from unblock/select/reblock.  Still, we're no worse off
      than before on such a platform.
      
      There is talk of rewriting the postmaster to use a WaitEventSet and
      not do signal response work in signal handlers, at which point this
      could be reverted, since we'd be using a self-pipe to solve the race
      condition.  But that's not happening before v11 at the earliest.
      
      Back-patch to 9.6.  The problem exists much further back, but the
      worst symptom arises only in connection with parallel query, so it
      does not seem worth taking any portability risks in older branches.
      
      Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
      81069a9e
    • Andres Freund's avatar
      Don't include sys/poll.h anymore. · b182a4ae
      Andres Freund authored
      poll.h is mandated by Single Unix Spec v2, the usual baseline for
      postgres on unix.  None of the unixoid buildfarms animals has
      sys/poll.h but not poll.h.  Therefore there's not much point to test
      for sys/poll.h's existence and include it optionally.
      
      Author: Andres Freund, per suggestion from Tom Lane
      Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us
      b182a4ae
  5. Apr 07, 2017
    • Peter Eisentraut's avatar
      Remove use of Jade and DSSSL · 510074f9
      Peter Eisentraut authored
      All documentation is now built using XSLT.  Remove all references to
      Jade, DSSSL, also JadeTex and some other outdated tooling.
      
      For chunked HTML builds, this changes nothing, but removes the
      transitional "oldhtml" target.  The single-page HTML build is ported
      over to XSLT.  For PDF builds, this removes the JadeTex builds and moves
      the FOP builds in their place.
      510074f9
  6. Apr 05, 2017
  7. Mar 29, 2017
    • Peter Eisentraut's avatar
      Cast result of copyObject() to correct type · 4cb82469
      Peter Eisentraut authored
      
      copyObject() is declared to return void *, which allows easily assigning
      the result independent of the input, but it loses all type checking.
      
      If the compiler supports typeof or something similar, cast the result to
      the input type.  This creates a greater amount of type safety.  In some
      cases, where the result is assigned to a generic type such as Node * or
      Expr *, new casts are now necessary, but in general casts are now
      unnecessary in the normal case and indicate that something unusual is
      happening.
      
      Reviewed-by: default avatarMark Dilger <hornschnorter@gmail.com>
      4cb82469
  8. Mar 23, 2017
    • Peter Eisentraut's avatar
      ICU support · eccfef81
      Peter Eisentraut authored
      
      Add a column collprovider to pg_collation that determines which library
      provides the collation data.  The existing choices are default and libc,
      and this adds an icu choice, which uses the ICU4C library.
      
      The pg_locale_t type is changed to a union that contains the
      provider-specific locale handles.  Users of locale information are
      changed to look into that struct for the appropriate handle to use.
      
      Also add a collversion column that records the version of the collation
      when it is created, and check at run time whether it is still the same.
      This detects potentially incompatible library upgrades that can corrupt
      indexes and other structures.  This is currently only supported by
      ICU-provided collations.
      
      initdb initializes the default collation set as before from the `locale
      -a` output but also adds all available ICU locales with a "-x-icu"
      appended.
      
      Currently, ICU-provided collations can only be explicitly named
      collations.  The global database locales are still always libc-provided.
      
      ICU support is enabled by configure --with-icu.
      
      Reviewed-by: default avatarThomas Munro <thomas.munro@enterprisedb.com>
      Reviewed-by: default avatarAndreas Karlsson <andreas@proxel.se>
      eccfef81
  9. Mar 20, 2017
  10. Feb 26, 2017
    • Tom Lane's avatar
      Remove some configure header-file checks that we weren't really using. · 2bd7f857
      Tom Lane authored
      We had some AC_CHECK_HEADER tests that were really wastes of cycles,
      because the code proceeded to #include those headers unconditionally
      anyway, in all or a large majority of cases.  The lack of complaints
      shows that those headers are available on every platform of interest,
      so we might as well let configure run a bit faster by not probing
      those headers at all.
      
      I suspect that some of the tests I left alone are equally useless, but
      since all the existing #includes of the remaining headers are properly
      guarded, I didn't touch them.
      2bd7f857
  11. Feb 23, 2017
    • Tom Lane's avatar
      De-support floating-point timestamps. · b6aa17e0
      Tom Lane authored
      Per discussion, the time has come to do this.  The handwriting has been
      on the wall at least since 9.0 that this would happen someday, whenever
      it got to be too much of a burden to support the float-timestamp option.
      The triggering factor now is the discovery that there are multiple bugs
      in the code that attempts to implement use of integer timestamps in the
      replication protocol even when the server is built for float timestamps.
      The internal float timestamps leak into the protocol fields in places.
      While we could fix the identified bugs, there's a very high risk of
      introducing more.  Trying to build a wall that would positively prevent
      mixing integer and float timestamps is more complexity than we want to
      undertake to maintain a long-deprecated option.  The fact that these
      bugs weren't found through testing also indicates a lack of interest
      in float timestamps.
      
      This commit disables configure's --disable-integer-datetimes switch
      (it'll still accept --enable-integer-datetimes, though), removes direct
      references to USE_INTEGER_DATETIMES, and removes discussion of float
      timestamps from the user documentation.  A considerable amount of code is
      rendered dead by this, but removing that will occur as separate mop-up.
      
      Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
      b6aa17e0
  12. Feb 21, 2017
    • Tom Lane's avatar
      Reject too-old Python versions a bit sooner. · 4e5ce3c1
      Tom Lane authored
      Commit 04aad401 added this check after the search for a Python shared
      library, which seems to me to be a pretty unfriendly ordering.  The
      search might fail for what are basically version-related reasons, and
      in such a case it'd be better to say "your Python is too old" than
      "could not find shared library for Python".
      4e5ce3c1
    • Peter Eisentraut's avatar
      Drop support for Python 2.3 · 04aad401
      Peter Eisentraut authored
      There is no specific reason for this right now, but keeping support for
      old Python versions around indefinitely increases the maintenance
      burden.  The oldest supported Python version is now Python 2.4, which is
      still shipped in RHEL/CentOS 5 by default.
      
      In configure, add a check for the required Python version and give a
      friendly error message for an old version, instead of relying on an
      obscure build error later on.
      04aad401
  13. Feb 06, 2017
  14. Jan 03, 2017
  15. Jan 02, 2017
    • Tom Lane's avatar
      Use clock_gettime(), if available, in instr_time measurements. · 1d63f7d2
      Tom Lane authored
      The advantage of clock_gettime() is that the API allows the result to
      be precise to nanoseconds, not just microseconds as in gettimeofday().
      Now that it's routinely possible to do tens of plan node executions
      in 1us, we really need more precision than gettimeofday() can offer
      for EXPLAIN ANALYZE to accumulate statistics with.
      
      Some research shows that clock_gettime() is available on pretty nearly
      every modern Unix-ish platform, and as far as I have been able to test,
      it has about the same execution time as gettimeofday(), so there's no
      loss in switching over.  (By the same token, this doesn't do anything
      to fix the fact that we really wish clock readings were faster.  But
      there's enough win here to justify changing anyway.)
      
      A small side benefit is that on most platforms, we can use CLOCK_MONOTONIC
      instead of CLOCK_REALTIME and thereby render EXPLAIN impervious to
      concurrent resets of the system clock.  (This means that code must not
      assume that the contents of struct instr_time have any well-defined
      interpretation as timestamps, but really that was true before.)
      
      Some platforms offer nonstandard clock IDs that might be of interest.
      This patch knows we should use CLOCK_MONOTONIC_RAW on macOS, because it
      provides more precision and is faster to read than their CLOCK_MONOTONIC.
      If there turn out to be many more cases where we need special rules, it
      might be appropriate to handle the selection of clock ID in configure,
      but for the moment that doesn't seem worth the trouble.
      
      Discussion: https://postgr.es/m/31856.1400021891@sss.pgh.pa.us
      1d63f7d2
  16. Dec 12, 2016
  17. Dec 07, 2016
  18. Dec 05, 2016
    • Heikki Linnakangas's avatar
      Fix typo in new message in configure. · 44a977f5
      Heikki Linnakangas authored
      Remove spurious "of", and reformat to fit on a 80 chars wide line.
      44a977f5
    • Heikki Linnakangas's avatar
      Replace PostmasterRandom() with a stronger source, second attempt. · fe0a0b59
      Heikki Linnakangas authored
      This adds a new routine, pg_strong_random() for generating random bytes,
      for use in both frontend and backend. At the moment, it's only used in
      the backend, but the upcoming SCRAM authentication patches need strong
      random numbers in libpq as well.
      
      pg_strong_random() is based on, and replaces, the existing implementation
      in pgcrypto. It can acquire strong random numbers from a number of sources,
      depending on what's available:
      
      - OpenSSL RAND_bytes(), if built with OpenSSL
      - On Windows, the native cryptographic functions are used
      - /dev/urandom
      
      Unlike the current pgcrypto function, the source is chosen by configure.
      That makes it easier to test different implementations, and ensures that
      we don't accidentally fall back to a less secure implementation, if the
      primary source fails. All of those methods are quite reliable, it would be
      pretty surprising for them to fail, so we'd rather find out by failing
      hard.
      
      If no strong random source is available, we fall back to using erand48(),
      seeded from current timestamp, like PostmasterRandom() was. That isn't
      cryptographically secure, but allows us to still work on platforms that
      don't have any of the above stronger sources. Because it's not very secure,
      the built-in implementation is only used if explicitly requested with
      --disable-strong-random.
      
      This replaces the more complicated Fortuna algorithm we used to have in
      pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
      so it doesn't seem worth the maintenance effort to keep that. pgcrypto
      functions that require strong random numbers will be disabled with
      --disable-strong-random.
      
      Original patch by Magnus Hagander, tons of further work by Michael Paquier
      and me.
      
      Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
      Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
      fe0a0b59
  19. Oct 11, 2016
    • Tom Lane's avatar
      Remove "sco" and "unixware" ports. · 2b860f52
      Tom Lane authored
      SCO OpenServer and SCO UnixWare are more or less dead platforms.
      We have never had a buildfarm member testing the "sco" port, and
      the last "unixware" member was last heard from in 2012, so it's
      fair to doubt that the code even compiles anymore on either one.
      Remove both ports.  We can always undo this if someone shows up
      with an interest in maintaining and testing these platforms.
      
      Discussion: <17177.1476136994@sss.pgh.pa.us>
      2b860f52
  20. Oct 10, 2016
    • Tom Lane's avatar
      Use unnamed POSIX semaphores, if available, on Linux and FreeBSD. · ecb0d20a
      Tom Lane authored
      We've had support for using unnamed POSIX semaphores instead of System V
      semaphores for quite some time, but it was not used by default on any
      platform.  Since many systems have rather small limits on the number of
      SysV semaphores allowed, it seems desirable to switch to POSIX semaphores
      where they're available and don't create performance or kernel resource
      problems.  Experimentation by me shows that unnamed POSIX semaphores
      are at least as good as SysV semaphores on Linux, and we previously had
      a report from Maksym Sobolyev that FreeBSD is significantly worse with
      SysV semaphores than POSIX ones.  So adjust those two platforms to use
      unnamed POSIX semaphores, if configure can find the necessary library
      functions.  If this goes well, we may switch other platforms as well,
      but it would be advisable to test them individually first.
      
      It's not currently contemplated that we'd encourage users to select
      a semaphore API for themselves, but anyone who wants to experiment
      can add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)
      to their configure command line to do so.
      
      I also tweaked configure to report which API it's selected, mainly
      so that we can tell that from buildfarm reports.
      
      I did not touch the user documentation's discussion about semaphores;
      that will need some adjustment once the dust settles.
      
      Discussion: <8536.1475704230@sss.pgh.pa.us>
      ecb0d20a
  21. Oct 04, 2016
    • Tom Lane's avatar
      Improve (I hope) our autolocation of the Python shared library. · 46ddbbb1
      Tom Lane authored
      Older versions of Python produce garbage (or at least useless) values of
      get_config_vars('LDLIBRARY').  Newer versions produce garbage (or at least
      useless) values of get_config_vars('SO'), which was defeating our configure
      logic that attempted to identify where the Python shlib really is.  The net
      result, at least with a stock Python 3.5 installation on macOS, was that
      we were linking against a static library in the mistaken belief that it was
      a shared library.  This managed to work, if you count statically absorbing
      libpython into plpython.so as working.  But it no longer works as of commit
      d51924be, because now we get separate static copies of libpython in
      plpython.so and hstore_plpython.so, and those can't interoperate on the
      same data.  There are some other infelicities like assuming that nobody
      ever installs a private version of Python on a macOS machine.
      
      Hence, forget about looking in $python_configdir for the Python shlib;
      as far as I can tell no version of Python has ever put one there, and
      certainly no currently-supported version does.  Also, rather than relying
      on get_config_vars('SO'), just try all the possibilities for shlib
      extensions.  Also, rather than trusting Py_ENABLE_SHARED, believe we've
      found a shlib only if it has a recognized extension.  Last, explicitly
      cope with the possibility that the shlib is really in /usr/lib and
      $python_libdir is a red herring --- this is the actual situation on older
      macOS, but we were only accidentally working with it.
      
      Discussion: <5300.1475592228@sss.pgh.pa.us>
      46ddbbb1
  22. Sep 25, 2016
    • Tom Lane's avatar
      Refer to OS X as "macOS", except for the port name which is still "darwin". · da6c4f6c
      Tom Lane authored
      We weren't terribly consistent about whether to call Apple's OS "OS X"
      or "Mac OS X", and the former is probably confusing to people who aren't
      Apple users.  Now that Apple has rebranded it "macOS", follow their lead
      to establish a consistent naming pattern.  Also, avoid the use of the
      ancient project name "Darwin", except as the port code name which does not
      seem desirable to change.  (In short, this patch touches documentation and
      comments, but no actual code.)
      
      I didn't touch contrib/start-scripts/osx/, either.  I suspect those are
      obsolete and due for a rewrite, anyway.
      
      I dithered about whether to apply this edit to old release notes, but
      those were responsible for quite a lot of the inconsistencies, so I ended
      up changing them too.  Anyway, Apple's being ahistorical about this,
      so why shouldn't we be?
      da6c4f6c
  23. Sep 15, 2016
    • Heikki Linnakangas's avatar
      Fix building with LibreSSL. · 5c6df67e
      Heikki Linnakangas authored
      LibreSSL defines OPENSSL_VERSION_NUMBER to claim that it is version 2.0.0,
      but it doesn't have the functions added in OpenSSL 1.1.0. Add autoconf
      checks for the individual functions we need, and stop relying on
      OPENSSL_VERSION_NUMBER.
      
      Backport to 9.5 and 9.6, like the patch that broke this. In the
      back-branches, there are still a few OPENSSL_VERSION_NUMBER checks left,
      to check for OpenSSL 0.9.8 or 0.9.7. I left them as they were - LibreSSL
      has all those functions, so they work as intended.
      
      Per buildfarm member curculio.
      
      Discussion: <2442.1473957669@sss.pgh.pa.us>
      5c6df67e
    • Heikki Linnakangas's avatar
      Support OpenSSL 1.1.0. · 593d4e47
      Heikki Linnakangas authored
      Changes needed to build at all:
      
      - Check for SSL_new in configure, now that SSL_library_init is a macro.
      - Do not access struct members directly. This includes some new code in
        pgcrypto, to use the resource owner mechanism to ensure that we don't
        leak OpenSSL handles, now that we can't embed them in other structs
        anymore.
      - RAND_SSLeay() -> RAND_OpenSSL()
      
      Changes that were needed to silence deprecation warnings, but were not
      strictly necessary:
      
      - RAND_pseudo_bytes() -> RAND_bytes().
      - SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
      - ASN1_STRING_data() -> ASN1_STRING_get0_data()
      - DH_generate_parameters() -> DH_generate_parameters()
      - Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
        riddance!)
      
      Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
      for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
      immemorial.
      
      Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
      the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
      Regenerate the test certificates with that. The "openssl" binary, used to
      generate the certificates, is also now more picky, and throws an error
      if an X509 extension is specified in "req_extensions", but that section
      is empty.
      
      Backpatch to all supported branches, per popular demand. In back-branches,
      we still support OpenSSL 0.9.7 and above. OpenSSL 0.9.6 should still work
      too, but I didn't test it. In master, we only support 0.9.8 and above.
      
      Patch by Andreas Karlsson, with additional changes by me.
      
      Discussion: <20160627151604.GD1051@msg.df7cb.de>
      593d4e47
  24. Aug 15, 2016
    • Tom Lane's avatar
      Stamp HEAD as 10devel. · ca9112a4
      Tom Lane authored
      This is a good bit more complicated than the average new-version stamping
      commit, because it includes various adjustments in pursuit of changing
      from three-part to two-part version numbers.  It's likely some further
      work will be needed around that change; but this is enough to get through
      the regression tests, at least in Unix builds.
      
      Peter Eisentraut and Tom Lane
      ca9112a4
  25. Aug 08, 2016
  26. Jul 18, 2016
  27. Jun 20, 2016
  28. May 09, 2016
  29. Apr 08, 2016
    • Tom Lane's avatar
      Add BSD authentication method. · 34c33a1f
      Tom Lane authored
      Create a "bsd" auth method that works the same as "password" so far as
      clients are concerned, but calls the BSD Authentication service to
      check the password.  This is currently only available on OpenBSD.
      
      Marisa Emerson, reviewed by Thomas Munro
      34c33a1f
  30. Mar 21, 2016
    • Andres Freund's avatar
      Introduce WaitEventSet API. · 98a64d0b
      Andres Freund authored
      Commit ac1d7945 ("Make idle backends exit if the postmaster dies.")
      introduced a regression on, at least, large linux systems. Constantly
      adding the same postmaster_alive_fds to the OSs internal datastructures
      for implementing poll/select can cause significant contention; leading
      to a performance regression of nearly 3x in one example.
      
      This can be avoided by using e.g. linux' epoll, which avoids having to
      add/remove file descriptors to the wait datastructures at a high rate.
      Unfortunately the current latch interface makes it hard to allocate any
      persistent per-backend resources.
      
      Replace, with a backward compatibility layer, WaitLatchOrSocket with a
      new WaitEventSet API. Users can allocate such a Set across multiple
      calls, and add more than one file-descriptor to wait on. The latter has
      been added because there's upcoming postgres features where that will be
      helpful.
      
      In addition to the previously existing poll(2), select(2),
      WaitForMultipleObjects() implementations also provide an epoll_wait(2)
      based implementation to address the aforementioned performance
      problem. Epoll is only available on linux, but that is the most likely
      OS for machines large enough (four sockets) to reproduce the problem.
      
      To actually address the aforementioned regression, create and use a
      long-lived WaitEventSet for FE/BE communication.  There are additional
      places that would benefit from a long-lived set, but that's a task for
      another day.
      
      Thanks to Amit Kapila, who helped make the windows code I blindly wrote
      actually work.
      
      Reported-By: Dmitry Vasilyev Discussion:
      CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
      20160114143931.GG10941@awork2.anarazel.de
      98a64d0b
    • Andres Freund's avatar
      Combine win32 and unix latch implementations. · 72e2d21c
      Andres Freund authored
      Previously latches for windows and unix had been implemented in
      different files. A later patch introduce an expanded wait
      infrastructure, keeping the implementation separate would introduce too
      much duplication.
      
      This basically just moves the functions, without too much change. The
      reason to keep this separate is that it allows blame to continue working
      a little less badly; and to make review a tiny bit easier.
      
      Discussion: 20160114143931.GG10941@awork2.anarazel.de
      72e2d21c
  31. Mar 15, 2016
    • Tom Lane's avatar
      Cope if platform declares mbstowcs_l(), but not locale_t, in <xlocale.h>. · 0e9b8998
      Tom Lane authored
      Previously, we included <xlocale.h> only if necessary to get the definition
      of type locale_t.  According to notes in PGAC_TYPE_LOCALE_T, this is
      important because on some versions of glibc that file supplies an
      incompatible declaration of locale_t.  (This info may be obsolete, because
      on my RHEL6 box that seems to be the *only* definition of locale_t; but
      there may still be glibc's in the wild for which it's a live concern.)
      
      It turns out though that on FreeBSD and maybe other BSDen, you can get
      locale_t from stdlib.h or locale.h but mbstowcs_l() and friends only from
      <xlocale.h>.  This was leaving us compiling calls to mbstowcs_l() and
      friends with no visible prototype, which causes a warning and could
      possibly cause actual trouble, since it's not declared to return int.
      
      Hence, adjust the configure checks so that we'll include <xlocale.h>
      either if it's necessary to get type locale_t or if it's necessary to
      get a declaration of mbstowcs_l().
      
      Report and patch by Aleksander Alekseev, somewhat whacked around by me.
      Back-patch to all supported branches, since we have been using
      mbstowcs_l() since 9.1.
      0e9b8998
  32. Mar 14, 2016
    • Tom Lane's avatar
      Teach the configure script to validate its --with-pgport argument. · bf53d5c2
      Tom Lane authored
      Previously, configure would take any string, including an empty string,
      leading to obscure compile failures in guc.c.  It seems worth expending
      a few lines of code to ensure that the argument is a decimal number
      between 1 and 65535.
      
      Report and patch by Jim Nasby; reviews by Alex Shulgin, Peter Eisentraut,
      Ivan Kartyshov
      bf53d5c2
  33. Feb 03, 2016
  34. Jan 08, 2016
    • Alvaro Herrera's avatar
      Revert "Blind attempt at a Cygwin fix" · 46317211
      Alvaro Herrera authored
      This reverts commit e9282e95, which blew
      up in a pretty spectacular way.  Re-introduce the original code while we
      search for a real fix.
      46317211
    • Alvaro Herrera's avatar
      Blind attempt at a Cygwin fix · e9282e95
      Alvaro Herrera authored
      Further portability fix for a9676139.  Mingw- and MSVC-based builds
      appear to be working fine, but Cygwin needs an extra tweak whereby the
      new win32security.c file is explicitely added to the list of files to
      build in pgport, per Cygwin members brolga and lorikeet.
      
      Author: Michael Paquier
      e9282e95
Loading