Skip to content
Snippets Groups Projects
  1. Feb 12, 2013
    • Alvaro Herrera's avatar
      Create libpgcommon, and move pg_malloc et al to it · 8396447c
      Alvaro Herrera authored
      libpgcommon is a new static library to allow sharing code among the
      various frontend programs and backend; this lets us eliminate duplicate
      implementations of common routines.  We avoid libpgport, because that's
      intended as a place for porting issues; per discussion, it seems better
      to keep them separate.
      
      The first use case, and the only implemented by this patch, is pg_malloc
      and friends, which many frontend programs were already using.
      
      At the same time, we can use this to provide palloc emulation functions
      for the frontend; this way, some palloc-using files in the backend can
      also be used by the frontend cleanly.  To do this, we change palloc() in
      the backend to be a function instead of a macro on top of
      MemoryContextAlloc().  This was previously believed to cause loss of
      performance, but this implementation has been tweaked by Tom and Andres
      so that on modern compilers it provides a slight improvement over the
      previous one.
      
      This lets us clean up some places that were already with
      localized hacks.
      
      Most of the pg_malloc/palloc changes in this patch were authored by
      Andres Freund. Zoltán Böszörményi also independently provided a form of
      that.  libpgcommon infrastructure was authored by Álvaro.
      8396447c
    • Peter Eisentraut's avatar
  2. Feb 08, 2013
    • Tom Lane's avatar
      Make contrib/btree_gist's GiST penalty function a bit saner. · 9221f9d4
      Tom Lane authored
      The previous coding supposed that the first differing bytes in two varlena
      datums must have the same sign difference as their overall comparison
      result.  This is obviously bogus for text strings in non-C locales, and
      probably wrong for numeric, and even for bytea I think it was wrong on
      machines where char is signed.  When the assumption failed, the function
      could deliver a zero or negative penalty in situations where such a result
      is quite ridiculous, leading the core GiST code to make very bad page-split
      decisions.
      
      To fix, take the absolute values of the byte-level differences.  Also,
      switch the code to using unsigned char not just char, so that the behavior
      will be consistent whether char is signed or not.
      
      Per investigation of a trouble report from Tomas Vondra.  Back-patch to all
      supported branches.
      9221f9d4
    • Tom Lane's avatar
      Fix erroneous range-union logic for varlena types in contrib/btree_gist. · 94f565dc
      Tom Lane authored
      gbt_var_bin_union() failed to do the right thing when the existing range
      needed to be widened at both ends rather than just one end.  This could
      result in an invalid index in which keys that are present would not be
      found by searches, because the searches would not think they need to
      descend to the relevant leaf pages.  This error affected all the varlena
      datatypes supported by btree_gist (text, bytea, bit, numeric).
      
      Per investigation of a trouble report from Tomas Vondra.  (There is also
      an issue in gbt_var_penalty(), but that should only result in inefficiency
      not wrong answers.  I'm committing this separately so that we have a git
      state in which it can be tested that bad penalty results don't produce
      invalid indexes.)  Back-patch to all supported branches.
      94f565dc
  3. Feb 06, 2013
  4. Jan 31, 2013
    • Alvaro Herrera's avatar
      pgrowlocks: fix bogus lock strength output · 77a3082f
      Alvaro Herrera authored
      Per report from digoal@126.com
      77a3082f
    • Tatsuo Ishii's avatar
      Add --aggregate-interval option. · 6a651d85
      Tatsuo Ishii authored
      The new option specifies length of aggregation interval (in
      seconds). May be used only together with -l. With this option, the log
      contains per-interval summary (number of transactions, min/max latency
      and two additional fields useful for variance estimation).
      
      Patch contributed by Tomas Vondra, reviewed by Pavel Stehule. Slight
      change by Tatsuo Ishii, suggested by Robert Hass to emit an error
      message indicating that the option is not currently supported on
      Windows.
      6a651d85
  5. Jan 29, 2013
    • Heikki Linnakangas's avatar
      Allow pgbench to use a scale larger than 21474. · 89d00cbe
      Heikki Linnakangas authored
      Beyond 21474, the number of accounts exceed the range for int4. Change the
      initialization code to use bigint for account id columns when scale is large
      enough, and switch to using int64s for the variables in pgbench code. The
      threshold where we switch to bigints is set at 20000, because that's easier
      to remember and document than 21474, and ensures that there is some headroom
      when int4s are used.
      
      Greg Smith, with various changes by Euler Taveira de Oliveira, Gurjeet
      Singh and Satoshi Nagayasu.
      89d00cbe
  6. Jan 24, 2013
  7. Jan 23, 2013
    • Alvaro Herrera's avatar
      Improve concurrency of foreign key locking · 0ac5ad51
      Alvaro Herrera authored
      This patch introduces two additional lock modes for tuples: "SELECT FOR
      KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
      other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
      FOR UPDATE".  UPDATE commands that do not modify the values stored in
      the columns that are part of the key of the tuple now grab a SELECT FOR
      NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
      with tuple locks of the FOR KEY SHARE variety.
      
      Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
      means the concurrency improvement applies to them, which is the whole
      point of this patch.
      
      The added tuple lock semantics require some rejiggering of the multixact
      module, so that the locking level that each transaction is holding can
      be stored alongside its Xid.  Also, multixacts now need to persist
      across server restarts and crashes, because they can now represent not
      only tuple locks, but also tuple updates.  This means we need more
      careful tracking of lifetime of pg_multixact SLRU files; since they now
      persist longer, we require more infrastructure to figure out when they
      can be removed.  pg_upgrade also needs to be careful to copy
      pg_multixact files over from the old server to the new, or at least part
      of multixact.c state, depending on the versions of the old and new
      servers.
      
      Tuple time qualification rules (HeapTupleSatisfies routines) need to be
      careful not to consider tuples with the "is multi" infomask bit set as
      being only locked; they might need to look up MultiXact values (i.e.
      possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
      whereas they previously were assured to only use information readily
      available from the tuple header.  This is considered acceptable, because
      the extra I/O would involve cases that would previously cause some
      commands to block waiting for concurrent transactions to finish.
      
      Another important change is the fact that locking tuples that have
      previously been updated causes the future versions to be marked as
      locked, too; this is essential for correctness of foreign key checks.
      This causes additional WAL-logging, also (there was previously a single
      WAL record for a locked tuple; now there are as many as updated copies
      of the tuple there exist.)
      
      With all this in place, contention related to tuples being checked by
      foreign key rules should be much reduced.
      
      As a bonus, the old behavior that a subtransaction grabbing a stronger
      tuple lock than the parent (sub)transaction held on a given tuple and
      later aborting caused the weaker lock to be lost, has been fixed.
      
      Many new spec files were added for isolation tester framework, to ensure
      overall behavior is sane.  There's probably room for several more tests.
      
      There were several reviewers of this patch; in particular, Noah Misch
      and Andres Freund spent considerable time in it.  Original idea for the
      patch came from Simon Riggs, after a problem report by Joel Jacobson.
      Most code is from me, with contributions from Marti Raudsepp, Alexander
      Shulgin, Noah Misch and Andres Freund.
      
      This patch was discussed in several pgsql-hackers threads; the most
      important start at the following message-ids:
      	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
      	1290721684-sup-3951@alvh.no-ip.org
      	1294953201-sup-2099@alvh.no-ip.org
      	1320343602-sup-2290@alvh.no-ip.org
      	1339690386-sup-8927@alvh.no-ip.org
      	4FE5FF020200002500048A3D@gw.wicourts.gov
      	4FEAB90A0200002500048B7D@gw.wicourts.gov
      0ac5ad51
    • Bruce Momjian's avatar
      pg_upgrade: remove --single-transaction usage · 861ad67b
      Bruce Momjian authored
      With AtEOXact applied, --single-transaction makes pg_restore slower, and
      has the potential to require lock table configuration, so remove the
      argument.
      
      Per suggestion from Tom.
      861ad67b
  8. Jan 18, 2013
  9. Jan 14, 2013
    • Tom Lane's avatar
      Improve handling of ereport(ERROR) and elog(ERROR). · b853eb97
      Tom Lane authored
      In commit 71450d7f, we added code to inform
      suitably-intelligent compilers that ereport() doesn't return if the elevel
      is ERROR or higher.  This patch extends that to elog(), and also fixes a
      double-evaluation hazard that the previous commit created in ereport(),
      as well as reducing the emitted code size.
      
      The elog() improvement requires the compiler to support __VA_ARGS__, which
      should be available in just about anything nowadays since it's required by
      C99.  But our minimum language baseline is still C89, so add a configure
      test for that.
      
      The previous commit assumed that ereport's elevel could be evaluated twice,
      which isn't terribly safe --- there are already counterexamples in xlog.c.
      On compilers that have __builtin_constant_p, we can use that to protect the
      second test, since there's no possible optimization gain if the compiler
      doesn't know the value of elevel.  Otherwise, use a local variable inside
      the macros to prevent double evaluation.  The local-variable solution is
      inferior because (a) it leads to useless code being emitted when elevel
      isn't constant, and (b) it increases the optimization level needed for the
      compiler to recognize that subsequent code is unreachable.  But it seems
      better than not teaching non-gcc compilers about unreachability at all.
      
      Lastly, if the compiler has __builtin_unreachable(), we can use that
      instead of abort(), resulting in a noticeable code savings since no
      function call is actually emitted.  However, it seems wise to do this only
      in non-assert builds.  In an assert build, continue to use abort(), so that
      the behavior will be predictable and debuggable if the "impossible"
      happens.
      
      These changes involve making the ereport and elog macros emit do-while
      statement blocks not just expressions, which forces small changes in
      a few call sites.
      
      Andres Freund, Tom Lane, Heikki Linnakangas
      b853eb97
  10. Jan 12, 2013
    • Andrew Dunstan's avatar
      Extend and improve use of EXTRA_REGRESS_OPTS. · 4ae5ee6c
      Andrew Dunstan authored
      This is now used by ecpg tests, and not clobbered by pg_upgrade
      tests. This change won't affect anything that doesn't set this
      environment variable, but will enable the buildfarm to control
      exactly what port regression test installs will be running on,
      and thus to detect possible rogue postmasters more easily.
      
      Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
      4ae5ee6c
  11. Jan 09, 2013
  12. Jan 07, 2013
    • Tatsuo Ishii's avatar
      Add new "-q" logging option (quiet mode) while in initialize mode · cf03ff6c
      Tatsuo Ishii authored
      (-i), producing only one progress message per 5 seconds along with
      elapsed time and estimated remaining time.  Also add elapsed time and
      estimated remaining time to the default logging(prints one message
      each 100000 rows).
      Patch contributed by Tomas Vondra, reviewed by Jeevan Chalke and
      Tatsuo Ishii.
      cf03ff6c
  13. Jan 04, 2013
    • Tom Lane's avatar
      Prevent creation of postmaster's TCP socket during pg_upgrade testing. · 78a5e738
      Tom Lane authored
      On non-Windows machines, we use the Unix socket for connections to test
      postmasters, so there is no need to create a TCP socket.  Furthermore,
      doing so causes failures due to port conflicts if two builds are carried
      out concurrently on one machine.  (If the builds are done in different
      chroots, which is standard practice at least in Red Hat distros, there
      is no risk of conflict on the Unix socket.)  Suppressing the TCP socket
      by setting listen_addresses to empty has long been standard practice
      for pg_regress, and pg_upgrade knows about this too ... but pg_upgrade's
      test.sh didn't get the memo.
      
      Back-patch to 9.2, and also sync the 9.2 version of the script with HEAD
      as much as practical.
      78a5e738
  14. Jan 03, 2013
  15. Jan 01, 2013
  16. Dec 27, 2012
  17. Dec 20, 2012
  18. Dec 11, 2012
    • Bruce Momjian's avatar
      Fix pg_upgrade for invalid indexes · e95c4bd1
      Bruce Momjian authored
      All versions of pg_upgrade upgraded invalid indexes caused by CREATE
      INDEX CONCURRENTLY failures and marked them as valid.  The patch adds a
      check to all pg_upgrade versions and throws an error during upgrade or
      --check.
      
      Backpatch to 9.2, 9.1, 9.0.  Patch slightly adjusted.
      e95c4bd1
    • Andrew Dunstan's avatar
      Add mode where contrib installcheck runs each module in a separately named database. · ad69bd05
      Andrew Dunstan authored
      Normally each module is tested in a database named contrib_regression,
      which is dropped and recreated at the beginhning of each pg_regress run.
      This new mode, enabled by adding USE_MODULE_DB=1 to the make command
      line, runs most modules in a database with the module name embedded in
      it.
      
      This will make testing pg_upgrade on clusters with the contrib modules
      a lot easier.
      
      Second attempt at this, this time accomodating make versions older
      than 3.82.
      
      Still to be done: adapt to the MSVC build system.
      
      Backpatch to 9.0, which is the earliest version it is reasonably
      possible to test upgrading from.
      ad69bd05
    • Bruce Momjian's avatar
      Fix pg_upgrade -O/-o options · acdb8c22
      Bruce Momjian authored
      Fix previous commit that added synchronous_commit=off, but broke -O/-o
      due to missing space in argument passing.
      
      Backpatch to 9.2.
      acdb8c22
  19. Dec 07, 2012
    • Bruce Momjian's avatar
      Improve pg_upgrade's status display · 6dd95845
      Bruce Momjian authored
      Pg_upgrade displays file names during copy and database names during
      dump/restore.  Andrew Dunstan identified three bugs:
      
      *  long file names were being truncated to 60 _leading_ characters, which
         often do not change for long file names
      
      *  file names were truncated to 60 characters in log files
      
      *  carriage returns were being output to log files
      
      This commit fixes these --- it prints 60 _trailing_ characters to the
      status display, and full path names without carriage returns to log
      files.  It also suppresses status output to the log file unless verbose
      mode is used.
      6dd95845
  20. Dec 06, 2012
    • Alvaro Herrera's avatar
      Background worker processes · da07a1e8
      Alvaro Herrera authored
      Background workers are postmaster subprocesses that run arbitrary
      user-specified code.  They can request shared memory access as well as
      backend database connections; or they can just use plain libpq frontend
      database connections.
      
      Modules listed in shared_preload_libraries can register background
      workers in their _PG_init() function; this is early enough that it's not
      necessary to provide an extra GUC option, because the necessary extra
      resources can be allocated early on.  Modules can install more than one
      bgworker, if necessary.
      
      Care is taken that these extra processes do not interfere with other
      postmaster tasks: only one such process is started on each ServerLoop
      iteration.  This means a large number of them could be waiting to be
      started up and postmaster is still able to quickly service external
      connection requests.  Also, shutdown sequence should not be impacted by
      a worker process that's reasonably well behaved (i.e. promptly responds
      to termination signals.)
      
      The current implementation lets worker processes specify their start
      time, i.e. at what point in the server startup process they are to be
      started: right after postmaster start (in which case they mustn't ask
      for shared memory access), when consistent state has been reached
      (useful during recovery in a HOT standby server), or when recovery has
      terminated (i.e. when normal backends are allowed).
      
      In case of a bgworker crash, actions to take depend on registration
      data: if shared memory was requested, then all other connections are
      taken down (as well as other bgworkers), just like it were a regular
      backend crashing.  The bgworker itself is restarted, too, within a
      configurable timeframe (which can be configured to be never).
      
      More features to add to this framework can be imagined without much
      effort, and have been discussed, but this seems good enough as a useful
      unit already.
      
      An elementary sample module is supplied.
      
      Author: Álvaro Herrera
      
      This patch is loosely based on prior patches submitted by KaiGai Kohei,
      and unsubmitted code by Simon Riggs.
      
      Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
      Heikki Linnakangas, Simon Riggs, Amit Kapila
      da07a1e8
  21. Dec 05, 2012
  22. Dec 04, 2012
  23. Dec 03, 2012
  24. Dec 02, 2012
    • Andrew Dunstan's avatar
      Add mode where contrib installcheck runs each module in a separately named database. · e2b3c21b
      Andrew Dunstan authored
      Normally each module is tested in aq database named contrib_regression,
      which is dropped and recreated at the beginhning of each pg_regress run.
      This mode, enabled by adding USE_MODULE_DB=1 to the make command line,
      runs most modules in a database with the module name embedded in it.
      
      This will make testing pg_upgrade on clusters with the contrib modules
      a lot easier.
      
      Still to be done: adapt to the MSVC build system.
      
      Backpatch to 9.0, which is the earliest version it is reasonably possible
      to test upgrading from.
      e2b3c21b
  25. Dec 01, 2012
Loading