Skip to content
Snippets Groups Projects
  1. Oct 24, 2014
  2. Sep 25, 2014
  3. Aug 09, 2014
  4. Jun 05, 2014
    • Tom Lane's avatar
      Add defenses against running with a wrong selection of LOBLKSIZE. · 5f93c378
      Tom Lane authored
      It's critical that the backend's idea of LOBLKSIZE match the way data has
      actually been divided up in pg_largeobject.  While we don't provide any
      direct way to adjust that value, doing so is a one-line source code change
      and various people have expressed interest recently in changing it.  So,
      just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in
      pg_control and cross-check that the backend's compiled-in setting matches
      the on-disk data.
      
      Also tweak the code in inv_api.c so that fetches from pg_largeobject
      explicitly verify that the length of the data field is not more than
      LOBLKSIZE.  Formerly we just had Asserts() for that, which is no protection
      at all in production builds.  In some of the call sites an overlength data
      value would translate directly to a security-relevant stack clobber, so it
      seems worth one extra runtime comparison to be sure.
      
      In the back branches, we can't change the contents of pg_control; but we
      can still make the extra checks in inv_api.c, which will offer some amount
      of protection against running with the wrong value of LOBLKSIZE.
      5f93c378
  5. May 28, 2014
  6. May 06, 2014
    • Bruce Momjian's avatar
      pgindent run for 9.4 · 0a783200
      Bruce Momjian authored
      This includes removing tabs after periods in C comments, which was
      applied to back branches, so this change should not effect backpatching.
      0a783200
  7. Mar 21, 2014
  8. Mar 13, 2014
  9. Feb 15, 2014
    • Tom Lane's avatar
      Centralize getopt-related declarations in a new header file pg_getopt.h. · 60ff2fdd
      Tom Lane authored
      We used to have externs for getopt() and its API variables scattered
      all over the place.  Now that we find we're going to need to tweak the
      variable declarations for Cygwin, it seems like a good idea to have
      just one place to tweak.
      
      In this commit, the variables are declared "#ifndef HAVE_GETOPT_H".
      That may or may not work everywhere, but we'll soon find out.
      
      Andres Freund
      60ff2fdd
  10. Jan 07, 2014
  11. Dec 20, 2013
  12. Dec 13, 2013
    • Heikki Linnakangas's avatar
      Add GUC to enable WAL-logging of hint bits, even with checksums disabled. · 50e54709
      Heikki Linnakangas authored
      WAL records of hint bit updates is useful to tools that want to examine
      which pages have been modified. In particular, this is required to make
      the pg_rewind tool safe (without checksums).
      
      This can also be used to test how much extra WAL-logging would occur if
      you enabled checksums, without actually enabling them (which you can't
      currently do without re-initdb'ing).
      
      Sawada Masahiko, docs by Samrat Revagade. Reviewed by Dilip Kumar, with
      further changes by me.
      50e54709
  13. Dec 12, 2013
  14. Jul 07, 2013
  15. Jul 04, 2013
    • Robert Haas's avatar
      Add new GUC, max_worker_processes, limiting number of bgworkers. · 6bc8ef0b
      Robert Haas authored
      In 9.3, there's no particular limit on the number of bgworkers;
      instead, we just count up the number that are actually registered,
      and use that to set MaxBackends.  However, that approach causes
      problems for Hot Standby, which needs both MaxBackends and the
      size of the lock table to be the same on the standby as on the
      master, yet it may not be desirable to run the same bgworkers in
      both places.  9.3 handles that by failing to notice the problem,
      which will probably work fine in nearly all cases anyway, but is
      not theoretically sound.
      
      A further problem with simply counting the number of registered
      workers is that new workers can't be registered without a
      postmaster restart.  This is inconvenient for administrators,
      since bouncing the postmaster causes an interruption of service.
      Moreover, there are a number of applications for background
      processes where, by necessity, the background process must be
      started on the fly (e.g. parallel query).  While this patch
      doesn't actually make it possible to register new background
      workers after startup time, it's a necessary prerequisite.
      
      Patch by me.  Review by Michael Paquier.
      6bc8ef0b
  16. Jun 27, 2013
  17. May 29, 2013
  18. Apr 30, 2013
  19. Mar 22, 2013
    • Simon Riggs's avatar
      Allow I/O reliability checks using 16-bit checksums · 96ef3b8f
      Simon Riggs authored
      Checksums are set immediately prior to flush out of shared buffers
      and checked when pages are read in again. Hint bit setting will
      require full page write when block is dirtied, which causes various
      infrastructure changes. Extensive comments, docs and README.
      
      WARNING message thrown if checksum fails on non-all zeroes page;
      ERROR thrown but can be disabled with ignore_checksum_failure = on.
      
      Feature enabled by an initdb option, since transition from option off
      to option on is long and complex and has not yet been implemented.
      Default is not to use checksums.
      
      Checksum used is WAL CRC-32 truncated to 16-bits.
      
      Simon Riggs, Jeff Davis, Greg Smith
      Wide input and assistance from many community members. Thank you.
      96ef3b8f
  20. Mar 17, 2013
  21. Feb 12, 2013
    • Alvaro Herrera's avatar
      Create libpgcommon, and move pg_malloc et al to it · 8396447c
      Alvaro Herrera authored
      libpgcommon is a new static library to allow sharing code among the
      various frontend programs and backend; this lets us eliminate duplicate
      implementations of common routines.  We avoid libpgport, because that's
      intended as a place for porting issues; per discussion, it seems better
      to keep them separate.
      
      The first use case, and the only implemented by this patch, is pg_malloc
      and friends, which many frontend programs were already using.
      
      At the same time, we can use this to provide palloc emulation functions
      for the frontend; this way, some palloc-using files in the backend can
      also be used by the frontend cleanly.  To do this, we change palloc() in
      the backend to be a function instead of a macro on top of
      MemoryContextAlloc().  This was previously believed to cause loss of
      performance, but this implementation has been tweaked by Tom and Andres
      so that on modern compilers it provides a slight improvement over the
      previous one.
      
      This lets us clean up some places that were already with
      localized hacks.
      
      Most of the pg_malloc/palloc changes in this patch were authored by
      Andres Freund. Zoltán Böszörményi also independently provided a form of
      that.  libpgcommon infrastructure was authored by Álvaro.
      8396447c
  22. Feb 11, 2013
    • Heikki Linnakangas's avatar
      Support unlogged GiST index. · 62401db4
      Heikki Linnakangas authored
      The reason this wasn't supported before was that GiST indexes need an
      increasing sequence to detect concurrent page-splits. In a regular WAL-
      logged GiST index, the LSN of the page-split record is used for that
      purpose, and in a temporary index, we can get away with a backend-local
      counter. Neither of those methods works for an unlogged relation.
      
      To provide such an increasing sequence of numbers, create a "fake LSN"
      counter that is saved and restored across shutdowns. On recovery, unlogged
      relations are blown away, so the counter doesn't need to survive that
      either.
      
      Jeevan Chalke, based on discussions with Robert Haas, Tom Lane and me.
      62401db4
    • Heikki Linnakangas's avatar
      Include previous TLI in end-of-recovery and shutdown checkpoint records. · 7803e932
      Heikki Linnakangas authored
      This isn't used for anything but a sanity check at the moment, but it could
      be highly valuable for debugging purposes. It could also be used to recreate
      timeline history by traversing WAL, which seems useful.
      7803e932
  23. Jan 23, 2013
    • Alvaro Herrera's avatar
      Improve concurrency of foreign key locking · 0ac5ad51
      Alvaro Herrera authored
      This patch introduces two additional lock modes for tuples: "SELECT FOR
      KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
      other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
      FOR UPDATE".  UPDATE commands that do not modify the values stored in
      the columns that are part of the key of the tuple now grab a SELECT FOR
      NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
      with tuple locks of the FOR KEY SHARE variety.
      
      Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
      means the concurrency improvement applies to them, which is the whole
      point of this patch.
      
      The added tuple lock semantics require some rejiggering of the multixact
      module, so that the locking level that each transaction is holding can
      be stored alongside its Xid.  Also, multixacts now need to persist
      across server restarts and crashes, because they can now represent not
      only tuple locks, but also tuple updates.  This means we need more
      careful tracking of lifetime of pg_multixact SLRU files; since they now
      persist longer, we require more infrastructure to figure out when they
      can be removed.  pg_upgrade also needs to be careful to copy
      pg_multixact files over from the old server to the new, or at least part
      of multixact.c state, depending on the versions of the old and new
      servers.
      
      Tuple time qualification rules (HeapTupleSatisfies routines) need to be
      careful not to consider tuples with the "is multi" infomask bit set as
      being only locked; they might need to look up MultiXact values (i.e.
      possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
      whereas they previously were assured to only use information readily
      available from the tuple header.  This is considered acceptable, because
      the extra I/O would involve cases that would previously cause some
      commands to block waiting for concurrent transactions to finish.
      
      Another important change is the fact that locking tuples that have
      previously been updated causes the future versions to be marked as
      locked, too; this is essential for correctness of foreign key checks.
      This causes additional WAL-logging, also (there was previously a single
      WAL record for a locked tuple; now there are as many as updated copies
      of the tuple there exist.)
      
      With all this in place, contention related to tuples being checked by
      foreign key rules should be much reduced.
      
      As a bonus, the old behavior that a subtransaction grabbing a stronger
      tuple lock than the parent (sub)transaction held on a given tuple and
      later aborting caused the weaker lock to be lost, has been fixed.
      
      Many new spec files were added for isolation tester framework, to ensure
      overall behavior is sane.  There's probably room for several more tests.
      
      There were several reviewers of this patch; in particular, Noah Misch
      and Andres Freund spent considerable time in it.  Original idea for the
      patch came from Simon Riggs, after a problem report by Joel Jacobson.
      Most code is from me, with contributions from Marti Raudsepp, Alexander
      Shulgin, Noah Misch and Andres Freund.
      
      This patch was discussed in several pgsql-hackers threads; the most
      important start at the following message-ids:
      	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
      	1290721684-sup-3951@alvh.no-ip.org
      	1294953201-sup-2099@alvh.no-ip.org
      	1320343602-sup-2290@alvh.no-ip.org
      	1339690386-sup-8927@alvh.no-ip.org
      	4FE5FF020200002500048A3D@gw.wicourts.gov
      	4FEAB90A0200002500048B7D@gw.wicourts.gov
      0ac5ad51
  24. Jan 01, 2013
  25. Dec 04, 2012
    • Heikki Linnakangas's avatar
      Track the timeline associated with minRecoveryPoint, for more sanity checks. · 5ce108bf
      Heikki Linnakangas authored
      This allows recovery to notice certain incorrect recovery scenarios.
      If a server has recovered to point X on timeline 5, and you restart
      recovery, it better be on timeline 5 when it reaches point X again, not on
      some timeline with a higher ID. This can happen e.g if you a standby server
      is shut down, a new timeline appears in the WAL archive, and the standby
      server is restarted. It will try to follow the new timeline, which is wrong
      because some WAL on the old timeline was already replayed before shutdown.
      
      Requires an initdb (or at least pg_resetxlog), because this adds a field to
      the control file.
      5ce108bf
  26. Nov 22, 2012
    • Tom Lane's avatar
      Fix pg_resetxlog to use correct path to postmaster.pid. · 455b8887
      Tom Lane authored
      Since we've already chdir'd into the data directory, the file should
      be referenced as just "postmaster.pid", without prefixing the directory
      path.  This is harmless in the normal case where an absolute PGDATA path
      is used, but quite dangerous if a relative path is specified, since the
      program might then fail to notice an active postmaster.
      
      Reported by Hari Babu.  This got broken in my commit
      eb5949d1, so patch all active versions.
      455b8887
  27. Jun 26, 2012
    • Heikki Linnakangas's avatar
      Fix pg_upgrade, broken by the xlogid/segno -> 64-bit int refactoring. · 038f3a05
      Heikki Linnakangas authored
      The xlogid + segno representation of a particular WAL segment doesn't make
      much sense in pg_resetxlog anymore, now that we don't use that anywhere
      else. Use the WAL filename instead, since that's a convenient way to name a
      particular WAL segment.
      
      I did this partially for pg_resetxlog in the original xlogid/segno -> uint64
      patch, but I neglected pg_upgrade and the docs. This should now be more
      complete.
      038f3a05
  28. Jun 25, 2012
  29. Jun 24, 2012
    • Heikki Linnakangas's avatar
      Replace XLogRecPtr struct with a 64-bit integer. · 0ab9d1c4
      Heikki Linnakangas authored
      This simplifies code that needs to do arithmetic on XLogRecPtrs.
      
      To avoid changing on-disk format of data pages, the LSN on data pages is
      still stored in the old format. That should keep pg_upgrade happy. However,
      we have XLogRecPtrs embedded in the control file, and in the structs that
      are sent over the replication protocol, so this changes breaks compatibility
      of pg_basebackup and server. I didn't do anything about this in this patch,
      per discussion on -hackers, the right thing to do would to be to change the
      replication protocol to be architecture-independent, so that you could use
      a newer version of pg_receivexlog, for example, against an older server
      version.
      0ab9d1c4
    • Heikki Linnakangas's avatar
      Allow WAL record header to be split across pages. · 061e7efb
      Heikki Linnakangas authored
      This saves a few bytes of WAL space, but the real motivation is to make it
      predictable how much WAL space a record requires, as it no longer depends
      on whether we need to waste the last few bytes at end of WAL page because
      the header doesn't fit.
      
      The total length field of WAL record, xl_tot_len, is moved to the beginning
      of the WAL record header, so that it is still always found on the first page
      where a WAL record begins.
      
      Bump WAL version number again as this is an incompatible change.
      061e7efb
    • Heikki Linnakangas's avatar
      Don't waste the last segment of each 4GB logical log file. · dfda6eba
      Heikki Linnakangas authored
      The comments claimed that wasting the last segment made it easier to do
      calculations with XLogRecPtrs, because you don't have problems representing
      last-byte-position-plus-1 that way. In my experience, however, it only made
      things more complicated, because the there was two ways to represent the
      boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
      or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
      picky about which representation was used.
      
      Also, use a 64-bit segment number instead of the log/seg combination, to
      point to a certain WAL segment. We assume that all platforms have a working
      64-bit integer type nowadays.
      
      This is an incompatible change in WAL format, so bumping WAL version number.
      dfda6eba
  30. Jun 18, 2012
  31. Jun 07, 2012
  32. May 18, 2012
  33. Jan 25, 2012
  34. Jan 02, 2012
  35. Aug 17, 2011
Loading