Skip to content
Snippets Groups Projects
  1. Dec 23, 2010
    • Heikki Linnakangas's avatar
      Rewrite the GiST insertion logic so that we don't need the post-recovery · 9de3aa65
      Heikki Linnakangas authored
      cleanup stage to finish incomplete inserts or splits anymore. There was two
      reasons for the cleanup step:
      
      1. When a new tuple was inserted to a leaf page, the downlink in the parent
      needed to be updated to contain (ie. to be consistent with) the new key.
      Updating the parent in turn might require recursively updating the parent of
      the parent. We now handle that by updating the parent while traversing down
      the tree, so that when we insert the leaf tuple, all the parents are already
      consistent with the new key, and the tree is consistent at every step.
      
      2. When a page is split, we need to insert the downlink for the new right
      page(s), and update the downlink for the original page to not include keys
      that moved to the right page(s). We now handle that by setting a new flag,
      F_FOLLOW_RIGHT, on the non-rightmost pages in the split. When that flag is
      set, scans always follow the rightlink, regardless of the NSN mechanism used
      to detect concurrent page splits. That way the tree is consistent right after
      split, even though the downlink is still missing. This is very similar to the
      way B-tree splits are handled. When the downlink is inserted in the parent,
      the flag is cleared. To keep the insertion algorithm simple, when an
      insertion sees an incomplete split, indicated by the F_FOLLOW_RIGHT flag, it
      finishes the split before doing anything else.
      
      These changes allow removing the whole "invalid tuple" mechanism, but I
      retained the scan code to still follow invalid tuples correctly. While we
      don't create any such tuples anymore, we want to handle them gracefully in
      case you pg_upgrade a GiST index that has them. If we encounter any on an
      insert, though, we just throw an error saying that you need to REINDEX.
      
      The issue that got me into doing this is that if you did a checkpoint while
      an insert or split was in progress, and the checkpoint finishes quickly so
      that there is no WAL record related to the insert between RedoRecPtr and the
      checkpoint record, recovery from that checkpoint would not know to finish
      the incomplete insert. IOW, we have the same issue we solved with the
      rm_safe_restartpoint mechanism during normal operation too. It's highly
      unlikely to happen in practice, and this fix is far too large to backpatch,
      so we're just going to live with in previous versions, but this refactoring
      fixes it going forward.
      
      With this patch, you don't get the annoying
      'index "FOO" needs VACUUM or REINDEX to finish crash recovery' notices
      anymore if you crash at an unfortunate moment.
      9de3aa65
    • Bruce Momjian's avatar
      Document that BBU's do not allow partial page writes to be safely turned · 7a1ca897
      Bruce Momjian authored
      off unless they guarantee that all writes to the BBU arrive in 8kB chunks.
      
      Per discussion with Greg Smith
      7a1ca897
  2. Dec 22, 2010
  3. Dec 20, 2010
  4. Dec 19, 2010
    • Magnus Hagander's avatar
      Support for collecting crash dumps on Windows · dcb09b59
      Magnus Hagander authored
      Add support for collecting "minidump" style crash dumps on
      Windows, by setting up an exception handling filter. Crash
      dumps will be generated in PGDATA/crashdumps if the directory
      is created (the existance of the directory is used as on/off
      switch for the generation of the dumps).
      
      Craig Ringer and Magnus Hagander
      dcb09b59
  5. Dec 17, 2010
  6. Dec 16, 2010
  7. Dec 15, 2010
  8. Dec 14, 2010
  9. Dec 13, 2010
  10. Dec 11, 2010
  11. Dec 09, 2010
    • Tom Lane's avatar
      Force default wal_sync_method to be fdatasync on Linux. · 576477e7
      Tom Lane authored
      Recent versions of the Linux system header files cause xlogdefs.h to
      believe that open_datasync should be the default sync method, whereas
      formerly fdatasync was the default on Linux.  open_datasync is a bad
      choice, first because it doesn't actually outperform fdatasync (in fact
      the reverse), and second because we try to use O_DIRECT with it, causing
      failures on certain filesystems (e.g., ext4 with data=journal option).
      This part of the patch is largely per a proposal from Marti Raudsepp.
      More extensive changes are likely to follow in HEAD, but this is as much
      change as we want to back-patch.
      
      Also clean up confusing code and incorrect documentation surrounding the
      fsync_writethrough option.  Those changes shouldn't result in any actual
      behavioral change, but I chose to back-patch them anyway to keep the
      branches looking similar in this area.
      
      In 9.0 and HEAD, also do some copy-editing on the WAL Reliability
      documentation section.
      
      Back-patch to all supported branches, since any of them might get used
      on modern Linux versions.
      576477e7
  12. Dec 08, 2010
  13. Dec 04, 2010
  14. Dec 03, 2010
    • Robert Haas's avatar
    • Tom Lane's avatar
      Create core infrastructure for KNNGIST. · d583f10b
      Tom Lane authored
      This is a heavily revised version of builtin_knngist_core-0.9.  The
      ordering operators are no longer mixed in with actual quals, which would
      have confused not only humans but significant parts of the planner.
      Instead, ordering operators are carried separately throughout planning and
      execution.
      
      Since the API for ambeginscan and amrescan functions had to be changed
      anyway, this commit takes the opportunity to rationalize that a bit.
      RelationGetIndexScan no longer forces a premature index_rescan call;
      instead, callers of index_beginscan must call index_rescan too.  Aside from
      making the AM-side initialization logic a bit less peculiar, this has the
      advantage that we do not make a useless extra am_rescan call when there are
      runtime key values.  AMs formerly could not assume that the key values
      passed to amrescan were actually valid; now they can.
      
      Teodor Sigaev and Tom Lane
      d583f10b
  15. Nov 29, 2010
  16. Nov 27, 2010
    • Tom Lane's avatar
      Point out in default_tablespace's description that CREATE DATABASE ignores it. · c623365f
      Tom Lane authored
      Per gripe from Andreas Scherbaum.
      c623365f
    • Robert Haas's avatar
      New contrib module, auth_delay. · fe7a32fc
      Robert Haas authored
      KaiGai Kohei, with a few changes by me.
      fe7a32fc
    • Tom Lane's avatar
      d53c1255
    • Tom Lane's avatar
      Rewrite PQping to be more like what we agreed to last week. · db96e1cc
      Tom Lane authored
      Basically, we want to distinguish all cases where the connection was
      not made from those where it was.  A convenient proxy for this is to
      see if we got a message with a SQLSTATE code back from the postmaster.
      This presumes that the postmaster will always send us a SQLSTATE in
      a failure message, which is true for 7.4 and later postmasters in
      every case except fork failure.  (We could possibly complicate the
      postmaster code to do something about that, but it seems not worth
      the trouble, especially since pg_ctl's response for that case should
      be to keep waiting anyway.)
      
      If we did get a SQLSTATE from the postmaster, there are basically only
      two cases, as per last week's discussion: ERRCODE_CANNOT_CONNECT_NOW
      and everything else.  Any other error code implies that the postmaster
      is in principle willing to accept connections, it just didn't like or
      couldn't handle this particular request.  We want to make a special
      case for ERRCODE_CANNOT_CONNECT_NOW so that "pg_ctl start -w" knows
      it should keep waiting.
      
      In passing, pick names for the enum constants that are a tad less
      likely to present collision hazards in future.
      db96e1cc
  17. Nov 26, 2010
    • Robert Haas's avatar
      Add more ALTER <object> .. SET SCHEMA commands. · 55109313
      Robert Haas authored
      This adds support for changing the schema of a conversion, operator,
      operator class, operator family, text search configuration, text search
      dictionary, text search parser, or text search template.
      
      Dimitri Fontaine, with assorted corrections and other kibitzing.
      55109313
  18. Nov 25, 2010
  19. Nov 24, 2010
    • Bruce Momjian's avatar
      When reporting the server as not responding, if the hostname was · ba11258c
      Bruce Momjian authored
      supplied, also print the IP address.  This allows IPv4 and IPv6 failures
      to be distinguished.  Also useful when a hostname resolves to multiple
      IP addresses.
      
      Also, remove use of inet_ntoa() and use our own inet_net_ntop() in all
      places, including in libpq, because it is thread-safe.
      ba11258c
    • Tom Lane's avatar
      Create the system catalog infrastructure needed for KNNGIST. · 725d52d0
      Tom Lane authored
      This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
      column amcanorderbyop to pg_am.  For the moment all the entries in
      amcanorderbyop are "false", since the underlying support isn't there yet.
      
      Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
      [ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
      columns of pg_amop to be populated, and create pg_dump support for dumping
      that information.
      
      I also added some documentation, although it's perhaps a bit premature
      given that the feature doesn't do anything useful yet.
      
      Teodor Sigaev, Robert Haas, Tom Lane
      725d52d0
  20. Nov 23, 2010
  21. Nov 21, 2010
    • Robert Haas's avatar
      Add new SQL function, format(text). · 75048707
      Robert Haas authored
      Currently, three conversion format specifiers are supported: %s for a
      string, %L for an SQL literal, and %I for an SQL identifier.  The latter
      two are deliberately designed not to overlap with what sprintf() already
      supports, in case we want to add more of sprintf()'s functionality here
      later.
      
      Patch by Pavel Stehule, heavily revised by me.  Reviewed by Jeff Janes
      and, in earlier versions, by Itagaki Takahiro and Tom Lane.
      75048707
  22. Nov 18, 2010
Loading