Skip to content
Snippets Groups Projects
  1. Feb 09, 2013
  2. Feb 07, 2013
    • Tom Lane's avatar
      Repair bugs in GiST page splitting code for multi-column indexes. · 166d534f
      Tom Lane authored
      When considering a non-last column in a multi-column GiST index,
      gistsplit.c tries to improve on the split chosen by the opclass-specific
      pickSplit function by considering penalties for the next column.  However,
      there were two bugs in this code: it failed to recompute the union keys for
      the leftmost index columns, even though these might well change after
      reassigning tuples; and it included the old union keys in the recomputation
      for the columns it did recompute, so that those keys couldn't get smaller
      even if they should.  The first problem could result in an invalid index
      in which searches wouldn't find index entries that are in fact present;
      the second would make the index less efficient to search.
      
      Both of these errors were caused by misuse of gistMakeUnionItVec, whose
      API was designed in a way that just begged such errors to be made.  There
      is no situation in which it's safe or useful to compute the union keys for
      a subset of the index columns, and there is no caller that wants any
      previous union keys to be included in the computation; so the undocumented
      choice to treat the union keys as in/out rather than pure output parameters
      is a waste of code as well as being dangerous.
      
      Hence, rather than just making a minimal patch, I've changed the API of
      gistMakeUnionItVec to remove the "startkey" parameter (it now always
      processes all index columns) and treat the attr/isnull arrays as purely
      output parameters.
      
      In passing, also get rid of a couple of unnecessary and dangerous uses
      of static variables in gistutil.c.  It's remarkable that the one in
      gistMakeUnionKey hasn't given us portability troubles before now, because
      in addition to posing a re-entrancy hazard, it was unsafely assuming that
      a static char[] array would have at least Datum alignment.
      
      Per investigation of a trouble report from Tomas Vondra.  (There are also
      some bugs in contrib/btree_gist to be fixed, but that seems like material
      for a separate patch.)  Back-patch to all supported branches.
      166d534f
  3. Jan 01, 2013
  4. Jun 10, 2012
  5. Jan 30, 2012
  6. Jan 02, 2012
  7. Apr 23, 2011
  8. Jan 01, 2011
  9. Dec 23, 2010
    • Heikki Linnakangas's avatar
      Rewrite the GiST insertion logic so that we don't need the post-recovery · 9de3aa65
      Heikki Linnakangas authored
      cleanup stage to finish incomplete inserts or splits anymore. There was two
      reasons for the cleanup step:
      
      1. When a new tuple was inserted to a leaf page, the downlink in the parent
      needed to be updated to contain (ie. to be consistent with) the new key.
      Updating the parent in turn might require recursively updating the parent of
      the parent. We now handle that by updating the parent while traversing down
      the tree, so that when we insert the leaf tuple, all the parents are already
      consistent with the new key, and the tree is consistent at every step.
      
      2. When a page is split, we need to insert the downlink for the new right
      page(s), and update the downlink for the original page to not include keys
      that moved to the right page(s). We now handle that by setting a new flag,
      F_FOLLOW_RIGHT, on the non-rightmost pages in the split. When that flag is
      set, scans always follow the rightlink, regardless of the NSN mechanism used
      to detect concurrent page splits. That way the tree is consistent right after
      split, even though the downlink is still missing. This is very similar to the
      way B-tree splits are handled. When the downlink is inserted in the parent,
      the flag is cleared. To keep the insertion algorithm simple, when an
      insertion sees an incomplete split, indicated by the F_FOLLOW_RIGHT flag, it
      finishes the split before doing anything else.
      
      These changes allow removing the whole "invalid tuple" mechanism, but I
      retained the scan code to still follow invalid tuples correctly. While we
      don't create any such tuples anymore, we want to handle them gracefully in
      case you pg_upgrade a GiST index that has them. If we encounter any on an
      insert, though, we just throw an error saying that you need to REINDEX.
      
      The issue that got me into doing this is that if you did a checkpoint while
      an insert or split was in progress, and the checkpoint finishes quickly so
      that there is no WAL record related to the insert between RedoRecPtr and the
      checkpoint record, recovery from that checkpoint would not know to finish
      the incomplete insert. IOW, we have the same issue we solved with the
      rm_safe_restartpoint mechanism during normal operation too. It's highly
      unlikely to happen in practice, and this fix is far too large to backpatch,
      so we're just going to live with in previous versions, but this refactoring
      fixes it going forward.
      
      With this patch, you don't get the annoying
      'index "FOO" needs VACUUM or REINDEX to finish crash recovery' notices
      anymore if you crash at an unfortunate moment.
      9de3aa65
  10. Sep 20, 2010
  11. Jan 02, 2010
  12. Jun 24, 2009
  13. Jun 11, 2009
  14. Jun 10, 2009
  15. Apr 06, 2009
  16. Jan 01, 2009
  17. Jun 19, 2008
  18. Jan 01, 2008
  19. Jan 05, 2007
  20. Oct 04, 2006
  21. Jul 14, 2006
  22. Jun 28, 2006
Loading