Skip to content
Snippets Groups Projects
  1. Aug 17, 2012
    • Tom Lane's avatar
      Check LIBXML_VERSION instead of testing in configure script. · 33f40976
      Tom Lane authored
      We had put a test for libxml2's xmlStructuredErrorContext variable in
      configure, but of course that doesn't work on Windows builds.  The next
      best alternative seems to be to test the LIBXML_VERSION symbol provided
      by xmlversion.h.
      
      Per report from Talha Bin Rizwan, though this fixes it in a different way
      than his proposed patch.
      33f40976
  2. Aug 16, 2012
    • Tom Lane's avatar
      Allow create_index_paths() to consider multiple join bitmapscan paths. · 26893906
      Tom Lane authored
      In the initial cut at the "parameterized paths" feature, I'd simplified
      create_index_paths() to the point where it would only generate a single
      parameterized bitmap path per relation.  Experimentation with an example
      supplied by Josh Berkus convinces me that that's not good enough: we really
      need to consider a bitmap path for each possible outer relation.  Otherwise
      we have regressions relative to pre-9.2 versions, in which the planner
      picks a plain indexscan where it should have used a bitmap scan in queries
      involving three or more tables.  Indeed, after fixing this, several queries
      in the regression tests show improved plans as a result of using bitmap not
      plain indexscans.
      26893906
    • Heikki Linnakangas's avatar
      Fix GiST buffering build bug, which caused "failed to re-find parent" errors. · d6524697
      Heikki Linnakangas authored
      We use a hash table to track the parents of inner pages, but when inserting
      to a leaf page, the caller of gistbufferinginserttuples() must pass a
      correct block number of the leaf's parent page. Before gistProcessItup()
      descends to a child page, it checks if the downlink needs to be adjusted to
      accommodate the new tuple, and updates the downlink if necessary. However,
      updating the downlink might require splitting the page, which might move the
      downlink to a page to the right. gistProcessItup() doesn't realize that, so
      when it descends to the leaf page, it might pass an out-of-date parent block
      number as a result. Fix that by returning the block a tuple was inserted to
      from gistbufferinginserttuples().
      
      This fixes the bug reported by Zdeněk Jílovec.
      d6524697
    • Tom Lane's avatar
      Fix rescan logic in nodeCtescan. · caf97eb7
      Tom Lane authored
      The previous coding essentially assumed that nodes would be rescanned in
      the same order they were initialized in; or at least that the "leader" of
      a group of CTEscans would be rescanned before any others were required to
      execute.  Unfortunately, that isn't even a little bit true.  It's possible
      to devise queries in which the leader isn't rescanned until other CTEscans
      on the same CTE have run to completion, or even in which the leader never
      gets a rescan call at all.
      
      The fix makes the leader specially responsible only for initial creation
      and final destruction of the tuplestore; rescan resets are now a
      symmetrically shared responsibility.  This means that we might reset the
      tuplestore multiple times when restarting a plan subtree containing
      multiple CTEscans; but resetting an already-empty tuplestore is cheap
      enough that that doesn't seem like a problem.
      
      Per report from Adam Mackler; the new regression test cases are based on
      his example query.
      
      Back-patch to 8.4 where CTE scans were introduced.
      caf97eb7
  3. Aug 15, 2012
    • Tom Lane's avatar
      Disallow extensions from owning the schema they are assigned to. · 82634a88
      Tom Lane authored
      This situation creates a dependency loop that confuses pg_dump and probably
      other things.  Moreover, since the mental model is that the extension
      "contains" schemas it owns, but "is contained in" its extschema (even
      though neither is strictly true), having both true at once is confusing for
      people too.  So prevent the situation from being set up.
      
      Reported and patched by Thom Brown.  Back-patch to 9.1 where extensions
      were added.
      82634a88
    • Tom Lane's avatar
      Resurrect the "last ditch" code path in join_search_one_level(). · 43ccd309
      Tom Lane authored
      This essentially reverts commit e54b10a6,
      in which I'd decided that the "last ditch" join logic was useless.  The
      folly of that is now exposed by a report from Pavel Stehule: although the
      function should always find at least one join in a self-contained join
      problem, it can still fail to do so in a sub-problem created by artificial
      from_collapse_limit or join_collapse_limit constraints.  Adjust the
      comments to describe this, and simplify the code a bit to match the new
      coding of the earlier loop in the function.
      
      I'm not terribly happy about this: I still subscribe to the opinion stated
      in the previous commit message that the "last ditch" code can obscure logic
      bugs elsewhere.  But the alternative seems to be to complicate the earlier
      tests for does-this-relation-have-a-join-clause to the point where they can
      tell whether the join clauses link outside the current join sub-problem.
      And that looks messy, slow, and possibly a source of bugs in itself.
      In any case, now is not the time to be inserting experimental code into
      9.2, so let's just go back to the time-tested solution.
      43ccd309
    • Tom Lane's avatar
      Prevent access to external files/URLs via XML entity references. · aa2bc1f2
      Tom Lane authored
      xml_parse() would attempt to fetch external files or URLs as needed to
      resolve DTD and entity references in an XML value, thus allowing
      unprivileged database users to attempt to fetch data with the privileges
      of the database server.  While the external data wouldn't get returned
      directly to the user, portions of it could be exposed in error messages
      if the data didn't parse as valid XML; and in any case the mere ability
      to check existence of a file might be useful to an attacker.
      
      The ideal solution to this would still allow fetching of references that
      are listed in the host system's XML catalogs, so that documents can be
      validated according to installed DTDs.  However, doing that with the
      available libxml2 APIs appears complex and error-prone, so we're not going
      to risk it in a security patch that necessarily hasn't gotten wide review.
      So this patch merely shuts off all access, causing any external fetch to
      silently expand to an empty string.  A future patch may improve this.
      
      In HEAD and 9.2, also suppress warnings about undefined entities, which
      would otherwise occur as a result of not loading referenced DTDs.  Previous
      branches don't show such warnings anyway, due to different error handling
      arrangements.
      
      Credit to Noah Misch for first reporting the problem, and for much work
      towards a solution, though this simplistic approach was not his preference.
      Also thanks to Daniel Veillard for consultation.
      
      Security: CVE-2012-3489
      aa2bc1f2
  4. Aug 14, 2012
  5. Aug 11, 2012
    • Tom Lane's avatar
      Fix dependencies generated during ALTER TABLE ADD CONSTRAINT USING INDEX. · ca07d7eb
      Tom Lane authored
      This command generated new pg_depend entries linking the index to the
      constraint and the constraint to the table, which match the entries made
      when a unique or primary key constraint is built de novo.  However, it did
      not bother to get rid of the entries linking the index directly to the
      table.  We had considered the issue when the ADD CONSTRAINT USING INDEX
      patch was written, and concluded that we didn't need to get rid of the
      extra entries.  But this is wrong: ALTER COLUMN TYPE wasn't expecting such
      redundant dependencies to exist, as reported by Hubert Depesz Lubaczewski.
      On reflection it seems rather likely to break other things as well, since
      there are many bits of code that crawl pg_depend for one purpose or
      another, and most of them are pretty naive about what relationships they're
      expecting to find.  Fortunately it's not that hard to get rid of the extra
      dependency entries, so let's do that.
      
      Back-patch to 9.1, where ALTER TABLE ADD CONSTRAINT USING INDEX was added.
      ca07d7eb
  6. Aug 10, 2012
  7. Aug 09, 2012
  8. Aug 08, 2012
    • Alvaro Herrera's avatar
      Fix typo in comment · 7c055d64
      Alvaro Herrera authored
      7c055d64
    • Simon Riggs's avatar
      Fix minor bug in XLogFileRead() that accidentally worked. · df09dbbc
      Simon Riggs authored
      Cascading replication copied the incoming file into pg_xlog but
      didn't set path correctly, so the first attempt to open file failed
      causing it to loop around and look for file in pg_xlog. So the
      earlier coding worked, but accidentally rather than by design.
      
      Spotted by Fujii Masao, fix by Fujii Masao and Simon Riggs
      df09dbbc
    • Tom Lane's avatar
      Fix TwoPhaseGetDummyBackendId(). · 5cf2307c
      Tom Lane authored
      This was broken in commit ed0b409d,
      which revised the GlobalTransactionData struct to not include the
      associated PGPROC as its first member, but overlooked one place where
      a cast was used in reliance on that equivalence.
      
      The most effective way of fixing this seems to be to create a new function
      that looks up the GlobalTransactionData struct given the XID, and make
      both TwoPhaseGetDummyBackendId and TwoPhaseGetDummyProc rely on that.
      
      Per report from Robert Ross.
      5cf2307c
  9. Aug 07, 2012
  10. Aug 06, 2012
  11. Aug 03, 2012
    • Tom Lane's avatar
      Fix bugs with parsing signed hh:mm and hh:mm:ss fields in interval input. · 225fe68c
      Tom Lane authored
      DecodeInterval() failed to honor the "range" parameter (the special SQL
      syntax for indicating which fields appear in the literal string) if the
      time was signed.  This seems inappropriate, so make it work like the
      not-signed case.  The inconsistency was introduced in my commit
      f867339c, which as noted in its log message
      was only really focused on making SQL-compliant literals work per spec.
      Including a sign here is not per spec, but if we're going to allow it
      then it's reasonable to expect it to work like the not-signed case.
      
      Also, remove bogus setting of tmask, which caused subsequent processing to
      think that what had been given was a timezone and not an hh:mm(:ss) field,
      thus confusing checks for redundant fields.  This seems to be an aboriginal
      mistake in Lockhart's commit 2cf16424.
      
      Add regression test cases to illustrate the changed behaviors.
      
      Back-patch as far as 8.4, where support for spec-compliant interval
      literals was added.
      
      Range problem reported and diagnosed by Amit Kapila, tmask problem by me.
      225fe68c
    • Tom Lane's avatar
      Improve underdocumented btree_xlog_delete_get_latestRemovedXid() code. · 11de73b2
      Tom Lane authored
      As noted by Noah Misch, btree_xlog_delete_get_latestRemovedXid is
      critically dependent on the assumption that it's examining a consistent
      state of the database.  This was undocumented though, so the
      seemingly-unrelated check for no active HS sessions might be thought to be
      merely an optional optimization.  Improve comments, and add an explicit
      check of reachedConsistency just to be sure.
      
      This function returns InvalidTransactionId (thereby killing all HS
      transactions) in several cases that are not nearly unlikely enough for my
      taste.  This commit doesn't attempt to fix those deficiencies, just
      document them.
      
      Back-patch to 9.2, not from any real functional need but just to keep the
      branches more closely synced to simplify possible future back-patching.
      11de73b2
    • Tom Lane's avatar
      In SPGiST replay, do conflict resolution before modifying the page. · dd6947aa
      Tom Lane authored
      In yesterday's commit 962e0cc7, I added the
      ResolveRecoveryConflictWithSnapshot call in the wrong place.  I correctly
      put it before spgRedoVacuumRedirect itself would modify the index page ---
      but not before RestoreBkpBlocks, so replay of a record with a full-page
      image would modify the page before kicking off any conflicting HS
      transactions.  Oops.
      dd6947aa
  12. Aug 02, 2012
    • Peter Eisentraut's avatar
      Translation updates · 095bcf93
      Peter Eisentraut authored
      095bcf93
    • Tom Lane's avatar
      Fix race conditions associated with SPGiST redirection tuples. · 7f7c93f8
      Tom Lane authored
      The correct test for whether a redirection tuple is removable is whether
      tuple's xid < RecentGlobalXmin, not OldestXmin; the previous coding
      failed to protect index searches being done in concurrent transactions that
      have no XID.  This mirrors the recent fix in btree's page recycling logic
      made in commit d3abbbeb.
      
      Also, WAL-log the newest XID of any removed redirection tuple on an index
      page, and apply ResolveRecoveryConflictWithSnapshot during InHotStandby WAL
      replay.  This protects against concurrent Hot Standby transactions possibly
      needing to see the redirection tuple(s).
      
      Per my query of 2012-03-12 and subsequent discussion.
      7f7c93f8
  13. Jul 31, 2012
    • Tom Lane's avatar
      Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT). · 3786b9b4
      Tom Lane authored
      Parse analysis neglected to cover the case of a WITH clause attached to an
      intermediate-level set operation; it only handled WITH at the top level
      or WITH attached to a leaf-level SELECT.  Per report from Adam Mackler.
      
      In HEAD, I rearranged the order of SelectStmt's fields to put withClause
      with the other fields that can appear on non-leaf SelectStmts.  In back
      branches, leave it alone to avoid a possible ABI break for third-party
      code.
      
      Back-patch to 8.4 where WITH support was added.
      3786b9b4
    • Tom Lane's avatar
      Fix syslogger so that log_truncate_on_rotation works in the first rotation. · 63aba79c
      Tom Lane authored
      In the original coding of the log rotation stuff, we did not bother to make
      the truncation logic work for the very first rotation after postmaster
      start (or after a syslogger crash and restart).  It just always appended
      in that case.  It did not seem terribly important at the time, but we've
      recently had two separate complaints from people who expected it to work
      unsurprisingly.  (Both users tend to restart the postmaster about as often
      as a log rotation is configured to happen, which is maybe not typical use,
      but still...)  Since the initial log file is opened in the postmaster,
      fixing this requires passing down some more state to the syslogger child
      process.
      
      It's always been like this, so back-patch to all supported branches.
      63aba79c
  14. Jul 26, 2012
    • Tom Lane's avatar
      Only allow autovacuum to be auto-canceled by a directly blocked process. · 07399f44
      Tom Lane authored
      In the original coding of the autovacuum cancel feature, commit
      acac68b2, an autovacuum process was
      considered a target for cancellation if it was found to hard-block any
      process examined in the deadlock search.  This patch tightens the test so
      that the autovacuum must directly hard-block the current process.  This
      should make the behavior more predictable in general, and in particular
      it ensures that an autovacuum will not be canceled with less than
      deadlock_timeout grace period.  In the old coding, it was possible for an
      autovacuum to be canceled almost instantly, given unfortunate timing of two
      or more other processes' lock attempts.
      
      This also justifies the logging methodology in the recent commit
      d7318d43; without this restriction, that
      patch isn't providing enough information to see the connection of the
      canceling process to the autovacuum.  Like that one, patch all the way
      back.
      07399f44
    • Robert Haas's avatar
      Log a better message when canceling autovacuum. · a5bca248
      Robert Haas authored
      The old message was at DEBUG2, so typically it didn't show up in the
      log at all.  As a result, in most cases where autovacuum was canceled,
      the only information that was logged was the table being vacuumed,
      with no indication as to what problem caused the cancel.  Crank up
      the level to LOG and add some more details to assist with debugging.
      
      Back-patch all the way, per discussion on pgsql-hackers.
      a5bca248
  15. Jul 25, 2012
    • Tom Lane's avatar
      Fix longstanding crash-safety bug with newly-created-or-reset sequences. · a4a7eb37
      Tom Lane authored
      If a crash occurred immediately after the first nextval() call for a serial
      column, WAL replay would restore the sequence to a state in which it
      appeared that no nextval() had been done, thus allowing the first sequence
      value to be returned again by the next nextval() call; as reported in
      bug #6748 from Xiangming Mei.
      
      More generally, the problem would occur if an ALTER SEQUENCE was executed
      on a freshly created or reset sequence.  (The manifestation with serial
      columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step
      to serial column creation.)  The cause is that sequence creation attempted
      to save one WAL entry by writing out a WAL record that made it appear that
      the first nextval() had already happened (viz, with is_called = true),
      while marking the sequence's in-database state with log_cnt = 1 to show
      that the first nextval() need not emit a WAL record.  However, ALTER
      SEQUENCE would emit a new WAL entry reflecting the actual in-database state
      (with is_called = false).  Then, nextval would allocate the first sequence
      value and set is_called = true, but it would trust the log_cnt value and
      not emit any WAL record.  A crash at this point would thus restore the
      sequence to its post-ALTER state, causing the next nextval() call to return
      the first sequence value again.
      
      To fix, get rid of the idea of logging an is_called status different from
      reality.  This means that the first nextval-driven WAL record will happen
      at the first nextval call not the second, but the marginal cost of that is
      pretty negligible.  In addition, make sure that ALTER SEQUENCE resets
      log_cnt to zero in any case where it touches sequence parameters that
      affect future nextval results.  This will result in some user-visible
      changes in the contents of a sequence's log_cnt column, as reflected in the
      patch's regression test changes; but no application should be depending on
      that anyway, since it was already true that log_cnt changes rather
      unpredictably depending on checkpoint timing.
      
      In addition, make some basically-cosmetic improvements to get rid of
      sequence.c's undesirable intimacy with page layout details.  It was always
      really trying to WAL-log the contents of the sequence tuple, so we should
      have it do that directly using a HeapTuple's t_data and t_len, rather than
      backing into it with some magic assumptions about where the tuple would be
      on the sequence's page.
      
      Back-patch to all supported branches.
      a4a7eb37
  16. Jul 24, 2012
    • Alvaro Herrera's avatar
      Change syntax of new CHECK NO INHERIT constraints · 68043258
      Alvaro Herrera authored
      The initially implemented syntax, "CHECK NO INHERIT (expr)" was not
      deemed very good, so switch to "CHECK (expr) NO INHERIT" instead.  This
      way it looks similar to SQL-standards compliant constraint attribute.
      
      Backport to 9.2 where the new syntax and feature was introduced.
      
      Per discussion.
      68043258
  17. Jul 21, 2012
    • Tom Lane's avatar
      Account for SRFs in targetlists in planner rowcount estimates. · 641054ad
      Tom Lane authored
      We made use of the ROWS estimate for set-returning functions used in FROM,
      but not for those used in SELECT targetlists; which is a bit of an
      oversight considering there are common usages that require the latter
      approach.  Improve that.  (I had initially thought it might be worth
      folding this into cost_qual_eval, but after investigation concluded that
      that wouldn't be very helpful, so just do it separately.)  Per complaint
      from David Johnston.
      
      Back-patch to 9.2, but not further, for fear of destabilizing plan choices
      in existing releases.
      641054ad
  18. Jul 20, 2012
    • Alvaro Herrera's avatar
      connoinherit may be true only for CHECK constraints · d721f208
      Alvaro Herrera authored
      The code was setting it true for other constraints, which is
      bogus.  Doing so caused bogus catalog entries for such constraints, and
      in particular caused an error to be raised when trying to drop a
      constraint of types other than CHECK from a table that has children,
      such as reported in bug #6712.
      
      In 9.2, additionally ignore connoinherit=true for other constraint
      types, to avoid having to force initdb; existing databases might already
      contain bogus catalog entries.
      
      Includes a catversion bump (in HEAD only).
      
      Bug report from Miroslav Šulc
      Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.
      d721f208
    • Tom Lane's avatar
      Fix whole-row Var evaluation to cope with resjunk columns (again). · d7991a13
      Tom Lane authored
      When a whole-row Var is reading the result of a subquery, we need it to
      ignore any "resjunk" columns that the subquery might have evaluated for
      GROUP BY or ORDER BY purposes.  We've hacked this area before, in commit
      68e40998, but that fix only covered
      whole-row Vars of named composite types, not those of RECORD type; and it
      was mighty klugy anyway, since it just assumed without checking that any
      extra columns in the result must be resjunk.  A proper fix requires getting
      hold of the subquery's targetlist so we can actually see which columns are
      resjunk (whereupon we can use a JunkFilter to get rid of them).  So bite
      the bullet and add some infrastructure to make that possible.
      
      Per report from Andrew Dunstan and additional testing by Merlin Moncure.
      Back-patch to all supported branches.  In 8.3, also back-patch commit
      292176a1, which for some reason I had
      not done at the time, but it's a prerequisite for this change.
      d7991a13
    • Tom Lane's avatar
      Rethink checkpointer's fsync-request table representation. · e3981da3
      Tom Lane authored
      Instead of having one hash table entry per relation/fork/segment, just have
      one per relation, and use bitmapsets to represent which specific segments
      need to be fsync'd.  This eliminates the need to scan the whole hash table
      to implement FORGET_RELATION_FSYNC, which fixes the O(N^2) behavior
      recently demonstrated by Jeff Janes for cases involving lots of TRUNCATE or
      DROP TABLE operations during a single checkpoint cycle.  Per an idea from
      Robert Haas.
      
      (FORGET_DATABASE_FSYNC still sucks, but since dropping a database is a
      pretty expensive operation anyway, we'll live with that.)
      
      In passing, improve the delayed-unlink code: remove the pass over the list
      in mdpreckpt, since it wasn't doing anything for us except supporting a
      useless Assert in mdpostckpt, and fix mdpostckpt so that it will absorb
      fsync requests every so often when clearing a large backlog of deletion
      requests.
      e3981da3
  19. Jul 19, 2012
  20. Jul 18, 2012
    • Heikki Linnakangas's avatar
      Refactor the way code is shared between some range type functions. · 79c49131
      Heikki Linnakangas authored
      Functions like range_eq, range_before etc. are exposed at the SQL-level, but
      they're also used internally by the GiST consistent support function. The
      code sharing was done by a hack, TrickFunctionCall2, which relied on the
      knowledge that all the functions used fn_extra the same way. This commit
      splits the functions into internal versions that take a TypeCacheEntry as
      argument, and thin wrappers to expose the functions at the SQL-level. The
      internal versions can then be called directly and in a less hacky way from
      the GiST consistent function.
      
      This is just cosmetic, but backpatch to 9.2 anyway, to avoid having a
      different version of this code in the 9.2 branch. That would make
      backpatching fixes in this area more difficult.
      
      Alexander Korotkov
      79c49131
    • Tom Lane's avatar
      Fix statistics breakage from bgwriter/checkpointer process split. · 1e9326d6
      Tom Lane authored
      ForwardFsyncRequest() supposed that it could only be called in regular
      backends, which used to be true; but since the splitup of bgwriter and
      checkpointer, it is also called in the bgwriter.  We do not want to count
      such calls in pg_stat_bgwriter.buffers_backend statistics, so fix things
      so that they aren't.
      
      (It's worth noting here that this implies an alarmingly large increase in
      the expected amount of cross-process fsync request traffic, which may well
      mean that the process splitup was not such a hot idea.)
      1e9326d6
    • Tom Lane's avatar
      Fix management of pendingOpsTable in auxiliary processes. · d843589e
      Tom Lane authored
      mdinit() was misusing IsBootstrapProcessingMode() to decide whether to
      create an fsync pending-operations table in the current process.  This led
      to creating a table not only in the startup and checkpointer processes as
      intended, but also in the bgwriter process, not to mention other auxiliary
      processes such as walwriter and walreceiver.  Creation of the table in the
      bgwriter is fatal, because it absorbs fsync requests that should have gone
      to the checkpointer; instead they just sit in bgwriter local memory and are
      never acted on.  So writes performed by the bgwriter were not being fsync'd
      which could result in data loss after an OS crash.  I think there is no
      live bug with respect to walwriter and walreceiver because those never
      perform any writes of shared buffers; but the potential is there for
      future breakage in those processes too.
      
      To fix, make AuxiliaryProcessMain() export the current process's
      AuxProcType as a global variable, and then make mdinit() test directly for
      the types of aux process that should have a pendingOpsTable.  Having done
      that, we might as well also get rid of the random bool flags such as
      am_walreceiver that some of the aux processes had grown.  (Note that we
      could not have fixed the bug by examining those variables in mdinit(),
      because it's called from BaseInit() which is run by AuxiliaryProcessMain()
      before entering any of the process-type-specific code.)
      
      Back-patch to 9.2, where the problem was introduced by the split-up of
      bgwriter and checkpointer processes.  The bogus pendingOpsTable exists
      in walwriter and walreceiver processes in earlier branches, but absent
      any evidence that it causes actual problems there, I'll leave the older
      branches alone.
      d843589e
  21. Jul 17, 2012
    • Tom Lane's avatar
      Improve coding around the fsync request queue. · 4abcce8c
      Tom Lane authored
      In all branches back to 8.3, this patch fixes a questionable assumption in
      CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are
      no uninitialized pad bytes in the request queue structs.  This would only
      cause trouble if (a) there were such pad bytes, which could happen in 8.4
      and up if the compiler makes enum ForkNumber narrower than 32 bits, but
      otherwise would require not-currently-planned changes in the widths of
      other typedefs; and (b) the kernel has not uniformly initialized the
      contents of shared memory to zeroes.  Still, it seems a tad risky, and we
      can easily remove any risk by pre-zeroing the request array for ourselves.
      In addition to that, we need to establish a coding rule that struct
      RelFileNode can't contain any padding bytes, since such structs are copied
      into the request array verbatim.  (There are other places that are assuming
      this anyway, it turns out.)
      
      In 9.1 and up, the risk was a bit larger because we were also effectively
      assuming that struct RelFileNodeBackend contained no pad bytes, and with
      fields of different types in there, that would be much easier to break.
      However, there is no good reason to ever transmit fsync or delete requests
      for temp files to the bgwriter/checkpointer, so we can revert the request
      structs to plain RelFileNode, getting rid of the padding risk and saving
      some marginal number of bytes and cycles in fsync queue manipulation while
      we are at it.  The savings might be more than marginal during deletion of
      a temp relation, because the old code transmitted an entirely useless but
      nonetheless expensive-to-process ForgetRelationFsync request to the
      background process, and also had the background process perform the file
      deletion even though that can safely be done immediately.
      
      In addition, make some cleanup of nearby comments and small improvements to
      the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
      4abcce8c
  22. Jul 16, 2012
    • Tom Lane's avatar
      Avoid pre-determining index names during CREATE TABLE LIKE parsing. · 3727240d
      Tom Lane authored
      Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE
      had to pre-assign names to indexes that had comments, because it made up an
      explicit CommentStmt command to apply the comment and so it had to know the
      name for the index.  This creates bad interactions with other indexes, as
      shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't
      take any other indexes into account so it could choose a conflicting name.
      
      To fix, add a field to IndexStmt that allows it to carry a comment to be
      assigned to the new index.  (This isn't a user-exposed feature of CREATE
      INDEX, only an internal option.)  Now we don't need preassignment of index
      names in any situation.
      
      I also took the opportunity to refactor DefineIndex to accept the IndexStmt
      as such, rather than passing all its fields individually in a mile-long
      parameter list.
      
      Back-patch to 9.2, but no further, because it seems too dangerous to change
      IndexStmt or DefineIndex's API in released branches.  The bug exists back
      to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given
      the lack of prior complaints we'll just let it go unfixed before 9.2.
      3727240d
Loading