Commits · ea56ed9a1e2b5164b02f4a030fb327346612b2d9 · Jakob Huber / postgres-lambda-diff

Aug 02, 2012

Replace libpq's "row processor" API with a "single row" mode. · ea56ed9a

Tom Lane authored 12 years ago

After taking awhile to digest the row-processor feature that was added to
libpq in commit 92785dac, we've concluded
it is over-complicated and too hard to use. Leave the core infrastructure
changes in place (that is, there's still a row processor function inside
libpq), but remove the exposed API pieces, and instead provide a "single
row" mode switch that causes PQgetResult to return one row at a time in
separate PGresult objects.

This approach incurs more overhead than proper use of a row processor
callback would, since construction of a PGresult per row adds extra cycles.
However, it is far easier to use and harder to break. The single-row mode
still affords applications the primary benefit that the row processor API
was meant to provide, namely not having to accumulate large result sets in
memory before processing them. Preliminary testing suggests that we can
probably buy back most of the extra cycles by micro-optimizing construction
of the extra results, but that task will be left for another day.

Marko Kreen

ea56ed9a

Aug 01, 2012
- Add documentation cross-reference for JSON functions. · f6fb9f10
  Tom Lane authored 12 years ago
  
  Thom Brown
  f6fb9f10
Jul 31, 2012

Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT). · 3786b9b4

Tom Lane authored 12 years ago

Parse analysis neglected to cover the case of a WITH clause attached to an
intermediate-level set operation; it only handled WITH at the top level
or WITH attached to a leaf-level SELECT.  Per report from Adam Mackler.

In HEAD, I rearranged the order of SelectStmt's fields to put withClause
with the other fields that can appear on non-leaf SelectStmts.  In back
branches, leave it alone to avoid a possible ABI break for third-party
code.

Back-patch to 8.4 where WITH support was added.

3786b9b4

Fix syslogger so that log_truncate_on_rotation works in the first rotation. · 63aba79c

Tom Lane authored 12 years ago

In the original coding of the log rotation stuff, we did not bother to make
the truncation logic work for the very first rotation after postmaster
start (or after a syslogger crash and restart). It just always appended
in that case. It did not seem terribly important at the time, but we've
recently had two separate complaints from people who expected it to work
unsurprisingly. (Both users tend to restart the postmaster about as often
as a log rotation is configured to happen, which is maybe not typical use,
but still...) Since the initial log file is opened in the postmaster,
fixing this requires passing down some more state to the syslogger child
process.

It's always been like this, so back-patch to all supported branches.

63aba79c

pg_basebackup: stylistic adjustments · 65f33352

Alvaro Herrera authored 12 years ago

The most user-visible part of this is to change the long options
--statusint and --noloop to --status-interval and --no-loop,
respectively, per discussion.

Also, consistently enclose file names in double quotes, per our
conventions; and consistently use the term "transaction log file" to
talk about WAL segments.  (Someday we may need to go over this
terminology and make it consistent across the whole source code.)

Finally, reflow the code to better fit in 80 columns, and have pgindent
fix it up some more.

65f33352

Fix memory and file descriptor leaks in pg_receivexlog/pg_basebackup · 776bdc4c

Alvaro Herrera authored 12 years ago

When the internal loop mode was added, freeing memory and closing
filedescriptors before returning became important, and a few cases
in the code missed that.

This is a backpatch of commit 058a050e to the 9.2 branch, which seems to
have been neglected (in error, because the bugs it fixes were introduced
in commit 16282ae6 which is present in both master and 9.2).

Fujii Masao

776bdc4c

Jul 30, 2012
- Now that the diskchecker.pl author has updated the download link on his · 99dd2a39
  Bruce Momjian authored 12 years ago
  
  website, revert the separate link to the download git repository. Backpatch from 9.0 to current.
  99dd2a39
Jul 28, 2012

Improve reporting of error situations in find_other_exec(). · 62d69045

Tom Lane authored 12 years ago

This function suppressed any stderr output from the called program, which
is unnecessary in the normal case and unhelpful in error cases. It also
gave a rather opaque message along the lines of "fgets failure: Success"
in case the called program failed to return anything on stdout. Since
we've seen multiple reports of people not understanding what's wrong when
pg_ctl reports this, improve the message.

Back-patch to all active branches.

62d69045

Jul 27, 2012
- Update doc mention of diskchecker.pl to add URL for script; retain URL · 9ebe82bd
  Bruce Momjian authored 12 years ago
  
  for description. Patch to 9.0 and later, where script is mentioned.
  9ebe82bd
Jul 26, 2012

Document that the pg_upgrade user of rsync might want to skip some · 54e3c4af
Bruce Momjian authored 12 years ago
```
files, like postmaster.pid.

Backpatch to 9.2.
```
54e3c4af

Only allow autovacuum to be auto-canceled by a directly blocked process. · 07399f44

Tom Lane authored 12 years ago

In the original coding of the autovacuum cancel feature, commit
acac68b2, an autovacuum process was
considered a target for cancellation if it was found to hard-block any
process examined in the deadlock search.  This patch tightens the test so
that the autovacuum must directly hard-block the current process.  This
should make the behavior more predictable in general, and in particular
it ensures that an autovacuum will not be canceled with less than
deadlock_timeout grace period.  In the old coding, it was possible for an
autovacuum to be canceled almost instantly, given unfortunate timing of two
or more other processes' lock attempts.

This also justifies the logging methodology in the recent commit
d7318d43; without this restriction, that
patch isn't providing enough information to see the connection of the
canceling process to the autovacuum.  Like that one, patch all the way
back.

07399f44

Log a better message when canceling autovacuum. · a5bca248

Robert Haas authored 12 years ago

The old message was at DEBUG2, so typically it didn't show up in the
log at all. As a result, in most cases where autovacuum was canceled,
the only information that was logged was the table being vacuumed,
with no indication as to what problem caused the cancel. Crank up
the level to LOG and add some more details to assist with debugging.

Back-patch all the way, per discussion on pgsql-hackers.

a5bca248

Simplify pg_upgrade's handling when returning directory listings. · ba98239d
Bruce Momjian authored 12 years ago
```
Backpatch to 9.2.
```
ba98239d

Jul 25, 2012

Fix longstanding crash-safety bug with newly-created-or-reset sequences. · a4a7eb37

Tom Lane authored 12 years ago

If a crash occurred immediately after the first nextval() call for a serial
column, WAL replay would restore the sequence to a state in which it
appeared that no nextval() had been done, thus allowing the first sequence
value to be returned again by the next nextval() call; as reported in
bug #6748 from Xiangming Mei.

More generally, the problem would occur if an ALTER SEQUENCE was executed
on a freshly created or reset sequence. (The manifestation with serial
columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step
to serial column creation.) The cause is that sequence creation attempted
to save one WAL entry by writing out a WAL record that made it appear that
the first nextval() had already happened (viz, with is_called = true),
while marking the sequence's in-database state with log_cnt = 1 to show
that the first nextval() need not emit a WAL record. However, ALTER
SEQUENCE would emit a new WAL entry reflecting the actual in-database state
(with is_called = false). Then, nextval would allocate the first sequence
value and set is_called = true, but it would trust the log_cnt value and
not emit any WAL record. A crash at this point would thus restore the
sequence to its post-ALTER state, causing the next nextval() call to return
the first sequence value again.

To fix, get rid of the idea of logging an is_called status different from
reality. This means that the first nextval-driven WAL record will happen
at the first nextval call not the second, but the marginal cost of that is
pretty negligible. In addition, make sure that ALTER SEQUENCE resets
log_cnt to zero in any case where it touches sequence parameters that
affect future nextval results. This will result in some user-visible
changes in the contents of a sequence's log_cnt column, as reflected in the
patch's regression test changes; but no application should be depending on
that anyway, since it was already true that log_cnt changes rather
unpredictably depending on checkpoint timing.

In addition, make some basically-cosmetic improvements to get rid of
sequence.c's undesirable intimacy with page layout details. It was always
really trying to WAL-log the contents of the sequence tuple, so we should
have it do that directly using a HeapTuple's t_data and t_len, rather than
backing into it with some magic assumptions about where the tuple would be
on the sequence's page.

Back-patch to all supported branches.

a4a7eb37

Document that pg_basebackup will create its output directory · 7332aa6c
Peter Eisentraut authored 12 years ago

7332aa6c
Add translator comments to module names · 408e82c2
Alvaro Herrera authored 12 years ago

408e82c2

Jul 24, 2012

Change syntax of new CHECK NO INHERIT constraints · 68043258

Alvaro Herrera authored 12 years ago

The initially implemented syntax, "CHECK NO INHERIT (expr)" was not
deemed very good, so switch to "CHECK (expr) NO INHERIT" instead.  This
way it looks similar to SQL-standards compliant constraint attribute.

Backport to 9.2 where the new syntax and feature was introduced.

Per discussion.

68043258

Jul 22, 2012

Fix name collision between concurrent regression tests. · d86fb72c

Tom Lane authored 12 years ago

Commit f5bcd398 introduced a test using
a table named "circles" in inherit.sql.  Unfortunately, the concurrently
executed constraints test was already using that table name, so the
parallel regression tests would sometimes fail.  Rename table to dodge
the problem.  Per buildfarm.

d86fb72c

Jul 21, 2012

Account for SRFs in targetlists in planner rowcount estimates. · 641054ad

Tom Lane authored 12 years ago

We made use of the ROWS estimate for set-returning functions used in FROM,
but not for those used in SELECT targetlists; which is a bit of an
oversight considering there are common usages that require the latter
approach.  Improve that.  (I had initially thought it might be worth
folding this into cost_qual_eval, but after investigation concluded that
that wouldn't be very helpful, so just do it separately.)  Per complaint
from David Johnston.

Back-patch to 9.2, but not further, for fear of destabilizing plan choices
in existing releases.

641054ad

Jul 20, 2012

Remove now unneeded results file for disabled prepared transactions case. · c6c6f219
Andrew Dunstan authored 12 years ago

c6c6f219

Remove prepared transactions from main isolation test schedule. · a8f0f98f

Andrew Dunstan authored 12 years ago

There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.

a8f0f98f

pg_dump: Simplify mkdir() error checking · 1247ebff
Peter Eisentraut authored 12 years ago
```
mkdir() can check for errors itself.  We don't need to code that
ourselves again.
```
1247ebff

connoinherit may be true only for CHECK constraints · d721f208

Alvaro Herrera authored 12 years ago

The code was setting it true for other constraints, which is
bogus.  Doing so caused bogus catalog entries for such constraints, and
in particular caused an error to be raised when trying to drop a
constraint of types other than CHECK from a table that has children,
such as reported in bug #6712.

In 9.2, additionally ignore connoinherit=true for other constraint
types, to avoid having to force initdb; existing databases might already
contain bogus catalog entries.

Includes a catversion bump (in HEAD only).

Bug report from Miroslav Šulc
Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.

d721f208

Fix whole-row Var evaluation to cope with resjunk columns (again). · d7991a13

Tom Lane authored 12 years ago

When a whole-row Var is reading the result of a subquery, we need it to
ignore any "resjunk" columns that the subquery might have evaluated for
GROUP BY or ORDER BY purposes. We've hacked this area before, in commit
68e40998, but that fix only covered
whole-row Vars of named composite types, not those of RECORD type; and it
was mighty klugy anyway, since it just assumed without checking that any
extra columns in the result must be resjunk. A proper fix requires getting
hold of the subquery's targetlist so we can actually see which columns are
resjunk (whereupon we can use a JunkFilter to get rid of them). So bite
the bullet and add some infrastructure to make that possible.

Per report from Andrew Dunstan and additional testing by Merlin Moncure.
Back-patch to all supported branches. In 8.3, also back-patch commit
292176a1, which for some reason I had
not done at the time, but it's a prerequisite for this change.

d7991a13

Rethink checkpointer's fsync-request table representation. · e3981da3

Tom Lane authored 12 years ago

Instead of having one hash table entry per relation/fork/segment, just have
one per relation, and use bitmapsets to represent which specific segments
need to be fsync'd.  This eliminates the need to scan the whole hash table
to implement FORGET_RELATION_FSYNC, which fixes the O(N^2) behavior
recently demonstrated by Jeff Janes for cases involving lots of TRUNCATE or
DROP TABLE operations during a single checkpoint cycle.  Per an idea from
Robert Haas.

(FORGET_DATABASE_FSYNC still sucks, but since dropping a database is a
pretty expensive operation anyway, we'll live with that.)

In passing, improve the delayed-unlink code: remove the pass over the list
in mdpreckpt, since it wasn't doing anything for us except supporting a
useless Assert in mdpostckpt, and fix mdpostckpt so that it will absorb
fsync requests every so often when clearing a large backlog of deletion
requests.

e3981da3

Jul 19, 2012

Send only one FORGET_RELATION_FSYNC request when dropping a relation. · 2bc30516

Tom Lane authored 12 years ago

We were sending one per fork, but a little bit of refactoring allows us
to send just one request with forknum == InvalidForkNumber. This not only
reduces pressure on the shared-memory request queue, but saves repeated
traversals of the checkpointer's hash table.

2bc30516

Jul 18, 2012

Refactor the way code is shared between some range type functions. · 79c49131

Heikki Linnakangas authored 12 years ago

Functions like range_eq, range_before etc. are exposed at the SQL-level, but
they're also used internally by the GiST consistent support function. The
code sharing was done by a hack, TrickFunctionCall2, which relied on the
knowledge that all the functions used fn_extra the same way. This commit
splits the functions into internal versions that take a TypeCacheEntry as
argument, and thin wrappers to expose the functions at the SQL-level. The
internal versions can then be called directly and in a less hacky way from
the GiST consistent function.

This is just cosmetic, but backpatch to 9.2 anyway, to avoid having a
different version of this code in the 9.2 branch. That would make
backpatching fixes in this area more difficult.

Alexander Korotkov

79c49131

Fix statistics breakage from bgwriter/checkpointer process split. · 1e9326d6

Tom Lane authored 12 years ago

ForwardFsyncRequest() supposed that it could only be called in regular
backends, which used to be true; but since the splitup of bgwriter and
checkpointer, it is also called in the bgwriter.  We do not want to count
such calls in pg_stat_bgwriter.buffers_backend statistics, so fix things
so that they aren't.

(It's worth noting here that this implies an alarmingly large increase in
the expected amount of cross-process fsync request traffic, which may well
mean that the process splitup was not such a hot idea.)

1e9326d6

Fix management of pendingOpsTable in auxiliary processes. · d843589e

Tom Lane authored 12 years ago

mdinit() was misusing IsBootstrapProcessingMode() to decide whether to
create an fsync pending-operations table in the current process. This led
to creating a table not only in the startup and checkpointer processes as
intended, but also in the bgwriter process, not to mention other auxiliary
processes such as walwriter and walreceiver. Creation of the table in the
bgwriter is fatal, because it absorbs fsync requests that should have gone
to the checkpointer; instead they just sit in bgwriter local memory and are
never acted on. So writes performed by the bgwriter were not being fsync'd
which could result in data loss after an OS crash. I think there is no
live bug with respect to walwriter and walreceiver because those never
perform any writes of shared buffers; but the potential is there for
future breakage in those processes too.

To fix, make AuxiliaryProcessMain() export the current process's
AuxProcType as a global variable, and then make mdinit() test directly for
the types of aux process that should have a pendingOpsTable. Having done
that, we might as well also get rid of the random bool flags such as
am_walreceiver that some of the aux processes had grown. (Note that we
could not have fixed the bug by examining those variables in mdinit(),
because it's called from BaseInit() which is run by AuxiliaryProcessMain()
before entering any of the process-type-specific code.)

Back-patch to 9.2, where the problem was introduced by the split-up of
bgwriter and checkpointer processes. The bogus pendingOpsTable exists
in walwriter and walreceiver processes in earlier branches, but absent
any evidence that it causes actual problems there, I'll leave the older
branches alone.

d843589e

Get rid of useless global variable in pg_upgrade. · ebd9e26d

Tom Lane authored 12 years ago

Since the scandir() emulation was taken out of pg_upgrade, there's
no longer any need for scandir_file_pattern to exist as a global
variable.  Replace it with a local in the one remaining function
that was making use of it.

ebd9e26d

Improve pg_upgrade's load_directory() function. · 3d929dc7

Tom Lane authored 12 years ago

Error out on out-of-memory, rather than returning -1, which the sole
existing caller wasn't checking for anyway.  There doesn't seem to be
any use-case for making the caller check for failure here.

Detect failure return from readdir().

Use a less platform-dependent method of calculating the entrysize.
It's possible, but not yet confirmed, that this explains bug #6733,
in which Mike Wilson reports a pg_upgrade crash that did not occur
in 9.1.  (Note that load_directory is effectively new code in 9.2,
at least on platforms that have scandir().)

Fix up comments, avoid uselessly using two counters, reduce the number
of realloc calls to something sane.

3d929dc7

Jul 17, 2012

Improve coding around the fsync request queue. · 4abcce8c

Tom Lane authored 12 years ago

In all branches back to 8.3, this patch fixes a questionable assumption in
CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are
no uninitialized pad bytes in the request queue structs. This would only
cause trouble if (a) there were such pad bytes, which could happen in 8.4
and up if the compiler makes enum ForkNumber narrower than 32 bits, but
otherwise would require not-currently-planned changes in the widths of
other typedefs; and (b) the kernel has not uniformly initialized the
contents of shared memory to zeroes. Still, it seems a tad risky, and we
can easily remove any risk by pre-zeroing the request array for ourselves.
In addition to that, we need to establish a coding rule that struct
RelFileNode can't contain any padding bytes, since such structs are copied
into the request array verbatim. (There are other places that are assuming
this anyway, it turns out.)

In 9.1 and up, the risk was a bit larger because we were also effectively
assuming that struct RelFileNodeBackend contained no pad bytes, and with
fields of different types in there, that would be much easier to break.
However, there is no good reason to ever transmit fsync or delete requests
for temp files to the bgwriter/checkpointer, so we can revert the request
structs to plain RelFileNode, getting rid of the padding risk and saving
some marginal number of bytes and cycles in fsync queue manipulation while
we are at it. The savings might be more than marginal during deletion of
a temp relation, because the old code transmitted an entirely useless but
nonetheless expensive-to-process ForgetRelationFsync request to the
background process, and also had the background process perform the file
deletion even though that can safely be done immediately.

In addition, make some cleanup of nearby comments and small improvements to
the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.

4abcce8c

Show step titles in the pg_upgrade man page · 3d03c97a
Peter Eisentraut authored 12 years ago
```
The upstream XSLT stylesheets missed that case.

found by Álvaro Herrera
```
3d03c97a

Remove recently added PL/Perl encoding tests · 82b7faa3

Alvaro Herrera authored 12 years ago

These only pass cleanly on UTF8 and SQL_ASCII encodings, besides the
Japanese encoding in which they were originally written, which is clearly
not good enough.  Since the functionality they test has not ever been
tested from PL/Perl, the best answer seems to be to remove the new tests
completely.

Per buildfarm results and ensuing discussion.

82b7faa3

Jul 16, 2012

Avoid pre-determining index names during CREATE TABLE LIKE parsing. · 3727240d

Tom Lane authored 12 years ago

Formerly, when trying to copy both indexes and comments, CREATE TABLE LIKE
had to pre-assign names to indexes that had comments, because it made up an
explicit CommentStmt command to apply the comment and so it had to know the
name for the index. This creates bad interactions with other indexes, as
shown in bug #6734 from Daniele Varrazzo: the preassignment logic couldn't
take any other indexes into account so it could choose a conflicting name.

To fix, add a field to IndexStmt that allows it to carry a comment to be
assigned to the new index. (This isn't a user-exposed feature of CREATE
INDEX, only an internal option.) Now we don't need preassignment of index
names in any situation.

I also took the opportunity to refactor DefineIndex to accept the IndexStmt
as such, rather than passing all its fields individually in a mile-long
parameter list.

Back-patch to 9.2, but no further, because it seems too dangerous to change
IndexStmt or DefineIndex's API in released branches. The bug exists back
to 9.0 where CREATE TABLE LIKE grew the ability to copy comments, but given
the lack of prior complaints we'll just let it go unfixed before 9.2.

3727240d

Jul 15, 2012

Prevent corner-case core dump in rfree(). · 1116c9d1

Tom Lane authored 12 years ago

rfree() failed to cope with the case that pg_regcomp() had initialized the
regex_t struct but then failed to allocate any memory for re->re_guts (ie,
the first malloc call in pg_regcomp() failed). It would try to touch the
guts struct anyway, and thus dump core. This is a sufficiently narrow
corner case that it's not surprising it's never been seen in the field;
but still a bug is a bug, so patch all active branches.

Noted while investigating whether we need to call pg_regfree after a
failure return from pg_regcomp. Other than this bug, it turns out we
don't, so adjust comments appropriately.

1116c9d1

Jul 14, 2012
- Add link to PEP 394 regarding python2 vs python3 naming · eb972f3e
  Peter Eisentraut authored 12 years ago
  
  eb972f3e
Jul 12, 2012

Fix walsender processes to establish a SIGALRM handler. · 0bf8eb2a

Tom Lane authored 12 years ago

Walsenders must have working SIGALRM handling during InitPostgres,
but they set the handler to SIG_IGN so that nothing would happen
if a timeout was reached.  This could result in two failure modes:

* If a walsender participated in a deadlock during its authentication
transaction, and was the last to wait in the deadly embrace, the deadlock
would not get cleared automatically.  This would require somebody to be
trying to take out AccessExclusiveLock on multiple system catalogs, so
it's not very probable.

* If a client failed to respond to a walsender's authentication challenge,
the intended disconnect after AuthenticationTimeout wouldn't happen, and
the walsender would wait indefinitely for the client.

For the moment, fix in back branches only, since this is fixed in a
different way in the timeout-infrastructure patch that's awaiting
application to HEAD.  If we choose not to apply that, then we'll need
to do this in HEAD as well.

0bf8eb2a

Jul 11, 2012
- Document that Log-Shipping Standby Servers cannot be upgraded by · 4810ebe1
  Bruce Momjian authored 12 years ago
  
  pg_upgrade. Backpatch to 9.2.
  4810ebe1
- Back-patch fix for extraction of fixed prefixes from regular expressions. · 18c8dc32
  Tom Lane authored 12 years ago
  
  Back-patch of commits 628cbb50 and c6aae304. This has been broken since 7.3, so back-patch to all supported branches.
  18c8dc32