Commits · 60a3dffb724c49c60d9ba921929bfa49ec21dd00 · Jakob Huber / postgres-lambda-diff

May 10, 2012

Heikki Linnakangas authored 12 years ago

Multi-insert records observe XLOG_HEAP_INIT_PAGE flag too, as Andres Freund
pointed out.

60a3dffb

Improve control logic for bgwriter hibernation mode. · 6308ba05

Tom Lane authored 12 years ago

Commit 6d90eaaa added a hibernation mode
to the bgwriter to reduce the server's idle-power consumption. However,
its interaction with the detailed behavior of BgBufferSync's feedback
control loop wasn't very well thought out. That control loop depends
primarily on the rate of buffer allocation, not the rate of buffer
dirtying, so the hibernation mode has to be designed to operate only when
no new buffer allocations are happening. Also, the check for whether the
system is effectively idle was not quite right and would fail to detect
a constant low level of activity, thus allowing the bgwriter to go into
hibernation mode in a way that would let the cycle time vary quite a bit,
possibly further confusing the feedback loop. To fix, move the wakeup
support from MarkBufferDirty and SetBufferCommitInfoNeedsSave into
StrategyGetBuffer, and prevent the bgwriter from entering hibernation mode
unless no buffer allocations have happened recently.

In addition, fix the delaying logic to remove the problem of possibly not
responding to signals promptly, which was basically caused by trying to use
the process latch's is_set flag for multiple purposes. I can't prove it
but I'm suspicious that that hack was responsible for the intermittent
"postmaster does not shut down" failures we've been seeing in the buildfarm
lately. In any case it did nothing to improve the readability or
robustness of the code.

In passing, express the hibernation sleep time as a multiplier on
BgWriterDelay, not a constant. I'm not sure whether there's any value in
exposing the longer sleep time as an independently configurable setting,
but we can at least make it act like this for little extra code.

6308ba05

May 09, 2012

Rename BgWriterShmem/Request to CheckpointerShmem/Request · 8f28789b
Simon Riggs authored 12 years ago

8f28789b
Rename BgWriterCommLock to CheckpointerCommLock · bbd3ec9d
Simon Riggs authored 12 years ago

bbd3ec9d

Fix an issue in recent walwriter hibernation patch. · acd4c7d5

Tom Lane authored 12 years ago

Users of asynchronous-commit mode expect there to be a guaranteed maximum
delay before an async commit's WAL records get flushed to disk. The
original version of the walwriter hibernation patch broke that. Add an
extra shared-memory flag to allow async commits to kick the walwriter out
of hibernation mode, without adding any noticeable overhead in cases where
no action is needed.

acd4c7d5

Reduce idle power consumption of walwriter and checkpointer processes. · 5461564a

Tom Lane authored 12 years ago

This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.

Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.

In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.

Peter Geoghegan, somewhat simplified by Tom

5461564a

May 07, 2012

Remove strdup, strtol, strtoul from libpgport · 3284e03d

Peter Eisentraut authored 12 years ago

These should not be needed anymore, at least after the recent port
removals.  So let's see whether we can do without them.

3284e03d

May 04, 2012

Overdue code review for transaction-level advisory locks patch. · 71b9549d

Tom Lane authored 12 years ago

Commit 62c7bd31 had assorted problems, most
visibly that it broke PREPARE TRANSACTION in the presence of session-level
advisory locks (which should be ignored by PREPARE), as per a recent
complaint from Stephen Rees. More abstractly, the patch made the
LockMethodData.transactional flag not merely useless but outright
dangerous, because in point of fact that flag no longer tells you anything
at all about whether a lock is held transactionally. This fix therefore
removes that flag altogether. We now rely entirely on the convention
already in use in lock.c that transactional lock holds must be owned by
some ResourceOwner, while session holds are never so owned. Setting the
locallock struct's owner link to NULL thus denotes a session hold, and
there is no redundant marker for that.

PREPARE TRANSACTION now works again when there are session-level advisory
locks, and it is also able to transfer transactional advisory locks to the
prepared transaction, but for implementation reasons it throws an error if
we hold both types of lock on a single lockable object. Perhaps it will be
worth improving that someday.

Assorted other minor cleanup and documentation editing, as well.

Back-patch to 9.1, except that in the 9.1 branch I did not remove the
LockMethodData.transactional flag for fear of causing an ABI break for
any external code that might be examining those structs.

71b9549d

May 03, 2012
- Remove BSD/OS (BSDi) port. There are no known users upgrading to · ebcaa5fc
  Bruce Momjian authored 12 years ago
  
  Postgres 9.2, and perhaps no existing users either.
  ebcaa5fc
May 02, 2012

Add missing parenthesis in comment. · 8e0c5195
Robert Haas authored 12 years ago

8e0c5195

Avoid repeated CLOG access from heap_hot_search_buffer. · 00381104

Robert Haas authored 12 years ago

At the time we check whether the tuple is dead to all running
transactions, we've already verified that it isn't visible to our
scan, setting hint bits if appropriate.  So there's no need to
recheck CLOG for the all-dead test we do just a moment later.
So, add HeapTupleIsSurelyDead() to test the appropriate condition
under the assumption that all relevant hit bits are already set.

Review by Tom Lane.

00381104

More duplicate word removal. · e01e66f8
Robert Haas authored 12 years ago

e01e66f8
Remove duplicate words in comments. · f291ccd4
Heikki Linnakangas authored 12 years ago
```
Found these with grep -r "for for ".
```
f291ccd4

May 01, 2012

Remove dead ports · f2f9439f

Peter Eisentraut authored 12 years ago

Remove the following ports:

- dgux
- nextstep
- sunos4
- svr4
- ultrix4
- univel

These are obsolete and not worth rescuing.  In most cases, there is
circumstantial evidence that they wouldn't work anymore anyway.

f2f9439f

Apr 30, 2012

Converge all SQL-level statistics timing values to float8 milliseconds. · 809e7e21

Tom Lane authored 12 years ago

This patch adjusts the core statistics views to match the decision already
taken for pg_stat_statements, that values representing elapsed time should
be represented as float8 and measured in milliseconds.  By using float8,
we are no longer tied to a specific maximum precision of timing data.
(Internally, it's still microseconds, but we could now change that without
needing changes at the SQL level.)

The columns affected are
pg_stat_bgwriter.checkpoint_write_time
pg_stat_bgwriter.checkpoint_sync_time
pg_stat_database.blk_read_time
pg_stat_database.blk_write_time
pg_stat_user_functions.total_time
pg_stat_user_functions.self_time
pg_stat_xact_user_functions.total_time
pg_stat_xact_user_functions.self_time

The first four of these are new in 9.2, so there is no compatibility issue
from changing them.  The others require a release note comment that they
are now double precision (and can show a fractional part) rather than
bigint as before; also their underlying statistics functions now match
the column definitions, instead of returning bigint microseconds.

809e7e21

Mark ReThrowError() with attribute noreturn · 26471a51
Peter Eisentraut authored 12 years ago
```
All related functions were already so marked.
```
26471a51

Rename I/O timing statistics columns to blk_read_time and blk_write_time. · 1dd89ead

Tom Lane authored 12 years ago

This seems more consistent with the pre-existing choices for names of
other statistics columns.  Rename assorted internal identifiers to match.

1dd89ead

Apr 29, 2012

Rename track_iotiming GUC to track_io_timing. · 309c6474
Tom Lane authored 12 years ago
```
This spelling seems significantly more readable to me.
```
309c6474

Change return type of ExceptionalCondition to void and mark it noreturn · 81107282

Peter Eisentraut authored 12 years ago

In ancient times, it was thought that this wouldn't work because of
TrapMacro/AssertMacro, but changing those to use a comma operator
appears to work without compiler warnings.

81107282

Apr 27, 2012

Prevent index-only scans from returning wrong answers under Hot Standby. · 3424bff9

Robert Haas authored 12 years ago

The alternative of disallowing index-only scans in HS operation was
discussed, but the consensus was that it was better to treat marking
a page all-visible as a recovery conflict for snapshots that could still
fail to see XIDs on that page. We may in the future try to soften this,
so that we simply force index scans to do heap fetches in cases where
this may be an issue, rather than throwing a hard conflict.

3424bff9

Apr 26, 2012

Fix planner's handling of RETURNING lists in writable CTEs. · 9fa82c98

Tom Lane authored 12 years ago

setrefs.c failed to do "rtoffset" adjustment of Vars in RETURNING lists,
which meant they were left with the wrong varnos when the RETURNING list
was in a subquery. That was never possible before writable CTEs, of
course, but now it's broken. The executor fails to notice any problem
because ExecEvalVar just references the ecxt_scantuple for any normal
varno; but EXPLAIN breaks when the varno is wrong, as illustrated in a
recent complaint from Bartosz Dmytrak.

Since the eventual rtoffset of the subquery is not known at the time
we are preparing its plan node, the previous scheme of executing
set_returning_clause_references() at that time cannot handle this
adjustment. Fortunately, it turns out that we don't really need to do it
that way, because all the needed information is available during normal
setrefs.c execution; we just have to dig it out of the ModifyTable node.
So, do that, and get rid of the kluge of early setrefs processing of
RETURNING lists. (This is a little bit of a cheat in the case of inherited
UPDATE/DELETE, because we are not passing a "root" struct that corresponds
exactly to what the subplan was built with. But that doesn't matter, and
anyway this is less ugly than early setrefs processing was.)

Back-patch to 9.1, where the problem became possible to hit.

9fa82c98

Apr 25, 2012
- Remove prototype for nonexistent function. · ca1e1a8d
  Robert Haas authored 12 years ago
  
  ca1e1a8d
Apr 24, 2012
- Lots of doc corrections. · 5d4b60f2
  Robert Haas authored 12 years ago
  
  Josh Kupershmidt
  5d4b60f2
Apr 21, 2012

Recast "ONLY" column CHECK constraints as NO INHERIT · 09ff76fc

Alvaro Herrera authored 12 years ago

The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE.  It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.

The pg_constraint column has accordingly been renamed connoinherit.

This commit partly reverts some of the changes in
61d81bd2, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.

Author: Nikhil Sontakke
Some tweaks by me

09ff76fc

Apr 19, 2012

Revise parameterized-path mechanism to fix assorted issues. · 5b7b5518

Tom Lane authored 12 years ago

This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate. We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.

In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage. This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.

To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing. This is required at both base scans and joins. It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree. Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.

5b7b5518

Apr 18, 2012

Finish rename of FastPathStrongLocks to FastPathStrongRelationLocks. · 4a6fab03

Robert Haas authored 12 years ago

Commit 8e5ac74c tried to do this renaming,
but I relied on gcc to tell me where I needed to make changes, instead of
grep.

Noted by Jeff Davis.

4a6fab03

Tighten up error recovery for fast-path locking. · 53c5b869

Robert Haas authored 12 years ago

The previous code could cause a backend crash after BEGIN; SAVEPOINT a;
LOCK TABLE foo (interrupted by ^C or statement timeout); ROLLBACK TO
SAVEPOINT a; LOCK TABLE foo, and might have leaked strong-lock counts
in other situations.

Report by Zoltán Böszörményi; patch review by Jeff Davis.

53c5b869

Apr 14, 2012

pg_size_pretty(numeric) · 4a2d7ad7

Robert Haas authored 12 years ago

The output of the new pg_xlog_location_diff function is of type numeric,
since it could theoretically overflow an int8 due to signedness; this
provides a convenient way to format such values.

Fujii Masao, with some beautification by me.

4a2d7ad7

Apr 13, 2012

Rename bytea_agg to string_agg and add delimiter argument · c0cc526e

Peter Eisentraut authored 12 years ago

Per mailing list discussion, we would like to keep the bytea functions
parallel to the text functions, so rename bytea_agg to string_agg,
which already exists for text.

Also, to satisfy the rule that we don't want aggregate functions of
the same name with a different number of arguments, add a delimiter
argument, just like string_agg for text already has.

c0cc526e

Apr 08, 2012

Do stack-depth checking in all postmaster children. · ef3883d1

Heikki Linnakangas authored 12 years ago

We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.

The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.

Backpatch to all supported releases.

ef3883d1

Apr 06, 2012

Dept of second thoughts: improve the API for AnalyzeForeignTable. · cea49fe8

Tom Lane authored 12 years ago

If we make the initially-called function return the table physical-size
estimate, acquire_inherited_sample_rows will be able to use that to
allocate numbers of samples among child tables, when the day comes that
we want to support foreign tables in inheritance trees.

cea49fe8

Allow statistics to be collected for foreign tables. · 263d9de6

Tom Lane authored 12 years ago

ANALYZE now accepts foreign tables and allows the table's FDW to control
how the sample rows are collected. (But only manual ANALYZEs will touch
foreign tables, for the moment, since among other things it's not very
clear how to handle remote permissions checks in an auto-analyze.)

contrib/file_fdw is extended to support this.

Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.

263d9de6

Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock · 8cb53654
Simon Riggs authored 12 years ago

8cb53654
checkopint -> checkpoint · 21cc5296
Robert Haas authored 12 years ago
```
Report by Guillaume Lelarge.
```
21cc5296

Apr 05, 2012

Publish checkpoint timing information to pg_stat_bgwriter. · b736aef2
Robert Haas authored 12 years ago
```
Greg Smith, Peter Geoghegan, and Robert Haas
```
b736aef2

Expose track_iotiming data via the statistics collector. · 64482890

Robert Haas authored 12 years ago

Ants Aasma's original patch to add timing information for buffer I/O
requests exposed this data at the relation level, which was judged too
costly. I've here exposed it at the database level instead.

64482890

Apr 03, 2012
- Add support for renaming domain constraints · 38b9693f
  Peter Eisentraut authored 12 years ago
  
  38b9693f
Mar 31, 2012
- Add PGDLLIMPORT to ScanKeywords and NumScanKeywords. · 5e83854d
  Tom Lane authored 12 years ago
  
  Per buildfarm, this is now needed by contrib/pg_stat_statements.
  5e83854d
Mar 29, 2012

Inherit max_safe_fds to child processes in EXEC_BACKEND mode. · 5762a4d9

Heikki Linnakangas authored 13 years ago

Postmaster sets max_safe_fds by testing how many open file descriptors it
can open, and that is normally inherited by all child processes at fork().
Not so on EXEC_BACKEND, ie. Windows, however. Because of that, we
effectively ignored max_files_per_process on Windows, and always assumed
a conservative default of 32 simultaneous open files. That could have an
impact on performance, if you need to access a lot of different files
in a query. After this patch, the value is passed to child processes by
save/restore_backend_variables() among many other global variables.

It has been like this forever, but given the lack of complaints about it,
I'm not backpatching this.

5762a4d9

Remove now redundant pgpipe code. · d2c1740d
Andrew Dunstan authored 13 years ago

d2c1740d