Commits · e01e66f808fbd161b2714eab34bb9e9d0db0db53 · Jakob Huber / postgres-lambda-diff

Jan 30, 2012
- Various minor comments changes from bgwriter to checkpointer. · 73f617f1
  Simon Riggs authored 13 years ago
  
  73f617f1
Jan 24, 2012

Add new replication mode synchronous_commit = 'write'. · 443b4821

Simon Riggs authored 13 years ago

Replication occurs only to memory on standby, not to disk,
so provides additional performance if user wishes to
reduce durability level slightly. Adds concept of multiple
independent sync rep queues.

Fujii Masao and Simon Riggs

443b4821

Jan 02, 2012
- Update copyright notices for year 2012. · e126958c
  Bruce Momjian authored 13 years ago
  
  e126958c
Dec 31, 2011

Send new protocol keepalive messages to standby servers. · 64233902

Simon Riggs authored 13 years ago

Allows streaming replication users to calculate transfer latency
and apply delay via internal functions. No external functions yet.

64233902

Sep 14, 2011

Split walsender.h in public/private headers · 86822df9

Alvaro Herrera authored 13 years ago

This dramatically cuts short the number of headers the public one brings
into whatever includes it.

86822df9

Sep 09, 2011

Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h. · a7801b62

Tom Lane authored 13 years ago

As per my recent proposal, this refactors things so that these typedefs and
macros are available in a header that can be included in frontend-ish code.
I also changed various headers that were undesirably including
utils/timestamp.h to include datatype/timestamp.h instead.  Unsurprisingly,
this showed that half the system was getting utils/timestamp.h by way of
xlog.h.

No actual code changes here, just header refactoring.

a7801b62

Sep 04, 2011

Clean up the #include mess a little. · 1609797c

Tom Lane authored 13 years ago

walsender.h should depend on xlog.h, not vice versa. (Actually, the
inclusion was circular until a couple hours ago, which was even sillier;
but Bruce broke it in the expedient rather than logically correct
direction.) Because of that poor decision, plus blind application of
pgrminclude, we had a situation where half the system was depending on
xlog.h to include such unrelated stuff as array.h and guc.h. Clean up
the header inclusion, and manually revert a lot of what pgrminclude had
done so things build again.

This episode reinforces my feeling that pgrminclude should not be run
without adult supervision. Inclusion changes in header files in particular
need to be reviewed with great care. More generally, it'd be good if we
had a clearer notion of module layering to dictate which headers can sanely
include which others ... but that's a big task for another day.

1609797c

walsender.h doesn't need xlog.h, per Tom. · 5bce637a
Bruce Momjian authored 13 years ago

5bce637a
Move AllowCascadeReplication() define from xlog.h to replication include · 85e6e166
Bruce Momjian authored 13 years ago
```
file.

Per suggestion from Alvaro.
```
85e6e166

Sep 01, 2011
- Remove unnecessary #include references, per pgrminclude script. · 6416a82a
  Bruce Momjian authored 13 years ago
  
  6416a82a
Aug 11, 2011

Remove wal_sender_delay GUC, because it's no longer useful. · cff75130

Tom Lane authored 13 years ago

The latch infrastructure is now capable of detecting all cases where the
walsender loop needs to wake up, so there is no reason to have an arbitrary
timeout.

Also, modify the walsender loop logic to follow the standard pattern of
ResetLatch, test for work to do, WaitLatch.  The previous coding was both
hard to follow and buggy: it would sometimes busy-loop despite having
nothing available to do, eg between receipt of a signal and the next time
it was caught up with new WAL, and it also had interesting choices like
deciding to update to WALSNDSTATE_STREAMING on the strength of information
known to be obsolete.

cff75130

Aug 10, 2011

Change the autovacuum launcher to use WaitLatch instead of a poll loop. · 4dab3d5a

Tom Lane authored 13 years ago

In pursuit of this (and with the expectation that WaitLatch will be needed
in more places), convert the latch field that was already added to PGPROC
for sync rep into a generic latch that is activated for all PGPROC-owning
processes, and change many of the standard backend signal handlers to set
that latch when a signal happens. This will allow WaitLatch callers to be
wakened properly by these signals.

In passing, fix a whole bunch of signal handlers that had been hacked to do
things that might change errno, without adding the necessary save/restore
logic for errno. Also make some minor fixes in unix_latch.c, and clean
up bizarre and unsafe scheme for disowning the process's latch. Much of
this has to be back-patched into 9.1.

Peter Geoghegan, with additional work by Tom

4dab3d5a

Aug 06, 2011

Clean up ill-advised attempt to invent a private set of Node tags. · 05e83968

Tom Lane authored 13 years ago

Somebody thought it'd be cute to invent a set of Node tag numbers that were
defined independently of, and indeed conflicting with, the main tag-number
list. While this accidentally failed to fail so far, it would certainly
lead to trouble as soon as anyone wanted to, say, apply copyObject to these
node types. Clang was already complaining about the use of makeNode on
these tags, and I think quite rightly so. Fix by pushing these node
definitions into the mainstream, including putting replnodes.h where it
belongs.

05e83968

Jul 19, 2011

Cascading replication feature for streaming log-based replication. · 52861058

Simon Riggs authored 13 years ago

Standby servers can now have WALSender processes, which can work with
either WALReceiver or archive_commands to pass data. Fully updated
docs, including new conceptual terms of sending server, upstream and
downstream servers. WALSenders terminated when promote to master.

Fujii Masao, review, rework and doc rewrite by Simon Riggs

52861058

Apr 10, 2011
- pgindent run before PG 9.1 beta 1. · bf50caf1
  Bruce Momjian authored 13 years ago
  
  bf50caf1
Apr 07, 2011

Revise the API for GUC variable assign hooks. · 2594cf0e

Tom Lane authored 13 years ago

The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".

2594cf0e

Apr 05, 2011

Avoid assuming there will be only 3 states for synchronous_commit. · 88f32b7c

Simon Riggs authored 13 years ago

Also avoid hardcoding the current default state by giving it the name
"on" and replace with a meaningful name that reflects its behaviour.
Coding only, no change in behaviour.

88f32b7c

Apr 04, 2011

Merge synchronous_replication setting into synchronous_commit. · 240067b3

Robert Haas authored 13 years ago

This means one less thing to configure when setting up synchronous
replication, and also avoids some ambiguity around what the behavior
should be when the settings of these variables conflict.

Fujii Masao, with additional hacking by me.

240067b3

Mar 30, 2011

Automatically terminate replication connections that are idle for more · 754baa21

Heikki Linnakangas authored 14 years ago

than replication_timeout (a new GUC) milliseconds. The TCP timeout is often
too long, you want the master to notice a dead connection much sooner.
People complained about that in 9.0 too, but with synchronous replication
it's even more important to notice dead connections promptly.

Fujii Masao and Heikki Linnakangas

754baa21

Mar 17, 2011

Fix various possible problems with synchronous replication. · 9a56dc33

Robert Haas authored 14 years ago

1. Don't ignore query cancel interrupts.  Instead, if the user asks to
cancel the query after we've already committed it, but before it's on
the standby, just emit a warning and let the COMMIT finish.

2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown).
Instead, emit a warning message and close the connection without
acknowledging the commit.  Other backends will still see the effect of
the commit, but there's no getting around that; it's too late to abort
at this point, and ignoring die interrupts altogether doesn't seem like
a good idea.

3. If synchronous_standby_names becomes empty, wake up all backends
waiting for synchronous replication to complete.  Without this, someone
attempting to shut synchronous replication off could easily wedge the
entire system instead.

4. Avoid depending on the assumption that if a walsender updates
MyProc->syncRepState, we'll see the change even if we read it without
holding the lock.  The window for this appears to be quite narrow (and
probably doesn't exist at all on machines with strong memory ordering)
but protecting against it is practically free, so do that.

5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and
doesn't actually do anything.

There's still some further work needed here to make the behavior of fast
shutdown plausible, but that looks complex, so I'm leaving it for a
separate commit.  Review by Fujii Masao.

9a56dc33

Mar 10, 2011

More synchronous replication tweaks. · b8bb8dbf

Robert Haas authored 14 years ago

SyncRepRequested() must check not only the value of the
synchronous_replication GUC but also whether max_wal_senders > 0.
Otherwise, we might end up waiting for sync rep even when there's no
possibility of a standby ever managing to connect. There are some
existing cross-checks to prevent this, but they're not quite sufficient:
the user can start the server with max_wal_senders=0,
synchronous_standby_names='', and synchronous_replication=off and then
subsequent make synchronous_standby_names not empty using pg_ctl reload,
and then SET synchronous_standby=on, leading to an indefinite hang.

Along the way, rename the global variable for the synchronous_replication
GUC to match the name of the GUC itself, for clarity.

Report by Fujii Masao, though I didn't use his patch.

b8bb8dbf

Remove obsolete comment. · e397d2ee

Robert Haas authored 14 years ago

In earlier versions of the sync rep patch, waiters removed themselves from
the queue, but now walsender removes them before doing the wakeup.

Report by Fujii Masao.

e397d2ee

Minor sync rep corrections. · 64360987
Robert Haas authored 14 years ago
```
Fujii Masao, with a bit of additional wordsmithing by me.
```
64360987
Cleanup copyright years and file names in the header comments of some files. · 2d8de0a5
Itagaki Takahiro authored 14 years ago

2d8de0a5

Mar 07, 2011
- Add new files for syncrep missed in previous commit · 966fb05b
  Simon Riggs authored 14 years ago
  
  966fb05b
Mar 06, 2011

Efficient transaction-controlled synchronous replication. · a8a8a3e0

Simon Riggs authored 14 years ago

If a standby is broadcasting reply messages and we have named
one or more standbys in synchronous_standby_names then allow
users who set synchronous_replication to wait for commit, which
then provides strict data integrity guarantees. Design avoids
sending and receiving transaction state information so minimises
bookkeeping overheads. We synchronize with the highest priority
standby that is connected and ready to synchronize. Other standbys
can be defined to takeover in case of standby failure.

This version has very strict behaviour; more relaxed options
may be added at a later date.

Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime
Casanova, Heikki Linnakangas and Robert Haas, plus the assistance
of many other design reviewers.

a8a8a3e0

Mar 01, 2011

Change pg_last_xlog_receive_location() not to move backwards. That makes · 6eba5a7c

Heikki Linnakangas authored 14 years ago

it a lot more useful for determining which standby is most up-to-date,
for example. There was long discussions on whether overwriting existing
existing WAL makes sense to begin with, and whether we should do some more
extensive variable renaming, but this change nevertheless seems quite
uncontroversial.

Fujii Masao, reviewed by Jeff Janes, Robert Haas, Stephen Frost.

6eba5a7c

Feb 18, 2011

Separate messages for standby replies and hot standby feedback. · 06828c5f

Simon Riggs authored 14 years ago

Allow messages to be sent at different times, and greatly reduce
the frequency of hot standby feedback. Refactor to allow additional
message types.

06828c5f

Feb 16, 2011

Hot Standby feedback for avoidance of cleanup conflicts on standby. · bca8b7f1

Simon Riggs authored 14 years ago

Standby optionally sends back information about oldestXmin of queries
which is then checked and applied to the WALSender's proc->xmin.
GetOldestXmin() is modified slightly to agree with GetSnapshotData(),
so that all backends on primary include WALSender within their snapshots.
Note this does nothing to change the snapshot xmin on either master or
standby. Feedback piggybacks on the standby reply message.
vacuum_defer_cleanup_age is no longer used on standby, though parameter
still exists on primary, since some use cases still exist.

Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas

bca8b7f1

Feb 10, 2011

Send status updates back from standby server to master, indicating how far · b186523f

Heikki Linnakangas authored 14 years ago

the standby has written, flushed, and applied the WAL. At the moment, this
is for informational purposes only, the values are only shown in
pg_stat_replication system view, but in the future they will also be needed
for synchronous replication.

Extracted from Simon riggs' synchronous replication patch by Robert Haas, with
some tweaking by me.

b186523f

Jan 30, 2011

Add option to include WAL in base backup · 507069de

Magnus Hagander authored 14 years ago

When included, this makes the base backup a complete working
"clone" of the initial database, ready to have a postmaster
started against it without the need to set up any log archiving
or similar.

Magnus Hagander, reviewed by Fujii Masao and Heikki Linnakangas

507069de

Jan 23, 2011

Make walsender options order-independent · e5487f65

Magnus Hagander authored 14 years ago

While doing this, also move base backup options into
a struct instead of increasing the number of parameters
to multiple functions for each new option.

e5487f65

Add pg_basebackup tool for streaming base backups · 048d148f

Magnus Hagander authored 14 years ago

This tool makes it possible to do the pg_start_backup/
copy files/pg_stop_backup step in a single command.

There are still some steps to be done before this is a
complete backup solution, such as the ability to stream
the required WAL logs, but it's still usable, and
could do with some buildfarm coverage.

In passing, make the checkpoint request optionally
fast instead of hardcoding it.

Magnus Hagander, reviewed by Fujii Masao and Dimitri Fontaine

048d148f

Jan 14, 2011

Use a lexer and grammar for parsing walsender commands · fcd810c6

Magnus Hagander authored 14 years ago

Makes it easier to parse mainly the BASE_BACKUP command
with it's options, and avoids having to manually deal
with quoted identifiers in the label (previously broken),
and makes it easier to add new commands and options in
the future.

In passing, refactor the case statement in the walsender
to put each command in it's own function.

fcd810c6

Exit from base backups when shutdown is requested · 688423d0

Magnus Hagander authored 14 years ago

When the exit waits until the whole backup completes, it may take
a very long time.

In passing, add back an error check in the main loop so we detect
clients that disconnect much earlier if the backup is large.

688423d0

Jan 11, 2011
- Track walsender state in shared memory and expose in pg_stat_replication · 4c8e20f8
  Magnus Hagander authored 14 years ago
  
  4c8e20f8
Jan 10, 2011

Backend support for streaming base backups · 0eb59c45

Magnus Hagander authored 14 years ago

Add BASE_BACKUP command to walsender, allowing it to stream a
base backup to the client (in tar format). The syntax is still
far from ideal, that will be fixed in the switch to use a proper
grammar for walsender.

No client included yet, will come as a separate commit.

Magnus Hagander and Heikki Linnakangas

0eb59c45

Jan 07, 2011
- New system view pg_stat_replication displays activity of wal sender processes. · a755ea33
  Itagaki Takahiro authored 14 years ago
  
  Itagaki Takahiro and Simon Riggs.
  a755ea33
Jan 01, 2011
- Stamp copyrights for year 2011. · 5d950e3b
  Bruce Momjian authored 14 years ago
  
  5d950e3b
Dec 11, 2010
- Allow bidirectional copy messages in streaming replication mode. · d3d41469
  Robert Haas authored 14 years ago
  
  Fujii Masao. Review by Alvaro Herrera, Tom Lane, and myself.
  d3d41469