Commits · 9de3aa65f01fb51cbc725e8508ea233e4e92c46c · Jakob Huber / postgres-lambda-diff

Dec 23, 2010

Rewrite the GiST insertion logic so that we don't need the post-recovery · 9de3aa65

Heikki Linnakangas authored 14 years ago

cleanup stage to finish incomplete inserts or splits anymore. There was two
reasons for the cleanup step:

1. When a new tuple was inserted to a leaf page, the downlink in the parent
needed to be updated to contain (ie. to be consistent with) the new key.
Updating the parent in turn might require recursively updating the parent of
the parent. We now handle that by updating the parent while traversing down
the tree, so that when we insert the leaf tuple, all the parents are already
consistent with the new key, and the tree is consistent at every step.

2. When a page is split, we need to insert the downlink for the new right
page(s), and update the downlink for the original page to not include keys
that moved to the right page(s). We now handle that by setting a new flag,
F_FOLLOW_RIGHT, on the non-rightmost pages in the split. When that flag is
set, scans always follow the rightlink, regardless of the NSN mechanism used
to detect concurrent page splits. That way the tree is consistent right after
split, even though the downlink is still missing. This is very similar to the
way B-tree splits are handled. When the downlink is inserted in the parent,
the flag is cleared. To keep the insertion algorithm simple, when an
insertion sees an incomplete split, indicated by the F_FOLLOW_RIGHT flag, it
finishes the split before doing anything else.

These changes allow removing the whole "invalid tuple" mechanism, but I
retained the scan code to still follow invalid tuples correctly. While we
don't create any such tuples anymore, we want to handle them gracefully in
case you pg_upgrade a GiST index that has them. If we encounter any on an
insert, though, we just throw an error saying that you need to REINDEX.

The issue that got me into doing this is that if you did a checkpoint while
an insert or split was in progress, and the checkpoint finishes quickly so
that there is no WAL record related to the insert between RedoRecPtr and the
checkpoint record, recovery from that checkpoint would not know to finish
the incomplete insert. IOW, we have the same issue we solved with the
rm_safe_restartpoint mechanism during normal operation too. It's highly
unlikely to happen in practice, and this fix is far too large to backpatch,
so we're just going to live with in previous versions, but this refactoring
fixes it going forward.

With this patch, you don't get the annoying
'index "FOO" needs VACUUM or REINDEX to finish crash recovery' notices
anymore if you crash at an unfortunate moment.

9de3aa65

Document that BBU's do not allow partial page writes to be safely turned · 7a1ca897
Bruce Momjian authored 14 years ago
```
off unless they guarantee that all writes to the BBU arrive in 8kB chunks.

Per discussion with Greg Smith
```
7a1ca897

Dec 22, 2010

Typo fix. · 2a0f13a7
Robert Haas authored 14 years ago
```
Noted by Thom Brown.
```
2a0f13a7
Wording improvements for pg_ctl manual page. · 28d5c565
Bruce Momjian authored 14 years ago

28d5c565

Add PQlibVersion() function to libpq · de9a4c27

Magnus Hagander authored 14 years ago

This function is like the PQserverVersion() function except
it returns the version of libpq, making it possible for a client
program or driver to determine which version of libpq is in
use at runtime, and not just at link time.

Suggested by Harald Armin Massa and several others.

de9a4c27

Dec 20, 2010
- Fix typo · f9e9763b
  Alvaro Herrera authored 14 years ago
  
  Jaime Casanova
  f9e9763b
Dec 19, 2010

Support for collecting crash dumps on Windows · dcb09b59

Magnus Hagander authored 14 years ago

Add support for collecting "minidump" style crash dumps on
Windows, by setting up an exception handling filter. Crash
dumps will be generated in PGDATA/crashdumps if the directory
is created (the existance of the directory is used as on/off
switch for the generation of the dumps).

Craig Ringer and Magnus Hagander

dcb09b59

Dec 17, 2010
- Waiting for complete startup is now a well-defined operation. · df142bf8
  Robert Haas authored 14 years ago
  
  Per report from Fujii Masao, and subsequent discussion.
  df142bf8
Dec 16, 2010
- Some copy editing of pg_read_binary_file() patch. · 290f1603
  Robert Haas authored 14 years ago
  
  290f1603
- Document timestamptz a little better. · afc8f47b
  Robert Haas authored 14 years ago
  
  afc8f47b
Dec 15, 2010
- Add pg_read_binary_file() and whole-file-at-once versions of pg_read_file(). · 03db44ea
  Itagaki Takahiro authored 14 years ago
  
  One of the usages of the binary version is to read files in a different encoding from the server encoding. Dimitri Fontaine and Itagaki Takahiro.
  03db44ea
- Use "upgrade" in preference over "migrate" in pg_upgrade messages and · 16b5e08d
  Bruce Momjian authored 14 years ago
  
  documentation. (Many were left over from the old pg_migrator naming.)
  16b5e08d
Dec 14, 2010
- Update release notes for releases 9.0.2, 8.4.6, 8.3.13, 8.2.19, and 8.1.23. · f9224c8e
  Tom Lane authored 14 years ago
  
  f9224c8e
Dec 13, 2010
- Remove recently reintroduced CVS keyword · 843a490f
  Peter Eisentraut authored 14 years ago
  
  843a490f
- Document replacement of pg_class.relistemp with relpersistence. · d26849ee
  Robert Haas authored 14 years ago
  
  Noted by Tom Lane.
  d26849ee
Dec 11, 2010
- Allow bidirectional copy messages in streaming replication mode. · d3d41469
  Robert Haas authored 14 years ago
  
  Fujii Masao. Review by Alvaro Herrera, Tom Lane, and myself.
  d3d41469
- Minor documentation cleanup. · 1490946c
  Robert Haas authored 14 years ago
  
  Fujii Masao
  1490946c
Dec 09, 2010

Force default wal_sync_method to be fdatasync on Linux. · 576477e7

Tom Lane authored 14 years ago

Recent versions of the Linux system header files cause xlogdefs.h to
believe that open_datasync should be the default sync method, whereas
formerly fdatasync was the default on Linux.  open_datasync is a bad
choice, first because it doesn't actually outperform fdatasync (in fact
the reverse), and second because we try to use O_DIRECT with it, causing
failures on certain filesystems (e.g., ext4 with data=journal option).
This part of the patch is largely per a proposal from Marti Raudsepp.
More extensive changes are likely to follow in HEAD, but this is as much
change as we want to back-patch.

Also clean up confusing code and incorrect documentation surrounding the
fsync_writethrough option.  Those changes shouldn't result in any actual
behavioral change, but I chose to back-patch them anyway to keep the
branches looking similar in this area.

In 9.0 and HEAD, also do some copy-editing on the WAL Reliability
documentation section.

Back-patch to all supported branches, since any of them might get used
on modern Linux versions.

576477e7

Dec 08, 2010

Optimize commit_siblings in two ways to improve group commit. · e620ee35

Simon Riggs authored 14 years ago

First, avoid scanning the whole ProcArray once we know there
are at least commit_siblings active; second, skip the check
altogether if commit_siblings = 0.

Greg Smith

e620ee35

Dec 04, 2010
- Add KNNGIST support to contrib/pg_trgm. · b525bf77
  Tom Lane authored 14 years ago
  
  Teodor Sigaev, with some revision by Tom
  b525bf77
- Add external documentation for KNNGIST. · b576757d
  Tom Lane authored 14 years ago
  
  b576757d
Dec 03, 2010

Clarify that LOCK TABLE requires a table-level privilege. · c0a4d3e0
Robert Haas authored 14 years ago

c0a4d3e0

Create core infrastructure for KNNGIST. · d583f10b

Tom Lane authored 14 years ago

This is a heavily revised version of builtin_knngist_core-0.9. The
ordering operators are no longer mixed in with actual quals, which would
have confused not only humans but significant parts of the planner.
Instead, ordering operators are carried separately throughout planning and
execution.

Since the API for ambeginscan and amrescan functions had to be changed
anyway, this commit takes the opportunity to rationalize that a bit.
RelationGetIndexScan no longer forces a premature index_rescan call;
instead, callers of index_beginscan must call index_rescan too. Aside from
making the AM-side initialization logic a bit less peculiar, this has the
advantage that we do not make a useless extra am_rescan call when there are
runtime key values. AMs formerly could not assume that the key values
passed to amrescan were actually valid; now they can.

Teodor Sigaev and Tom Lane

d583f10b

Nov 29, 2010
- Be consistent about writing "[, ...]" instead "[,...]" in the docs. · 3c42efce
  Heikki Linnakangas authored 14 years ago
  
  Christoph Berg.
  3c42efce
Nov 27, 2010

Point out in default_tablespace's description that CREATE DATABASE ignores it. · c623365f
Tom Lane authored 14 years ago
```
Per gripe from Andreas Scherbaum.
```
c623365f
New contrib module, auth_delay. · fe7a32fc
Robert Haas authored 14 years ago
```
KaiGai Kohei, with a few changes by me.
```
fe7a32fc
A bit more wordsmithing on the PQping documentation. · d53c1255
Tom Lane authored 14 years ago

d53c1255

Rewrite PQping to be more like what we agreed to last week. · db96e1cc

Tom Lane authored 14 years ago

Basically, we want to distinguish all cases where the connection was
not made from those where it was. A convenient proxy for this is to
see if we got a message with a SQLSTATE code back from the postmaster.
This presumes that the postmaster will always send us a SQLSTATE in
a failure message, which is true for 7.4 and later postmasters in
every case except fork failure. (We could possibly complicate the
postmaster code to do something about that, but it seems not worth
the trouble, especially since pg_ctl's response for that case should
be to keep waiting anyway.)

If we did get a SQLSTATE from the postmaster, there are basically only
two cases, as per last week's discussion: ERRCODE_CANNOT_CONNECT_NOW
and everything else. Any other error code implies that the postmaster
is in principle willing to accept connections, it just didn't like or
couldn't handle this particular request. We want to make a special
case for ERRCODE_CANNOT_CONNECT_NOW so that "pg_ctl start -w" knows
it should keep waiting.

In passing, pick names for the enum constants that are a tad less
likely to present collision hazards in future.

db96e1cc

Nov 26, 2010

Add more ALTER <object> .. SET SCHEMA commands. · 55109313

Robert Haas authored 14 years ago

This adds support for changing the schema of a conversion, operator,
operator class, operator family, text search configuration, text search
dictionary, text search parser, or text search template.

Dimitri Fontaine, with assorted corrections and other kibitzing.

55109313

Nov 25, 2010

Add PQping and PQpingParams to libpq to allow detection of the server's · afd7d9ad

Bruce Momjian authored 14 years ago

status, including a status where the server is running but refuses a
postgres connection.

Have pg_ctl use this new function.  This fixes the case where pg_ctl
reports that the server is not running (cannot connect) but in fact it
is running.

afd7d9ad

Document that a CHECKPOINT before taking a file system snapshot can · 7276ab58
Bruce Momjian authored 14 years ago
```
reduce recovery time.
```
7276ab58

Nov 24, 2010

When reporting the server as not responding, if the hostname was · ba11258c

Bruce Momjian authored 14 years ago

supplied, also print the IP address.  This allows IPv4 and IPv6 failures
to be distinguished.  Also useful when a hostname resolves to multiple
IP addresses.

Also, remove use of inet_ntoa() and use our own inet_net_ntop() in all
places, including in libpq, because it is thread-safe.

ba11258c

Create the system catalog infrastructure needed for KNNGIST. · 725d52d0

Tom Lane authored 14 years ago

This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
column amcanorderbyop to pg_am.  For the moment all the entries in
amcanorderbyop are "false", since the underlying support isn't there yet.

Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
[ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
columns of pg_amop to be populated, and create pg_dump support for dumping
that information.

I also added some documentation, although it's perhaps a bit premature
given that the feature doesn't do anything useful yet.

Teodor Sigaev, Robert Haas, Tom Lane

725d52d0

Nov 23, 2010

Add index entries for more functions · 4fc09ad0

Peter Eisentraut authored 14 years ago

Also, move index entries into the tables, closer to the function description,
for easier editing in the future.  Resort some tables to be more alphabetical.
Remove the entries for count, max, min, and sum in the tutorial area, because
that was felt to be confusing.

Thom Brown

4fc09ad0

Propagate ALTER TYPE operations to typed tables · f2a42783

Peter Eisentraut authored 14 years ago

This adds RESTRICT/CASCADE flags to ALTER TYPE ... ADD/DROP/ALTER/
RENAME ATTRIBUTE to control whether to alter typed tables as well.

f2a42783

Remove useless whitespace at end of lines · fc946c39
Peter Eisentraut authored 14 years ago

fc946c39

Nov 21, 2010

Add new SQL function, format(text). · 75048707

Robert Haas authored 14 years ago

Currently, three conversion format specifiers are supported: %s for a
string, %L for an SQL literal, and %I for an SQL identifier.  The latter
two are deliberately designed not to overlap with what sprintf() already
supports, in case we want to add more of sprintf()'s functionality here
later.

Patch by Pavel Stehule, heavily revised by me.  Reviewed by Jeff Janes
and, in earlier versions, by Itagaki Takahiro and Tom Lane.

75048707

Nov 18, 2010
- Add pg_describe_object function · 6cc2deb8
  Alvaro Herrera authored 14 years ago
  
  This function is useful to obtain textual descriptions of objects as stored in pg_depend.
  6cc2deb8
- Minor corrections to dummy_seclabel documentation. · 1fc2d60d
  Robert Haas authored 14 years ago
  
  Problems noted by Thom Brown.
  1fc2d60d
- Document the dummy_seclabel contrib module. · 45768d10
  Robert Haas authored 14 years ago
  
  KaiGai Kohei, with editing and markup fixes by me.
  45768d10