- Feb 12, 2013
-
-
Alvaro Herrera authored
libpgcommon is a new static library to allow sharing code among the various frontend programs and backend; this lets us eliminate duplicate implementations of common routines. We avoid libpgport, because that's intended as a place for porting issues; per discussion, it seems better to keep them separate. The first use case, and the only implemented by this patch, is pg_malloc and friends, which many frontend programs were already using. At the same time, we can use this to provide palloc emulation functions for the frontend; this way, some palloc-using files in the backend can also be used by the frontend cleanly. To do this, we change palloc() in the backend to be a function instead of a macro on top of MemoryContextAlloc(). This was previously believed to cause loss of performance, but this implementation has been tweaked by Tom and Andres so that on modern compilers it provides a slight improvement over the previous one. This lets us clean up some places that were already with localized hacks. Most of the pg_malloc/palloc changes in this patch were authored by Andres Freund. Zoltán Böszörményi also independently provided a form of that. libpgcommon infrastructure was authored by Álvaro.
-
Peter Eisentraut authored
-
- Feb 08, 2013
-
-
Tom Lane authored
The previous coding supposed that the first differing bytes in two varlena datums must have the same sign difference as their overall comparison result. This is obviously bogus for text strings in non-C locales, and probably wrong for numeric, and even for bytea I think it was wrong on machines where char is signed. When the assumption failed, the function could deliver a zero or negative penalty in situations where such a result is quite ridiculous, leading the core GiST code to make very bad page-split decisions. To fix, take the absolute values of the byte-level differences. Also, switch the code to using unsigned char not just char, so that the behavior will be consistent whether char is signed or not. Per investigation of a trouble report from Tomas Vondra. Back-patch to all supported branches.
-
Tom Lane authored
gbt_var_bin_union() failed to do the right thing when the existing range needed to be widened at both ends rather than just one end. This could result in an invalid index in which keys that are present would not be found by searches, because the searches would not think they need to descend to the relevant leaf pages. This error affected all the varlena datatypes supported by btree_gist (text, bytea, bit, numeric). Per investigation of a trouble report from Tomas Vondra. (There is also an issue in gbt_var_penalty(), but that should only result in inefficiency not wrong answers. I'm committing this separately so that we have a git state in which it can be tested that bad penalty results don't produce invalid indexes.) Back-patch to all supported branches.
-
- Feb 06, 2013
-
-
Alvaro Herrera authored
The wording changes applied in 0ac5ad51 were universally disliked. Per gripe from Andrew Dunstan
-
- Jan 31, 2013
-
-
Alvaro Herrera authored
Per report from digoal@126.com
-
Tatsuo Ishii authored
The new option specifies length of aggregation interval (in seconds). May be used only together with -l. With this option, the log contains per-interval summary (number of transactions, min/max latency and two additional fields useful for variance estimation). Patch contributed by Tomas Vondra, reviewed by Pavel Stehule. Slight change by Tatsuo Ishii, suggested by Robert Hass to emit an error message indicating that the option is not currently supported on Windows.
-
- Jan 29, 2013
-
-
Heikki Linnakangas authored
Beyond 21474, the number of accounts exceed the range for int4. Change the initialization code to use bigint for account id columns when scale is large enough, and switch to using int64s for the variables in pgbench code. The threshold where we switch to bigints is set at 20000, because that's easier to remember and document than 21474, and ensures that there is some headroom when int4s are used. Greg Smith, with various changes by Euler Taveira de Oliveira, Gurjeet Singh and Satoshi Nagayasu.
-
- Jan 24, 2013
-
-
Bruce Momjian authored
If the postmaster.pid lock file exists, try starting/stopping the cluster to check if the lock file is valid. Per request from Tom.
-
Alvaro Herrera authored
This makes 9.3 -> 9.3 upgrades work when they cross the commit that added persistent multixacts; early 9.3 pg_controldata did not have the required oldestMultiXact line, and so would fail to upgrade. per Bruce Momjian
-
Alvaro Herrera authored
-
Bruce Momjian authored
When pg_upgrade can't find required pg_controldata information, report _which_ cluster is failing, with this message: The %s cluster lacks some required control information:
-
- Jan 23, 2013
-
-
Alvaro Herrera authored
This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
-
Bruce Momjian authored
With AtEOXact applied, --single-transaction makes pg_restore slower, and has the potential to require lock table configuration, so remove the argument. Per suggestion from Tom.
-
- Jan 18, 2013
-
-
Bruce Momjian authored
If the cluster alignments don't match, output this suggestion: Likely one cluster is a 32-bit install, the other 64-bit
-
- Jan 14, 2013
-
-
Tom Lane authored
In commit 71450d7f, we added code to inform suitably-intelligent compilers that ereport() doesn't return if the elevel is ERROR or higher. This patch extends that to elog(), and also fixes a double-evaluation hazard that the previous commit created in ereport(), as well as reducing the emitted code size. The elog() improvement requires the compiler to support __VA_ARGS__, which should be available in just about anything nowadays since it's required by C99. But our minimum language baseline is still C89, so add a configure test for that. The previous commit assumed that ereport's elevel could be evaluated twice, which isn't terribly safe --- there are already counterexamples in xlog.c. On compilers that have __builtin_constant_p, we can use that to protect the second test, since there's no possible optimization gain if the compiler doesn't know the value of elevel. Otherwise, use a local variable inside the macros to prevent double evaluation. The local-variable solution is inferior because (a) it leads to useless code being emitted when elevel isn't constant, and (b) it increases the optimization level needed for the compiler to recognize that subsequent code is unreachable. But it seems better than not teaching non-gcc compilers about unreachability at all. Lastly, if the compiler has __builtin_unreachable(), we can use that instead of abort(), resulting in a noticeable code savings since no function call is actually emitted. However, it seems wise to do this only in non-assert builds. In an assert build, continue to use abort(), so that the behavior will be predictable and debuggable if the "impossible" happens. These changes involve making the ereport and elog macros emit do-while statement blocks not just expressions, which forces small changes in a few call sites. Andres Freund, Tom Lane, Heikki Linnakangas
-
- Jan 12, 2013
-
-
Andrew Dunstan authored
This is now used by ecpg tests, and not clobbered by pg_upgrade tests. This change won't affect anything that doesn't set this environment variable, but will enable the buildfarm to control exactly what port regression test installs will be running on, and thus to detect possible rogue postmasters more easily. Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
-
- Jan 09, 2013
-
-
Bruce Momjian authored
This patch implements parallel copying/linking of files by tablespace using the --jobs option in pg_upgrade.
-
- Jan 07, 2013
-
-
Tatsuo Ishii authored
(-i), producing only one progress message per 5 seconds along with elapsed time and estimated remaining time. Also add elapsed time and estimated remaining time to the default logging(prints one message each 100000 rows). Patch contributed by Tomas Vondra, reviewed by Jeevan Chalke and Tatsuo Ishii.
-
- Jan 04, 2013
-
-
Tom Lane authored
On non-Windows machines, we use the Unix socket for connections to test postmasters, so there is no need to create a TCP socket. Furthermore, doing so causes failures due to port conflicts if two builds are carried out concurrently on one machine. (If the builds are done in different chroots, which is standard practice at least in Red Hat distros, there is no risk of conflict on the Unix socket.) Suppressing the TCP socket by setting listen_addresses to empty has long been standard practice for pg_regress, and pg_upgrade knows about this too ... but pg_upgrade's test.sh didn't get the memo. Back-patch to 9.2, and also sync the 9.2 version of the script with HEAD as much as practical.
-
- Jan 03, 2013
-
-
Bruce Momjian authored
Adjust pg_upgrade page conversion functions (which are not used) to return void so transfer_all_new_dbs can return void.
-
- Jan 01, 2013
-
-
Bruce Momjian authored
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
-
- Dec 27, 2012
-
-
Bruce Momjian authored
Add pg_upgrade --jobs, which allows parallel dump/restore of databases, which improves performance.
-
- Dec 20, 2012
-
-
Bruce Momjian authored
Because the client encoding might not match the server encoding, pg_upgrade can't allocate NAMEDATALEN bytes for storage of database, relation, and namespace identifiers. Instead pg_strdup() the memory and free it. Also add C comment in initdb.c about safe NAMEDATALEN usage.
-
Bruce Momjian authored
Add comment stating that constraint and index names must match.
-
- Dec 11, 2012
-
-
Bruce Momjian authored
All versions of pg_upgrade upgraded invalid indexes caused by CREATE INDEX CONCURRENTLY failures and marked them as valid. The patch adds a check to all pg_upgrade versions and throws an error during upgrade or --check. Backpatch to 9.2, 9.1, 9.0. Patch slightly adjusted.
-
Andrew Dunstan authored
Normally each module is tested in a database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This new mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Second attempt at this, this time accomodating make versions older than 3.82. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
-
Bruce Momjian authored
Fix previous commit that added synchronous_commit=off, but broke -O/-o due to missing space in argument passing. Backpatch to 9.2.
-
- Dec 07, 2012
-
-
Bruce Momjian authored
Pg_upgrade displays file names during copy and database names during dump/restore. Andrew Dunstan identified three bugs: * long file names were being truncated to 60 _leading_ characters, which often do not change for long file names * file names were truncated to 60 characters in log files * carriage returns were being output to log files This commit fixes these --- it prints 60 _trailing_ characters to the status display, and full path names without carriage returns to log files. It also suppresses status output to the log file unless verbose mode is used.
-
- Dec 06, 2012
-
-
Alvaro Herrera authored
Background workers are postmaster subprocesses that run arbitrary user-specified code. They can request shared memory access as well as backend database connections; or they can just use plain libpq frontend database connections. Modules listed in shared_preload_libraries can register background workers in their _PG_init() function; this is early enough that it's not necessary to provide an extra GUC option, because the necessary extra resources can be allocated early on. Modules can install more than one bgworker, if necessary. Care is taken that these extra processes do not interfere with other postmaster tasks: only one such process is started on each ServerLoop iteration. This means a large number of them could be waiting to be started up and postmaster is still able to quickly service external connection requests. Also, shutdown sequence should not be impacted by a worker process that's reasonably well behaved (i.e. promptly responds to termination signals.) The current implementation lets worker processes specify their start time, i.e. at what point in the server startup process they are to be started: right after postmaster start (in which case they mustn't ask for shared memory access), when consistent state has been reached (useful during recovery in a HOT standby server), or when recovery has terminated (i.e. when normal backends are allowed). In case of a bgworker crash, actions to take depend on registration data: if shared memory was requested, then all other connections are taken down (as well as other bgworkers), just like it were a regular backend crashing. The bgworker itself is restarted, too, within a configurable timeframe (which can be configured to be never). More features to add to this framework can be imagined without much effort, and have been discussed, but this seems good enough as a useful unit already. An elementary sample module is supplied. Author: Álvaro Herrera This patch is loosely based on prior patches submitted by KaiGai Kohei, and unsubmitted code by Simon Riggs. Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund, Heikki Linnakangas, Simon Riggs, Amit Kapila
-
- Dec 05, 2012
-
-
Heikki Linnakangas authored
Fujii Masao, reviewed by Kyotaro Horiguchi.
-
- Dec 04, 2012
-
-
Bruce Momjian authored
report is clearer.
-
Bruce Momjian authored
executed.
-
Bruce Momjian authored
storage. Have pg_upgrade use it, and enable server options fsync=off and full_page_writes=off. Document that users turning fsync from off to on should run initdb --sync-only. [ Previous commit was incorrectly applied as a git merge. ]
-
Bruce Momjian authored
-
Bruce Momjian authored
-
Bruce Momjian authored
-
- Dec 03, 2012
-
-
Andrew Dunstan authored
This reverts commit e2b3c21b.
-
- Dec 02, 2012
-
-
Andrew Dunstan authored
Normally each module is tested in aq database named contrib_regression, which is dropped and recreated at the beginhning of each pg_regress run. This mode, enabled by adding USE_MODULE_DB=1 to the make command line, runs most modules in a database with the module name embedded in it. This will make testing pg_upgrade on clusters with the contrib modules a lot easier. Still to be done: adapt to the MSVC build system. Backpatch to 9.0, which is the earliest version it is reasonably possible to test upgrading from.
-
- Dec 01, 2012
-
-
Bruce Momjian authored
-