- May 08, 2017
-
-
Tom Lane authored
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484
-
- Aug 08, 2016
-
-
Tom Lane authored
If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>
-
- Jan 06, 2015
-
-
Bruce Momjian authored
Backpatch certain files through 9.0
-
- May 06, 2014
-
-
Bruce Momjian authored
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
-
- Jan 07, 2014
-
-
Bruce Momjian authored
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
-
- May 29, 2013
-
-
Bruce Momjian authored
This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
-
- Mar 14, 2013
-
-
Heikki Linnakangas authored
The estimates are based on the existing lower bound histogram, and a new histogram of range lengths. Bump catversion, because the range length histogram now needs to be present in statistic slot kind 6, or you get an error on @> and <@ queries. (A re-ANALYZE would be enough to fix that, though) Alexander Korotkov, with some refactoring by me.
-
- Jan 01, 2013
-
-
Bruce Momjian authored
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
-
- Aug 27, 2012
-
-
Heikki Linnakangas authored
This enables selectivity estimation of the <<, >>, &<, &> and && operators, as well as the normal inequality operators: <, <=, >=, >. "range @> element" is also supported, but the range-variant @> and <@ operators are not, because they cannot be sensibly estimated with lower and upper bound histograms alone. We would need to make some assumption about the lengths of the ranges for that. Alexander's patch included a separate histogram of lengths for that, but I left that out of the patch for simplicity. Hopefully that will be added as a followup patch. The fraction of empty ranges is also calculated and used in estimation. Alexander Korotkov, heavily modified by me.
-
- Jun 25, 2012
-
-
Peter Eisentraut authored
The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.
-
- Jun 10, 2012
-
-
Bruce Momjian authored
commit-fest.
-
- Mar 04, 2012
-
-
Tom Lane authored
This patch improves selectivity estimation for the array <@, &&, and @> (containment and overlaps) operators. It enables collection of statistics about individual array element values by ANALYZE, and introduces operator-specific estimators that use these stats. In addition, ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)" and "const <> ANY/ALL (array_column)" are estimated by treating them as variants of the containment operators. Since we still collect scalar-style stats about the array values as a whole, the pg_stats view is expanded to show both these stats and the array-style stats in separate columns. This creates an incompatible change in how stats for tsvector columns are displayed in pg_stats: the stats about lexemes are now displayed in the array-related columns instead of the original scalar-related columns. There are a few loose ends here, notably that it'd be nice to be able to suppress either the scalar-style stats or the array-element stats for columns for which they're not useful. But the patch is in good enough shape to commit for wider testing. Alexander Korotkov, reviewed by Noah Misch and Nathan Boley
-
- Jan 27, 2012
-
-
Peter Eisentraut authored
Those fields only appear in the structs so that genbki.pl can create the BKI bootstrap files for the catalogs. But they are not actually usable from C. So hiding them can prevent coding mistakes, saves stack space, and can help the compiler. In certain catalogs, the first variable-length field has been kept visible after manual inspection. These exceptions are noted in C comments. reviewed by Tom Lane
-
- Jan 02, 2012
-
-
Bruce Momjian authored
-
- Feb 18, 2011
-
-
Tom Lane authored
ts_typanalyze.c computes MCE statistics as fractions of the non-null rows, which seems fairly reasonable, and anyway changing it in released versions wouldn't be a good idea. But then ts_selfuncs.c has to account for that. Failure to do so results in overestimates in columns with a significant fraction of null documents. Back-patch to 8.4 where this stuff was introduced. Jesper Krogh
-
- Jan 01, 2011
-
-
Bruce Momjian authored
-
- Sep 20, 2010
-
-
Magnus Hagander authored
-
- Jan 05, 2010
-
-
Tom Lane authored
pg_attribute, by having genbki.pl derive the information from the various catalog header files. This greatly simplifies modification of the "bootstrapped" catalogs. This patch finally kills genbki.sh and Gen_fmgrtab.sh; we now rely entirely on Perl scripts for those build steps. To avoid creating a Perl build dependency where there was not one before, the output files generated by these scripts are now treated as distprep targets, ie, they will be built and shipped in tarballs. But you will need a reasonably modern Perl (probably at least 5.6) if you want to build from a CVS pull. The changes to the MSVC build process are untested, and may well break --- we'll soon find out from the buildfarm. John Naylor, based on ideas from Robert Haas and others
-
- Jan 02, 2010
-
-
Bruce Momjian authored
-
- Dec 29, 2009
-
-
Tom Lane authored
and teach ANALYZE to compute such stats for tables that have subclasses. Per my proposal of yesterday. autovacuum still needs to be taught about running ANALYZE on parent tables when their subclasses change, but the feature is useful even without that.
-
- Jun 11, 2009
-
-
Bruce Momjian authored
provided by Andrew.
-
- Jan 01, 2009
-
-
Bruce Momjian authored
-
- Sep 19, 2008
-
-
Tom Lane authored
Jan Urbanski
-
- Jul 14, 2008
-
-
Tom Lane authored
on the most common individual lexemes in place of the mostly-useless default behavior of counting duplicate tsvectors. Future work: create selectivity estimation functions that actually do something with these stats. (Some other things we ought to look at doing: using the Lossy Counting algorithm in compute_minimal_stats, and using the element-counting idea for stats on regular arrays.) Jan Urbanski
-
- Mar 27, 2008
-
-
Tom Lane authored
inclusions in src/include/catalog/*.h files. The main idea here is to push function declarations for src/backend/catalog/*.c files into separate headers, rather than sticking them into the corresponding catalog definition file as has been done in the past. This commit only carries out that idea fully for pg_proc, pg_type and pg_conversion, but that's enough for the moment --- if pg_list.h ever becomes unsafe for frontend code to include, we'll need to work a bit more. Zdenek Kotala
-
- Jan 01, 2008
-
-
Bruce Momjian authored
-
- May 08, 2007
-
-
Tom Lane authored
datatype project. Per request from Ale Raza (araza at esri.com).
-
- Jan 05, 2007
-
-
Bruce Momjian authored
back-stamped for this.
-
- Mar 05, 2006
-
-
Bruce Momjian authored
-
- Oct 15, 2005
-
-
Bruce Momjian authored
-
- Apr 14, 2005
-
-
Tom Lane authored
indexes. Extend the macros in include/catalog/*.h to carry the info about hand-assigned OIDs, and adjust the genbki script and bootstrap code to make the relations actually get those OIDs. Remove the small number of RelOid_pg_foo macros that we had in favor of a complete set named like the catname.h and indexing.h macros. Next phase will get rid of internal use of names for looking up catalogs and indexes; but this completes the changes forcing an initdb, so it looks like a good place to commit. Along the way, I made the shared relations (pg_database etc) not be 'bootstrap' relations any more, so as to reduce the number of hardwired entries and simplify changing those relations in future. I'm not sure whether they ever really needed to be handled as bootstrap relations, but it seems to work fine to not do so now.
-
- Dec 31, 2004
-
-
PostgreSQL Daemon authored
Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
-
- Aug 29, 2004
-
-
Bruce Momjian authored
-
Bruce Momjian authored
-
- Feb 24, 2004
-
-
Tom Lane authored
-
- Feb 13, 2004
-
-
Tom Lane authored
coding by Mark Cave-Ayland, some kibitzing by Tom Lane. initdb forced due to new column in pg_type.
-
- Nov 29, 2003
-
-
PostgreSQL Daemon authored
make sure the $Id tags are converted to $PostgreSQL as well ...
-
- Aug 04, 2003
-
-
Bruce Momjian authored
-
Bruce Momjian authored
-
- Mar 23, 2003
-
-
Tom Lane authored
them as arrays of the internal datatype. This requires treating the stavalues columns as 'anyarray' rather than 'text[]', which is not 100% kosher but seems to work fine for the purposes we need for pg_statistic. Perhaps in the future 'anyarray' will be allowed more generally.
-