diff --git a/doc/src/sgml/indexcost.sgml b/doc/src/sgml/indexcost.sgml new file mode 100644 index 0000000000000000000000000000000000000000..4f9702cf7c94302fa163d50426487cd2f9fb678d --- /dev/null +++ b/doc/src/sgml/indexcost.sgml @@ -0,0 +1,236 @@ + <chapter> + <title>Index Cost Estimation Functions</title> + + <note> + <title>Author</title> + + <para> + Written by <ulink url="mailto:tgl@sss.pgh.pa.us">Tom Lane</ulink> + on 2000-01-24. + </para> + </note> + +<!-- +I have written the attached bit of doco about the new index cost +estimator procedure definition, but I am not sure where to put it. +There isn't (AFAICT) any existing documentation about how to make +a new kind of index, which would be the proper place for it. +May I impose on you to find/make a place for this and mark it up +properly? + +Also, doc/src/graphics/catalogs.ag needs to be updated, but I have +no idea how. (The amopselect and amopnpages fields of pg_amop +are gone; pg_am has a new field amcostestimate.) + + regards, tom lane +--> + + <para> + Every index access method must provide a cost estimation function for + use by the planner/optimizer. The procedure OID of this function is + given in the <literal>amcostestimate</literal> field of the access + method's <literal>pg_am</literal> entry. + + <note> + <para> + Prior to Postgres 7.0, a different scheme was used for registering + index-specific cost estimation functions. + </para> + </note> + </para> + + <para> + The amcostestimate function is given a list of WHERE clauses that have + been determined to be usable with the index. It must return estimates + of the cost of accessing the index and the selectivity of the WHERE + clauses (that is, the fraction of main-table tuples that will be + retrieved during the index scan). For simple cases, nearly all the + work of the cost estimator can be done by calling standard routines + in the optimizer; the point of having an amcostestimate function is + to allow index access methods to provide index-type-specific knowledge, + in case it is possible to improve on the standard estimates. + </para> + + <para> + Each amcostestimate function must have the signature: + + <programlisting> +void +amcostestimate (Query *root, + RelOptInfo *rel, + IndexOptInfo *index, + List *indexQuals, + Cost *indexAccessCost, + Selectivity *indexSelectivity); + </programlisting> + + The first four parameters are inputs: + + <variablelist> + <varlistentry> + <term>root</term> + <listitem> + <para> + The query being processed. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>rel</term> + <listitem> + <para> + The relation the index is on. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>index</term> + <listitem> + <para> + The index itself. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>indexQuals</term> + <listitem> + <para> + List of index qual clauses (implicitly ANDed); + a NIL list indicates no qualifiers are available. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + + <para> + The last two parameters are pass-by-reference outputs: + + <variablelist> + <varlistentry> + <term>*indexAccessCost</term> + <listitem> + <para> + Set to cost of index processing. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>*indexSelectivity</term> + <listitem> + <para> + Set to index selectivity + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + + <para> + Note that cost estimate functions must be written in C, not in SQL or + any available procedural language, because they must access internal + data structures of the planner/optimizer. + </para> + + <para> + The indexAccessCost should be computed in the units used by + src/backend/optimizer/path/costsize.c: a disk block fetch has cost 1.0, + and the cost of processing one index tuple should usually be taken as + cpu_index_page_weight (which is a user-adjustable optimizer parameter). + The access cost should include all disk and CPU costs associated with + scanning the index itself, but NOT the cost of retrieving or processing + the main-table tuples that are identified by the index. + </para> + + <para> + The indexSelectivity should be set to the estimated fraction of the main + table tuples that will be retrieved during the index scan. In the case + of a lossy index, this will typically be higher than the fraction of + tuples that actually pass the given qual conditions. + </para> + + <procedure> + <title>Cost Estimation</title> + <para> + A typical cost estimator will proceed as follows: + </para> + + <step> + <para> + Estimate and return the fraction of main-table tuples that will be visited + based on the given qual conditions. In the absence of any index-type-specific + knowledge, use the standard optimizer function clauselist_selec(): + + <programlisting> +*indexSelectivity = clauselist_selec(root, indexQuals); + </programlisting> + </para> + </step> + + <step> + <para> + Estimate the number of index tuples that will be visited during the + scan. For many index types this is the same as indexSelectivity times + the number of tuples in the index, but it might be more. (Note that the + index's size in pages and tuples is available from the IndexOptInfo struct.) + </para> + </step> + + <step> + <para> + Estimate the number of index pages that will be retrieved during the scan. + This might be just indexSelectivity times the index's size in pages. + </para> + </step> + + <step> + <para> + Compute the index access cost as + + <programlisting> +*indexAccessCost = numIndexPages + cpu_index_page_weight * numIndexTuples; + </programlisting> + </para> + </step> + </procedure> + + <para> + Examples of cost estimator functions can be found in + <filename>src/backend/utils/adt/selfuncs.c</filename>. + </para> + + <para> + By convention, the <literal>pg_proc</literal> entry for an + <literal>amcostestimate</literal> function should show + + <programlisting> +prorettype = 0 +pronargs = 6 +proargtypes = 0 0 0 0 0 0 + </programlisting> + + We use zero ("opaque") for all the arguments since none of them have types + that are known in pg_type. + </para> + </chapter> + +<!-- Keep this comment at the end of the file +Local variables: +mode:sgml +sgml-omittag:nil +sgml-shorttag:t +sgml-minimize-attributes:nil +sgml-always-quote-attributes:t +sgml-indent-step:1 +sgml-indent-data:t +sgml-parent-document:nil +sgml-default-dtd-file:"./reference.ced" +sgml-exposed-tags:nil +sgml-local-catalogs:("/usr/lib/sgml/CATALOG") +sgml-local-ecat-files:nil +End: +-->