From 19dd2fbf7e5784b7d70d2910cb7366e0578ac7a7 Mon Sep 17 00:00:00 2001 From: Bruce Momjian <bruce@momjian.us> Date: Mon, 4 Sep 2006 20:10:53 +0000 Subject: [PATCH] Add GIN documentation. Christopher Kings-Lynne --- doc/src/sgml/filelist.sgml | 3 +- doc/src/sgml/gin.sgml | 135 +++++++++++++++++++++++++++++++++++++ doc/src/sgml/xindex.sgml | 37 +++++++++- 3 files changed, 173 insertions(+), 2 deletions(-) create mode 100644 doc/src/sgml/gin.sgml diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 3b1cd740057..a5c5f12f10b 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.44 2005/09/12 22:11:38 neilc Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.45 2006/09/04 20:10:53 momjian Exp $ --> <!entity history SYSTEM "history.sgml"> <!entity info SYSTEM "info.sgml"> @@ -78,6 +78,7 @@ <!entity catalogs SYSTEM "catalogs.sgml"> <!entity geqo SYSTEM "geqo.sgml"> <!entity gist SYSTEM "gist.sgml"> +<!entity gin SYSTEM "gin.sgml"> <!entity planstats SYSTEM "planstats.sgml"> <!entity indexam SYSTEM "indexam.sgml"> <!entity nls SYSTEM "nls.sgml"> diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml new file mode 100644 index 00000000000..4420fcd0ab9 --- /dev/null +++ b/doc/src/sgml/gin.sgml @@ -0,0 +1,135 @@ +<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 1.1 2006/09/04 20:10:53 momjian Exp $ --> + +<chapter id="GIN"> +<title>GIN Indexes</title> + + <indexterm> + <primary>index</primary> + <secondary>GIN</secondary> + </indexterm> + +<sect1 id="gin-intro"> + <title>Introduction</title> + + <para> + <acronym>GIN</acronym> stands for Generalized Inverted Index. It is + an index structure storing a set of (key, posting list) pairs, where + 'posting list' is a set of documents in which the key occurs. + </para> + + <para> + It is generalized in the sense that a <acronym>GIN</acronym> index + does not need to be aware of the operation that it accelerates. + Instead, it uses custom strategies defined for particular data types. + </para> + + <para> + One advantage of <acronym>GIN</acronym> is that it allows the development + of custom data types with the appropriate access methods, by + an expert in the domain of the data type, rather than a database expert. + This is much the same advantage as using <acronym>GiST</acronym>. + </para> + + <para> + The <acronym>GIN</acronym> + implementation in <productname>PostgreSQL</productname> is primarily + maintained by Teodor Sigaev and Oleg Bartunov, and there is more + information on their + <ulink url="http://www.sai.msu.su/~megera/oddmuse/index.cgi/Gin">website</ulink>. + </para> + +</sect1> + +<sect1 id="gin-extensibility"> + <title>Extensibility</title> + + <para> + The <acronym>GIN</acronym> interface has a high level of abstraction, + requiring the access method implementer to only implement the semantics of + the data type being accessed. The <acronym>GIN</acronym> layer itself + takes care of concurrency, logging and searching the tree structure. + </para> + + <para> + All it takes to get a <acronym>GIN</acronym> access method working + is to implement four user-defined methods, which define the behavior of + keys in the tree. In short, <acronym>GIN</acronym> combines extensibility + along with generality, code reuse, and a clean interface. + </para> + +</sect1> + +<sect1 id="gin-implementation"> + <title>Implementation</title> + + <para> + There are four methods that an index operator class for + <acronym>GIN</acronym> must provide: + </para> + + <variablelist> + <varlistentry> + <term>compare</term> + <listitem> + <para> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>extract value</term> + <listitem> + <para> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>extract query</term> + <listitem> + <para> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>consistent</term> + <listitem> + <para> + </para> + </listitem> + </varlistentry> + + </variablelist> + +</sect1> + +<sect1 id="gin-examples"> + <title>Examples</title> + + <para> + The <productname>PostgreSQL</productname> source distribution includes + <acronym>GIN</acronym> classes for one-dimensional arrays of all internal + types. The following + <filename>contrib</> modules also contain <acronym>GIN</acronym> + operator classes: + </para> + + <variablelist> + <varlistentry> + <term>intarray</term> + <listitem> + <para>Enhanced support for int4[]</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>tsearch2</term> + <listitem> + <para>Support for inverted text indexing. This is much faster for very + large, mostly-static sets of documents. + </para> + </listitem> + </varlistentry> + +</chapter> diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index 9c52202ea04..35e5137eaec 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.43 2006/03/10 19:10:49 momjian Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.44 2006/09/04 20:10:53 momjian Exp $ --> <sect1 id="xindex"> <title>Interfacing Extensions To Indexes</title> @@ -380,6 +380,41 @@ </tgroup> </table> + <para> + GIN indexes require four support functions, + shown in <xref linkend="xindex-gin-support-table">. + </para> + + <table tocentry="1" id="xindex-gin-support-table"> + <title>GIN Support Functions</title> + <tgroup cols="2"> + <thead> + <row> + <entry>Function</entry> + <entry>Support Number</entry> + </row> + </thead> + <tbody> + <row> + <entry>compare</entry> + <entry>1</entry> + </row> + <row> + <entry>extract value</entry> + <entry>2</entry> + </row> + <row> + <entry>extract query</entry> + <entry>3</entry> + </row> + <row> + <entry>consistent</entry> + <entry>4</entry> + </row> + </tbody> + </tgroup> + </table> + <para> Unlike strategy operators, support functions return whichever data type the particular index method expects; for example in the case -- GitLab