index.html

<HTML>
<HEAD>
<TITLE>How PostgreSQL Processes a Query</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#FF0000" VLINK="#A00000" ALINK="#0000FF">
<H1 ALIGN=CENTER>
How PostgreSQL Processes a Query
</H1>
<H2 ALIGN=CENTER>
by Bruce Momjian
</H2>
<P>
A query comes to the backend via data packets arriving through TCP/IP
or Unix Domain sockets.   It is loaded into a string, and passed to
the
<A HREF="../../backend/parser">parser,</A> where the lexical scanner,
<A HREF="../../backend/parser/scan.l">scan.l,</A>
breaks the query up into tokens(words).  The parser
uses
<A HREF="../../backend/parser/gram.y">gram.y</A> and the tokens to
identify the query type, and load the proper query-specific structure,
like <A HREF="../../include/nodes/parsenodes.h">CreateStmt</A> or <A
HREF="../../include/nodes/parsenodes.h">SelectStmt.</A>
<P>
The query is then identified as a <I>Utility</I> query or a more complex
query.  A <I>Utility</I> query is processed by a query-specific function
in <A HREF="../../backend/commands"> commands.</A> A complex query, like
<CODE>SELECT, UPDATE,</CODE> and
<CODE>DELETE</CODE> requires much more handling.
<P>
The parser takes a complex query, and creates a
<A HREF="../../include/nodes/parsenodes.h">Query</A> structure that
contains all the elements used by complex queries.  Query.qual holds the
<CODE>WHERE</CODE> clause qualification, which is filled in by
<A HREF="../../backend/parser/parse_clause.c">transformWhereClause().</A>
Each table referenced in the query is represented by a <A
HREF="../../include/nodes/parsenodes.h"> RangeTableEntry,</A> and they
are linked together to form the <I>range table</I> of the query, which is
generated by <A HREF="../../backend/parser/parse_clause.c">
makeRangeTable().</A>  Query.rtable holds the query's range table.
<P>
Certain queries, like <CODE>SELECT,</CODE> return columns of data.  Other
queries, like <CODE>INSERT</CODE> and <CODE>UPDATE,</CODE> specify the columns
modified by the query.  These column references are converted to <A
HREF="../../include/nodes/primnodes.h">Resdom</A> entries, which are
linked together to make up the <I>target list</I> of the query. The
target list is stored in Query.targetList, which is generated by
<A HREF="../../backend/parser/parse_target.c">transformTargetList().</A>
<P>
Other query elements, like aggregates(<CODE>SUM()</CODE>), <CODE>GROUP BY,</CODE>
and <CODE>ORDER BY</CODE> are also stored in their own Query fields.
<P>
The next step is for the Query to be modified by any <CODE>VIEWS</CODE> or
<CODE>RULES</CODE> that may apply to the query.  This is performed by the <A
HREF="../../backend/rewrite">rewrite</A> system.
<P>
The <A HREF="../../backend/optimizer">optimizer</A> takes the Query
structure and generates an optimal <A
HREF="../..//include/nodes/plannodes.h">Plan,</A> which contains the
operations to be performed to execute the query.  The <A
HREF="../../backend/optimizer/path">path</A> module determines the best
table join order and join type of each table in the RangeTable, using
Query.qual(<CODE>WHERE</CODE> clause) to consider optimal index usage.
<P>
The Plan is then passed to the <A
HREF="../../backend/executor">executor</A> for execution, and the result
returned to the client.
<P>
There are many other modules that support this basic functionality.
They can be accessed by clicking on the flowchart.
<P>
<HR>
<P>
<CENTER>
<EM><BIG>
Click on an item to see more detail or
<A HREF="backend_dirs.html">click</A> to see the full index.
</BIG></EM>
<BR>
<BR>
<IMG src="flow.jpg" usemap="#flowmap" alt="flowchart">
</CENTER>
<MAP name="flowmap">
<AREA COORDS="290,10,450,50" HREF="backend_dirs.html#main">
<AREA COORDS="550,10,710,50" HREF="backend_dirs.html#bootstrap">
<AREA COORDS="290,90,450,130," HREF="backend_dirs.html#postmaster">
<AREA COORDS="550,90,710,130," HREF="backend_dirs.html#libpq">
<AREA COORDS="290,170,450,210" HREF="backend_dirs.html#tcop">
<AREA COORDS="550,170,710,210" HREF="backend_dirs.html#tcop">
<AREA COORDS="290,270,450,310" HREF="backend_dirs.html#parser">
<AREA COORDS="290,350,450,390" HREF="backend_dirs.html#tcop">
<AREA COORDS="290,430,450,470" HREF="backend_dirs.html#optimizer">
<AREA COORDS="290,510,450,550" HREF="backend_dirs.html#optimizer/plan">
<AREA COORDS="290,570,450,630" HREF="backend_dirs.html#executor">
<AREA COORDS="550,350,710,390" HREF="backend_dirs.html#commands">
<AREA COORDS="10,330,170,370" HREF="backend_dirs.html#access">
<AREA COORDS="10,390,170,430" HREF="backend_dirs.html#catalog">
<AREA COORDS="10,450,170,490" HREF="backend_dirs.html#utils">
<AREA COORDS="10,510,170,550" HREF="backend_dirs.html#nodes">
<AREA COORDS="10,570,170,610" HREF="backend_dirs.html#storage">
</MAP>
<BR>
<P>
<HR>
<P>
Another area of interest is the shared memory area, which contains data
accessable to all backends.  It has table recently used data/index
blocks, locks, backend information, and lookup tables for these
structures:
<UL> 
<LI>ShmemIndex - lookup shared memory addresses using structure names
<LI><A HREF="../../include/storage/buf_internals.h">Buffer
Descriptor</A> - control header for buffer cache block
<LI><A HREF="../../include/storage/buf_internals.h">Buffer Block</A> -
data/index buffer cache block
<LI>Shared Buffer Lookup Table - lookup of buffer cache block addresses using
table name and block number(<A HREF="../../include/storage/buf_internals.h">
BufferTag</A>)
<LI>MultiLevelLockTable (ctl) - control structure for
each locking method.  Currently, only multi-level locking is used(<A
HREF="../../include/storage/lock.h">LOCKMETHODCTL</A>).
<LI>MultiLevelLockTable (lock hash) - the <A
HREF="../../include/storage/lock.h">LOCK</A> structure, looked up using
relation, database object ids(<A
HREF="../../include/storage/lock.h">LOCKTAG)</A>.  The lock table structure contains the
lock modes(read/write or shared/exclusive) and circular linked list of backends (<A
HREF="../../include/storage/proc.h">PROC</A> structure pointers) waiting
on the lock.
<LI>MultiLevelLockTable (xid hash) - lookup of LOCK structure address
using transaction id, LOCK address.  It is used to quickly check if the
current transaction already has any locks on a table, rather than having
to search through all the held locks.  It also stores the modes
(read/write) of the locks held by the current transaction.  The returned
<A HREF="../../include/storage/lock.h">XIDLookupEnt</A> structure also
contains a pointer to the backend's PROC.lockQueue.
<LI><A HREF="../../include/storage/proc.h">Proc Header</A> - information
about each backend, including locks held/waiting, indexed by process id
</UL>
Each data structure is created by calling <A
HREF="../../backend/storage/ipc/shmem.c">ShmemInitStruct(),</A> and
the lookups are created by
<A HREF="../../backend/storage/ipc/shmem.c">ShmemInitHash().</A>
<P>
<HR SIZE="2" NOSHADE>
<SMALL>
<ADDRESS>
Maintainer:	Bruce Momjian (<A
HREF="mailto:maillist@candle.pha.pa.us">maillist@candle.pha.pa.us</A>)<BR>
Last updated:		Tue Dec  9 17:56:08 EST 1997
</ADDRESS>
</SMALL>
</BODY>
</HTML>