Skip to content
Snippets Groups Projects
Select Git revision
  • benchmark-tools
  • postgres-lambda
  • master default
  • REL9_4_25
  • REL9_5_20
  • REL9_6_16
  • REL_10_11
  • REL_11_6
  • REL_12_1
  • REL_12_0
  • REL_12_RC1
  • REL_12_BETA4
  • REL9_4_24
  • REL9_5_19
  • REL9_6_15
  • REL_10_10
  • REL_11_5
  • REL_12_BETA3
  • REL9_4_23
  • REL9_5_18
  • REL9_6_14
  • REL_10_9
  • REL_11_4
23 results

gist

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Bruce Momjian authored
    rows were removed from the heap by the VACUUM.
    
    Simon Riggs
    bf324946
    History
    $PostgreSQL: pgsql/src/backend/access/gist/README,v 1.3 2005/09/16 14:40:54 teodor Exp $
    
    This directory contains an implementation of GiST indexing for Postgres.
    
    GiST stands for Generalized Search Tree. It was introduced in the seminal paper
    "Generalized Search Trees for Database Systems", 1995, Joseph M. Hellerstein,
    Jeffrey F. Naughton, Avi Pfeffer:
    
        http://www.sai.msu.su/~megera/postgres/gist/papers/gist.ps
    
    and implemented by J. Hellerstein and P. Aoki in an early version of
    PostgreSQL (more details are available from The GiST Indexing Project
    at Berkeley at http://gist.cs.berkeley.edu/). As a "university"
    project it had a limited number of features and was in rare use.
    
    The current implementation of GiST supports:
    
      * Variable length keys
      * Composite keys (multi-key)
      * provides NULL-safe interface to GiST core
      * Concurrency
      * Recovery support via WAL logging
    
    The support for concurrency implemented in PostgreSQL was developed based on 
    the paper "Access Methods for Next-Generation Database Systems" by 
    Marcel Kornaker:
    
        http://www.sai.msu.su/~megera/postgres/gist/papers/concurrency/access-methods-for-next-generation.pdf.gz
    
    The original algorithms were modified in several ways:
    
    * They should be adapted to PostgreSQL conventions. For example, the SEARCH 
      algorithm was considerably changed, because in PostgreSQL function search 
      should return one tuple (next), not all tuples at once. Also, it should 
      release page locks between calls.
    * Since we added support for variable length keys, it's not possible to 
      guarantee enough free space for all keys on pages after splitting. User 
      defined function picksplit doesn't have information about size of tuples 
      (each tuple may contain several keys as in multicolumn index while picksplit
      could work with only one key) and pages.
    * We modified original INSERT algorithm for performance reason. In particular,
      it is now a single-pass algorithm.
    * Since the papers were theoretical, some details were omitted and we
      have to find out ourself how to solve some specific problems.
    
    Because of the above reasons, we have to revised interaction of GiST
    core and PostgreSQL WAL system. Moreover, we encountered (and solved)
    a problem of uncompleted insertions when recovering after crash, which
    was not touched in the paper.
    
    SEARCH ALGORITHM
    
    Function gettuple finds a tuple which satisfies the search
    predicate. It store their state and returns next tuple under
    subsequent calls. Stack contains page, its LSN and LSN of parent page
    and currentposition is saved between calls.
    
    gettuple(search-pred)
    	if ( firsttime )
    		push(stack, [root, 0, 0]) // page, LSN, parentLSN
    		currentposition=0
    	end
    	ptr = top of stack
    	while(true)
    		latch( ptr->page, S-mode )
    		if ( ptr->page->lsn != ptr->lsn ) 
    			ptr->lsn = ptr->page->lsn
    			currentposition=0
    			if ( ptr->parentlsn < ptr->page->nsn )
    				add to stack rightlink
    		else
    			currentposition++
    		end
    
    		while(true)
    			currentposition = find_first_match( currentposition )
    			if ( currentposition is invalid )
    				unlatch( ptr->page )
    				pop stack
    				ptr = top of stack
    				if (ptr is NULL)
    					return NULL
    				break loop
    			else if ( ptr->page is leaf )
    				unlatch( ptr->page )
    				return tuple
    			else 
    				add to stack child page
    			end
    			currentposition++
    		end
    	end
    
    
    INSERT ALGORITHM
    
    INSERT guarantees that the GiST tree remains balanced. User defined key method 
    Penalty is used for choosing a subtree to insert; method PickSplit is used for 
    the node splitting algorithm; method Union is used for propagating changes 
    upward to maintain the tree properties.
    
    NOTICE: We modified original INSERT algorithm for performance reason. In 
    particularly, it is now a single-pass algorithm.
    
    Function findLeaf is used to identify subtree for insertion. Page, in which 
    insertion is proceeded, is locked as well as its parent page. Functions 
    findParent and findPath are used to find parent pages, which could be changed 
    because of concurrent access. Function pageSplit is reccurrent and could split 
    page by more than 2 pages, which could be necessary if keys have different 
    lengths or more than one key are inserted (in such situation, user defined 
    function pickSplit cannot guarantee free space on page).
    
    findLeaf(new-key)
    	push(stack, [root, 0]) //page, LSN
    	while(true)
    		ptr = top of stack
    		latch( ptr->page, S-mode )
    		ptr->lsn = ptr->page->lsn
    		if ( exists ptr->parent AND ptr->parent->lsn < ptr->page->nsn )
    			unlatch( ptr->page )
    			pop stack
    		else if ( ptr->page is not leaf )
    			push( stack, [get_best_child(ptr->page, new-key), 0] )
    			unlatch( ptr->page )
    		else
    			unlatch( ptr->page )
    			latch( ptr->page, X-mode )
    			if ( ptr->page is not leaf )
    				//the only root page can become a non-leaf
    				unlatch( ptr->page )
    			else if ( ptr->parent->lsn < ptr->page->nsn )
    				unlatch( ptr->page )
    				pop stack
    			else
    				return stack
    			end
    		end
    	end
    
    findPath( stack item )
    	push stack, [root, 0, 0] // page, LSN, parent 
    	while( stack )
    		ptr = top of stack
    		latch( ptr->page, S-mode )
    		if ( ptr->parent->page->lsn < ptr->page->nsn )
    			push stack, [ ptr->page->rightlink, 0, ptr->parent ]
    		end
    		for( each tuple on page )
    			if ( tuple->pagepointer == item->page )
    				return stack	
    			else
    				add to stack at the end [tuple->pagepointer,0, ptr]
    			end
    		end
    		unlatch( ptr->page )
    		pop stack
    	end
    	
    findParent( stack item )
    	parent = item->parent
    	latch( parent->page, X-mode )
    	if ( parent->page->lsn != parent->lsn )
    		while(true) 
    			search parent tuple on parent->page, if found the return
    			rightlink = parent->page->rightlink
    			unlatch( parent->page )
    			if ( rightlink is incorrect )
    				break loop
    			end
    			parent->page = rightlink
    			latch( parent->page, X-mode )
    		end
    		newstack = findPath( item->parent )
    		replace part of stack to new one
    		return findParent( item )
    	end
    
    pageSplit(page, allkeys)
    	(lkeys, rkeys) = pickSplit( allkeys )
    	if ( page is root )
    		lpage = new page
    	else
    		lpage = page
    	rpage = new page
    	if ( no space left on rpage )
    		newkeys = pageSplit( rpage, rkeys )
    	else
    		push newkeys, union(rkeys)
    	end
    	if ( no space left on lpage )
    		push newkeys, pageSplit( lpage, lkeys )
    	else
    		push newkeys, union(lkeys)
    	end
    	return newkeys
    
    
    placetopage(page, keysarray)
    	if ( no space left on page )
    		keysarray = pageSplit(page, [ extract_keys(page), keysarray])
    		last page in chain gets old NSN,
    		original and others - new NSN equals to LSN
    		if ( page is root )
    			make new root with keysarray
    		end
    	else
    		put keysarray on page
    		if ( length of keysarray > 1 )
    			keysarray = [ union(keysarray) ]
    		end
    	end
    	
    insert(new-key)
    	stack = findLeaf(new-key)
    	keysarray = [new-key]
    	ptr = top of stack
    	while(true)
    		findParent( ptr ) //findParent latches parent page
    		keysarray = placetopage(ptr->page, keysarray)
    		unlatch( ptr->page )
    		pop stack;
    		ptr = top of stack
    		if (length of keysarray == 1)
    			newboundingkey = union(oldboundingkey, keysarray)
    			if (newboundingkey == oldboundingkey)
    				unlatch ptr->page
    				break loop
    			end
    		end
    	end
    
    Authors:
    	Teodor Sigaev	<teodor@sigaev.ru>
    	Oleg Bartunov   <oleg@sai.msu.su>