Skip to content
Snippets Groups Projects
lobj.sgml 18.50 KiB
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjian Exp $
-->

 <chapter id="largeObjects">
  <title id="largeObjects-title">Large Objects</title>

  <indexterm zone="largeobjects"><primary>large object</></>
  <indexterm><primary>BLOB</><see>large object</></>

  <sect1 id="lo-intro">
   <title>Introduction</title>

   <para>
    In <productname>PostgreSQL</productname> releases prior to 7.1,
    the size of any row in the database could not exceed the size of a
    data page.  Since the size of a data page is 8192 bytes (the
    default, which can be raised up to 32768), the upper limit on the
    size of a data value was relatively low. To support the storage of
    larger atomic values, <productname>PostgreSQL</productname>
    provided and continues to provide a large object interface.  This
    interface provides file-oriented access to user data that has been
    declared to be a large object.
   </para>

   <para>
    <productname>POSTGRES 4.2</productname>, the indirect predecessor
    of <productname>PostgreSQL</productname>, supported three standard
    implementations of large objects: as files external to the
    <productname>POSTGRES</productname> server, as external files
    managed by the <productname>POSTGRES</productname> server, and as
    data stored within the <productname>POSTGRES</productname>
    database. This caused considerable confusion among users. As a
    result, only support for large objects as data stored within the
    database is retained in <productname>PostgreSQL</productname>.
    Even though this is slower to access, it provides stricter data
    integrity.  For historical reasons, this storage scheme is
    referred to as <firstterm>Inversion large
    objects</firstterm>. (You will see the term Inversion used
    occasionally to mean the same thing as large object.)  Since
    <productname>PostgreSQL 7.1</productname>, all large objects are
    placed in one system table called
    <classname>pg_largeobject</classname>.
   </para>

   <para>
    <indexterm><primary>TOAST</></>
    <indexterm><primary>sliced bread</><see>TOAST</></indexterm>
    <productname>PostgreSQL 7.1</productname> introduced a mechanism
    (nicknamed <quote><acronym>TOAST</acronym></quote>) that allows
    data rows to be much larger than individual data pages.  This
    makes the large object interface partially obsolete.  One
    remaining advantage of the large object interface is that it
    allows random access to the data, i.e., the ability to read or
    write small chunks of a large value.  It is planned to equip
    <acronym>TOAST</acronym> with such functionality in the future.
   </para>

   <para>
    This section describes the implementation and the programming and
    query language interfaces to <productname>PostgreSQL</productname>
    large object data.  We use the <application>libpq</application> C
    library for the examples in this section, but most programming
    interfaces native to <productname>PostgreSQL</productname> support
    equivalent functionality.  Other interfaces may use the large
    object interface internally to provide generic support for large
    values.  This is not described here.
   </para>

  </sect1>

  <sect1 id="lo-implementation">
   <title>Implementation Features</title>

   <para>
    The large object implementation breaks  large
    objects  up  into  <quote>chunks</quote>  and  stores  the chunks in
    tuples in the database.  A B-tree index guarantees fast
    searches for the correct chunk number when doing random
    access reads and writes.
   </para>
  </sect1>

  <sect1 id="lo-interfaces">
   <title>Interfaces</title>

   <para>
    The  facilities  <productname>PostgreSQL</productname> provides  to
    access large objects,  both  in  the backend as part of user-defined
    functions or the front end as part  of  an  application
    using  the   interface, are described below. For users
    familiar with <productname>POSTGRES 4.2</productname>,
    <productname>PostgreSQL</productname> has a new set of
    functions  providing  a  more  coherent  interface.

    <note>
     <para>
      All large object manipulation <emphasis>must</emphasis> take
      place within an SQL transaction. This requirement is strictly
      enforced as of <productname>PostgreSQL 6.5</>, though it has been an
      implicit requirement in previous versions, resulting in
      misbehavior if ignored.
     </para>
    </note>
   </para>

   <para>
    The  <productname>PostgreSQL</productname>  large  object interface is modeled after
    the <acronym>Unix</acronym>  file-system  interface,  with  analogues  of
    <function>open(2)</function>,  <function>read(2)</function>,
    <function>write(2)</function>,
    <function>lseek(2)</function>, etc.  User 
    functions call these routines to retrieve only the data  of
    interest  from a large object.  For example, if a large
    object type called <type>mugshot</type>  existed  that  stored  
    photographs  of  faces, then a function called <function>beard</function> could
    be declared on <type>mugshot</type> data.  <function>beard</> could look  at  the
    lower third of a photograph, and determine the color of
    the beard that appeared  there,  if  any.   The  entire
    large-object value need not be buffered, or even 
    examined, by the <function>beard</function> function.
    Large objects may be accessed from dynamically-loaded <acronym>C</acronym>
    functions  or  database  client  programs that link the
    library.  <productname>PostgreSQL</productname> provides a set of routines that 
    support opening, reading, writing, closing, and seeking on
    large objects.
   </para>

   <sect2>
    <title>Creating a Large Object</title>

    <para>
     The routine
<synopsis>
Oid lo_creat(PGconn *<replaceable class="parameter">conn</replaceable>, int <replaceable class="parameter">mode</replaceable>)
</synopsis>
     creates a new large  object.  
     <replaceable class="parameter">mode</replaceable>  is  a  bit mask
     describing  several  different  attributes  of  the new
     object.  The symbolic constants listed here are defined
     in the header file <filename>libpq/libpq-fs.h</filename>.
     The access type (read, write, or both) is controlled by
     or'ing together the bits <symbol>INV_READ</symbol>  and
     <symbol>INV_WRITE</symbol>.  The low-order sixteen bits of the mask have
     historically been used at Berkeley to designate the storage  manager  number on which the large object
     should reside.  These
     bits should always be zero now.
     The commands below create a large object:
<programlisting>
inv_oid = lo_creat(INV_READ|INV_WRITE);
</programlisting>
    </para>
   </sect2>

   <sect2>
    <title>Importing a Large Object</title>

    <para>
     To import an operating system file as a large object, call
<synopsis>
Oid lo_import(PGconn *<replaceable class="parameter">conn</replaceable>, const char *<replaceable class="parameter">filename</replaceable>)
</synopsis>
    <replaceable class="parameter">filename</replaceable> 
     specifies the operating system name of
     the file to be imported as a large object.
    </para>
   </sect2>

   <sect2>
    <title>Exporting a Large Object</title>

    <para>
     To export a large object
     into an operating system file, call
<synopsis>
int lo_export(PGconn *<replaceable class="parameter">conn</replaceable>, Oid <replaceable class="parameter">lobjId</replaceable>, const char *<replaceable class="parameter">filename</replaceable>)
</synopsis>
     The <parameter>lobjId</parameter> argument specifies  the  OID  of  the  large
     object  to  export  and the <parameter>filename</parameter> argument specifies
     the operating system name name of the file.
    </para>
   </sect2>

   <sect2>
    <title>Opening an Existing Large Object</title>

    <para>
     To open an existing large object, call
<synopsis>
int lo_open(PGconn *conn, Oid lobjId, int mode)
</synopsis>
     The <parameter>lobjId</parameter> argument specifies  the  OID  of  the  large
     object  to  open.   The  <parameter>mode</parameter>  bits control whether the
     object is opened  for  reading  (<symbol>INV_READ</>),  writing (<symbol>INV_WRITE</symbol>),  or
     both.
     A  large  object cannot be opened before it is created.
     <function>lo_open</function> returns a large object descriptor
     for later use in <function>lo_read</function>, <function>lo_write</function>,
     <function>lo_lseek</function>, <function>lo_tell</function>, and
     <function>lo_close</function>.
</para>
</sect2>

<sect2>
<title>Writing Data to a Large Object</title>

<para>
     The routine
<programlisting>
int lo_write(PGconn *conn, int fd, const char *buf, size_t len)
</programlisting>
     writes <parameter>len</parameter> bytes from <parameter>buf</parameter> to large object <parameter>fd</>.   The <parameter>fd</parameter>
     argument must have been returned by a previous <function>lo_open</function>.
     The number of bytes actually written is  returned.   In
     the event of an error, the return value is negative.
</para>
</sect2>

<sect2>
<title>Reading Data from a Large Object</title>

<para>
     The routine
<programlisting>
int lo_read(PGconn *conn, int fd, char *buf, size_t len)
</programlisting>
     reads <parameter>len</parameter> bytes from large object <parameter>fd</parameter> into <parameter>buf</parameter>. The  <parameter>fd</parameter>
     argument must have been returned by a previous <function>lo_open</function>.
     The number of bytes actually read is returned. In
     the event of an error, the return value is negative.
</para>
</sect2>

<sect2>
<title>Seeking on a Large Object</title>

<para>
     To change the current read or write location on a large
     object, call
<programlisting>
int lo_lseek(PGconn *conn, int fd, int offset, int whence)
</programlisting>
     This routine moves the current location pointer for the
     large object described by <parameter>fd</> to the new location specified 
     by <parameter>offset</>.  The valid values for <parameter>whence</> are
     <symbol>SEEK_SET</>, <symbol>SEEK_CUR</>, and <symbol>SEEK_END</>.
</para>
</sect2>

<sect2>
<title>Closing a Large Object Descriptor</title>

<para>
     A large object may be closed by calling
<programlisting>
int lo_close(PGconn *conn, int fd)
</programlisting>
     where  <parameter>fd</>  is  a  large  object  descriptor returned by
     <function>lo_open</function>.  On success, <function>lo_close</function>
      returns zero.  On error, the return value is negative.
</para>
</sect2>

   <sect2>
    <title>Removing a Large Object</title>

    <para>
     To remove a large object from the database, call
<synopsis>
int lo_unlink(PGconn *<replaceable class="parameter">conn</replaceable>, Oid lobjId)
</synopsis>
     The <parameter>lobjId</parameter> argument specifies  the  OID  of  the  large
     object  to  remove.  In the event of an error, the return value is negative.
    </para>
   </sect2>


</sect1>

<sect1 id="lo-funcs">
<title>Server-side Built-in Functions</title>

<para>
     There  are two built-in registered functions, <function>lo_import</function>
     and <function>lo_export</function> which  are  convenient  for  use
    in  <acronym>SQL</acronym>
     queries.
     Here is an example of their use
<programlisting>
CREATE TABLE image (
    name            text,
    raster          oid
);

INSERT INTO image (name, raster)
    VALUES ('beautiful image', lo_import('/etc/motd'));

SELECT lo_export(image.raster, '/tmp/motd') FROM image
    WHERE name = 'beautiful image';
</programlisting>
</para>
</sect1>

<sect1 id="lo-libpq">
<title>Accessing Large Objects from <application>Libpq</application></title>

<para>
     <xref linkend="lo-example"> is a sample program which shows how the large object  
     interface
     in  <application>libpq</>  can  be used.  Parts of the program are 
     commented out but are left in the source for  the  reader's
     benefit.  This program can be found in
     <filename>src/test/examples/testlo.c</filename> in the source distribution.
     Frontend applications which use the large object interface  
     in  <application>libpq</application>  should   include   the   header   file
     <filename>libpq/libpq-fs.h</filename> and link with the <application>libpq</application> library.
</para>

  <example id="lo-example">
   <title>Large Objects with <application>Libpq</application> Example Program</title>
<programlisting>
/*--------------------------------------------------------------
 *
 * testlo.c--
 *    test using large objects with libpq
 *
 * Copyright (c) 1994, Regents of the University of California
 *
 *--------------------------------------------------------------
 */
#include &lt;stdio.h&gt;
#include &quot;libpq-fe.h&quot;
#include &quot;libpq/libpq-fs.h&quot;

#define BUFSIZE          1024

/*
 * importFile
 *    import file &quot;in_filename&quot; into database as large object &quot;lobjOid&quot;
 *
 */
Oid
importFile(PGconn *conn, char *filename)
{
    Oid         lobjId;
    int         lobj_fd;
    char        buf[BUFSIZE];
    int         nbytes,
                tmp;
    int         fd;

    /*
     * open the file to be read in
     */
    fd = open(filename, O_RDONLY, 0666);
    if (fd &lt; 0)
    {                           /* error */
        fprintf(stderr, &quot;can't open unix file %s\n&quot;, filename);
    }

    /*
     * create the large object
     */
    lobjId = lo_creat(conn, INV_READ | INV_WRITE);
    if (lobjId == 0)
        fprintf(stderr, &quot;can't create large object\n&quot;);

    lobj_fd = lo_open(conn, lobjId, INV_WRITE);

    /*
     * read in from the Unix file and write to the inversion file
     */
    while ((nbytes = read(fd, buf, BUFSIZE)) &gt; 0)
    {
        tmp = lo_write(conn, lobj_fd, buf, nbytes);
        if (tmp &lt; nbytes)
            fprintf(stderr, &quot;error while reading large object\n&quot;);
    }

    (void) close(fd);
    (void) lo_close(conn, lobj_fd);

    return lobjId;
}

void
pickout(PGconn *conn, Oid lobjId, int start, int len)
{
    int         lobj_fd;
    char       *buf;
    int         nbytes;
    int         nread;

    lobj_fd = lo_open(conn, lobjId, INV_READ);
    if (lobj_fd &lt; 0)
    {
        fprintf(stderr, &quot;can't open large object %d\n&quot;,
                lobjId);
    }

    lo_lseek(conn, lobj_fd, start, SEEK_SET);
    buf = malloc(len + 1);

    nread = 0;
    while (len - nread &gt; 0)
    {
        nbytes = lo_read(conn, lobj_fd, buf, len - nread);
        buf[nbytes] = ' ';
        fprintf(stderr, &quot;&gt;&gt;&gt; %s&quot;, buf);
        nread += nbytes;
    }
    free(buf);
    fprintf(stderr, &quot;\n&quot;);
    lo_close(conn, lobj_fd);
}

void
overwrite(PGconn *conn, Oid lobjId, int start, int len)
{
    int         lobj_fd;
    char       *buf;
    int         nbytes;
    int         nwritten;
    int         i;

    lobj_fd = lo_open(conn, lobjId, INV_READ);
    if (lobj_fd &lt; 0)
    {
        fprintf(stderr, &quot;can't open large object %d\n&quot;,
                lobjId);
    }

    lo_lseek(conn, lobj_fd, start, SEEK_SET);
    buf = malloc(len + 1);

    for (i = 0; i &lt; len; i++)
        buf[i] = 'X';
    buf[i] = ' ';

    nwritten = 0;
    while (len - nwritten &gt; 0)
    {
        nbytes = lo_write(conn, lobj_fd, buf + nwritten, len - nwritten);
        nwritten += nbytes;
    }
    free(buf);
    fprintf(stderr, &quot;\n&quot;);
    lo_close(conn, lobj_fd);
}

/*
 * exportFile *    export large object &quot;lobjOid&quot; to file &quot;out_filename&quot;
 *
 */
void
exportFile(PGconn *conn, Oid lobjId, char *filename)
{
    int         lobj_fd;
    char        buf[BUFSIZE];
    int         nbytes,
                tmp;
    int         fd;

    /*
     * create an inversion &quot;object&quot;
     */
    lobj_fd = lo_open(conn, lobjId, INV_READ);
    if (lobj_fd &lt; 0)
    {
        fprintf(stderr, &quot;can't open large object %d\n&quot;,
                lobjId);
    }

    /*
     * open the file to be written to
     */
    fd = open(filename, O_CREAT | O_WRONLY, 0666);
    if (fd &lt; 0)
    {                           /* error */
        fprintf(stderr, &quot;can't open unix file %s\n&quot;,
                filename);
    }

    /*
     * read in from the Unix file and write to the inversion file
     */
    while ((nbytes = lo_read(conn, lobj_fd, buf, BUFSIZE)) &gt; 0)
    {
        tmp = write(fd, buf, nbytes);
        if (tmp &lt; nbytes)
        {
            fprintf(stderr, &quot;error while writing %s\n&quot;,
                    filename);
        }
    }

    (void) lo_close(conn, lobj_fd);
    (void) close(fd);

    return;
}

void
exit_nicely(PGconn *conn)
{
    PQfinish(conn);
    exit(1);
}

int
main(int argc, char **argv)
{
    char       *in_filename,
               *out_filename;
    char       *database;
    Oid         lobjOid;
    PGconn     *conn;
    PGresult   *res;

    if (argc != 4)
    {
        fprintf(stderr, &quot;Usage: %s database_name in_filename out_filename\n&quot;,
                argv[0]);
        exit(1);
    }

    database = argv[1];
    in_filename = argv[2];
    out_filename = argv[3];

    /*
     * set up the connection
     */
    conn = PQsetdb(NULL, NULL, NULL, NULL, database);

    /* check to see that the backend connection was successfully made */
    if (PQstatus(conn) == CONNECTION_BAD)
    {
        fprintf(stderr, &quot;Connection to database '%s' failed.\n&quot;, database);
        fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn));
        exit_nicely(conn);
    }

    res = PQexec(conn, &quot;begin&quot;);
    PQclear(res);

    printf(&quot;importing file %s\n&quot;, in_filename);
/*  lobjOid = importFile(conn, in_filename); */
    lobjOid = lo_import(conn, in_filename);
/*
    printf(&quot;as large object %d.\n&quot;, lobjOid);

    printf(&quot;picking out bytes 1000-2000 of the large object\n&quot;);
    pickout(conn, lobjOid, 1000, 1000);

    printf(&quot;overwriting bytes 1000-2000 of the large object with X's\n&quot;);
    overwrite(conn, lobjOid, 1000, 1000);
*/

    printf(&quot;exporting large object to file %s\n&quot;, out_filename);
/*    exportFile(conn, lobjOid, out_filename); */
    lo_export(conn, lobjOid, out_filename);

    res = PQexec(conn, &quot;end&quot;);
    PQclear(res);
    PQfinish(conn);
    exit(0);
}
</programlisting>
</example>

</sect1>
</chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->