Commits · 0c2c061eb0436e696c40832272167273b04d2b73 · Jakob Huber / postgres-lambda-diff

Nov 16, 2007
- Cleanup for new else/comment handling. · 0c2c061e
  Bruce Momjian authored 17 years ago
  
  0c2c061e
- Fix pgindent to properly handle 'else' and single-line comments on the · 7d4c99b4
  Bruce Momjian authored 17 years ago
  
  same line; previous fix was only partial. Re-run pgindent on files that need it.
  7d4c99b4
Mar 25, 2007

Add new encoding EUC_JIS_2004 and SHIFT_JIS_2004, · 75c6519f

Tatsuo Ishii authored 18 years ago

along with new conversions among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8.
catalog version has been bump up.

75c6519f

Jan 05, 2007
- Update CVS HEAD for 2007 copyright. Back branches are typically not · 29dccf5f
  Bruce Momjian authored 18 years ago
  
  back-stamped for this.
  29dccf5f
Oct 04, 2006
- pgindent run for 8.2. · f99a569a
  Bruce Momjian authored 18 years ago
  
  f99a569a
May 21, 2006

Change the backend to reject strings containing invalidly-encoded multibyte · c61a2f58

Tom Lane authored 18 years ago

characters in all cases. Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release). Embedded
zero (null) bytes will be rejected as well. The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated. Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.

Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.

Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.

c61a2f58

Mar 05, 2006
- Update copyright for 2006. Update scripts. · f2f5b056
  Bruce Momjian authored 19 years ago
  
  f2f5b056
Dec 26, 2005
- More uses of IS_HIGHBIT_SET() macro. · a2384d00
  Bruce Momjian authored 19 years ago
  
  a2384d00
Dec 25, 2005

I have added these macros to c.h: · 261114a2

Bruce Momjian authored 19 years ago

        #define HIGHBIT                 (0x80)
        #define IS_HIGHBIT_SET(ch)      ((unsigned char)(ch) & HIGHBIT)

and removed CSIGNBIT and mapped it uses to HIGHBIT.  I have also added
uses for IS_HIGHBIT_SET where appropriate.  This change is
purely for code clarity.

261114a2

Oct 29, 2005
- Message corrections · 07bb9f08
  Peter Eisentraut authored 19 years ago
  
  07bb9f08
Oct 15, 2005
- Standard pgindent run for 8.1. · 1dc34982
  Bruce Momjian authored 19 years ago
  
  1dc34982
Sep 24, 2005
- Suppress signed-vs-unsigned-char warnings. · 88896855
  Tom Lane authored 19 years ago
  
  88896855
Jun 15, 2005
- Support 3 and 4-byte unicode characters. · 59559458
  Bruce Momjian authored 19 years ago
  
  John Hansen
  59559458
Mar 07, 2005

Rename canonical encodings, per Peter: · e3d7de6b

Bruce Momjian authored 20 years ago

	UNICODE => UTF8
	ALT => WIN866
	WIN => WIN1251
	TCVN => WIN1258

The old codes continue to work.

e3d7de6b

Dec 31, 2004

· 2ff50159

PostgreSQL Daemon authored 20 years ago

Tag appropriate files for rc3

Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...

2ff50159

Aug 29, 2004
- Update copyright to 2004. · da9a8649
  Bruce Momjian authored 20 years ago
  
  da9a8649
Nov 29, 2003
- · 969685ad
  PostgreSQL Daemon authored 21 years ago
  
  $Header: -> $PostgreSQL Changes ...
  969685ad
Aug 04, 2003
- Update copyrights to 2003. · f3c3deb7
  Bruce Momjian authored 21 years ago
  
  f3c3deb7
- pgindent run. · 089003fb
  Bruce Momjian authored 21 years ago
  
  089003fb
Jul 25, 2003
- Error message editing in backend/utils (except /adt). · 689eb53e
  Tom Lane authored 21 years ago
  
  689eb53e
Apr 12, 2003

Fix encoding conversion function bug. · 35a09959

Tatsuo Ishii authored 21 years ago

See following posting for more details.

Subject: Re: [HACKERS] [BUGS] Bug #943: Server-Encoding from EUC_TW to UTF-8 doesn't
From: Tatsuo Ishii <t-ishii@sra.co.jp>
To: michael.enke@wincor-nixdorf.com, pgsql-bugs@postgresql.org
Cc: pgsql-hackers@postgresql.org
Date: Sat, 12 Apr 2003 10:51:45 +0900 (JST)

35a09959

Mar 10, 2003
- This patch fixes a bunch of spelling mistakes in comments throughout the · e4704001
  Tom Lane authored 22 years ago
  
  PostgreSQL source code. Neil Conway
  e4704001
Sep 04, 2002
- pgindent run. · e50f52a0
  Bruce Momjian authored 22 years ago
  
  e50f52a0
Aug 14, 2002
- Add Cyrillic and other encodings for encoding conversion. · 969e0246
  Tatsuo Ishii authored 22 years ago
  
  Patches submitted by Kaori Inaba (i-kaori@sra.co.jp).
  969e0246
Jul 19, 2002
- Oops. Too much ifdef out. · 86270024
  Tatsuo Ishii authored 22 years ago
  
  86270024
- Temporary ifdef out migrating functions to avoid compiler warnings. · 248cbb57
  Tatsuo Ishii authored 22 years ago
  
  248cbb57
Jul 18, 2002

I have committed many support files for CREATE CONVERSION. Default · eb335a03

Tatsuo Ishii authored 22 years ago

conversion procs and conversions are added in initdb. Currently
supported conversions are:

UTF-8(UNICODE) <--> SQL_ASCII, ISO-8859-1 to 16, EUC_JP, EUC_KR,
		    EUC_CN, EUC_TW, SJIS, BIG5, GBK, GB18030, UHC,
		    JOHAB, TCVN

EUC_JP <--> SJIS
EUC_TW <--> BIG5
MULE_INTERNAL <--> EUC_JP, SJIS, EUC_TW, BIG5

Note that initial contents of pg_conversion system catalog are created
in the initdb process. So doing initdb required is ideal, it's
possible to add them to your databases by hand, however. To accomplish
this:

psql -f your_postgresql_install_path/share/conversion_create.sql your_database

So I did not bump up the version in cataversion.h.

TODO:
Add more conversion procs
Add [CASCADE|RESTRICT] to DROP CONVERSION
Add tuples to pg_depend
Add regression tests
Write docs
Add SQL99 CONVERT command?
--
Tatsuo Ishii

eb335a03

Jun 13, 2002
- Add GB18030 support. Contributed by Bill Huang <bill_huanghb@ybb.ne.jp> · 14f72b9a
  Tatsuo Ishii authored 22 years ago
  
  (ODBC support has not been committed yet. left for Hiroshi...)
  14f72b9a
Mar 06, 2002

Change made to elog: · 92288a1c

Bruce Momjian authored 23 years ago

o  Change all current CVS messages of NOTICE to WARNING.  We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.

o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.

o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.

o Remove INFO from the client_min_messages options and add NOTICE.

Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.

Regression passed.

92288a1c

Mar 05, 2002

> Tatsuo Ishii wrote: · a8bd7e1c

Bruce Momjian authored 23 years ago

> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > >         Korean (JOHAB), Thai (WIN874),
> > > >         Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > >         Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good.  I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte.  I have attached a
> > > list of files it modifies.  Also, look at the sizes of the mb/
> > > directory.  It is getting large:
> > >
> > >   4       ./CVS
> > >   6       ./Unicode/CVS
> > >   3433    ./Unicode
> > >   6197    .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >

Address chainge.

http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz

Add PsqlODBC and document ...etc patch.

Eiji Tokuya

a8bd7e1c

Nov 05, 2001
- New pgindent run with fixes suggested by Tom. Patch manually reviewed, · ea08e6cd
  Bruce Momjian authored 23 years ago
  
  initdb/regression tests pass.
  ea08e6cd
Oct 28, 2001
- Another pgindent run. Fixes enum indenting, and improves #endif · 6783b237
  Bruce Momjian authored 23 years ago
  
  spacing. Also adds space for one-line comments.
  6783b237
Oct 25, 2001
- pgindent run on all C files. Java run to follow. initdb/regression · b81844b1
  Bruce Momjian authored 23 years ago
  
  tests pass.
  b81844b1
Oct 16, 2001

Ok, here is the modified encoding table (column1 is the standard name, · cfe01796

Tatsuo Ishii authored 23 years ago

2 is our "official" name, and 3 is alias). If there's no objection, I
will change them.

ASCII		SQL_ASCII
UTF-8		UNICODE		UTF_8
MULE-INTERNAL	MULE_INTERNAL
ISO-8859-1	LATIN1		ISO_8859_1
ISO-8859-2	LATIN2		ISO_8859_2
ISO-8859-3	LATIN3		ISO_8859_3
ISO-8859-4	LATIN4		ISO_8859_4
ISO-8859-5	ISO_8859_5
ISO-8859-6	ISO_8859_6
ISO-8859-7	ISO_8859_7
ISO-8859-8	ISO_8859_8
ISO-8859-9	LATIN5		ISO_8859_9
ISO-8859-10	LATIN6		ISO_8859_10
ISO-8859-13	LATIN7		ISO_8859_13
ISO-8859-14	LATIN8		ISO_8859_14
ISO-8859-15	LATIN9		ISO_8859_15
ISO-8859-16	LATIN10		ISO_8859_16

cfe01796

Oct 11, 2001
- Add support for ISO-8859-6 to 16 · 51053d32
  Tatsuo Ishii authored 23 years ago
  
  51053d32
Sep 25, 2001
- Fix bug in mic2ascii(). It does not handle correctly if none ASCII · 1b203150
  Tatsuo Ishii authored 23 years ago
  
  chars are in the input.
  1b203150
Sep 22, 2001
- Remove test drivers · 8ebdac0e
  Tatsuo Ishii authored 23 years ago
  
  Also fix comment in conv.c.
  8ebdac0e
Sep 11, 2001
- Implement following item in TODO: · e1de3e08
  Tatsuo Ishii authored 23 years ago
  
  * Reject character sequences those are not valid in their charset
  e1de3e08
Sep 06, 2001

Commit Karel's patch. · 22776711

Tatsuo Ishii authored 23 years ago

-------------------------------------------------------------------
Subject: Re: [PATCHES] encoding names
From: Karel Zak <zakkr@zf.jcu.cz>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: pgsql-patches <pgsql-patches@postgresql.org>
Date: Fri, 31 Aug 2001 17:24:38 +0200

On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote:
> > 		- convert encoding 'name' to 'id'
>
> I thought we decided not to add functions returning "new" names until we
> know exactly what the new names should be, and pending schema

 Ok, the patch not to add functions.

> better
>
>     ...(): encoding name too long

 Fixed.

 I found new bug in command/variable.c in parse_client_encoding(), nobody
probably never see this error:

if (pg_set_client_encoding(encoding))
{
	elog(ERROR, "Conversion between %s and %s is not supported",
                     value, GetDatabaseEncodingName());
}

because pg_set_client_encoding() returns -1 for error and 0 as true.
It's fixed too.

 IMHO it can be apply.

		Karel
PS:

    * following files are renamed:

src/utils/mb/Unicode/KOI8_to_utf8.map  -->
        src/utils/mb/Unicode/koi8r_to_utf8.map

src/utils/mb/Unicode/WIN_to_utf8.map  -->
        src/utils/mb/Unicode/win1251_to_utf8.map

src/utils/mb/Unicode/utf8_to_KOI8.map -->
        src/utils/mb/Unicode/utf8_to_koi8r.map

src/utils/mb/Unicode/utf8_to_WIN.map -->
        src/utils/mb/Unicode/utf8_to_win1251.map

   * new file:

src/utils/mb/encname.c

   * removed file:

src/utils/mb/common.c

--
 Karel Zak  <zakkr@zf.jcu.cz>
 http://home.zf.jcu.cz/~zakkr/

 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

22776711

May 28, 2001
- Fix a message error in utf_to_local · e23f8c45
  Tatsuo Ishii authored 23 years ago
  
  e23f8c45