Skip to content
Snippets Groups Projects
Select Git revision
  • benchmark-tools
  • postgres-lambda
  • master default
  • REL9_4_25
  • REL9_5_20
  • REL9_6_16
  • REL_10_11
  • REL_11_6
  • REL_12_1
  • REL_12_0
  • REL_12_RC1
  • REL_12_BETA4
  • REL9_4_24
  • REL9_5_19
  • REL9_6_15
  • REL_10_10
  • REL_11_5
  • REL_12_BETA3
  • REL9_4_23
  • REL9_5_18
  • REL9_6_14
  • REL_10_9
  • REL_11_4
23 results

regex

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Tom Lane authored
    locale-dependent character classification properly when the database encoding
    is UTF8.
    
    The previous coding worked okay in single-byte encodings, or in any case for
    ASCII characters, but failed entirely on multibyte characters.  The fix
    assumes that the <wctype.h> functions use Unicode code points as the wchar
    representation for Unicode, ie, wchar matches pg_wchar.
    
    This is only a partial solution, since we're still stupid about non-ASCII
    characters in multibyte encodings other than UTF8.  The practical effect
    of that is limited, however, since those cases are generally Far Eastern
    glyphs for which concepts like case-folding don't apply anyway.  Certainly
    all or nearly all of the field reports of problems have been about UTF8.
    A more general solution would require switching to the platform's wchar
    representation for all regex operations; which is possible but would have
    substantial disadvantages.  Let's try this and see if it's sufficient in
    practice.
    0d323425
    History
    Name Last commit Last update
    ..