From 802323535788965f041b4fdaecc16025f289cb44 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 10 Jun 2014 22:48:16 -0400
Subject: [PATCH] Fix ancient encoding error in hungarian.stop.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When we grabbed this file off the Snowball project's website, we mistakenly
supposed that it was in LATIN1 encoding, but evidently it was actually in
LATIN2.  This resulted in Å‘ (o-double-acute, U+0151, which is code 0xF5 in
LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
bug #10589 from Zoltán Sörös.  We'd have messed up u-double-acute too,
but there aren't any of those in the file.  Other characters used in the
file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
the problem for so long.

The error is not only ours: the Snowball project also was confused about
which encoding is required for Hungarian.  But dealing with that will
require source-code changes that I'm not at all sure we'll wish to
back-patch.  Fixing the stopword file seems reasonably safe to back-patch
however.
---
 src/backend/snowball/stopwords/hungarian.stop | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/backend/snowball/stopwords/hungarian.stop b/src/backend/snowball/stopwords/hungarian.stop
index 94e9f9a0b07..abfd35ce976 100644
--- a/src/backend/snowball/stopwords/hungarian.stop
+++ b/src/backend/snowball/stopwords/hungarian.stop
@@ -55,10 +55,10 @@ ekkor
 el
 elég
 ellen
-elõ
-elõször
-elõtt
-elsõ
+elő
+először
+előtt
+első
 én
 éppen
 ebben
@@ -149,9 +149,9 @@ nincs
 olyan
 ott
 össze
-õ
-õk
-õket
+Å‘
+ők
+őket
 pedig
 persze
 rá
-- 
GitLab