From 442231d7f71764b8c628044e7ce2225f9aa43b67 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Mon, 6 Feb 2012 15:29:26 -0500
Subject: [PATCH] Fix postmaster to attempt restart after a hot-standby crash.

The postmaster was coded to treat any unexpected exit of the startup
process (i.e., the WAL replay process) as a catastrophic crash, and not try
to restart it. This was OK so long as the startup process could not have
any sibling postmaster children.  However, if a hot-standby backend
crashes, we SIGQUIT the startup process along with everything else, and the
resulting exit is hardly "unexpected".  Treating it as such meant we failed
to restart a standby server after any child crash at all, not only a crash
of the WAL replay process as intended.  Adjust that.  Back-patch to 9.0
where hot standby was introduced.
---
 src/backend/postmaster/postmaster.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 05db29cb93b..5d7888ade18 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2311,13 +2311,18 @@ reaper(SIGNAL_ARGS)
 			}
 
 			/*
-			 * Any unexpected exit (including FATAL exit) of the startup
-			 * process is treated as a crash, except that we don't want to
-			 * reinitialize.
+			 * After PM_STARTUP, any unexpected exit (including FATAL exit) of
+			 * the startup process is catastrophic, so kill other children,
+			 * and set RecoveryError so we don't try to reinitialize after
+			 * they're gone.  Exception: if FatalError is already set, that
+			 * implies we previously sent the startup process a SIGQUIT, so
+			 * that's probably the reason it died, and we do want to try to
+			 * restart in that case.
 			 */
 			if (!EXIT_STATUS_0(exitstatus))
 			{
-				RecoveryError = true;
+				if (!FatalError)
+					RecoveryError = true;
 				HandleChildCrash(pid, exitstatus,
 								 _("startup process"));
 				continue;
-- 
GitLab