Skip to content

Commit 909b449

Browse files
committed
Fix concurrency issues with WAL segment recycling on Windows
This commit is mostly a revert of aaa3aed, that switched the routine doing the internal renaming of recycled WAL segments to use on Windows a combination of CreateHardLinkA() plus unlink() instead of rename(). As reported by several users of Postgres 13, this is causing concurrency issues when manipulating WAL segments, mostly in the shape of the following error: LOG: could not rename file "pg_wal/000000XX000000YY000000ZZ": Permission denied This moves back to a logic where a single rename() (well, pgrename() for Windows) is used. This issue has proved to be hard to hit when I tested it, facing it only once with an archive_command that was not able to do its work, so it is environment-sensitive. The reporters of this issue have been able to confirm that the situation improved once we switched back to a single rename(). In order to check things, I have provided to the reporters a patched build based on 13.2 with aaa3aed reverted, to test if the error goes away, and an unpatched build of 13.2 to test if the error still showed up (just to make sure that I did not mess up my build process). Extra thanks to Fujii Masao for pointing out what looked like the culprit commit, and to all the reporters for taking the time to test what I have sent them. Reported-by: Andrus, Guy Burgess, Yaroslav Pashinsky, Thomas Trenz Reviewed-by: Tom Lane, Andres Freund Discussion: https://postgr.es/m/3861ff1e-0923-7838-e826-094cc9bef737@hot.ee Discussion: https://postgr.es/m/16874-c3eecd319e36a2bf@postgresql.org Discussion: https://postgr.es/m/095ccf8d-7f58-d928-427c-b17ace23cae6@burgess.co.nz Discussion: https://postgr.es/m/16927-67c570d968c99567%40postgresql.org Discussion: https://postgr.es/m/YFBcRbnBiPdGZvfW@paquier.xyz Backpatch-through: 13
1 parent 8c6eda2 commit 909b449

File tree

2 files changed

+25
-4
lines changed

2 files changed

+25
-4
lines changed

src/backend/storage/file/fd.c

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -801,17 +801,20 @@ durable_unlink(const char *fname, int elevel)
801801
}
802802

803803
/*
804-
* durable_rename_excl -- rename a file in a durable manner, without
805-
* overwriting an existing target file
804+
* durable_rename_excl -- rename a file in a durable manner.
806805
*
807-
* Similar to durable_rename(), except that this routine will fail if the
808-
* target file already exists.
806+
* Similar to durable_rename(), except that this routine tries (but does not
807+
* guarantee) not to overwrite the target file.
809808
*
810809
* Note that a crash in an unfortunate moment can leave you with two links to
811810
* the target file.
812811
*
813812
* Log errors with the caller specified severity.
814813
*
814+
* On Windows, using a hard link followed by unlink() causes concurrency
815+
* issues, while a simple rename() does not cause that, so be careful when
816+
* changing the logic of this routine.
817+
*
815818
* Returns 0 if the operation succeeded, -1 otherwise. Note that errno is not
816819
* valid upon return.
817820
*/
@@ -825,6 +828,7 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
825828
if (fsync_fname_ext(oldfile, false, false, elevel) != 0)
826829
return -1;
827830

831+
#ifdef HAVE_WORKING_LINK
828832
if (link(oldfile, newfile) < 0)
829833
{
830834
ereport(elevel,
@@ -834,6 +838,16 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
834838
return -1;
835839
}
836840
unlink(oldfile);
841+
#else
842+
if (rename(oldfile, newfile) < 0)
843+
{
844+
ereport(elevel,
845+
(errcode_for_file_access(),
846+
errmsg("could not rename file \"%s\" to \"%s\": %m",
847+
oldfile, newfile)));
848+
return -1;
849+
}
850+
#endif
837851

838852
/*
839853
* Make change persistent in case of an OS crash, both the new entry and

src/include/pg_config_manual.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,13 @@
135135
#define EXEC_BACKEND
136136
#endif
137137

138+
/*
139+
* Define this if your operating system supports link()
140+
*/
141+
#if !defined(WIN32) && !defined(__CYGWIN__)
142+
#define HAVE_WORKING_LINK 1
143+
#endif
144+
138145
/*
139146
* USE_POSIX_FADVISE controls whether Postgres will attempt to use the
140147
* posix_fadvise() kernel call. Usually the automatic configure tests are

0 commit comments

Comments
 (0)