-
Notifications
You must be signed in to change notification settings - Fork 471
SR-4201: DispatchSourceSignal not working on Linux #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to re-read doc about signals and linux, the fact that you need to block signals for signalfd to work properly is a major PITA that will cause platform dependance issues with Swift. but I'm not 100% sure we can avoid it anyway.
src/queue.c
Outdated
(void)dispatch_assume_zero(r); | ||
|
||
#if DISPATCH_EVENT_BACKEND_EPOLL | ||
// If there are signals that the user may have reasonably |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that feels like the wrong solution, but I need to think about it more. I hate signals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the better solution is to handle this lazily in the code that creates the unotes for signals: when a dispatch source of type SIGNAL is created for a given signal, making a sigprocmask(SIG_BLOCK, ...)
and document that it does this in the headerdoc for DISPATCH_SOURCE_TYPE_SIGNAL on linux.
Here you're just fishing and there are other signals that are useful that you don't cover which is the part I don't like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which ... icidentally is exactly what the current code is doing so why is it not working again?
I agree. This option feels like a complete kludge, but I'm out of ideas. I hacked together a pretty small C program that shows the same symptoms. I'll attach that to the JIRA entry in case it is useful for thinking about the problem or playing with options. |
@dgrove-oss @MadCoder any updates for this? |
src/queue.c
Outdated
(void)dispatch_assume_zero(r); | ||
|
||
#if DISPATCH_EVENT_BACKEND_EPOLL | ||
// If there are signals that the user may have reasonably |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the better solution is to handle this lazily in the code that creates the unotes for signals: when a dispatch source of type SIGNAL is created for a given signal, making a sigprocmask(SIG_BLOCK, ...)
and document that it does this in the headerdoc for DISPATCH_SOURCE_TYPE_SIGNAL on linux.
Here you're just fishing and there are other signals that are useful that you don't cover which is the part I don't like.
src/queue.c
Outdated
(void)dispatch_assume_zero(r); | ||
|
||
#if DISPATCH_EVENT_BACKEND_EPOLL | ||
// If there are signals that the user may have reasonably |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which ... icidentally is exactly what the current code is doing so why is it not working again?
I think what may be broken is that the sigprocmask() should be done before the signalfd_create() maybe? but what you're doing makes no sense to me: if you only block the signal on the thread that happens to register this signal, it can still be delivered to other threads. I thought on linux sigprocmask() was applying the mask to the process? If not then I don't understand how libraries can use signals safely (which is a joke in itself anyway) in a process that went multi-threaded. |
oh but the manpage says:
doh I'm so so so confused. |
My memory is paged out. I can probably find the C programs I was playing with back in March are refresh it. One issue is that Linux docs flat out says you can't correctly use sigprocmask in a mutli-threaded program. You have to use pthread_sigmask. |
It might work to lazily add signals to the mask being used in dispatch_signal_thread and poke that thread to wake up and reset its pthread_sigmask. |
hmm, actually I think I have a good fix, modulo testing it and making sure it builds and nitpicks like that ;) From 101af4bb8c554c7073605fe100b527a30fbf8e7a Mon Sep 17 00:00:00 2001
From: Pierre Habouzit <phabouzit@apple.com>
Date: Fri, 5 May 2017 10:34:20 -0700
Subject: [PATCH] Fix the signal story on linux
- export _dispatch_sigmask() as the way to block signals in a process
using dispatch
- call it when dispatch_main() is called
- each time dispatch registers a signal handler, register a sigaction that
will catch signals for threads with wrong masks and will fix the thread
configuration then raise the signal again so that it is delivered to the
signalfd as expected.
---
src/event/event_epoll.c | 22 +++++++++++++++++++++-
src/init.c | 23 +++++++++++++++++++++++
src/internal.h | 2 ++
src/queue.c | 40 ++--------------------------------------
4 files changed, 48 insertions(+), 39 deletions(-)
diff --git a/src/event/event_epoll.c b/src/event/event_epoll.c
index 3ebff148..c6c2c592 100644
--- a/src/event/event_epoll.c
+++ b/src/event/event_epoll.c
@@ -111,9 +111,26 @@ _dispatch_muxnote_dispose(dispatch_muxnote_t dmn)
free(dmn);
}
+static void
+_dispatch_muxnote_signal_block_and_raise(int signo)
+{
+ // On linux, for signals to be delivered to the signalfd, signals
+ // must be blocked, else any thread that hasn't them blocked may
+ // receive them. Fix that by lazily noticing, blocking said signal,
+ // and raising the signal again when it happens
+ _dispatch_sigmask();
+ raise(signo);
+}
+
static dispatch_muxnote_t
_dispatch_muxnote_create(dispatch_unote_t du, int16_t events)
{
+ static sigset_t signals_with_unotes;
+ static struct sigaction sa = {
+ .sa_handler = _dispatch_muxnote_signal_block_and_raise,
+ .sa_flags = SA_RESTART,
+ };
+
dispatch_muxnote_t dmn;
struct stat sb;
int fd = du._du->du_ident;
@@ -123,13 +140,16 @@ _dispatch_muxnote_create(dispatch_unote_t du, int16_t events)
switch (filter) {
case EVFILT_SIGNAL:
+ if (!sigismember(&signals_with_unotes, du->du_ident)) {
+ sigaddset(&signals_with_unotes, du->du_ident);
+ sigaction(du->du_ident, &sa, NULL);
+ }
sigemptyset(&sigmask);
sigaddset(&sigmask, du._du->du_ident);
fd = signalfd(-1, &sigmask, SFD_NONBLOCK | SFD_CLOEXEC);
if (fd < 0) {
return NULL;
}
- sigprocmask(SIG_BLOCK, &sigmask, NULL);
break;
case EVFILT_WRITE:
diff --git a/src/init.c b/src/init.c
index c1506422..5490bd91 100644
--- a/src/init.c
+++ b/src/init.c
@@ -74,6 +74,29 @@ dispatch_atfork_child(void)
_dispatch_unsafe_fork = 0;
}
+int
+_dispatch_sigmask(void);
+{
+ sigset_t mask;
+ int r = 0;
+
+ /* Workaround: 6269619 Not all signals can be delivered on any thread */
+ r |= sigfillset(&mask);
+ r |= sigdelset(set, SIGILL);
+ r |= sigdelset(set, SIGTRAP);
+#if HAVE_DECL_SIGEMT
+ r |= sigdelset(set, SIGEMT);
+#endif
+ r |= sigdelset(set, SIGFPE);
+ r |= sigdelset(set, SIGBUS);
+ r |= sigdelset(set, SIGSEGV);
+ r |= sigdelset(set, SIGSYS);
+ r |= sigdelset(set, SIGPIPE);
+ r |= sigdelset(set, SIGPROF);
+ r |= pthread_sigmask(how, set, oset);
+ (void)dispatch_assume_zero(r);
+}
+
#pragma mark -
#pragma mark dispatch_globals
diff --git a/src/internal.h b/src/internal.h
index 500030a9..e6e8ccc5 100644
--- a/src/internal.h
+++ b/src/internal.h
@@ -1086,6 +1086,8 @@ extern int _dispatch_kevent_workloop_disabled;
#endif // DISPATCH_KEVENT_WORKLOOP_FALLBACK
#endif // __OPEN_SOURCE__
+int _dispatch_sigmask(void);
+
/* #includes dependent on internal.h */
#include "object_internal.h"
#include "semaphore_internal.h"
diff --git a/src/queue.c b/src/queue.c
index 07ae855d..3c1eceb5 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -70,7 +70,6 @@ static void _dispatch_worker_thread2(int priority, int options, void *context);
#endif
#if DISPATCH_USE_PTHREAD_POOL
static void *_dispatch_worker_thread(void *context);
-static int _dispatch_pthread_sigmask(int how, sigset_t *set, sigset_t *oset);
#endif
#if DISPATCH_COCOA_COMPAT
@@ -5884,13 +5883,8 @@ _dispatch_worker_thread(void *context)
pqc->dpq_thread_configure();
}
- sigset_t mask;
- int r;
// workaround tweaks the kernel workqueue does for us
- r = sigfillset(&mask);
- (void)dispatch_assume_zero(r);
- r = _dispatch_pthread_sigmask(SIG_BLOCK, &mask, NULL);
- (void)dispatch_assume_zero(r);
+ _dispatch_sigmask();
_dispatch_introspection_thread_add();
const int64_t timeout = 5ull * NSEC_PER_SEC;
@@ -5906,37 +5900,6 @@ _dispatch_worker_thread(void *context)
_dispatch_release(dq); // retained in _dispatch_global_queue_poke_slow
return NULL;
}
-
-int
-_dispatch_pthread_sigmask(int how, sigset_t *set, sigset_t *oset)
-{
- int r;
-
- /* Workaround: 6269619 Not all signals can be delivered on any thread */
-
- r = sigdelset(set, SIGILL);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGTRAP);
- (void)dispatch_assume_zero(r);
-#if HAVE_DECL_SIGEMT
- r = sigdelset(set, SIGEMT);
- (void)dispatch_assume_zero(r);
-#endif
- r = sigdelset(set, SIGFPE);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGBUS);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGSEGV);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGSYS);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGPIPE);
- (void)dispatch_assume_zero(r);
- r = sigdelset(set, SIGPROF);
- (void)dispatch_assume_zero(r);
-
- return pthread_sigmask(how, set, oset);
-}
#endif // DISPATCH_USE_PTHREAD_POOL
#if DISPATCH_USE_KEVENT_WORKLOOP && !defined(__OPEN_SOURCE__)
@@ -6430,6 +6393,7 @@ dispatch_main(void)
pthread_key_t dispatch_main_key;
pthread_key_create(&dispatch_main_key, _dispatch_sig_thread);
pthread_setspecific(dispatch_main_key, &dispatch_main_key);
+ _dispatch_sigmask();
#endif
pthread_exit(NULL);
DISPATCH_INTERNAL_CRASH(errno, "pthread_exit() returned");
--
2.12.2
github desn't let me attach the diff easily (I know I should create a pull request but for reasons I can't from where I am so bear with me). the gist of the patch is to expose (in init.c, it has to move away from queue.c) a Then each time we register a source for a signal, if it's new, to fix the configuration of threads that have been pthread_created and don't have the right mask, we have a handler that will apply that sigmask and raise the signal again. it's gross but should do the job. |
@dgrove-oss can I let you have a look at that idea and massage this patch to victory? oh and the _dispatch_sigmask() in dispatch_main() is obviously about trying to fix that guy since clearly he's the broken one most of the time ;) |
(typically I just realized _dispatch_sigmask() doesn't even build you have to |
sure, I'll take a look later this afternoon. |
I couldn't get git apply to do anything reasonable, so I applied via patch and made minor adjustments. Left it on the branch associated with this PR and haven't squashed away the previous commits yet in case we need them. It is tantalizingly close to working....if one is patient and waits a couple of seconds before sending the signal to the process, than all is fine. If one sends the signal too soon, it is missed. |
hmm, actually there are more kinks to work out. Although the signal gets delivered to the DispatchSource, we also get into "signal-based spin loop" because the signal handler re-raises the signal, which causes the signal handler to run again, which re-raises, ad infinitum. |
How can it spin loop? the signal is supposed to be blocked in all threads, and I mask it too on if we ever hit the handler. I'm very very confused. I'm very confused. |
I was trying to see why it seemed like we needed to wait a few seconds before giving the signal. I stuck debug printfs into the code, including: diff --git a/src/event/event_epoll.c b/src/event/event_epoll.c
If run the test program, and send it a single kill -17, in addition to getting the message we want (from the DispatchSource event handler), we get endless output of:
I haven't spent too much time digging into why yet, just noting that my earlier comment about it being almost working was not right. |
Actually instead of raise() we want to pthread_kill() the manager thread I think, I'll try to amend your branch in that direction, it's probably not too hard by:
that should have the desired effect. You have to make sure the manager thread (the one blocking in epoll_wait) also calls _dispatch_sigmask()). If linux has a way to pthread_create() with an initial mask that'd be great too |
yep, that fixes it. Will push shortly. |
LGTM please squash |
The observed bug is that signals sent to the main thread after it called dispatch_main were not triggering the event handler of a DispatchSourceSignal that had been previously created and activated to handle that signal. This patch addresses the problem by: - export _dispatch_sigmask() as the way to block signals in a process using dispatch - call it when dispatch_main() is called - each time dispatch registers a signal handler, register a sigaction that will catch signals for threads with wrong masks and will fix the thread configuration then raise the signal again by directing a pthread_kill to the manager thread so that it is delivered to the signalfd as expected. Patch by Pierre Habouzit <phabouzit@apple.com> from PR-231 with minor compilation fixes by Dave Grove.
2bf0921
to
62741ca
Compare
squashed; ready to be tested. |
@swift-ci please test |
1 similar comment
@swift-ci please test |
this causes a build failure on Darwin, the _dispatch_sigmask() function is declared with int return value but nothing is returned. |
SR-4201: DispatchSourceSignal not working on Linux Signed-off-by: Daniel A. Steffen <dsteffen@apple.com>
The observed bug is that signals sent to the main thread
after it called dispatch_main were not triggering the
event handler of a DispatchSourceSignal that had been
previously created and activated to handle that signal.
If plausible signals are added to the sigsuspend mask,
then they are properly routed to the signalfd and the
DispatchSourceSignal event handler runs as expected.
This commit also changes event_epoll to use pthread_sigmask
to block the signal when creating a signalfd because
the Linux man page for sigprocmask states that sigprocmask's
behavior is undefined for multi-threaded programs and
pthread_sigmask should be used instead.