Age | Commit message (Collapse) | Author |
|
We've seen a number of deadlocks in the tree since thread safety was
introduced. So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively. When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.
POSIX "error-checking" mutexes check for this specific problem (and
others). This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes. This ought to help find problems more
quickly in the future.
There might be performance advantages to other kinds of mutexes in some
cases. However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases. Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
This commit adds annotations for thread safety check. And the
check can be conducted by using -Wthread-safety flag in clang.
Co-authored-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ed Maste <emaste@adaranet.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
On FreeBSD sig_atomic_t is long, which causes the comparison in
fatal_signal_run to be true when no signal has been reported.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This makes it easier to diagnose why and when a daemon exited.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Replaced all instances of Nicira Networks(, Inc) to Nicira, Inc.
Feature #10593
Signed-off-by: Raju Subramanian <rsubramanian@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Android appears to lack SIG_ATOMIC_MAX which is only
used in fatal-signal.c.
Observed when compiling using the Android NDK r6b (Android API level 13).
Patch based on a suggestion by Ben Pfaff
|
|
If a daemon doesn't start, we need to know why. Being able to
consistently consult the log to find out is helpful.
|
|
In each of the cases converted here, an shash was used simply to maintain
a set of strings, with the shash_nodes' 'data' values set to NULL. This
commit converts them to use sset instead.
|
|
It's kind of odd for VLOG_DEFINE_THIS_MODULE to supply its own semicolon,
so this commit switches to the more common form.
|
|
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
Adding a macro to define the vlog module in use adds a level of
indirection, which makes it easier to change how the vlog module must be
defined. A followup commit needs to do that, so getting these widespread
changes out of the way first should make that commit easier to review.
|
|
This is just a cleanup.
|
|
The fatal-signal library notices and records fatal signals (e.g. SIGTERM)
and terminates the process on the next trip through poll_block(). But
some special utilities do not always invoke poll_block() promptly, e.g.
"ovs-ofctl monitor" does not call poll_block() as long as OpenFlow messages
are available. But these special cases seem like they are all likely to
call into functions that themselves block (those with "_block" in their
names). So make a new rule that such functions should always call
fatal_signal_run(), either directly or through poll_block(). This commit
implements and documents that rule.
Bug #2625.
|
|
Not calling fatal_signal_init() means that the signal handlers don't get
registered, so the process won't clean up on fatal signals. Furthermore,
signal_fds[0] is then 0, which means that fatal-signal_wait() waits on
stdin, so if you are testing a program interactively and accidentally type
something on stdin then that program's CPU usage jumps to 100%.
Since poll_block() calls fatal_signal_wait() this seems like the most
reliable solution.
|
|
The main change here is the need to update all of the uses of UNUSED in
the next branch to OVS_UNUSED as it is now spelled on "master".
|
|
Requested by Jean Tourrilhes <jt@hpl.hp.com>.
|
|
Until now, fatal_signal_fork() has simply disabled all the fatal signal
callback hooks. This worked fine, because a daemon process forked only
once and the parent didn't do much before it exited.
But upcoming commits will introduce a --monitor option, which requires
processes to fork multiple times. Sometimes the parent process will fork,
then run for a while, then fork again. It's not good to disable the
hooks in the child process in such a case, because that prevents e.g.
pidfiles from being removed at the child's exit.
So this commit changes the semantics of fatal_signal_fork() to just
clearing out hooks. After hooks are cleared, new hooks can be added and
will be executed on process termination in the usual way.
This commit also introduces a cancellation callback function so that a
canceled hook can free resources.
|
|
Rather than running signal hooks directly from the actual signal
handler, simply record the fact that the signal occured and run
the hook next time around the poll loop. This allows significantly
more freedom as to what can actually be done in the signal hooks.
|
|
Suggested by Justin Pettit.
|
|
This is a helper function that combines two actions that callers commonly
wanted. It will have an additional user in an upcoming commit.
|
|
This simplifies the code here and should speed it up, too, when there are
lots of files to unlink on a fatal signal.
|
|
|
|
|