I'm running a fairly simple bit of test code using libevent2 with epoll and
openssl bufferevents and I've run into a 100% cpu usage problem.
Looking into it 100% usage was caused by epoll_wait constantly
returning write events on the openssl socket when it shouldn't really have
been looking for write events at all (N_ACTIVE_CALLBACKS() was returning 0
also).
Looking a bit deeper eventbuffer_openssl socket seems to be requesting
that the EV_WRITE event be removed when it should, but the event isn't
actually being removed from epoll.
Continuing to follow this I think I've found a bug in
event_changelist_del.
For evpoll event_del calls event_changelist_del which caches the change
which is then actioned later when evpoll_dispatch is called.
In event_changlist_del there is a check so that if the currently changed
action is an add then the cached action is changed to a no-op rather than a
delete (which makes sense). The problem arises if there are more than
two add or delete operations between calls to dispatch, in this case it's
possible that the delete is turned into a no-op when it shouldn't have
been.
For example starting with the event on, a delete followed by an add and
then another delete results in a no-op when it should have been a delete (I
added a fair bit of debug output that seems to confirm this behaviour).
I've applied a small change that checks the original old_event stored with
the change and only converts the delete to a no-op if the event isn't on in
old_event. This seems to have fixed my problem.
The old code had a bug where the 'exact' flag to 1 in
_evbuffer_read_setup_vecs would never actually make the iov_len field
of the last iovec get truncated. This patch fixes that.
Everybody but Linux documents this as taking an int, and Linux is
very tolerant of getting an int instead. If it weren't, everybody
doing fcntl(fd,F_SETFL,O_NONBLOCK) would break, since the glibc
headers define O_NONBLOCK as an int literal.
It's okay for us to get an EPERM when doing an EPOLL_DEL on an fd; it
just means that before we got a chance to the EPOLL_DEL, we closed the
fd and reopened a new non-socket that wound up having the same fd.
Partial fix for Bug 3019973.
There was previously no lock protecting the signal event's
ev_ncalls/ev_pncalls fields, which were accessed by all of
event_signal_closure, event_add_internal, event_del_internal, and
event_active_nolock. This patch fixes this race by using the
current_event_lock in the same way it's used to prevent
event_del_internal from touching an event that's currently running.
The problem was that the thread doing the notification could block on
write in evthread_notify_base_default while holding the th_base_lock.
The main thread would never drain th_notify_fd[0], since it would need
th_base_lock to actually trigger events.
Although bufferevent operations are threadsafe, sometimes you need
to make sure that a few operations on a single bufferevent will all
be executed with nothing intervening. That's what these functions
are for.
Previously, our autogen.sh script wouldn't tell automake to update
older versions of its copied-in scripts, which would cause problems if
they got sufficiently out-of-date.
(The existing implementation had sanity-checking code for the case where
its argument was NULL, but it erroneously dereferenced it before actually
doing the sanity-check. --nickm)
On all the backends on this little mac laptop, that behavior is to
report a remote socket close as both EV_READ and EV_WRITE.
Historically, we had problem for some of these behaviors on some
backends, so let's make sure that such behaviors don't come back.
The old logic made sense back when buffer.c was an enormous linear
buffer, but it doesn't make any sense for the chain-based
implementation.
This patch also refactors the ioctl{socket}? call into its own function.
The current template...
<HTML><HEAD><TITLE>%s</TITLE>
</HEAD><BODY>
<H1>Method Not Implemented</H1>
Invalid method in request<P>
</BODY></HTML>
is highly confusing. The given title is easily overlooked and the
hard-coded content is just plain wrong in most cases (I really read
this as "the server did not understand the requested HTTP method)
This patch changes the template to include the error reason in the
body as well as in the header, and to infer the proper reason from
the status code whenever the reason argument is NULL.
This patch also removes a redundant evhttp_add_header from
evhttp_send_error; evhttp_send_page already adds a "Connection:
close" header.
The default behavior of test.sh was to suppress all output from
test/regress, and say nothing but OKAY or FAILED. This wasn't so good
for getting bugs reported, since lots of people didn't know to set
TEST_OUTPUT_FILE, or re-run ./test/regress on its own.
Now, when you don't specify an output file for test.sh, it runs
regress with the --quiet option. This option makes the unit tests
only print output on failure, which is what we probably wanted.
If Libevent uses strcpy, even safely, it seems OpenBSD's linker will
complain every time a library links Libevent. It's easier just not to
use the old thing, even when it's safe to do so.
The new options let you specify a maximum deviation of bandwidth used
from expected bandwidth used, and make test-ratelim.c exit with a
nonzero status when those deviations are violated.
This patch also adds a test-ratelim.sh script to run test-ratelim with
a few sensible options for testing.
This attribute tells gcc (and anything else that understands gcc
attributes) that the functions will never return control, and helps
the optimizer a little. With luck, it will also tell
less-than-full-program dataflow analysis tools that they don't need to
worry about any code path that involves calling one of these functions
and then returning.
This patch also forces event_exit() to always exit, no matter what the
user-supplied fatal_callback does. This means that the old unit tests
for the event_err* functions don't work any more, since they assume it
is safe to call event_err* if you've given it a bogus fatal_callback
that doesn't exit. Instead, we have to make the unit tests fork
before calling event_err(), and have the main unit test process wait
for the event_err() test to exit with a sane exit code. On unix,
that's trivial. On windows, let's not bother and just assume that
event_err* works.