Gilad Benjamini noted that we check the error code for deleting a
write-timeout event twice, and the read timeout not at all. This
shouldn't be a bit problem, since it's really hard for a delete to
fail on a timeout-only event, but it's worth fixing.
Fixes bug 3046787
Avi Bab correctly noted as bug 3044479 the fact that any thread
blocking on current_event_lock will do so while holding
th_base_lock, making it impossible for the currently running event's
callback to call any other functions that require th_base_lock.
This patch switches the current_event_lock code to instead use a
condition variable that we wait on if we're trying to mess with
a currently-executing event, and that we signal when we're done
executing a callback if anybody is waiting on it.
The interface from the user's POV is similar to the locking
implementation: either provide a structure full of function
pointers, or just call evthread_use_*_threads() and everything will
be okay.
The internal interface is meant to vaguely resemble pthread_cond_*,
which Windows people will better recognize as *ConditionVariable*.
Though the C standards allow it, it's apparently possible to get MSVC
upset by saying "struct { int field; } (declarator);" instead of
"struct {int field; } declarator;", so let's just not do that.
Bugfix for 3044492
(commit msg by nickm)
As a generated file, it shouldn't get included in our source
distribution. Apparently there is an automake incant for this:
nobase_ even stacks with nodist_ .
If there is an event-config.h in include/event2 (either because we
screwed up packaging like in 2.0.6-rc or because we previously tried
building with mingw and we didn't make distclean in the middle), we
want MSVC to find the one one in WIN32-Code/include/event2 first.
Found by Gilad Benjamini.
When we freed a bufferevent that was in a rate-limiting group and
blocked on IO, the process of freeing it caused it to get removed
from the group. But removing the bufferevent from the group made
its limits get removed, which could make it get un-suspended and in
turn cause its events to get re-added. Since we would then
immediately _free_ the events, this would result in dangling
pointers.
Fixes bug 3041007.
If you were to enable USE_DEBUG and slog through all 700+ MB of
debugging output, you'd find that one of the unit tests failed,
since it tested the debug logging code, but the string it expected
and the string it logged differed by a tab vs 2 spaces.
This patch splits the formerly windows-only case of evutil_socketpair()
into an (internal-use-only) function named evutil_ersatz_socketpair(), and
makes it build and work right on non-Windows hosts.
We need this for convenience to test sendfile on solaris, where socketpair
can't give you an AF_INET pair, and sendfile() won't work on AF_UNIX.
If the rate limit was low enough, then the echo_conns wouldn't finish
inside the 300 msec we allowed for them to close. Instead, count the
number of connections we have, and keep waiting until they are all
closed.
When you're doing rate limiting on an openssl connection, you nearly
always want to limit the number of bytes sent and received over the
wire, not the number of bytes read or written over the secure
transport.
To be fair, when char can be signed, if toupper doesn't take negative
characters, toupper(char) is a very bad idea. So let's just use the
nice safe EVUTIL_TOUPPER instead. (It explicitly only upcases ASCII,
but we only use it for identifiers that we know to be ASCII anyway).
The bufferevent_connect_hostname test was specifying AF_INET, but the
gethostbyname test we were using to see what error to expect was using
PF_UNSPEC, leading to possible divergence of results.
I was running into a problem when using bufferevent_openssl with a
very simple echo server. My server simply bufferevent_read_buffer 'd
data into an evbuffer and then passed that evbuffer straight to
bufferevent_write_buffer.
The problem was every now and again the write would fail for no
apparent reason. I tracked it down to SSL_write being called with the
amount of data to send being 0.
This patch alters do_write in bufferevent_openssl so that it skips
io_vecs with 0 length.
Eventually test-changelist should expand to try more cases, maybe
query the status of the actual changelist somehow, and integrate it
with the rest of the unit tests.
Also, add test-changelist to gitignore.
I'm running a fairly simple bit of test code using libevent2 with epoll and
openssl bufferevents and I've run into a 100% cpu usage problem.
Looking into it 100% usage was caused by epoll_wait constantly
returning write events on the openssl socket when it shouldn't really have
been looking for write events at all (N_ACTIVE_CALLBACKS() was returning 0
also).
Looking a bit deeper eventbuffer_openssl socket seems to be requesting
that the EV_WRITE event be removed when it should, but the event isn't
actually being removed from epoll.
Continuing to follow this I think I've found a bug in
event_changelist_del.
For evpoll event_del calls event_changelist_del which caches the change
which is then actioned later when evpoll_dispatch is called.
In event_changlist_del there is a check so that if the currently changed
action is an add then the cached action is changed to a no-op rather than a
delete (which makes sense). The problem arises if there are more than
two add or delete operations between calls to dispatch, in this case it's
possible that the delete is turned into a no-op when it shouldn't have
been.
For example starting with the event on, a delete followed by an add and
then another delete results in a no-op when it should have been a delete (I
added a fair bit of debug output that seems to confirm this behaviour).
I've applied a small change that checks the original old_event stored with
the change and only converts the delete to a no-op if the event isn't on in
old_event. This seems to have fixed my problem.
The old code had a bug where the 'exact' flag to 1 in
_evbuffer_read_setup_vecs would never actually make the iov_len field
of the last iovec get truncated. This patch fixes that.
Everybody but Linux documents this as taking an int, and Linux is
very tolerant of getting an int instead. If it weren't, everybody
doing fcntl(fd,F_SETFL,O_NONBLOCK) would break, since the glibc
headers define O_NONBLOCK as an int literal.
It's okay for us to get an EPERM when doing an EPOLL_DEL on an fd; it
just means that before we got a chance to the EPOLL_DEL, we closed the
fd and reopened a new non-socket that wound up having the same fd.
Partial fix for Bug 3019973.
There was previously no lock protecting the signal event's
ev_ncalls/ev_pncalls fields, which were accessed by all of
event_signal_closure, event_add_internal, event_del_internal, and
event_active_nolock. This patch fixes this race by using the
current_event_lock in the same way it's used to prevent
event_del_internal from touching an event that's currently running.