This is not a perfect fix, but it's much much better than the
current buggy behavior, which could lead to filtering SSL
connections that just stopped reading.
Based on ideas by Maseeb Abdul Qadir and Mark Ellzey.
Previously, if some sender were generating data to read on an
OpenSSL connection as fast as we could process it, we could easily
wind up looping on an openssl do_read operation without ever
considering other sockets.
The difference between this and the original method in
consider_reading() is that it only loops for a single completed
*frame* instead of looping until fd is drained or an error condition
was triggered.
{Patch split out by nickm}
OpenSSL bufferevents with deferred callbacks enabled under high load will
spinlock in the function consider_reading(). This loop continues until all
data has been read.
Because of this condition; openssl bufferevents will never return back into
event_base_loop() until SSL_read has determined data is no longer ready.
As of yet I have not found a reason why this while loop exists, so this patch
just swaps out while for if.
If needed I can write same code which would trigger this effect; optionally
libevhtp has a test.c program which can be run with the following flags:
./test -s <keyfile.pem>
curl -vvvv -k -d@<HUGE_ASS_FILE> https://127.0.0.1:8081/
The return data will include the number of times the readcb got data and the
length of that read.
Without this patch, you are likely to see a small amount of "bytes read....",
otherwise the "bytes read..." return data should show much more reasonable
numbers.
Original mail:
the logic that handles write watermarks in "bio_bufferevent_write"
is not working. It currently doesn't write any data if the high
watermark is *above* the amount of data to write (i.e. when there
is actually enough room available).
The problem was that we were using openssl's BIO code's shutdown flag
whenever BEV_OPT_CLOSE_ON_FREE was set. This made the BIO close the
socket when it was freed... but it would be freed whenever we did a
setfd on the bufferevent_openssl, even the no-op setfd in
bufferevent_connect.
So instead, we just set the shutdown flag to 0, and handle closing the
fd ourselves.
Spotted by Linus Nordberg
Previously, if BEV_OPT_CLOSE_ON_FREE wasn't set on a
bufferevent_filter or a filtering bufferevent_openssl, when we went
to free the filtering bufferevent, we'd leave the underlying
bufferevent unchanged. That's not so good, since the callbacks are
set to activate stuff in the filtering bufferevent that we're about
to free. Instead, set all the callbacks to NULL.
If an SSL connection becamse disabled or suspended before became open,
it could (under the right circumstances) wind up without ever getting
its write callback disabled.
The most correct fix is probably more subtle, and involves checking
all caseswhen a write callback is enabled or disabled. This fix is
more blunt, and explicitly checks whether the callback should have
been disabled at the end of the callback to prevent infinite looping.
Diagnosed with help from Sebastian Hahn
When handshaking, we listen for reads or writes from the
transport. But when we're connected, we start out with writes enabled
and reads disabled, which means we should not have the transport read
for us.
Previously, whenever writing was disabled on a bufferevent_filter (or
a filtering SSL bufferevent), we would stop writing on the underlying
bufferevent. This would make for trouble, though, since if you
implemented common patterns like "stop writing once data X has been
flushed", your bufferevent filter would disable the underlying
bufferevent after the data was flushed to the underlying bufferevent,
but before actually having it written to the network.
Now, we have filters leave their underlying bufferevents enabled for
reading and writing for reading and writing immediately. They are not
disabled, unless the user wants to disable them, which is now allowed.
To handle the case where we want to choke reading on the underlying
bufferevent because the filter no longer wants to read, we use
bufferevent_suspend_read(). This is analogous to the way that we use
bufferevent_suspend_write() to suspend writing on a filtering
bufferevent when the underlying bufferevent's output buffer has hit
its high watermark.
We were looking at the number of bytes read on the wbio, not in the
rbio. But these are usually different BIOs, and the reading is
supposed to happen on the rbio.
When you're doing rate limiting on an openssl connection, you nearly
always want to limit the number of bytes sent and received over the
wire, not the number of bytes read or written over the secure
transport.
I was running into a problem when using bufferevent_openssl with a
very simple echo server. My server simply bufferevent_read_buffer 'd
data into an evbuffer and then passed that evbuffer straight to
bufferevent_write_buffer.
The problem was every now and again the write would fail for no
apparent reason. I tracked it down to SSL_write being called with the
amount of data to send being 0.
This patch alters do_write in bufferevent_openssl so that it skips
io_vecs with 0 length.
The different bufferevent implementations had different behavior for
their timeouts. Some of them kept re-triggering the timeouts
indefinitely; some disabled the event immediately the first time a
timeout triggered. Some of them made the timeouts only count when
the bufferevent was actively trying to read or write; some did not.
The new behavior is modeled after old socket bufferevents, since
they were here first and their behavior is relatively sane.
Basically, each timeout disables the bufferevent's corresponding
read or write operation when it fires. Timeouts are stopped
whenever we suspend writing or reading, and reset whenever we
unsuspend writing or reading. Calling bufferevent_enable resets a
timeout, as does changing the timeout value.
Most of these should be unable to fail, since adding a timeout
generally always works. Still, it's better not to try to be "too
smart for our own good here."
There are some remaining event_add() calls that I didn't add checks
for; I've marked those with "XXXX" comments.
In many places throughout the code, we called _bufferevent_run_eventcb
without checking whether the eventcb was actually set. This would
work fine when the bufferevent's callbacks were deferred, but
otherwise the code would segfault. Strangely, we always remembered to
check before calling the _bufferevent_run_{read,write}cb functions.
To prevent similar errors in the future, all of
_buferevent_run_{read,write,event}cb now check to make sure the
callback is actually set before invoking or deferring the callback.
This patch also removes the now-redundant checks for {read,write}cb.
The fairness algorithms are not the best, not every bufferevent type
is supported, and some of the locking tricks here are simply absurd.
Still, this code should be a good first step.
OpenSSL has a per-thread error stack, and really doesn't like you
leaving errors on the stack. Rather than discard the errors or force
the user to handle them, this patch pulls them off the openssl stack
and puts them on a stack associated with the bufferevent_openssl. If
the user leaves them on the stack then, it won't affect any other
connections.
This bug was found by Roman Puls. Thanks!
svn:r1481
This makes our interfaces usable from C++, which doesn't believe
you can say "bufferevent_socket_nase(base, -1,
BEV_OPT_CLOSE_ON_FREE|BEV_OPT_DEFER_CALLBACKS)" but which instead
would demand "static_cast<bufferevent_options>(BEV_OPT_CLOSE_ON_FREE|
BEV_OPT_DEFER_CALLBACKS))" for the last argument.
Diagnosis and patch from Chris Davis.
svn:r1456