Fixes data stuck in filters with active watermarks.
* be-filter-data-stuck:
test/be_filter: creating test case for data stuck with active watermarks
be_filter: avoid data stuck under active watermarks
Suppose we have bufferevent filter attached to bufferevent socket.
Read high watermark for bufferevent filter is configured to 4096 bytes.
Socket receives 4343 bytes. Due to watermark, 4096 bytes are transferred
from socket input buffer to filter input buffer and 247 bytes are left
in bufferevent socket.
Suppose that no more data is received through socket.
At this point 247 bytes will sit forever in input buffer of bufferevent
socket.
The patch attached solves this issue registering read callback to
filter's input buffer if it reaches its read high water mark and data
was left in corresponding underlying's input buffer.
This read callback calls filter process input function as soon as filter
input buffer falls below its read high watermark and there still is data
left in underlying input buffer. Callback is deregistered as soon as
filter input buffer falls below its read high watermark.
Here's some fun. From `bufferevent.h`:
```
#define BEV_EVENT_READING 0x01 /**< error encountered while reading */
#define BEV_EVENT_WRITING 0x02 /**< error encountered while writing */
```
And from `event.h`:
```
/** Wait for a socket or FD to become readable */
#define EV_READ 0x02
/** Wait for a socket or FD to become writeable */
#define EV_WRITE 0x04
```
Library users have to be very careful to get this right; it turns out, the
library itself got this wrong in the `bufferevent_pair` code. It appears that
in most of the code, only `BEV_EVENT_FINISHED` will indicate whether it's read
or write; on error or timeout, it appears that "both" is assumed and not set in
the callback. I read through all the other places where `BEV_EVENT_FINISHED` is
passed to an event callback; it appears that the pair code is the only spot
that got it wrong.
azat: add TT_FORK to avoid breaking clean env, and rebase commit message
(copied from #359)
Fixes: #359
@EMPanisset reported a problem (#358) with evbuffer_remove_buffer(), but
actually I think that the problem is in evbuffer_add_buffer() which introduces
this empty chain, all other callers (except evbuffer_prepend_buffer(), but it
doesn't have this problem though) should be safe.
And FWIW the only API that allows empty chains is evbuffer_add_reference(), and
we can add check there to avoid such issues, but for now I leaved this without
fixing, since I think that evbuffer_add_reference() with empty chains can be
used as a barrier (but this can be tricky).
Fixes: regress evbuffer/remove_buffer_with_empty2
v2: introduce/fixes evbuffer/add_buffer_with_empty
Using:
- evbuffer_add()
- evbuffer_add_buffer() -- the one that has problem
- evbuffer_add_reference() -- the only one that allows empty chains to be added
- evbuffer_remove_buffer()
This increases libevent coverage to:
- os:osx
- cmake -DEVENT__DISABLE_MM_REPLACEMENT=ON
- cmake -DEVENT__ENABLE_VERBOSE_DEBUG=ON
- configure --disable-openssl
- configure --disable-thread-support
- configure --disable-malloc-replacement
- fix travis-ci builds under automake >1.11
Possible failures after this patch set (not always, IOW in some builds this
issues aren't real issues):
- some failures but mostly because of timing issues, must be fixed separately.
- https://travis-ci.org/azat/libevent/jobs/129430229 # on brew update
- https://travis-ci.org/azat/libevent/jobs/129430221 # some locking issues
* travis-ci-os-matrix-v2:
automake: define serial-tests only if automake have this option
test/automake: don't use paralell test harness (since automake 1.12)
travis-ci/osx: relink gcc/g++ instead of clang
travis-ci: enable multi-os mode (osx, linux)
travis-ci: increase matrix (--disable-foo)
travis-ci: adjust alignment
Fixes: #356
Travis-CI: https://travis-ci.org/azat/libevent/builds/129430181
Starting from automake 1.2 there is parallel test harness, that redirects all
output to some log, which serial-test doesn't do.
So in case of new runner we can get no output for 10 minutes, for example on my
desktop:
$ time make verify VERBOSE=1
PASS: test/test-script.sh
============================================================================
Testsuite summary for libevent 2.1.5-beta
============================================================================
# TOTAL: 1
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
real 25m31.735s
user 0m13.753s
sys 0m7.648s
And this means that this will fail on travis-ci, since it has timeout for 10
minutes. Sure we can use `travis wait 60` instead, but I think that it is
better to fix this by writing result to output, instead of hacking around, so
let's use serial-tests instead of parallel always.
And now it works on travis-ci under linux because it has automake 1.11 while
osx has at least 1.12.
Links:
https://docs.travis-ci.com/user/common-build-problems/https://www.gnu.org/software/automake/manual/html_node/Serial-Test-Harness.html#Serial-Test-Harnesshttps://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html
CI:
https://travis-ci.org/azat/libevent/jobs/129171497 # ok on linux
https://travis-ci.org/azat/libevent/jobs/129171532 # no output for 10 min on osx
In the referenced commit new *.pc added, and I think it is better to ignore
them all.
Refs: b8d7c6211a965c19c7c5de414135ff13b5fa2476 ("libevent_core and
libevent_extra also deserve a pkgconfig file")
Without this patch:
$ regress --no-fork +listener/error_unlock
listener/error_unlock: [warn] Error from accept() call: Too many open files
[err] ../evthread.c:220: Assertion lock->count == 0 failed in ../evthread.c
Aborted (core dumped)
Fixes: #341
Fixes: listener/error_unlock
* origin/pr/339:
evbuffer_add: Use last_with_datap if set, not last.
test/regress: add tests for evbuffer_add() breakage on empty last chain
Fixes: #335
evbuffer_add() would always put data in the last chain, even if there
was available space in a previous chain, and in doing so it also
failed to update last_with_datap, causing subsequent calls to other
functions that do look at last_with_datap to add data in the middle
of the evbuffer instead of at the end.
Fixes the evbuffer_add() part of issue #335, and the evbuffer/add2 and
evbuffer/add3 tests, and also prevents wasting space available in the
chain pointed to by last_with_datap.
The evbuffer/add* tests currenly break on 2.0.21, 2.0.22 and 2.1 HEAD
due to issue #335. The evbuffer/reference2 test breaks on 2.0.21 and
2.0.22 due to commit b18c04dd not being applied.
../http.c:589:6: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses]
if (!req->kind == EVHTTP_REQUEST || !REQ_VERSION_ATLEAST(req, 1, 1))
^ ~~
From #332:
Here follows a bug report by **Guido Vranken** via the _Tor bug bounty program_. Please credit Guido accordingly.
## Bug report
The DNS code of Libevent contains this rather obvious OOB read:
```c
static char *
search_make_new(const struct search_state *const state, int n, const char *const base_name) {
const size_t base_len = strlen(base_name);
const char need_to_append_dot = base_name[base_len - 1] == '.' ? 0 : 1;
```
If the length of ```base_name``` is 0, then line 3125 reads 1 byte before the buffer. This will trigger a crash on ASAN-protected builds.
To reproduce:
Build libevent with ASAN:
```
$ CFLAGS='-fomit-frame-pointer -fsanitize=address' ./configure && make -j4
```
Put the attached ```resolv.conf``` and ```poc.c``` in the source directory and then do:
```
$ gcc -fsanitize=address -fomit-frame-pointer poc.c .libs/libevent.a
$ ./a.out
=================================================================
==22201== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60060000efdf at pc 0x4429da bp 0x7ffe1ed47300 sp 0x7ffe1ed472f8
READ of size 1 at 0x60060000efdf thread T0
```
P.S. we can add a check earlier, but since this is very uncommon, I didn't add it.
Fixes: #332
Seems that the hack with filling BACKLOG didn't work on win32, and hence we
stuck in write() waiting, not in connect()
And:
$ time regress http/cancel_server_timeout
- on linux: 10secs
- on win32: 2-5secs
I tried to debug this but you can't sniff TCP packages (wireshark/rawpcap) on
localhost in windows xp (according to [RAWPCAP] and my testing).
RAWPCAP: http://www.netresec.com/?page=RawCap
http/cancel_by_host_no_ns:
OK ../test/regress_http.c:1384: assert(regress_dnsserver(data->base, &portnum, search_table))
OK ../test/regress_http.c:1387: assert(dns_base)
OK ../test/regress_http.c:1423: assert(evcon)
OK ../test/regress_http.c:1444: assert(evhttp_make_request(evcon, req, EVHTTP_REQ_GET, "/delay") != -1): 0 vs -1
OK ../test/regress_http.c:1455: assert(test_ok == 2): 2 vs 2
OK ../test/regress_http.c:1480: assert(evhttp_make_request(evcon, req, EVHTTP_REQ_GET, "/test") != -1): 0 vs -1[msg] Nameserver 127.0.0.1:55948 has failed: request timed out.
[msg] All nameservers have failed
OK ../test/regress_http.c:1274: assert(!req)
OK ../test/regress_http.c:1505: assert(evhttp_make_request(evcon, req, EVHTTP_REQ_GET, "/test") != -1): 0 vs -1
OK ../test/regress_http.c:1274: assert(!req)==19199== Invalid read of size 8
==19199== at 0x4CC285: evdns_cancel_request (evdns.c:2849)
==19199== by 0x4CEDB2: evdns_nameserver_free (evdns.c:4018)
==19199== by 0x4CEF5B: evdns_base_free_and_unlock (evdns.c:4052)
==19199== by 0x4CF13B: evdns_base_free (evdns.c:4088)
==19199== by 0x4617A3: http_cancel_test (regress_http.c:1518)
==19199== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==19199== by 0x490D5A: testcase_run_one (tinytest.c:252)
==19199== by 0x491699: tinytest_main (tinytest.c:434)
==19199== by 0x47E0E0: main (regress_main.c:461)
==19199== Address 0x61e56d0 is 0 bytes inside a block of size 48 free'd
==19199== at 0x4C2AE6B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19199== by 0x4AAFFF: event_mm_free_ (event.c:3516)
==19199== by 0x4C5ADD: request_finished (evdns.c:693)
==19199== by 0x4CEE95: evdns_base_free_and_unlock (evdns.c:4040)
==19199== by 0x4CF13B: evdns_base_free (evdns.c:4088)
==19199== by 0x4617A3: http_cancel_test (regress_http.c:1518)
==19199== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==19199== by 0x490D5A: testcase_run_one (tinytest.c:252)
==19199== by 0x491699: tinytest_main (tinytest.c:434)
==19199== by 0x47E0E0: main (regress_main.c:461)
==19199== Block was alloc'd at
==19199== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19199== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==19199== by 0x4CAAA2: nameserver_send_probe (evdns.c:2327)
==19199== by 0x4C50FF: nameserver_prod_callback (evdns.c:494)
==19199== by 0x4A564C: event_process_active_single_queue (event.c:1646)
==19199== by 0x4A5B95: event_process_active (event.c:1738)
==19199== by 0x4A6296: event_base_loop (event.c:1961)
==19199== by 0x4A5C1D: event_base_dispatch (event.c:1772)
==19199== by 0x46172C: http_cancel_test (regress_http.c:1507)
==19199== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==19199== by 0x490D5A: testcase_run_one (tinytest.c:252)
==19199== by 0x491699: tinytest_main (tinytest.c:434)
==19199==
This patch set fixes bufferevent via http request cancellations (connect() and
dns-request), it survives tests and cancel.. with --no-fork, so this must be ok
(though I have one patch for dns layer pending).
But I'm not sure about cancel.. unit tests on win32, will fix disable them
later if they will differs (plus maybe we must make them skip-by-default?).
Fixes: #333
* bufev-cancellations-v5:
http: set fd to -1 unconditioally, to avoid leaking of DNS requests
test/http: cover NS timed out during request cancellations separatelly
http: avoid leaking of fd in evhttp_connection_free()
http: get fd from be layer during connection reset
be_sock: cancel in-progress dns requests
evdns: export cancel via callbacks in util (like async lib core/extra issues)
test/http: request cancellation with resolving/{conn,write}-timeouts in progress
Otherwise:
http/cancel_by_host_ns_timeout_inactive_server: [msg] Nameserver 127.0.0.1:37035 has failed: request timed out.
[msg] All nameservers have failed
OK
1 tests ok. (0 skipped)
==26211==
==26211== FILE DESCRIPTORS: 3 open at exit.
==26211== Open file descriptor 2: /dev/pts/47
==26211== <inherited from parent>
==26211==
==26211== Open file descriptor 1: /dev/pts/47
==26211== <inherited from parent>
==26211==
==26211== Open file descriptor 0: /dev/pts/47
==26211== <inherited from parent>
==26211==
==26211==
==26211== HEAP SUMMARY:
==26211== in use at exit: 1,112 bytes in 5 blocks
==26211== total heap usage: 149 allocs, 144 frees, 18,826 bytes allocated
==26211==
==26211== 40 bytes in 1 blocks are indirectly lost in loss record 1 of 5
==26211== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26211== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==26211== by 0x498F5B: evbuffer_add_cb (buffer.c:3309)
==26211== by 0x4A0EF5: bufferevent_socket_new (bufferevent_sock.c:366)
==26211== by 0x4BFADF: evhttp_connection_base_bufferevent_new (http.c:2375)
==26211== by 0x4BFC8F: evhttp_connection_base_new (http.c:2427)
==26211== by 0x460DAA: http_cancel_test (regress_http.c:1417)
==26211== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==26211== by 0x490D5A: testcase_run_one (tinytest.c:252)
==26211== by 0x491699: tinytest_main (tinytest.c:434)
==26211== by 0x47E0E0: main (regress_main.c:461)
==26211==
==26211== 136 bytes in 1 blocks are indirectly lost in loss record 2 of 5
==26211== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26211== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==26211== by 0x491FF0: evbuffer_new (buffer.c:365)
==26211== by 0x49A1BE: bufferevent_init_common_ (bufferevent.c:300)
==26211== by 0x4A0E44: bufferevent_socket_new (bufferevent_sock.c:353)
==26211== by 0x4BFADF: evhttp_connection_base_bufferevent_new (http.c:2375)
==26211== by 0x4BFC8F: evhttp_connection_base_new (http.c:2427)
==26211== by 0x460DAA: http_cancel_test (regress_http.c:1417)
==26211== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==26211== by 0x490D5A: testcase_run_one (tinytest.c:252)
==26211== by 0x491699: tinytest_main (tinytest.c:434)
==26211== by 0x47E0E0: main (regress_main.c:461)
==26211==
==26211== 136 bytes in 1 blocks are indirectly lost in loss record 3 of 5
==26211== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26211== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==26211== by 0x491FF0: evbuffer_new (buffer.c:365)
==26211== by 0x49A1FB: bufferevent_init_common_ (bufferevent.c:305)
==26211== by 0x4A0E44: bufferevent_socket_new (bufferevent_sock.c:353)
==26211== by 0x4BFADF: evhttp_connection_base_bufferevent_new (http.c:2375)
==26211== by 0x4BFC8F: evhttp_connection_base_new (http.c:2427)
==26211== by 0x460DAA: http_cancel_test (regress_http.c:1417)
==26211== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==26211== by 0x490D5A: testcase_run_one (tinytest.c:252)
==26211== by 0x491699: tinytest_main (tinytest.c:434)
==26211== by 0x47E0E0: main (regress_main.c:461)
==26211==
==26211== 536 bytes in 1 blocks are indirectly lost in loss record 4 of 5
==26211== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26211== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==26211== by 0x4A0E15: bufferevent_socket_new (bufferevent_sock.c:350)
==26211== by 0x4BFADF: evhttp_connection_base_bufferevent_new (http.c:2375)
==26211== by 0x4BFC8F: evhttp_connection_base_new (http.c:2427)
==26211== by 0x460DAA: http_cancel_test (regress_http.c:1417)
==26211== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==26211== by 0x490D5A: testcase_run_one (tinytest.c:252)
==26211== by 0x491699: tinytest_main (tinytest.c:434)
==26211== by 0x47E0E0: main (regress_main.c:461)
==26211==
==26211== 1,112 (264 direct, 848 indirect) bytes in 1 blocks are definitely lost in loss record 5 of 5
==26211== at 0x4C2BBD5: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26211== by 0x4AAEB2: event_mm_calloc_ (event.c:3459)
==26211== by 0x4D0564: evdns_getaddrinfo (evdns.c:4685)
==26211== by 0x4B13BA: evutil_getaddrinfo_async_ (evutil.c:1575)
==26211== by 0x4A139E: bufferevent_socket_connect_hostname (bufferevent_sock.c:524)
==26211== by 0x4C02DB: evhttp_connection_connect_ (http.c:2588)
==26211== by 0x4C04DD: evhttp_make_request (http.c:2643)
==26211== by 0x4615FF: http_cancel_test (regress_http.c:1504)
==26211== by 0x490A78: testcase_run_bare_ (tinytest.c:105)
==26211== by 0x490D5A: testcase_run_one (tinytest.c:252)
==26211== by 0x491699: tinytest_main (tinytest.c:434)
==26211== by 0x47E0E0: main (regress_main.c:461)
==26211==
==26211== LEAK SUMMARY:
==26211== definitely lost: 264 bytes in 1 blocks
==26211== indirectly lost: 848 bytes in 4 blocks
==26211== possibly lost: 0 bytes in 0 blocks
==26211== still reachable: 0 bytes in 0 blocks
==26211== suppressed: 0 bytes in 0 blocks
Since we do close fd there if we don't have BEV_OPT_CLOSE_ON_FREE, and
evcon->fd can be incorrect (non -1), so just get it from the underlying
bufferevent to fix this.
And after this patch the following tests report 0 instead of 2307 fd leaks:
$ valgrind --leak-check=full --show-reachable=yes --track-fds=yes --error-exitcode=1 regress --no-fork http/cancel..
==11299== FILE DESCRIPTORS: 3 open at exit.
And this is stdin/stderr/stdout.
Since it can be non -1, and we must close it, otherwise we will have problems.
And after this patch the following tests report fd 2307 instead of 2309 fd leaks:
$ valgrind --leak-check=full --show-reachable=yes --track-fds=yes --error-exitcode=1 regress --no-fork http/cancel..
==10853== FILE DESCRIPTORS: 2307 open at exit.