Then next code sample will use free'd lock:
evthread_use_pthreads();
...
assert(!bufferevent_pair_new(base, BEV_OPT_THREADSAFE, pair));
...
bufferevent_free(pair[0]); # refcnt == 0 -> unlink
bufferevent_free(pair[1]); # refcnt == 0 -> unlink
...
event_base_free() -> finalizers -> EVTHREAD_FREE_LOCK(bev1->lock)
-> BEV_LOCK(bev2->lock) <-- *already freed*
While if you will reverse the order:
bufferevent_free(pair[1]); # refcnt == 0 -> unlink
bufferevent_free(pair[0]); # refcnt == 0 -> unlink
...
event_base_free() -> finalizers -> BEV_LOCK(bev2->lock)/!own_lock/BEV_UNLOCK(bev2->lock)
-> EVTHREAD_FREE_LOCK(bev1->lock) (own_lock)
It is ok now, but I guess that it will be better to relax order of
freeing pairs.
For this fix, we need to make sure that passing too-large inputs to
the evbuffer functions can't make us do bad things with the heap.
Also, lower the maximum chunk size to the lower of off_t, size_t maximum.
This is necessary since otherwise we could get into an infinite loop
if we make a chunk that 'misalign' cannot index into.
This fixes following problems in shared library build:
* visibility=hidden was not enabled for gcc because of incorrect variable name
* test programs that need internal APIs caused link errors
There is a race between manual event_active and natural event activation. If both happen at the same time on the same FD, they would both be protected by the same event base lock except for 1 LoC where the fields of struct event are read without any kind of lock. This commit does those reads into local variables inside the lock and then invokes the callback with those local arguments outside the lock. In 2.0-stable, none of this is inside the lock; in HEAD, only the callback is read inside the lock. This gets the callback and all 3 arguments inside the lock before calling it outside the lock.
CMAKE_MODULE_PATH is usually a list instead of single entry. Especially
for projects contain sub cmake projects. My patch replace the
CMAKE_MODULE_PATH with fixed path, to locate the `.in` file.
In case when between this two close (close(F), close(F)) some open()
will be executed, than we will close newly opened fd.
Reported-by: xujiezhige@163.com
In evdns_request_timeout_callback() in case we a giving up, we call
request_finished() which will free() req structure, however we ns from
it to fail it, so save pointer to ns to call nameserver_failed() on
them.
Founded with valgrind:
$ valgrind regress dns/retry
==10497== Memcheck, a memory error detector
==10497== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==10497== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==10497== Command: regress dns/retry
==10497==
dns/retry: [forking] ==10498== Invalid read of size 8
==10498== at 0x4C309D: evdns_request_timeout_callback (evdns.c:2179)
==10498== by 0x49EA95: event_process_active_single_queue (event.c:1576)
==10498== by 0x49EFDD: event_process_active (event.c:1668)
==10498== by 0x49F6DD: event_base_loop (event.c:1891)
==10498== by 0x49F063: event_base_dispatch (event.c:1702)
==10498== by 0x44C7F1: dns_retry_test_impl (regress_dns.c:724)
==10498== by 0x44CF60: dns_retry_test (regress_dns.c:749)
==10498== by 0x48A8A1: testcase_run_bare_ (tinytest.c:105)
==10498== by 0x48A94E: testcase_run_forked_ (tinytest.c:189)
==10498== by 0x48AB73: testcase_run_one (tinytest.c:247)
==10498== by 0x48B4C2: tinytest_main (tinytest.c:434)
==10498== by 0x477FC7: main (regress_main.c:459)
==10498== Address 0x6176ef8 is 40 bytes inside a block of size 342 free'd
==10498== at 0x4C29E90: free (vg_replace_malloc.c:473)
==10498== by 0x4A4411: event_mm_free_ (event.c:3443)
==10498== by 0x4BE8C5: request_finished (evdns.c:702)
==10498== by 0x4C3098: evdns_request_timeout_callback (evdns.c:2178)
==10498== by 0x49EA95: event_process_active_single_queue (event.c:1576)
==10498== by 0x49EFDD: event_process_active (event.c:1668)
==10498== by 0x49F6DD: event_base_loop (event.c:1891)
==10498== by 0x49F063: event_base_dispatch (event.c:1702)
==10498== by 0x44C7F1: dns_retry_test_impl (regress_dns.c:724)
==10498== by 0x44CF60: dns_retry_test (regress_dns.c:749)
==10498== by 0x48A8A1: testcase_run_bare_ (tinytest.c:105)
==10498== by 0x48A94E: testcase_run_forked_ (tinytest.c:189)
==10498==
==10498==
==10498== HEAP SUMMARY:
==10498== in use at exit: 0 bytes in 0 blocks
==10498== total heap usage: 83 allocs, 83 frees, 10,020 bytes allocated
==10498==
==10498== All heap blocks were freed -- no leaks are possible
==10498==
==10498== For counts of detected and suppressed errors, rerun with: -v
==10498== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
OK
1 tests ok. (0 skipped)
==10497==
==10497== HEAP SUMMARY:
==10497== in use at exit: 0 bytes in 0 blocks
==10497== total heap usage: 3 allocs, 3 frees, 96 bytes allocated
==10497==
==10497== All heap blocks were freed -- no leaks are possible
==10497==
==10497== For counts of detected and suppressed errors, rerun with: -v
==10497== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Bug was introduced in 97c750d6602517f22a1100f16592b421c38f2a45 ("evdns:
fail ns after we are failing/retrasmitting request").
This will fix some invalid read/write:
==556== Invalid read of size 8
==556== at 0x4E4EEC6: event_queue_remove_timeout (minheap-internal.h:178)
==556== by 0x4E508AA: event_del_nolock_ (event.c:2764)
==556== by 0x4E53535: event_base_loop (event.c:3088)
==556== by 0x406FCFA: dispatch (libcrawl.c:271)
==556== by 0x402863: main (crawler.c:49)
==556== Address 0x68a3f18 is 152 bytes inside a block of size 400 free'd
==556== at 0x4C29C97: free (in /usr/local/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==556== by 0x406F140: renew (libcrawl.c:625)
==556== by 0x4E6CDE9: evhttp_connection_cb_cleanup (http.c:1331)
==556== by 0x4E6E2B2: evhttp_connection_cb (http.c:1424)
==556== by 0x4E4DF2D: bufferevent_writecb (bufferevent_sock.c:310)
==556== by 0x4E52D1D: event_process_active_single_queue (event.c:1584)
==556== by 0x4E53676: event_base_loop (event.c:1676)
==556== by 0x406FCFA: dispatch (libcrawl.c:271)
==556== by 0x402863: main (crawler.c:49)
But this one because of some invalid write before (I guess).
It is 100% reproduced during massive crawling (because this process
has many different servers), but after spending some time for trying to
reproduce this using some simple tests/utils I gave up for a few days (I
have a lot of work to do), but I'm sending this patch as a reminder.
Just in case, I've tried next tests:
- mixing timeouts/retries
- shutdown http server and return it back
- slow dns server for first request
- sleep before accept
- hacking libevent sources to change the behaviour of http layer (so it
will go into that function which I'm insterested in).
In case we are failing request (evdns_request_timeout_callback()), we
delete timeout_event in request_finished(), while just before calling
request_finished() (for failing request) there was a call to
nameserver_failed(), that add event for timeout_event, IOW we must fail
ns after request because otherwise we will not have timeout_event
actived, and we will waiting forever.
Before this patch the dns/retry_disable_when_inactive will wait forever,
after - OK.