Instead of having a file segment born as one type and stay that way
forever, let them start out unmapped, but map themselves as needed
if they need to get written out on a non-drains_to_fd evbuffer.
The sendfile() implementation for evbuffer_add_file is potentially more
efficient, but it has a problem: you can only use it to send bytes over
a socket using sendfile(). If you are writing bytes via SSL_send() or
via a filter, or if you need to be able to inspect your buffer, it
doesn't work.
As an easy fix, this patch disables the sendfile-based implementation of
evbuffer_add_file on an evbuffer unless the user sets a new
EVBUFFER_FLAG_DRAINS_TO_FD flag on that evbuffer, indicating that the
evbuffer will not be inspected, but only written out via
evbuffer_write(), evbuffer_write_atmost(), or drained with stuff like
evbuffer_drain() or evbuffer_add_buffer(). This flag is off by
default, except for evbuffers used for output on bufferevent_socket.
In the future, it could be interesting to make a best-effort file
segment implementation that tries to send via sendfile, but mmaps on
demand. That's too much complexity for a stable release series, though.
Original message:
Solaris sendfile seems to fail when sending moderately large (<1GB)
files. Not a 32/64 problem, but a buffer problem.
Anyone else ever try this? It is definitely broken in http-server.c.
It seems to be broken in the following way:
When sendfile sends partial data (EAGAIN, would block), "res" is
always -1, rather than the amount sent.
Here's a patch that reads from the "offset" pointer instead to
discover what was sent. This seems to work:
The _internal.pos_in_chain field was uninitialized or set to different
values in different places returning the special "not found" pointer.
Signed-off-by: Nir Soffer <nirsof@gmail.com>
When searching for a CRLF, it would find an LF, then look for a
preceding CR if not at the start of the buffer. That's fine when
we're starting from the beginning of the buffer, but if we're starting
at (say) byte 100, and we have that byte == LF, we shouldn't check for
a CR at byte 99.
Our evbuffer_strchr() function [which was only used for
search_eol(EOL_LF) could give incorrect results if it found its answer
in the first chunk but didn't start searching from the front of the
chunk.
Also, this patch adds unit tests for evbuffer_search_eol, particularly
in those cases that evbuffer_readln() tests didn't exercise.
Instead of mapping enough bytes for each segment, we were failing to
take into account the slop created by rounding the segment position
down to the nearest page.
Should fix bug 3142394 found by Sebastian Hahn.
The old logic was both too eager to realign (it would move a whole
chain to save a byte) and too reluctant to realign (it would only
realign when data would fit into the misaligned portion, without
considering the space at the end of the chain).
The new logic matches that from evbuffer_expand_singlechain: it only
realigns a chain when not much data is to be moved, and there's a
bunch of space to be regained.
Spotted by Yan Lin.
This change has no effect on non-windows platforms, since those
either define off_t to 64-bits, or allow you to decide whether
it should be 64-bits yourself via some LARGEFILE-like macro.
On Windows, however, off_t is always 32-bit, so it's a bad choice
for "file size" or "file offset" values. Instead, I'm adding
an ev_off_t type, and using it in the one place where we used
off_t to mean "the size of a file" or "an offset into a file" in the
API.
This breaks ABI compatibility on Windows.
The writev() call is limited to at most IOV_MAX iovecs (or UIO_MAXIOV,
depending on whom you ask). This isn't a problem anywhere we've
tested except on OpenSolaris, where IOV_MAX was a mere 16.
This patch makes us go from "use up to 128 iovecs when writing" to
"use up to 128 iovecs when writing, or IOV_MAX/UIO_MAXIOV, whichever
is less". This is still wrong if you somehow find a platform that
defines IOV_MAX < UIO_MAXIOV, but I hereby claim that such a platform
is too stupid to worry about for now.
Found by Michael Herf.
Remember, the code
int is_less_than(int a, unsigned b) {
return a < b;
}
is buggy, since the C integer promotion rules basically turn it into
int is_less_than(int a, unsigned b) {
return ((unsigned)a) < b;
}
and we really want something closer to
int is_less_than(int a, unsigned b) {
return a < 0 || ((unsigned)a) < b;
}
.
Suggested by an example from Ralph Castain
- Prevent evbuffer_{add,prepend}_buffer from moving read-pinned chains.
- Fix evbuffer_drain to handle read-pinned chains better.
- Raise the limit on WSABUFs from two to MAX_WSABUFS for overlapped reads.
The old code had a bug where the 'exact' flag to 1 in
_evbuffer_read_setup_vecs would never actually make the iov_len field
of the last iovec get truncated. This patch fixes that.
The old logic made sense back when buffer.c was an enormous linear
buffer, but it doesn't make any sense for the chain-based
implementation.
This patch also refactors the ioctl{socket}? call into its own function.
For future note, opensolaris doesn't have sys/sysctl.h, doesn't like
comparing iov_buf to a chain_space_ptr without a cast, and is (predictably)
unforgiving of dumb syntax errors.
Also, we had accidentally broken the devpoll backend test in configure.in