ruipacheco ruipacheco - 1 month ago 12
C++ Question

Heisenbug when doing an ansychronous read from socket

I'm calling async_read on a domain socket and my problem is that sometimes I get data and sometimes I don't, even though the remote system always returns data. I seem to be able to read from the socket if I do step-by-step debugging. If I automate unit tests they seem to run too fast for data to be returned, which is odd since the whole purpose of asynchronous methods is to wait for a reply.

I have these as properties of my class:

io_service run_loop;
stream_protocol::socket connection_socket;
datagram_protocol::endpoint domain_socket_ep;
vector<unsigned char>read_buffer;


I do a write:

void operator>>(const vector<unsigned char> input, shared_ptr<Socket>socket) {
asio::async_write(socket->connection_socket, asio::buffer(input), std::bind(&Socket::write_handler, socket, std::placeholders::_1, std::placeholders::_2));
socket->run_loop.reset();
socket->run_loop.run();
}


In the write callback I do a read:

void Socket::write_handler(const std::error_code &ec, const size_t size) noexcept {
const size_t avail = connection_socket.available();
if (!read_buffer.empty()) {
read_buffer.clear();
}
asio::async_read(connection_socket, asio::buffer(read_buffer, avail), std::bind(&Socket::read_handler, shared_from_this(), std::placeholders::_1, std::placeholders::_2));
}


I tried wrapping the read function in a
while(read_buffer.size() < avail)
but that just threw me into an infinite loop.

I'm definitely missing something here, I just can't figure out what and the fact that this works when running under step-by-step just makes it worse.

Answer

On the read side of things:

You never mention the type of read_buffer. There's not a lot we know about what .clear() does, but if it does what the name suggests, it will be invalid to use asio::buffer(read_buffer, avail) without prior read_buffer.resize(avail)


Why do you have

void operator>>(const vector<unsigned char> input, shared_ptr<Socket>socket) {
  asio::async_write(socket->connection_socket, asio::buffer(input),   std::bind(&Socket::write_handler, socket, std::placeholders::_1, std::placeholders::_2));
  socket->run_loop.reset();
  socket->run_loop.run();
}

Instead of e.g.

void operator>>(const std::vector<unsigned char> input, std::shared_ptr<Socket> socket) {
    boost::system::error_code ec;
    size_t transferred = boost::asio::write(socket->connection_socket, boost::asio::buffer(input), ec);
    socket->write_handler(ec, transferred);
}

If you don't want asynchronous operations, don't use them. That's at least a lot simpler. The following sample would be sane (if you make sure io lives longer than any socket that uses it):

Live On Coliru

#include <boost/asio.hpp>

struct Socket {
    boost::asio::io_service& svc;
    boost::asio::ip::tcp::socket connection_socket;

    Socket(boost::asio::io_service& svc) : svc(svc), connection_socket(svc) {}

    void write_handler(boost::system::error_code ec, size_t bytes_transferred) {
    }
};

void operator>>(const std::vector<unsigned char> input, std::shared_ptr<Socket> socket) {
    boost::system::error_code ec;
    size_t transferred = boost::asio::write(socket->connection_socket, boost::asio::buffer(input), ec);
    socket->write_handler(ec, transferred);
}

int main(){
    boost::asio::io_service io;
    auto s = std::make_shared<Socket>(io);

    std::vector<unsigned char> v { 100, 'a' };
    v >> s;
}