user1641854 user1641854 - 4 months ago 26
Linux Question

edge triggered epoll for unix domain socket

I hit strange issue when

epoll_wait
is blocking for
EPOLLOUT
event on unix domain socket in edge triggered mode.

Some details: I use
boost ASIO
for IPC between two processes with file descriptors passing.

Here are some strace logs:

25097 16:59:04.273555 epoll_ctl(4, EPOLL_CTL_MOD, 37, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=40872176, u64=40872176}}) = 0
25097 16:59:04.273588 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273617 sendmsg(37, {msg_name(0)=NULL, msg_iov(1)=[{data skipped, 247}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {34, 49}}, msg_flags=0}, MSG_NOSIGNAL) = 247
25097 16:59:04.273671 epoll_ctl(4, EPOLL_CTL_DEL, 34, {0, {u32=0, u64=0}}) = 0
25097 16:59:04.273715 close(34) = 0
25097 16:59:04.273752 close(49) = 0
25097 16:59:04.273801 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273848 epoll_wait(4, <unfinished ...>


And I'm blocked in last
epoll_wait
call.
My understanding is that as I'm using edge triggered mode (
EPOLLET
), then I'm for sure can block if fd is already ready for write operations.

The question is: how to debug if unix domain socket is ready for write operations?
/proc/net/unix
shows nothing interesting.

Answer

My understanding is that as I'm using edge triggered mode (EPOLLET), then I'm for sure can block if fd is already ready for write operations.

I agree.

The question is: how to debug if unix domain socket is ready for write operations?

If you have a kernel file with debugging symbols, you could do

gdb vmlinux /proc/kcore

and with the struct sock address from the Num column of /proc/net/unix

p ((struct sock *)0xaddress)->sk_wmem_alloc

- inspect the committed transmit queue bytes and other structure elements to see if the socket's send buffer has space left.

But actually you needn't do that because the strace output already shows in the next-to-last line the EPOLLOUT event, and between that and the epoll_wait in the last line there's no system call which could change the situation, i. e. no signal edge. I think it's unwise to wait edge-triggered here.