socket.7: Correct the description of SO_RXQ_OVFL

Two reports that the description of SO_RXQ_OVFL was wrong.

======

Commentary from Tobias:

This bug pertains to the manpage as visible on man7.org right
now.

The socket(7) man page has this paragraph:

       SO_RXQ_OVFL (since Linux 2.6.33)
              Indicates that an unsigned 32-bit value ancillary
              message (cmsg) should be attached to received skbs
              indicating the number of packets dropped by the
              socket between the last received packet and this
              received packet.

The second half is wrong: the counter (internally,
SOCK_SKB_CB(skb)->dropcount is *not* reset after every packet.
That is, it is a proper counter, not a gauge, in monitoring
parlance.

A better version of that paragraph:

       SO_RXQ_OVFL (since Linux 2.6.33)
              Indicates that an unsigned 32-bit value ancillary
              message (cmsg) should be attached to received skbs
              indicating the number of packets dropped by the
              socket since its creation.
======
Commentary from Petr

Generic SO_RXQ_OVFL helpers sock_skb_set_dropcount() and
sock_recv_drops() implements returning of sk->sk_drops (the total
number of dropped packets), although the documentation says the
number of dropped packets since the last received one should be
returned (quoting the current socket.7):

  SO_RXQ_OVFL (since Linux 2.6.33)
  Indicates that an unsigned 32-bit value ancillary message (cmsg)
  should be attached to received skbs indicating the number of packets
  dropped by the socket between the last received packet and this
  received packet.

I assume the documentation needs to be updated, as fixing this in
the code could break programs depending on the current behavior,
although the formerly planned functionality seems to be more
useful.

The problem can be revealed with the following program:

int extract_drop(struct msghdr *msg)
{
        struct cmsghdr *cmsg;
        int rtn;

        for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg,cmsg)) {
                if (cmsg->cmsg_level == SOL_SOCKET &&
                    cmsg->cmsg_type == SO_RXQ_OVFL) {
                        memcpy(&rtn, CMSG_DATA(cmsg), sizeof rtn);
                        return rtn;
                }
        }
        return -1;
}

int main(int argc, char *argv[])
{
        struct sockaddr_in addr = { .sin_family = AF_INET };
        char msg[48*1024], cmsgbuf[256];
        struct iovec iov = { .iov_base = msg, .iov_len = sizeof msg };
        int sk1, sk2, i, one = 1;

        sk1 = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
        sk2 = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);

        inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
        addr.sin_port = htons(53333);

        bind(sk1, (struct sockaddr*)&addr, sizeof addr);
        connect(sk2, (struct sockaddr*)&addr, sizeof addr);

        // Kernel doubles this limit, but it accounts also the SKB overhead,
        // but it receives as long as there is at least 1 byte free.
        i = sizeof msg;
        setsockopt(sk1, SOL_SOCKET, SO_RCVBUF, &i, sizeof i);
        setsockopt(sk1, SOL_SOCKET, SO_RXQ_OVFL, &one, sizeof one);

        for (i = 0; i < 4; i++) {
                int rtn;

                send(sk2, msg, sizeof msg, 0);
                send(sk2, msg, sizeof msg, 0);
                send(sk2, msg, sizeof msg, 0);

                do {
                        struct msghdr msghdr = {
                                        .msg_iov = &iov, .msg_iovlen = 1,
                                        .msg_control = &cmsgbuf,
                                        .msg_controllen = sizeof cmsgbuf };
                        rtn = recvmsg(sk1, &msghdr, MSG_DONTWAIT);
                        if (rtn > 0) {
                                printf("rtn: %d drop %d\n", rtn,
                                                extract_drop(&msghdr));
                        } else {
                                printf("rtn: %d\n", rtn);
                        }
                } while (rtn > 0);
        }

        return 0;
}

which prints
  rtn: 49152 drop -1
  rtn: 49152 drop -1
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 1
  rtn: -1
  rtn: 49152 drop 2
  rtn: 49152 drop 2
  rtn: -1
  rtn: 49152 drop 3
  rtn: 49152 drop 3
  rtn: -1
although it should print (according to the documentation):
  rtn: 49152 drop 0
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1

Reported-by: Petr Malat <oss@malat.biz>
Reported-by: Tobias Klausmann <klausman@schwarzvogel.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
Michael Kerrisk 2017-11-20 12:13:25 +01:00
parent 2f694223f0
commit 4cee582147
1 changed files with 1 additions and 2 deletions

View File

@ -881,8 +881,7 @@ compete to receive datagrams on the same socket.
.\" commit 3b885787ea4112eaa80945999ea0901bf742707f
Indicates that an unsigned 32-bit value ancillary message (cmsg)
should be attached to received skbs indicating
the number of packets dropped by the socket between
the last received packet and this received packet.
the number of packets dropped by the socket since its creation.
.TP
.B SO_SNDBUF
Sets or gets the maximum socket send buffer in bytes.