[ee122] weird MNL behavior

vern at cs.berkeley.edu vern at cs.berkeley.edu
Sun Dec 9 16:53:47 PST 2007


> I am using select to wake me up on a timeout or if there is data ready to be
> received.  Towards the end of the transfer, select returns before a timeout
> and FD_ISSET returns true implying that there is data to be read, so I call
> recvfrom which ends up blocking which would imply that there is nothing
> there.

In my experience, this is pretty much always some sort of bug in the use
of select, though they can sometimes be very hard to find.  You've already
taken care of the #1 suspect, which is failing to FD_ZERO or failing to
FD_SET correctly.  Another possibility is that your code is structured to
(somewhere internal) read from the fd that you're then later trying to
read due to select(), so that now it no longer has anything to return;
or you're reading with recvfrom and what you've specified doesn't match
the packet that came in.

If you send me your select() loop, I'll try to take a look at it.  However,
I'm not online much today, so I'm not sure if I'll be able to reply before
tomorrow.

> I read online that select does not guarantee that recvfrom wont
> block because the packet may have been corrupted.

That doesn't sound right to me - the kernel should make the integrity checks
prior to analyzing the rest of the header, and it has to do that analysis
in order to figure out which file descriptor to flag as being available
for reading.  (More generally, servers all over the world rely on select()
not causing them to occasionally block waiting to read - so this is code
that has *really* been hammered on.)

> So I changed my socket to
> work in non-blocking mode.

Do try to avoid that.  Like select(), it comes with its own subtle usage
errors, and the combination can be quite confusing.

It's worth stepping through the code executed by MNL for the recv to
see whether in some cases it reads twice or something like that.

		Vern


More information about the ee122 mailing list