[Xorp-hackers] Selector overhead; SelectorList bugs
Bruce Simpson
bms at incunabulum.net
Tue Dec 1 09:34:13 PST 2009
Ben Greear wrote:
> On 12/01/2009 01:32 AM, Bruce Simpson wrote:
>> ...Sounds like a job for good old poll().
>
> I don't see how priority matters in poll() anyway. It will return all
> fds that
> can have action taken on them..then user-space can process those fds
> however
> it wants.
The point is: for poll(), the kernel will preserve the order of the
struct fd_poll[] members which it returns when the syscall has
completed. poll() will at least let you implement I/O priorities, with
some co-operation from the kernel.
Both select() and poll() cost a copy-in and a copy-out. The newer
alternatives, usually only a small copy-out once set up.
Any scheme (such as exists in libevent) which just prioritizes the order
in which callbacks might be invoked, is still at the mercy of how I/O
notification is dispatched. There's a priority queueing scheme in effect
in selector.cc which basically just wraps what select() gives us.
select() and epoll don't do this; they will always return the first
match first. I was actually quite surprised to discover this about
epoll, but there you go.
There are situations in the code, as you've seen, where relative I/O
prioritization is an issue; BGP is affected by such. The current 'soft'
priority queue scheme in selector.cc is an attempt to deal with the
situation.
It's a tough call to make in that situation, though -- we probably want
to make sure BGP UPDATE packets are serviced in a timely manner. Without
any pre-emption mechanism, we risk starving other tasks if the BGP
process is saturated by I/O. This is something which selector.cc doesn't
address... and I think this is what Atanu was getting at when we
discussed it many moons ago.
Having said that, it's not a priority to deal with at the moment. The
code seems resilient enough to cope with it, except for the cases you've
uncovered.
>
> One simple optimization might be to check return value of poll/select to
> see how many items have bits set..then you at least know when you can
> stop
> the loop.
We already do this.
cheers,
BMS
More information about the Xorp-hackers
mailing list