[Xorp-hackers] Selector overhead; SelectorList bugs

Bruce Simpson bms at incunabulum.net
Tue Dec 1 09:34:13 PST 2009


Ben Greear wrote:
> On 12/01/2009 01:32 AM, Bruce Simpson wrote:
>> ...Sounds like a job for good old poll().
>
> I don't see how priority matters in poll() anyway.  It will return all 
> fds that
> can have action taken on them..then user-space can process those fds 
> however
> it wants.

The point is: for poll(), the kernel will preserve the order of the 
struct fd_poll[] members which it returns when the syscall has 
completed. poll() will at least let you implement I/O priorities, with 
some co-operation from the kernel.

Both select() and poll() cost a copy-in and a copy-out. The newer 
alternatives, usually only a small copy-out once set up.

Any scheme (such as exists in libevent) which just prioritizes the order 
in which callbacks might be invoked, is still at the mercy of how I/O 
notification is dispatched. There's a priority queueing scheme in effect 
in selector.cc which basically just wraps what select() gives us.

select() and epoll don't do this; they will always return the first 
match first. I was actually quite surprised to discover this about 
epoll, but there you go.

There are situations in the code, as you've seen, where relative I/O 
prioritization is an issue; BGP is affected by such. The current 'soft' 
priority queue scheme in selector.cc is an attempt to deal with the 
situation.

It's a tough call to make in that situation, though -- we probably want 
to make sure BGP UPDATE packets are serviced in a timely manner. Without 
any pre-emption mechanism, we risk starving other tasks if the BGP 
process is saturated by I/O. This is something which selector.cc doesn't 
address... and I think this is what Atanu was getting at when we 
discussed it many moons ago.

Having said that, it's not a priority to deal with at the moment. The 
code seems resilient enough to cope with it, except for the cases you've 
uncovered.

>
> One simple optimization might be to check return value of poll/select to
> see how many items have bits set..then you at least know when you can 
> stop
> the loop.

We already do this.

cheers,
BMS



More information about the Xorp-hackers mailing list