[Xorp-hackers] BSR Restart
Samuel Lucas Vaz de Mello
samuellucas at datacom.ind.br
Wed Oct 29 06:50:22 PDT 2008
>> Another issue: If the node is Elected as BSR and I add another
>> Cand-RP, it triggers bsr_stop() and bsr_start() what causes it to
>> lose Elected state and return to Pending. Although this is not
>> wrong, it would be better to just keep the state (the stop()
>> causes Cand-RP-ADV with zero holdtime, what changes the state also
>> in remote peers). I was wondering about writing an update method
>> that apply the actions in stop() and start() only in the changed
>> zones. What do you think?
>
> Accidentally, couple of days ago I was thinking about the same when I
> noticed the bsr_stop()/bsr_start().
>
> If I remember correctly, I used bsr_stop()/bsr_start() because there
> was lots of complexity when I tried to do the incremental update.
> Though, I don't remember whether this applied to the first version
> of the BSR implementation or to the (almost complete) rewrite I had
> to do after I discovered major issues with the first version.
The code in CVS is the rewritten code?
> Another reason for the bsr_stop()/bsr_start() was because I wanted
> atomic update (e.g., transaction like), otherwise during
> reconfiguration there will be lots of flux with potentially
> dangerous results (remember that consistent Cand-RP set across
> all PIM-SM routers is critical).
> At that time there was no easy way to apply some transaction-based
> mechanism to achieve the atomic update, so the
> bsr_stop()/bsr_start() was the simple work-around (e.g., note that
> it is in the etc/templates/pimsm*.tp template files).
>
> If you can find a simple way to do the update atomically without
> bsr_stop()/bsr_start(), then give it a try and see how well it
> works.
> Otherwise, please submit a Bugzilla entry about the issue.
I played around a bit with this.
I've tried to keep as close as possible to the stop()/start() approach.
My idea was:
- Add a boolean parameter to bsr_start()/bsr_stop that is true during restarts
- Create a bsr_restart() method that calls bsr_stop(true) and then bsr_start(true)
- In stop(true), instead deleting the whole _active_bsr_zone_list delete only the non-elected zones. For the elected zones, delete all groupprefix (which contains the rps).
- In start(true) it will reuse the existing elected zone and add all configured groupprefix (and rps)
- In start(true), check if any elected zone was deleted from config and vanish them.
- After that, I put the elected zones back in the PENDING state and expire the bsr timer. The timer will put it back in the ELECTED state, compute the rp-set and send BSR Messages with the new rp-set.
In my tests it seems to work fine.
I'm sending the patch attached.
Can you have a look on it and check if I've forgotten something?
Regards,
- Samuel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Support-BSR-restart-without-losing-BSR-Elected-state.patch
Type: text/x-diff
Size: 17744 bytes
Desc: not available
Url : http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20081029/57932e3f/attachment.bin
More information about the Xorp-hackers
mailing list