[Xorp-hackers] PIM-SM4: Cand-RP and Cand-BSR Inconsistency
Pavlin Radoslavov
pavlin at ICSI.Berkeley.EDU
Mon Oct 27 17:58:53 PDT 2008
Samuel Lucas Vaz de Mello <samuellucas at datacom.ind.br> wrote:
> > Fields _bsr_addr and _bsr_priority are suppose to be the state for
> > the Elected BSR. They matter in the set of so called "active"
> > BsrZone entries, but are not used in the "configured" BsrZone
> > entries.
> >
> > There was a bug in old code re. how the state from the "configured"
> > BsrZone was propagated to the "active" BsrZone.
> > That bug was exposed my previous change.
> >
> > I just committed a fix to CVS, so please try it and let me know if
> > you still have this or some other issue:
> >
> > Revision Changes Path
> > 1.56 +3 -3; commitid: ec3e4902630a41a7; xorp/pim/pim_bsr.cc
> >
>
> Thank you Pavlin!
>
> This basically fix it.
> Just a minor issue: After getting elected, shouldn't _my_bsr_addr/_my_bsr_priority be copied to _bsr_addr/_bsr_priority ?
>
> For me, it shows like this:
>
> root at erdinger> show pim bootstrap
> Active zones:
> BSR Pri LocalAddress Pri State Timeout SZTimeout
> 0.0.0.0 0 10.1.3.11 1 Elected 58 -1
> Expiring zones:
> BSR Pri LocalAddress Pri State Timeout SZTimeout
> Configured zones:
> BSR Pri LocalAddress Pri State Timeout SZTimeout
> 0.0.0.0 0 10.1.3.11 1 Init -1 -1
>
> As the machine is elected as Active BSR, wouldn't the Active address be 10.1.3.11 instead of 0.0.0.0? I saw that i_am_bsr() checks if bsr_addr()==my_bsr_addr(), what would lead to further errors.
Yes, this was a bug which I created by my previous fix.
I just committed a fix to CVS:
Revision Changes Path
1.57 +3 -1; commitid: 17bbc49065a0f41a7; xorp/pim/pim_bsr.cc
1.24 +2 -1; commitid: 17bbc49065a0f41a7; xorp/pim/pim_bsr.hh
Yes, you are right about copying _my_bsr_addr and _my_bsr_priority
to _bsr_addr and _bsr_priority.
In one of my earlier emails on the subject I said that _bsr_addr and
_bsr_priority are not used/needed for "configured" BsrZone entries.
While chasing this bug I discovered they are actually used (when
propagating the state to an "active" BsrZone entry), so now
they are set when the Cand-BSR state is updated.
> Another issue: If the node is Elected as BSR and I add another
> Cand-RP, it triggers bsr_stop() and bsr_start() what causes it to
> lose Elected state and return to Pending. Although this is not
> wrong, it would be better to just keep the state (the stop()
> causes Cand-RP-ADV with zero holdtime, what changes the state also
> in remote peers). I was wondering about writing an update method
> that apply the actions in stop() and start() only in the changed
> zones. What do you think?
Accidentally, couple of days ago I was thinking about the same when I
noticed the bsr_stop()/bsr_start().
If I remember correctly, I used bsr_stop()/bsr_start() because there
was lots of complexity when I tried to do the incremental update.
Though, I don't remember whether this applied to the first version
of the BSR implementation or to the (almost complete) rewrite I had
to do after I discovered major issues with the first version.
Another reason for the bsr_stop()/bsr_start() was because I wanted
atomic update (e.g., transaction like), otherwise during
reconfiguration there will be lots of flux with potentially
dangerous results (remember that consistent Cand-RP set across
all PIM-SM routers is critical).
At that time there was no easy way to apply some transaction-based
mechanism to achieve the atomic update, so the
bsr_stop()/bsr_start() was the simple work-around (e.g., note that
it is in the etc/templates/pimsm*.tp template files).
If you can find a simple way to do the update atomically without
bsr_stop()/bsr_start(), then give it a try and see how well it
works.
Otherwise, please submit a Bugzilla entry about the issue.
Thanks,
Pavlin
More information about the Xorp-hackers
mailing list