[Bro-Dev] [JIRA] (BIT-1522) Broker listener takes a long time to shut down on cluster stop/restart

Stephen Hosom (JIRA) jira at bro-tracker.atlassian.net
Fri Jan 22 05:13:00 PST 2016


    [ https://bro-tracker.atlassian.net/browse/BIT-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=23920#comment-23920 ] 

Stephen Hosom  commented on BIT-1522:
-------------------------------------

I can see how that would be confusing. 

The attached file should recreate the issue. To observe the issue:

# Add the script to your configuration
# Start Bro as a cluster using broctl start
# Observe listeners with "watch netstat -tulpn" (port 9999 should be in use)
# Stop Bro using broctl stop

At this point, the port 9999 listener stays around for as much as 60-120 seconds before going away. Because of this, a broctl deploy or broctl restart will result in Broker being unable to bind to the same port and Broker event communications will fail. 

Oddly enough, when I tried to recreate this with a standalone instance, I wasn't able to do so. I could only recreate the issue using broctl to start and stop bro as a cluster.

> Broker listener takes a long time to shut down on cluster stop/restart
> ----------------------------------------------------------------------
>
>                 Key: BIT-1522
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1522
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Broker
>    Affects Versions: 2.4
>         Environment: Ubuntu 14.04, Bro 2.4.1 with Broker
>            Reporter: Stephen Hosom 
>             Fix For: 2.5
>
>         Attachments: broker-shutdown-test.bro
>
>
> It looks like when shutting down Broker, the listener sticks around for an exceptionally long time (as much as a minute or more). Because of this, Broker's listener actually fails to re-bind to the port on the next cluster start silently. All Broker communication then fails to work silently. It can take a while to notice this failure, since nothing really complains. 
> The listener should probably shut down faster than 1 minute... but it might also make sense to add options to have the listener retry to start, or generate a failure message when it doesn't start. Maybe listener starts in bro_init should actually cause Bro to stop, so that the user sees the failure immediately?



--
This message was sent by Atlassian JIRA
(v7.1.0-OD-05-006#71001)


More information about the bro-dev mailing list