[Bro-Dev] [JIRA] (BIT-1306) bro process would get stuck/freeze with myricom drivers

Robin Sommer (JIRA) jira at bro-tracker.atlassian.net
Fri Apr 10 08:10:01 PDT 2015


    [ https://bro-tracker.atlassian.net/browse/BIT-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=20246#comment-20246 ] 

Robin Sommer commented on BIT-1306:
-----------------------------------

Turns out this change was part of the introduction of the iosource_mgr. That mgr has ownership of all IOSources registered with it, which includes the remote serializer. So deleting the iosource_mgr will delete the remote serializer, which is why this explicit delete went aways. (The same applies to other iosources: there used to be deletes for dns_mgr/event_player/etc., which likewise went away).

Aashish, did you ever get to try if Jon's patch made a difference? If it actually did, then there's something with the iosource_mgr clean up not working.

(Btw, I learned a fun new git option when looking into this: {{git log -L 368,397:main.cc}} shows all commits that contributed to that particular block of code, as it is now).

> bro process would get stuck/freeze with myricom drivers
> -------------------------------------------------------
>
>                 Key: BIT-1306
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1306
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Bro
>    Affects Versions: git/master
>         Environment:  OS: FreeBSD 9.3-RELEASE-p5 OS
> bro version 2.3-328
> git log -1 --format="%H"
> 379593c7fded0f9791ae71a52dd78a4c9d5a2c1f
>            Reporter: Aashish Sharma
>            Assignee: Robin Sommer
>              Labels: bro-git, myricom
>             Fix For: 2.4
>
>
> When I stop bro (in cluster mode), one of the bro worker process (random) would get stuck and wouldn't shutdown, stop or even be killed using kill -s 9. 
> System has to be ultimately rebooted to remove stuck bro process. 
> On running  myri_start_stop I see:
> # /usr/local/opt/snf/sbin/myri_start_stop stop
> Removing myri_snf.ko
> kldunload: can't unload file: Device busy
> It appears that the myri_snf.ko driver cannot be unloaded because of the stuck bro process.  That process still has an open descriptor on the Sniffer device/driver and bro process freezes 
> More details:
> The bro process is stuck in RNE state
> R       Marks a runnable process.
> N       The process has reduced CPU scheduling priority (see setpriority(2)).
> E       The process is trying to exit.
> Here is an example:
> ### stuck process:
> [bro at 01 ~]$ ps auxwww | fgrep 1616
> bro    1616  100.0  0.0 758040 60480 ??  RNE   2:57PM   53:50.04 /usr/local/bro-git/bin/bro -i myri0 -U .status -p broctl -p broctl-live -p local -p worker-1-1 mgr.bro broctl base/frameworks/cluster local-worker.bro broctl/auto
> ####when checking for process in proc:
> [bro at c ~]$ ls -l /proc/1616
> ls: /proc/1616: No such file or directory



--
This message was sent by Atlassian JIRA
(v6.4-OD-16-006#64014)


More information about the bro-dev mailing list