[Bro-Dev] Broker has landed in master, please test
Azoff, Justin S
jazoff at illinois.edu
Fri Jun 15 11:25:17 PDT 2018
> On Jun 1, 2018, at 11:37 AM, Azoff, Justin S <jazoff at illinois.edu> wrote:
> I could never figure out what was causing the problem, and it's possible that &synchronized not doing anything anymore is why it's better now. I'm mostly using &synchronized for syncing input files across all the workers and one of them does have 300k entries in it. That file is fairly constant though, only a few k changes every 5 minutes and nothing that should use 20G of ram.
FWIW, I figured out what was causing this problem. While the file wasn't changing that much, I was using something like
curl -o file.new $URL && mv file.new file.csv
to download the file, and apparently unless you pass -f to curl, it doesn't actually exit with a non-zero status code on server errors.
This was causing a server error page to be written to the csv file every now and then. When this happened:
* the input reader would throw a warning that the file couldn't be parsed, and clear out the set
* bro would then clear the set, triggering a removal of 300k items across all nodes (56 in the case of the test cluster)
* 5 minutes later the next download would work
* bro would then fill back in the set, and trigger 300k items to be synced to all 56 nodes again.
so within 5 minutes, 300,000*56*2 updates would be kicked off, which is 33million updates. This seemed to max out the proxies for 30 minutes.
The raw size of the data is only ~4M, or 261M total, which makes it a little crazy that memory usage would blow up by dozens of gigabytes of ram.
&synchronized not having an effect in master made this problem go away, and adding a -f to curl on our pre-broker clusters fixed those too.
All the more reason to port the method of distributing the data off of &synchronized. I think I will just run the curl command on the worker nodes too,
effectively replacing &synchronized with curl.
More information about the bro-dev