[Bro] Sanity check - Grabbing platform tokens from browser user agents (was p0f)

Ryan iamreck at gmail.com
Wed Apr 2 06:36:04 PDT 2014


This looks very nice. I'm curious if you had any more updates or
improvements for this?

Ryan Peck

On Mon, Feb 10, 2014 at 12:50 PM, Gary Faulkner <gary at doit.wisc.edu> wrote:

> After running various iterations of the original script against several
> pcaps of our local traffic (and a couple days of live traffic) I ended up
> finding a lot of user agents that would match against the desktop/server OS
> rules, but were not necessarily desktops or servers. I ended up adding to
> the matching rules in part to parse out these things and also to detect
> other things we were interested in. Checking for more things seems to incur
> a performance penalty, so I also made some effort to move some of the more
> common matches sooner in the if/else statements to avoid having to check
> all of the less likely items first. The create_expire statement still
> doesn't behave as I expected, as each match is logged once per log rotation
> as opposed to once per day, but the matching seems to work with the
> exception that it doesn't check for every possible user agent case. I may
> also be missing explicitly including scripts that are already commonly
> loaded.
> ======================== Begin Script ========================
> @load base/utils/site
> module BrowserPlatform;
> export
> {
>     # The fully resolved name for this log will be BrowserPlatform::LOG
>     redef enum Log::ID += { LOG };
>     type Info: record {
>         ts:                 time    &log &optional;
>         uid:                string  &log &optional;
>         host:               addr    &log &optional;
>         platform_token:     string  &log &optional;
>         unparsed_version:   string  &log &optional;
>     };
>     # A set of seen IP + OS combinations. Used to prevent logging the same
> combo repeatedly.
>     global seen_browser_platforms: set[string] &create_expire = 1.0 day
> &synchronized &redef;
> }
> event bro_init() &priority=5
>     {
>     Log::create_stream(BrowserPlatform::LOG,[$columns=Info]);
>     }
> event http_header(c: connection, is_orig: bool, name: string, value:
> string)
> {
>     local platform = "Unknown OS";
>     if (!is_orig || name != "USER-AGENT" || !Site::is_local_addr(c$id$
> orig_h))
>         return;
> # Parse out Apple IOS and Android variants first as some apps will dispay
> as compatible with a desktop OS version
>     if ( /iPhone/ in value )
>     platform = "iPhone";
>     else if ( /iPad/ in value )
>         platform = "iPad";
>     else if ( /iPod/ in value )
>         platform = "iPod";
>     else if ( /Android/ in value )
>         platform = "Android";
> # Once we've parsed out mobiles move onto desktop/server OS
> # User agents listed in order of expected use or to pre-parse user-agents
> that might otherwise match multiple rules.
>     else if ( /Windows/ in value )
>         {
>     if ( /Xbox/ in value ) # often includes a Windows OS version or
> identifies as a Mobile browser
>         platform = "Xbox";
>         else if ( /Phone/ in value || /Mobile/ in value ) # often includes
> Windows OS version
>             platform = "Windows Phone";
>         else if ( /Windows NT 6.1/ in value )
>              platform = "Windows 7";
>         else if ( /Windows NT 5.1/ in value )
>              platform = "Windows XP";
>         else if ( /Windows NT 5.2/ in value && /WOW64/ in value )
>              platform = "Windows XP x64";
>         else if ( /Windows NT 6.0/ in value )
>              platform = "Windows Vista";
>         else if ( /Windows NT 6.2/ in value )
>              platform = "Windows 8";
>         else if ( /Windows NT 6.3/ in value )
>              platform = "Windows 8.1";
>        else if ( /Windows 95/ in value )
>              platform = "Windows 95";
>         else if ( /Windows 98/ in value && /4.90/ !in value )
>              platform = "Windows 98";
>         else if ( /Win 9x 4.90/ in value )
>              platform = "Windows Me";
>         else if ( /Windows NT 4.0/ in value )
>              platform = "Windows NT 4.0";
>         else if ( /Windows NT 5.0/ in value || /Windows 2000/ in value )
>              platform = "Windows 2000";
> #    Catch-all for identifying less common user-agents. Can be noisy.
> #       else
> #            platform = "Windows Other";
>         }
>     else if ( /Mac OS X/ in value )
>         {
>     if ( /Mac OS X 10_9/ in value || /Mac OS X 10.9/ in value )
>             platform = "Mac OS X 10.9";
>         else if ( /Mac OS X 10_8/ in value || /Mac OS X 10.8/ in value )
>             platform = "Mac OS X 10.8";
>         else if ( /Mac OS X 10_7/ in value || /Mac OS X 10.7/ in value )
>             platform = "Mac OS X 10.7";
>         else if ( /Mac OS X 10_6/ in value || /Mac OS X 10.6/ in value )
>             platform = "Mac OS X 10.6";
>         else if ( /Mac OS X 10_5/ in value || /Mac OS X 10.5/ in value )
>             platform = "Mac OS X 10.5";
>         else if ( /Mac OS X 10_4/ in value || /Mac OS X 10.4/ in value )
>             platform = "Mac OS X 10.4";
> #       Catch-all for identifying less common user-agents. Can be noisy.
> #       else
> #           platform = "Mac OS X Other";
>         }
>     else if ( /Linux/ in value )
>         platform = "Linux";
> # Check to see if IP+OS combo already logged and if not log it and add it
> to the list of tracked combos.
>     local saw = cat(c$id$orig_h,platform); #There is probably a less ugly
> way to do this than cat, but it seems to work
>     if ( platform != "Unknown OS" && saw !in seen_browser_platforms )
>         {
>         local rec: BrowserPlatform::Info = [$ts=network_time(),
> $uid=c$uid, $host=c$id$orig_h, $platform_token=platform,
> $unparsed_version=value];
>         Log::write(BrowserPlatform::LOG, rec);
>         add seen_browser_platforms[saw];
>         }
> }
> ======================== End Script ========================
> On 1/31/2014 10:56 PM, Gary Faulkner wrote:
>> Thanks for the suggestions, that cleans that bit up quite nicely. I
>> actually started by trying to deconstruct the various software.bro
>> scripts and work my way backwards through the framework to see what was
>> doing what. I'm still trying to navigate my way through that code, but I
>> agree that it would make more sense to leverage it directly than create
>> a derivative just to pull out a specific bit of the data. I'm not
>> currently running Splunk in any production sense, but that is pretty
>> much what I'm trying to do in Bro. Thanks for sharing it!
>> Regards,
>> Gary
>> On 1/31/2014 6:12 PM, Justin Azoff wrote:
>>> On Wed, Jan 29, 2014 at 05:35:46PM -0600, Gary Faulkner wrote:
>>>> event http_header(c: connection, is_orig: bool, name: string, value:
>>>> string)
>>>> {
>>>>       local platform = "Unknown OS";
>>>>       if ( is_orig )
>>>>           {
>>>>         if ( name == "USER-AGENT" && /Windows NT 5.1/ in value )
>>>>                 {
>>>>                 platform = "Windows XP";
>>>>                 }
>>>>           else if ( name == "USER-AGENT" && /Windows NT 6.0/ in value )
>>>>                   {
>>>>                 platform = "Windows Vista";
>>>>                   }
>>>>           else if ( name == "USER-AGENT" && /Windows NT 6.1/ in value )
>>>>                   {
>>>>                   platform = "Windows 7";
>>>>                   }
>>> ..
>>> Modifying the http_header event handler as follows will increase
>>> performance:
>>> event http_header(c: connection, is_orig: bool, name: string, value:
>>> string)
>>> {
>>>       if(!is_orig || name != "USER-AGENT")
>>>           return;
>>>       if(/Windows NT 5.1/ in value)
>>>           platform = "Windows XP";
>>>       else if ...
>>> FWIW, I used to do this kind of thing outside of bro using splunk:
>>> https://github.com/JustinAzoff/splunk-scripts/blob/master/ua2os.py
>>> One thing you may want to do is rather than use the http_header event
>>> use
>>> event log_software(rec: Info)
>>> {
>>>       ...
>>> }
>>> which will be raised every time a new software version is seen.  The
>>> software framework is already pulling most of the info out that you
>>> might need, so you can piggy back on the work that it is doing.
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20140402/20d512cf/attachment.html 

More information about the Bro mailing list