[Bro] another kdd cup question

Oğuz Yarımtepe oguzyarimtepe at gmail.com
Fri Oct 4 11:18:59 PDT 2013


I was investigating to create KDD Cup 99 attributes on a live traffic. I
encountered with some papers telling that they reproduce the same attribute
values by using Bro-IDS. I am not sure whether all the values can be
gathered from a live traffic, so i am asking whether it is possible to
calculate the below attributes from a live GBit traffic.

    Num.

Name

Type

Description

1

duration

integer

duration of the connection

2

   protocol_type

nominal

protocol type of the connection: TCP, UDP and ICMP

3

   service

nominal

http, ftp, smtp, telnet... and other (if not much used service)

4


  flag

nominal

connection status. The possible status are this: SF, S0, S1, S2, S3,OTH,
REJ, RSTO, RSTOS0, SH, RSTRH, SHR

5


  src_bytes

integer

bytes sent in one connection

6

   dst_bytes

integer

bytes received in one connection

7


  land

binary

if source and destination IP addresses and port numbers are equal then this
variable takes value 1 else 0

8

   wrong_fragment

integer

sum of bad checksum packets in a connection

9


  urgent

integer

sum of urgent packets in a connections. Urgent packets are packet with the
urgent bit activated

Here i am not sure about the wrong_fragment and urgent packet number part.
Will be great if someone enlightens me.

    Num.

Name

Type

Description

10

hot


  integer


  sum of hot actions in a connection such as: entering a systetory,
creating programs and executing programs


   11

   num_failed_logins


  integer


  number of incorrect logins in a connection


   12

   logged_in


  integer


  if the login is correct then 1 else 0


   13


  num_compromised


  integer


  sum of times appearance “not found” error in a connection


   14


  root_shell


  integer


  if the root gets the shell then 1 else 0


   15

   su_attempted


  integer


  if the su command has been used then 1 else 0


   16


  num_root


  integer


  sum of operations performed as root in a connection


   17

   num_file_creations


  integer


  sum of file creations in a connection


   18


  num_shells


  integer


  number of logins of normal users


   19

num_access_files


  integer


  sum of operations in control files in a connection


   20

num_outbound_cmds


  integer


  sum of outbound commands in a ftp session


   21

is_hot_login


  integer


  if the user is accessing as root or adm


   22

is_guest_login


  integer


  if the user is accessing as guest, anonymous or visitor




It seems these attributes require payload analysis. I am not sure whether
Bro is able to detect some of them by default rules or whether i will need
to write some custom ones.


    Num.

Name

Type

Description

23

count

  integer

sum of connections to the same destination IP address

24

   srv_count

  integer

sum of connections to the same destination port number

25

   serror_rate

  real

the percentage of connections that have activated the flag (4) s0, s1, s2

or s3, among the connections aggregated in count (23)

26


  srv_serror_rate

  real

the percentage of connections that have activated the flag (4) s0, s1, s2

or s3, among the connections aggregated in srv_count (24)

27


  rerror_rate

  real

the percentage of connections that have activated the flag (4) REJ,

among the connections aggregated in count (23)

28

   srv_error_rate

  real

the percentage of connections that have activated the flag (4) REJ,

among the connections aggregated in srv_count (24)

29


  same_srv_rate

  real

the percentage of connections that were to the same service, among

the connections aggregated in count (23)

30

   diff_srv_rate

  real

the percentage of connections that were to different services, among

the connections aggregated in count (23)

31


  srv_diff_host_rate

  real

the percentage of connections that were to different destination ma-

chines among the connections aggregated in srv_count (24)

These are totally ambiguous to me. I think i will need extra issue to
handle som results. But whether to wait some people to guide me first.


So if bro-ids is enough to calculate above attributes from a live traffic
somehow, whether either saving some attributes to DB and then reprocessing
them or any guidance will be appreciated. What i am trying is to recreate
these attributes for a real traffic and test my algorithm with the up to
date dataset.

-- 
Oğuz Yarımtepe
http://about.me/oguzy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20131004/23aabff3/attachment.html 


More information about the Bro mailing list