lma: log mail analyzer maurizio aiello [email protected] national research council...

25
LMA: Log Mail Analyzer Maurizio Aiello [email protected] National Research Council Institute of Electronics and Telecommunications and Information Engineering (IEIIT) http://sourceforge.net/lma

Upload: melvyn-byrd

Post on 13-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

LMA: Log Mail Analyzer

Maurizio [email protected]

National Research CouncilInstitute of Electronics and Telecommunications and Information

Engineering (IEIIT)

http://sourceforge.net/lma

Page 2: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Free software project LMA: Log Mail Analyzer

What can be performed with Log File Analysys?– User’s request– Normal debugging operations– Help for worm detection

Why do we need a tool for log mail analysis?Mainly, avoiding headacheSpeeding up operation

Page 3: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Postfix architecture

Why are log files so complex?

– Modularity– Log = Debug– …

Page 4: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Interesting fields

What information do we need about an e-mail transaction?

Using hash QID (queue identifier) we retrieve value for each field above

Timestamp Ip client Mail From Rcpt to Status

Page 5: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Postfix :remote client to local user

Page 6: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

E-mail translation

Retrieving info on a mail:

Find its QIDSearch lines related to that QIDReconstruct transaction (Local-Local, L-Remote, R-L, R-R)

LMA Module:Log-Translator

Output: info file (plaintext)

Page 7: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Architectural issue

Customization needs:– Network architecture– Antivirus server– ….

File conf:– Whitelisting– Network selection– DB format, server type

Page 8: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Database generation

To store e-mail transaction we support 2 options:

Transactional db: Mysql Berkeley DB

+ query flexibility+ engine power

+ LMA standalone program (no db engine required)

- need to install engine - need to build engine- engine power and flexibility

Page 9: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Dbgenerator module

With berkeleyDB we have to build db engine:

Page 10: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Database keys and values

Database Key Value

Mail_db E-mail_number (progressive integer)

Timestamp, ip, from, to, status

Date_db Timestamp

IP_db Ip address

Receiver_db “Rcpt to” recipient

Sender_db “mail from” sender

Sequence of e-mail_number

Page 11: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Database schema

Page 12: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Query engine and example

To search through DB, LMA performs the following:

Example: find all e-mails sent from [email protected]:

1. search [email protected] in Sender_db table2. obtain a list of integer which are keys in mail

table [email protected] -> 27 | 45| 78| 3456| 8960 etc.3. retrieve all the data about each e-mail

27 ->01-Jan-2004|xxx.yyy.www.zzz|[email protected]|[email protected]|250

Page 13: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Built-in query

List all e-mail sent with the following characteristics:

IP: from a particular IPFROM: with a given “mail from” fieldTO: to a particular recipientDATE: with ts_begin < timestamp < ts_final

Sysman & Debugging OK.

Page 14: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Security?

What about security?

Worms use “direct” method to spread, scanning ports and exploiting vulnerabilities, or

Use “indirect” way, for example using its own smtp engine or smtp server taken from User Agent settings.

Page 15: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Security aspects

PC is infected by an indirect worm: we expectLots of e-mail sent in a given time period;Different “mail from” field used by the same ip;Some abnormal mail repudiation by internet server.

LMA birth:awk ' BEGIN { FS="[" } /client=/ { print $3 } ' < mail.log | sed s/]// |

sort | uniq -c | sort -r

Page 16: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Another free project: Worm Poacher

Project with aim to:

• study behaviour of e-mail client

•Detect anomalies

•Take the appropriate countermeasure

Page 17: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Statistical data mining

Number of e-mails sent every 5m, 1h, 4h, 8h, 24h are calculated, plotted and analyzed

April 2004

0

200

400

600

800

1000

1200

1400

1 81 161 241 321 401 481 561 641

Time (h)

# e-

mai

ls

Page 18: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Baseline & statistichal

Visual inspections andBaseline threshold analysis and alert raising: Baseline =

Calculated subtracting “inactivity period”Correlation between different time_slice (5m, 1h

etc.) alerts to reduce false alarms.

Page 19: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Mail from

Normally, client pc use few Mail from fields. Some worms change this field (stealthyness)

Strange behaviour for a Pc?

80 different address in a day!

As before baseline calculated statistically for each ip.

Page 20: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Reject analysis

When a worm tries to spread fast, sometimes it chooses a random list of recipient (like [email protected]).

Probably a lot of these messages are rejected.

Baseline calculation and threshold analysis.

Page 21: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Kind of analysys performed

Global Flow Single ip flow

Number of e-mails sent

X X

Different mail from address

X X

Number of rejected mails

X X

Page 22: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Single ip flow analysis

Baseline calculated on each ip, instead of global trafficSingle ip flow useful in big network (where signal/noise ratio is low).Performance problem and architectural issue (impossible to perform with dhcp, shared pc etc.)

Page 23: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Results

Page 24: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Worm decision

Page 25: LMA: Log Mail Analyzer Maurizio Aiello maurizio.aiello@ieiit.cnr.it National Research Council Institute of Electronics and Telecommunications and Information

Future development

Baseline dinamically updated

Alarms generated by daemon

SMTPsniffer. Reason: system independent from logfile format; can control any server.

Extension to ports different from 25.