analysis of corporate email management system
DESCRIPTION
operations managementTRANSCRIPT
Indian Institute of Management,
Bangalore
Adarsh Natarajan 2008003
Alok Shukla 2008005
Dharmesh Gandhi 2008019
Narendran Subbaiah 2008038
Ritabrata Bhaumik 2008044
PGSEM 2008 – Section ‘A’ –OM
Group 9
[OPERATIONS MANAGEMENT – ANALYSIS OF CORPORATE EMAIL MANAGEMENT
SYSTEM]
OM – Analysis of Corporate Email System
2 | P a g e
1 Table of Contents
1. Analysis of corporate email system ......................................................................................................3
1.1. Definition of Terms ...........................................................................................................................3
1.2. Block Diagram of existing email system............................................................................................4
1.3. Explanation of the existing system ...................................................................................................4
1.4. Assumptions......................................................................................................................................5
2. Performance Evaluation of the existing System ...............................................................................5
3. Proposal for a new system....................................................................................................................8
3.1. Block Diagram of new email system .................................................................................................8
3.2. Performance Evaluation of the new email system ...........................................................................8
3.3. Initial Performance evaluation for this system.................................................................................9
3.4. Effects of change in the sequencing .................................................................................................9
OM – Analysis of Corporate Email System
3 | P a g e
1. Analysis of corporate email system
1.1. Definition of Terms
1.1.1. Anti Spam & Phish
Anti Spam and Phishing scans an email looking for spam signatures based on preexisting rules.
The same step provides for a phishing score for an individual manner. This step handles the bulk
of the load of the email system. This step is slow due to various steps involved in decomposition
of an email into multiple steps. Around 45% of the incoming mails in an average system would
constitute as Spam or Phish.
1.1.2. Anti Virus
Anti Virus test involves looking for signature based on MD5 checksums on an individual email.
This is generally a fast step for most of the emails. In practice hardly 10% of the emails coming in
at that scanning step could be identified as containing a virus.
1.1.3. File Filter Check
A file filter check would generally constitute for looking for allowed types of attachments for a
particular corporate. This step is generally fast apart from the minority cases of renaming of a
file or for outgoing emails from the system where data level protection test is also being run.
1.1.4. MCC
This step would constitute of multiple sub steps. It constitute of mail size checks, encrypted
email checks and corrupt email check. This step is usually fast and is responsible for around 20%
of the rejection for that step in the system.
1.1.5. Content Scan
Content scan steps involves parsing an email for objectionable and restricted content. As this
involved breaking down each and every type of file type into understandable format, this step is
usually very slow in comparison to other scanner steps. This step also accounts for around 20%
rejection of the emails for that step.
OM – Analysis of Corporate Email System
4 | P a g e
1.2. Block Diagram of existing email system
1.3. Explanation of the existing system
The existing architecture has the first phase as parsing and scanning of emails. The total number of
resources is 50 (number of threads).The second phase picks up the emails from the buffer. The second
phase of actual scanning itself is comprised of 5 stages. Each stage in the second phase has a resource of
10 threads each.
5 stages of second phase
� Anti-spam (and anti-phish)
� Mail composite scan
� File filter test
� Content scan
� Anti-virus
Scanner Steps % of emails dropped at each level from the earlier level
Anti Spam (AS) 45
MCC 20
File Filter Test (FFT) 40
Content Scan (CS) 20
Anti Virus (SV) 10
OM – Analysis of Corporate Email System
5 | P a g e
In the entire document we would be talking about only the second phase.
1.4. Assumptions
1.4.1. Speed of processing is dependent on an individual systems’ configuration. All the data used
in this document is for a specific Intel operating system with a 2 GB of RAM
1.4.2. A clear assumption for this system is that each steps’ rejection rate is held constant
irrespective of where it is used. It would always accord for rejecting a pre-determined
number of emails coming at that step irrespective of where it is sequenced.
1.4.3. Extraneous factors such as hyper threading, multiple processors, and various policies of
missing some scanning steps are relaxed for the discussion in this document.
2. Performance Evaluation of the existing System
The table outlines the current design for the second phase.
Here the effective capacity factor =>
(no of e-mails coming to the stage)/(No of e-mails entering the second phase)
Effective capacity= capacity / effective capacity factor
Stage Stage
Name
Speed
(Emails/
Min)
Resources
(Threads)
Capacity Probability Of
Email Drop
Effective
Capacity
Factor
Effective Capacity
Emails per
minutes
1 AS 30.0000 10 300 0.45 1 300
2 MCC 50.0000 10 500 0.2 0.55 909.0909091
3 FF 70.0000 10 700 0.4 0.44 1590.909091
4 CS 15.0000 10 150 0.2 0.264 568.1818182
5 AV 111.0000 10 1110 0.1 0.2112 5255.681818
The bottleneck as we can see is the stage 1: Resources have been equally distributed to all the
stages. Each stage has a pool of 10 threads. Each thread is one resource in OM terminology.
OM – Analysis of Corporate Email System
6 | P a g e
First level of optimization could be achieved by proper distribution of resources
Stage Stage
Name
Speed
(Emails/
Min)
Resources
(Threads)
Capacity Probability Of
Email Drop
Effective
Capacity
Factor
Effective Capacity
Emails per
minutes
1 AS 30.0000 23 690 0.45 1 690.00
2 MCC 50.0000 8 400 0.2 0.55 727.27
3 FF 70.0000 5 350 0.4 0.44 795.45
4 CS 15.0000 12 180 0.2 0.264 681.82
5 AV 111.0000 2 222 0.1 0.2112 1051.14
By a simple distribution we have managed an improvement of127.3%.
However we hit a bottleneck in Content scan stage. Surely there can be much higher increase in
capacity as a lot of wasted capacity in terms of idle threads will always be there. Let us have a
look at the waiting times for this configuration.
where ρ= λeffective/mµ ; λeffective = λ * (effective capacity factor)
Assume poisson => Ca = Cs =1
OM – Analysis of Corporate Email System
7 | P a g e
This waiting time is for the 2nd
phase only.
Total waiting time
0
0.5
1
1.5
2
2.5
300 400 500 600 650 675 677 680
λ
Ws
(min)
OM – Analysis of Corporate Email System
8 | P a g e
3. Proposal for a new system
3.1. Block Diagram of new email system
3.2. Performance Evaluation of the new email system
In this system following changes have been carried out. Here
� We are pooling the 10 threads from each stage into a total pool of 50 threads. The
pooling effect is expected to increase the capacity of the system.
� As soon as a thread is available, it would pick up an e-mail from the pipe (which is more
like a buffer) and performs all the operations (anti-spam, MCC,FFT,etc) sequentially.
While these operations are being performed, the thread obviously cannot pick-up any
other operation. When all these operations are complete it can pick up the next
available e-mail from the pipe.
OM – Analysis of Corporate Email System
9 | P a g e
� All such resources would be available on the basis of multiple independent instances of
the same resources. This would essentially mean that at a given point of time all 50 scan
threads would be using an Anti Spam instances which would be mutually independent
of each other.
� Key assumption here is that rejection rate of the individual system would remain the
same which is consistent with the earlier system.
� This situation is an example of pooling at both ends. There could be a many to many
combinations between the IPC Pipes and Email Scan threads. Since each IPC pipe could
be used by the first available thread, so average time of processing for an email
contained in any of the IPC Pipe would be same.
3.3. Initial Performance evaluation for this system
Here the effective capacity factor =>
(no of e-mails coming to the stage)/(No of e-mails entering the second phase)
Serial
Number
Step
Name
% of Emails
being dropped
Speed (Email
per minute)
Time spent
per email
Effective Capacity
Factor
Average time
spent on the
Step (min) per
entering 2nd
phase
1 AS 0.45 30 0.033333 1 0.033333
2 MCC 0.2 50 0.02 0.55 0.011
3 FF 0.4 70 0.01 0.44 0.006285714
4 CS 0.2 15 0.07 0.264 0.0176
5 AV 0.1 111 0.01 0.2112 0.001902703
Average Total time 0.07012175
For 50 scan threads the capacity of the system would be 713.0455211 emails per minutes. This is a jump
of around 5% from the existing system.
3.4. Effects of change in the sequencing
Some more improvements can be experienced here with the effect of sequencing of these steps. We
can re-arrange those steps earlier in the sequence whose product of Speed with percentage of emails
dropped is higher. So following sequence could be obtained by using this logic. Please note that this is a
crude logic and ideally linear programming should be used to find the best sequence of steps.
OM – Analysis of Corporate Email System
10 | P a g e
Step Sequence Number Step Name
1 FF
2 AS
3 AV
4 MCC
5 CS
Serial
Number
Step
Name
% of Emails
being dropped
Speed (Email
per minute)
Time spent
per email
Effective Capacity
Factor
Average time
spent on the
Step (min)
1 FF 0.4 70 0.01 1 0.014285714
2 AS 0.45 30 0.03 0.6 0.02
3 AV 0.1 111 0.01 0.33 0.002972973
4 MCC 0.2 50 0.02 0.297 0.00594
5 CS 0.2 15 0.07 0.2376 0.01584
Average Total time 0.0590386
Capacity = 1/ (Average Total time)* Number of resources
For 50 threads the capacity of the system would be 846.9022 emails per minutes. This is a jump of
around 24% from the existing system
So we have achieved an overall jump in capacity of 182.3% from the
original system.
Original capacity: 300 emails/min
Improved capacity: 846.9 emails/min