make process plants inherently safe against cyber attack

15
Make Process Plants Inherently Safe Against Cyber Attack Posted November 8, 2014 by Edward Marszal Warning… This is going to be a long blog post with a lot of large bits of reference material to go through. Please bear with me as the message is quite important. For years, industry pundits have been warning about the massive physical damage and loss of life that can occur as the results of cyber attacks. We have had government agencies prepare case studies demonstrating that demonstrate the cyber-attacks can cause physical damage to process plants. The most famous of these cases was the “Aurora” test that was staged (and yes, I mean that in the most pejorative sense of the word staged” by the Department of Homeland Security. The results of this staged event were widely reported on international news outlets like CNN. n addition to the Chicken Littles crying out that the sky is falling, we’ve also had at least one case of a cyber attack that was successful – STUXNET. Of course, the dirty little secret is that STUXNET should have failed too, and only didn’t because the design of the equipment that was being attacked was flawed. The government has cranked up the marketing machine related to preventing cyber attacks up to a monumental level, culminating with an entire cybersecurity awareness month. Now that cybersecurity awareness month has ended I would like to confidently report that industry’s response has been a resounding – yawn. While most process industry plants have not performed much cybersecurity work at all on their Industrial Control System (equipment), leaving their IT departments to carry out some basic perimeter guarding, physical damage to process plants caused by cyber-attacks is virtually non-existent. Amazing, considering that the number of attempts at hacking into industrial control systems has been reported to exceed several kajillion attempts per hour.

Upload: chimaovu-ehoro

Post on 02-Oct-2015

220 views

Category:

Documents


1 download

DESCRIPTION

for young engineers

TRANSCRIPT

Make Process Plants Inherently Safe Against Cyber AttackPosted November 8, 2014 by Edward MarszalWarning This is going to be a long blog post with a lot of large bits of reference material to go through. Please bear with me as the message is quite important.For years, industry pundits have been warning about the massive physical damage and loss of life that can occur as the results of cyber attacks. We have had government agencies prepare case studies demonstrating that demonstrate the cyber-attacks can cause physical damage to process plants. The most famous of these cases was the Aurora test that was staged (and yes, I mean that in the most pejorative sense of the word staged by the Department of Homeland Security. The results of this staged event were widely reported on international news outlets like CNN.

n addition to the Chicken Littles crying out that the sky is falling, weve also had at least one case of a cyber attack that was successful STUXNET. Of course, the dirty little secret is that STUXNET should have failed too, and only didnt because the design of the equipment that was being attacked was flawed. The government has cranked up the marketing machine related to preventing cyber attacks up to a monumental level, culminating with an entire cybersecurity awareness month. Now that cybersecurity awareness month has ended I would like to confidently report that industrys response has been a resounding yawn. While most process industry plants have not performed much cybersecurity work at all on their Industrial Control System (equipment), leaving their IT departments to carry out some basic perimeter guarding, physical damage to process plants caused by cyber-attacks is virtually non-existent. Amazing, considering that the number of attempts at hacking into industrial control systems has been reported to exceed several kajillion attempts per hour.

How is this possible? Simple. Process engineers arent idiots.

The great preponderance of the cyber researcher community has absolutely no idea how process plants operate and how they are designed. One famous cyber researcher once made this asinine statement, A lot of times the worst thing you can do, for example, is open a valve have bad things spew out of a valve. How dumb do you have to be to think that a process engineer created a plant where if a single valve is opened a catastrophe will occur Seriously? In the balance of this blog post I am going to explain a bit how process plants are designed, assessed for risk, and safeguarded against risks.

Lets start by having a sample plant to look at. We here at Kenexis have created a sample plant that we utilize for training class exercises. It is a small stripped down version of a natural gas production facility where gas well fluids are processed in separators to remove natural gas liquids (NGL), and the resultant gas stream is compressed to consumers, and the liquid stream is pump to another industrial user. The following drawings define the facility.This link will take you to a plot plan drawing which shows you a general arrangement of the facility DWG-Plot PlanThis link will take you to a set of drawings that include the process flow diagram (with heat and weight balance information) and the piping and instrumentation diagrams PFD & P&IDs

Once a process plant design has been created, the information that defines the plant (as shown above) is called the Process Safety Information (PSI). Once the PSI has been generated, the process plant is obligated (if the plant is covered by the OSHA Process Safety Management rule, which most plants where large scale bad things can happen are) to perform a Process Hazards Analysis. Most operating companies utilize the most comprehensive type of Process Hazards Analysis, the Hazards and Operability Study, or HAZOP. In a HAZOP study, a facility is broken down into Nodes of similar operating conditions and walked through a set of deviations, such as High Pressure, Low Temperature, Reverse Flow. For each of these guidewords, a multidisciplinary team determines if there is a cause of that deviation beyond safe operating limits. If so, the team determines the consequence if the deviation were to occur, and then lists all of the safeguards that are available to prevent that deviation from occurring or at least escalating to the point where damage can occur. Kenexis has performed a HAZOP on its sample facility. A link to that HAZOP is given below. If you are not familiar with HAZOP studies, I encourage you to look at this document and understand the process.

HAZOP Worksheets

In essence, when a HAZOP is performed a team of engineers look at virtually every failure that can possibly occur, and make sure that there are appropriate safeguards to protect against it. I assure you, if there was a single valve that could be opened to cause a catastrophe it would not make it through this process.

With all of that being said, a little bit more work can be done to make absolutely certain that your plant is inherently safe against cyber-attack. That additional step is what I am calling a HAZOP Cyber-Check. As you can see in the HAZOP study, for every deviation that was encountered, a CAUSE for that deviation was identified, and all of the SAFEGUARDS against that cause were also listed. From a cybersecurity perspective, you only have a problem if initiating the CAUSE while simultaneously disabling all of the SAFEGUARDS must be possible from the ICS equipment. This cyber-exposed position is rarely true, and can virtually always be made to be untrue. If there are no situations where this situation is true, then your plant is Inherently Safe against cyber attack.

A little more explanation

With respect to initiating events, or causes, in order to be cyber-exposed or hackable the physical cause discussed in the HAZOP would have to be made possible through a virtual command from the ICS. So, whereas a flow control valve going closed is hackable, the cause of operator inadvertently opens of bypass valve is not assuming that the bypass valve is hand-cranked, and not actuated from the ICS. So, the first step in the HAZOP Cyber-Check is going through each deviations cause to determine if it is hackable.

The next step is to review the SAFEGUARDS to determine if they are hackable. Any operator and computer controlled safety instrumented function is hackable, but many safeguards are not. The following list of safeguards are mechanical devices that cannot be hacked.Pressure relief valvesNon-return Check ValvesMechanical Overspeed TripsHard-wired overcurrent/undercurrent relays in pump/compressor motor startersAnalog Safety Instrumented Functions (i.e., analog transmitter, analog current monitor relay, solenoid/interposing relay)

If the deviation in the HAZOP includes any of the above safeguards, then the deviation cannot be generated through a cyber-attack and is thus considered not hackable.

At this point, Id like to circle back to the famous events where a cyber attack caused physical damage. In the case of STUXNET, a simple low-cost mechanical overspeed trip of the centrifuges could have prevented the machines from being damaged. Why was a mechanical overspeed trip not included? Overzealous cost-cutting engineers over-relied on programmable systems, and paid the ultimate price. I think that this is an especially important lesson for engineers designing overspeed systems for turbines in the electrical power generation industry. Dont let your hubris get the best of you. A mechanical overspeed trip might save your butt one day.

The other famous cause was the Aurora demonstration. The damage was caused to a turbine by generating an overspeed condition. In order for this to have occurred, the machine would have to not have been equipped with a mechanical overspeed trip, and also not have been equipped with a overcurrent/undercurrent relay in the driver. Is this credible? Let me put it this way. Buying an industrial turbine without undercurrent/overcurrent on the electric drive and without a mechanical overspeed trip is about as likely as going into a car dealership and walking out with a car that doesnt have seat belts or air bags.

If when going through the HAZOP Cyber-Check you get through the CAUSE and SAFEGUARDS and find that everything is hackable you then need to look at the consequences to determine if that deviation results in a significant consequence. If not, that attack vector is essentially a nuisances that well leave to our traditional cyber-security. If the consequence is significant, then it is incumbent upon the analyst to make a recommendation to add a non-hackable safeguard. This is not required as frequently as one would think, and is not that difficult to do. The worst case scenario is that you will have to mimic one of the programmable electronic system safety instrumented functions with an analog pathway. For instance, if you have a high reactor temperature opening a depressuring valve as a safety instrumented function, if it is located in a safety PLC, then in theory it can be hacked. But, if you use an analog signal splitter on the 4-20 mA single from the temperature transmitter, include a currently monitor relay on the split 4-20 mA loop, and then connect another solenoid valve to the pneumatic circuit of the depressuring valve poof, youve made an analog secondary pathway for that safety instrumented function that is hack-proof.Going through this process you can make any process plant Inherently Safe against cyber attack. This doesnt mean that youre engaging in some silly red/blue battle in the cyber-domain, this means that nothing can be done from the cyber-domain that can damage your plant. Forget about a terrorist sneaking in In this type of plant, you can sit that terrorist down at an operator station with full access to all operator terminals and programming terminals, and even give him ample training on how the plant and control systems work, and he still wont be able to cause any damage.All of this is possible with a HAZOP Cyber-Check. For your reference, I have included a HAZOP Cyber-Check worksheet that I created as a review of our sample plant. It can be accessed at the following link.GOGOCO Chemical City Plant HAZOP Cyber-CheckI have to admit that when I went through this exercise, I thought there would be more findings, but it turns out that right out of the box, the plant is Inherently Safe against cyber-attack. The only recommendations that were made were to ensure that the overcurrent/undercurrent interlocking built into the pump/compressor motor starters was in place and functional. I think that youll find that is common in the process industries, and why even through kajillions of attacks are taking place, nothing is really happening. I highly encourage everyone in the process industries to take on this additional task after each PHA. Theres no reason that you cant do it yourself, but of course Kenexis is available to help. I did the above HAZOP Cyber-Check in about 2 hours so it is really a cost-effective insurance policy. Im sure that you can get cyber-checks done on all of your HAZOPs for less than the price of the maintenance agreement of the software running on your network intrusion detection devices

CNN ExcerptsWASHINGTON (CNN) -- Researchers who launched an experimental cyber attack caused a generator to self-destruct, alarming the federal government and electrical industry about what might happen if such an attack were carried out on a larger scale, CNN has learned.Department of Homeland Security video shows a generator spewing smoke after a staged experiment. Sources familiar with the experiment said the same attack scenario could be used against huge generators that produce the country's electric power. Some experts fear bigger, coordinated attacks could cause widespread damage to electric infrastructure that could take months to fix. CNN has honored a request from the Department of Homeland Security not to divulge certain details about the experiment, dubbed "Aurora," and conducted in March at the Department of Energy's Idaho lab In a previously classified video of the test CNN obtained, the generator shakes and smokes, and then stops. DHS acknowledged the experiment involved controlled hacking into a replica of a power plant's control system. Sources familiar with the test said researchers changed the operating cycle of the generator, sending it out of control. Watch the generator shake and start to smoke The White House was briefed on the experiment, and DHS officials said they have since been working with the electric industry to devise a way to thwart such an attack. "I can't say it [the vulnerability] has been eliminated. But I can say a lot of risk has been taken off the table," said Robert Jamison, acting undersecretary of DHS's National Protection and Programs Directorate. Government sources said changes are being made to both computer software and physical hardware to protect power generating equipment. And the Nuclear Regulatory Commission said it is conducting inspections to ensure all nuclear plants have made the fix. Industry experts also said the experiment shows large electric systems are vulnerable in ways not previously demonstrated.Don't MissInvestigators: Homeland Security computers hacked "What people had assumed in the past is the worst thing you can do is shut things down. And that's not necessarily the case. A lot of times the worst thing you can do, for example, is open a valve -- have bad things spew out of a valve," said Joe Weiss of Applied Control Solutions. "The point is, it allows you to take control of these very large, very critical pieces of equipment and you can have them do what you want them to do," he said. Adding to the vulnerability of control systems, many of them are manufactured and used overseas. Persons at manufacturing plants overseas have access to control system schematics and even software program passwords, industry experts say. Weiss and others hypothesize that multiple, simultaneous cyber-attacks on key electric facilities could knock out power to a large geographic area for months, harming the nation's economy. "For about $5 million and between three to five years of preparation, an organization, whether it be transnational terrorist groups or nation states, could mount a strategic attack against the United States," said O. Sami Saydjari of the nonprofit Professionals for Cyber Defense. Economist Scott Borg, who produces security-related data for the federal government, projects that if a third of the country lost power for three months, the economic price tag would be $700 billion. "It's equivalent to 40 to 50 large hurricanes striking all at once," Borg said. "It's greater economic damage than any modern economy ever suffered. ... It's greater then the Great Depression. It's greater than the damage we did with strategic bombing on Germany in World War II." Computer experts have long warned of the vulnerability of cyber attacks, and many say the government is not devoting enough money or attention to the matter. "We need to get on it, and get on it quickly," said former CIA Director James Woolsey on Tuesday. Woolsey, along with other prominent computer and security experts, signed a 2002 letter to President Bush urging a massive cyber-defense program. "Fast and resolute mitigating action is needed to avoid a national disaster," the letter said. But five years later, there is no such program. Federal spending on electronic security is projected to increase slightly in the coming fiscal year, but spending in the Department of Homeland Security is projected to decrease to less than $100 million, with only $12 million spent to secure power control systems. Despite all the warnings and worry, there has not been any publicly known successful cyber-attack against a power plant's control system. And electric utilities have paid more attention to electronic risks than many other industries, adopting voluntary cyber-standards. "Of all our industries, there are only a couple -- perhaps banking and finance and telecommunications -- that have better cyber-security or better security in general then electric power," Borg said. And DHS notes that it uncovered the vulnerability discovered in March, and is taking steps with industry to address it. While acknowledging some vulnerability, DHS's Jamison said "several conditions have to be in place. ... You first have to gain access to that individual control system. [It] has to be a control system that is vulnerable to this type of attack." "You have to have overcome or have not enacted basic security protocols that are inherent on many of those systems. And you have to have some basic understanding of what you're doing. How the control system works and what, how the equipment works in order to do damage. But it is, it is a concern we take seriously." "It is a serious concern. But I want to point out that there is no threat, there is no indication that anybody is trying to take advantage of this individual vulnerability," Jamison said.

Managing industrial control system cybersecurity Proper cybersecurity keeps industry running efficiently By Jim Gilsinn The purpose of industrial control system (ICS) cybersecurity is to ensure that the industrial process performs safely and as expected. It should only perform at the right time, for the right people, and for the purposes for which it was designed. Anything outside those conditions is often considered a cybersecurity incident. Small improvements to the system design, network architecture, monitoring strategy, and maintenance policies can solve many problems before they become larger issues. Reliability and ROI When broaching the topic of cybersecurity with management, it is important to show some return on investment (ROI). Generally, organizations are not prepared to invest in cybersecurity for cybersecuritys sake. It may be easier to introduce cybersecurity improvements by looking toward the overall reliability and uptime of the system instead. The reliability and uptime of ICS is a function of safety, security, and performance. A failure in any of those conditions affects the overall reliability of the system, which will affect uptime and production efficiency. For many industrial processes, safety is king. Companies have learned the hard lessons of not responding to safety issues right away through a number of serious incidents. Organizations have also designed their systems to operate safely or with safer processes to lower their potential risk. By reducing the consequences in areas of their systems, organizations can reduce the complexity of the countermeasures that need to be applied to the system. Security can have a negative or positive effect on reliability and uptime, depending on how it is implemented. For example, it can segment the network, reduce the attack surface of legacy systems, and limit the spread of an incident. Performance seems like a natural aspect of reliability and uptime, but the root causes of performance degradation or failures may actually be overlooked. Performance problems often present themselves as inconsistent data delivery, halted human-machine interface screens, or jitter in data values. They may be indicators of network infrastructure problems and not the result of malfunctioning devices. Understanding risks Risk management is an integral part of industrial processes. Balancing the process risks with those for production quantity, quality, and safety is important for industrial organizations. When considering how to manage ICS security risks, learn from existing risk management systems. Organizations have often analyzed financial, safety, physical security, and business information technology (IT) security risks. The consequences and risk calculations made during those efforts are similar to those for ICS cybersecurity. Generally, the consequences will be the same for the different risk management systems, although the root causes may be different. When comparing ICS cybersecurity to other risk management systems consider people, devices, and systems not acting as they should or as they were configured, either through unintentional events or intentional actions. The failure modes associated with ICS security are slightly different as well. Loss of view = condition where a device or system is not receiving information from another device or system Manipulation of view = actions by an attacker to change the information between devices or systems Denial of control = condition where a device or system is not receiving control signals from another device or system Manipulation of control = actions by an attacker to change the control signals sent between devices or systems Loss of control = actions by an attacker to combine some or all of the above and deny information and control signals from reaching the proper devices or systems correctly For greenfield (new) ICS, security should be factored in from the start. When designing the control system, organizations should consider the security of components and communication paths. ICS cybersecurity should be included in the normal hazard and operability study, safety instrumented system (SIS) designs, and basic process control system designs. Consider possible single points of failure and systems that require extra protection due to potential consequences or their importance to the process. For brownfield (retrofit/upgrade) projects, security should be factored in to all future designs. The organization should consider adding or modifying security countermeasures during maintenance outages. These upgrades will require more planning, because maintenance outages are limited in duration and resources may not be available. Any improvements should be designed, procured, and tested with enough lead time to initiate them without any delaypossibly months in advance. Prioritizing countermeasures In a perfect world, organizations have enough personnel and funding to implement ICS security for all their systems once management approves. In the real world, capital expenditures are limited, personnel are almost always overloaded, and systems cannot be shut down at a moments notice. Organizations need to prioritize their ICS security countermeasures. One way to do this is by looking at the ability to implement versus the time for planned outages and making three categories: easily actionable improvements, near-term improvements, and long-term plans (see table 1).

Table 1. Countermeasures prioritized by planned outages Easily actionable improvements are not on the critical path where a change requires shutting down the main process. An example is removing unused or unauthorized software from operator workstations. Another example is upgrading a network interface on equipment only used periodically. A third example is adding a test system capable of validating patches and updates before they are applied to the production equipment. Near-term achievable improvements are countermeasures that can be fully developed, procured, tested, and ready to implement before the next planned outage. For example, if a plant is planning a network infrastructure change, it will require downtime to change equipment. The network change can be planned; the equipment can be procured and preconfigured; and the personnel can be trained on the new equipment before the outage. This could possibly require months of preparation. Long-term plans are countermeasures that may take multiple planned outages to accomplish. They can be done in different ways. One way is to break down the long-term plans into a series of near-term improvements that can then be implemented during multiple planned outages. Another way is to develop the plans in parallel with the existing system. Once finished and tested, the production system can be switched over during a planned outage. Depending on the size of the long-term plans, it may be necessary to use a combination of both methods. Network segmentation Network segmentation is one of the biggest factors in ICS network reliability. Properly designed segmentation can be a natural barrier to performance and security issues. A poorly designed network architecture may expose ICS to unnecessary network traffic, expand the potential attack surface of ICS equipment, and reduce the overall effectiveness of security practices within the organization. Technology is only part of the solution Network segmentation is more than just adding technology to a network. It is a process to understand: what devices communicate on the network how fast or often those devices communicate where the information flows throughout the network what form that information takes Understanding how the ICS devices and systems interact is key to designing for robustness and reliability. In the example ICS network architecture (see figure 1), major areas have been segmented by purpose and physical location. Buildings 2 and 3 are tightly coupled, requiring real-time ICS protocol traffic with control cycles in milliseconds. To reduce the ICS core network overhead and improve performance, these networks are joined into a single segment. Building 4 needs access to buildings 3s information, but it resides on an OPC server in the ICS servers segment.

Figure 1. Example network architecture overlaid with security zones Security zones are areas within a system that contain similar security requirements. In Figure 1, building 1 contains two distinct areas with different sets of security requirements, and area 1 has been assigned its own security zone and network segment. Examples of this situation may be SIS, legacy system, or vendor-proprietary equipment. Simplicity of design Figure 1 may look complex enough with multiple network topology layers and security zones. When overlaying these concepts on an operational ICS network, the architecture only gets more complex. In reality, though, the process of segmenting portions of the network and creating security zones makes things easier in the long run. For networks similar to the one in figure 1, it is common to use a layer 3 switch at the ICS core and separate IP subnets for each of the network segments shown. Improvements to reliability and maintenance will probably overshadow the added initial cost of the network hardware. ICS devices are sensitive to the amount of network traffic they are exposed to on a network. Reducing the amount of network traffic not associated with an ICS devices specific function increases its overall performance. Maintenance reports and network alarms may only include a devices IP address and an error. If the IP address subnet directly relates to a physical area within the facility, maintenance personnel will be able to identify systems and devices more easily. A host-numbering scheme can help further. For example, assign a host number from 1029 for all controllers and 3099 for all I/O devices. Another example is to assign host numbers from 1049 for process 1 and 5089 for process 2. Adding security zones on top of network segments may also seem like extra, unnecessary complexity. In many cases, there will be only three security zones assigned to the ICS. The business/ICS demilitarized (DMZ) zone is responsible for information passed between the business and ICS networks. It needs to restrict direct access through the DMZ, while still allowing information to pass securely in both directions. The ICS core and servers zone generally consolidates and processes information going through the DMZ between the business and ICS systems. The ICS process networks contain the bulk of the systems within the ICS environment. These are the controllers, I/O systems, sensors, actuators, process equipment, and other devices that make up the actual process. Some subnets, systems, or subsystems within the ICS network have higher security requirements than the rest of the network (e.g., SIS, legacy systems, vendor proprietary systems, and tightly coupled network devices). In these cases, an additional security zone can be created to isolate them from the rest of the network. Monitoring If a system is connected to a network and not monitored, then there is no guarantee that it is safe, secure, or performing properly. Monitoring can be done with specialized, designed systems and services or by observing the behavior of the system. Four main things to monitor are network segmentation devices, ingress/egress filtering, intrusion detection systems (IDSs), and network performance indicators. Network segmentation device monitoring Segmenting networks using devices such as layer 3 switches, firewalls, routers, and data diodes should be part of most network design. The rule sets, configurations, and logs for the segmentation devices should use strict change management policies and be monitored regularly. Changes to access privileges, management interfaces, or segmentation rule sets should receive prior approval. Even small changes can have drastic effects on the architecture of the network. An automated tool for monitoring these changes is recommended given the length and complexity of the files. Ingress/egress filtering It is also important to monitor the types of traffic flowing in and out of the ICS network. If the configuration of the segmentation device is changed without the knowledge of the ICS network administrator, the first indication may be unknown network traffic going across the business/ICS network boundary. Monitoring the ingress and egress of traffic during factory acceptance testing (FAT), site acceptance testing (SAT), or process startup establishes a baseline of the network traffic. Facilities should monitor traffic periodically to look for new network communication paths to addresses within or outside the organization. If unknown traffic is detected, there may be a problem in a particular ICS or network segmentation device. Intrusion detection systems IDSs are valuable tools capable of monitoring traffic continuously to look for known conditions based upon a set of rules. In many cases, additional rules can be generated for known good traffic. IDSs also generate alarm and event data that can be integrated into other systems, like security information and event management systems. Network performance indicators Network performance indicators are less well defined than the other types of monitoring and are very process dependent. The network streams that are most sensitive to network anomalies are heavily dependent on the ICS network architecture, devices, protocols, and environmental conditions. Some network performance metrics and methods have been developed to aid organizations. The cyclic jitter on periodic traffic or the latency associated with command/response traffic can measure an ICS network traffic stream. Deciding which metric is more important depends on the ICS being analyzed. Also, statistics about these streams may not be useful. In many cases, mean, minimum, maximum, and standard deviation values do not indicate any problems. Network performance indicators may only be observable when looking at time plots for each traffic stream. Tools such as Matlab, Microsoft Excel, and the Kenexis Gemini tool can be used to generate time plots. In the following two examples, the desired cyclic traffic pattern should be 20 ms. In figure 2, the device produced a random pattern at approximately 40 ms delay, which the receiving device interpreted as missing packets. The device produced these with no recognizable pattern. In figure 3, the device was able to transmit at approximately 19.5 and 20.5 ms, but not at 20 ms exactly. In addition, an angular pattern indicated a skew between two or more of the devices internal clocks.

Figure 2. Example #1 of Gemini Tool showing device performance

Figure 3. Example #2 of Gemini Tool showing device performance Both examples came from prototype devices during development. The data captures and traffic graphs gave the vendor valuable information it used to improve its products before production. Had these graphs been from devices in live production, the results may have indicated an anomaly in the device. That anomaly could be from a vendor development issue, but it could also be from a network performance problem or security incident. Using network performance tools during FAT, SAT, and startup to create baselines is an important way to judge the continuing health of the network. By comparing the baseline diagrams to ones collected periodically, early indications of network performance issues can be determined before they lead to larger problems. Whitelisting Whitelisting is not a new technology, but it has not gained much traction in the IT environment. Whitelisting is the process of restricting the applications and libraries that can run on a system to a previously approved list. When an application starts, it is checked against the approved list to determine whether it can run. This is the opposite of blacklisting software, such as antivirus and antimalware applications, that restrict known bad behavior. Whitelisting makes starting an application slower, but it runs much faster in memory because its checks are much less intrusive. For IT desktop computers, whitelisting is not common, because the software on them changes regularly with the introduction of new operating system patches, software updates, antivirus signatures, etc. After each change, a system administrator would have to verify that the new applications and libraries are valid and whitelist each one of them. For more than a couple systems, this would be too time consuming to be practical. For systems in the ICS environment that do not change regularly and where changes have to be approved through a change management process, whitelisting makes sense. An administrator will already be going through the action of approving the specific changes. After each change has been made, the final step in the change management process would be to approve the change in the whitelisting software. In summary, there are many things an ICS organization can do to manage its cybersecurity. In many cases, improving the reliability and uptime of the systems has much more return on investment for the organization. Security is one aspect of reliability and uptime, as are performance and safety, and many aspects of improving the performance, safety, and security of systems interrelate. Small improvements to design, architecture, monitoring and maintenance policies, and personnel responsiveness can solve many problems before they become large issues.