driving datacenter efficiency through server and application

40
An Oracle White Paper June 2010 Driving Datacenter Efficiency Through Server and Application Consolidation

Upload: others

Post on 04-Feb-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

An Oracle White Paper June 2010

Driving Datacenter Efficiency Through Server and Application Consolidation

Driving Datacenter Efficiency Through Server and Application Consolidation

Executive Overview ............................................................................ 1 Introduction ......................................................................................... 1 Optimizing the Datacenter .................................................................. 3

Server Consolidation....................................................................... 4 Application Consolidation................................................................ 6 Putting It All to Work ....................................................................... 8

Improving Datacenter Efficiency and Flexibility ................................ 10 Dynamic Domains......................................................................... 10 Dynamic Reconfiguration.............................................................. 16

Simplifying Application Consolidation ............................................... 22 Oracle Solaris Containers ............................................................. 23 Oracle Solaris Resource Manager ................................................ 25

Managing the Virtualized Environment ............................................. 28 eXtended System Controller Facility ............................................. 29 Oracle Enterprise Manager Ops Center ....................................... 31 SNMP Service............................................................................... 36

Conclusion ........................................................................................ 37

Driving Datacenter Efficiency Through Server and Application Consolidation

1

Executive Overview

The massive expansion of compute infrastructure over the last few decades has left many organizations facing power and cooling, floor space, and administrative challenges. By embracing application and server consolidation strategies, IT managers can gain real efficiencies that lower costs and streamline operations. Oracle’s Sun SPARC Enterprise M-Series servers offer the scalability required to tackle consolidation projects. These systems also support a choice of virtualization technologies that help organizations match the right level of application and resource isolation to the job at hand.

Introduction

Business processes are increasingly dependent upon technology. After decades of expanding the IT infrastructure, many organizations own a large, complex network of systems. Current challenges within these environments include datacenter floor space limits, excessive energy costs, and constraints on administrative resources. Given these conditions, many IT managers seek a more-reasonable IT infrastructure strategy.

Opportunities exist to gain efficiencies through application and server consolidation. Bringing together applications, databases, and services onto fewer, highly reliable servers can lower costs, increase efficiency, and reduce administration requirements. Sun SPARC Enterprise M-Series servers offer the scalability that can maximize the return on consolidation efforts. In addition, these systems provide a range of virtualization technologies—at no additional cost—that can help companies achieve the right level of application isolation and management flexibility.

With symmetric multiprocessing scalability from 1 to 64 processors, memory subsystems as large as 4 TB, and high-throughput I/O architectures, Sun SPARC Enterprise M-Series servers easily perform the heavy lifting required of consolidated workloads. Also architected to reduce planned and unplanned downtime, these servers provide the increased reliability, availability,

Driving Datacenter Efficiency Through Server and Application Consolidation

2

and serviceability (RAS) needed by a consolidated IT infrastructure. Sun SPARC Enterprise M-Series servers support a variety of tools to create a virtualized environment and simplify the management of application and resource isolation, including the following:

• Dynamic Domains partition Sun SPARC Enterprise M-Series servers along physical boundaries, providing complete resource isolation between applications.

• Dynamic Reconfiguration (DR) helps support continuous uptime and increases flexibility by allowing the addition and removal of CPU/memory and I/O boards and the transfer of these resources from one Dynamic Domain to another without interrupting application processing.

• Oracle Solaris Containers technology isolates software applications and services using flexible, software-defined boundaries. With this technology, many private execution environments can be created within a single instance of the Oracle Solaris operating system (OS).

Through examples, this white paper provides insight into the virtualization capabilities provided by Sun SPARC Enterprise M-Series servers. Detailed descriptions of Dynamic Domains, DR, and Containers technologies help provide a better understanding of each virtualization tool. In addition, information is included regarding embedded and add-on management tools that help streamline operation and enhance the value of using Sun SPARC Enterprise M-Series servers within a virtualization strategy.

Driving Datacenter Efficiency Through Server and Application Consolidation

3

Optimizing the Datacenter

Datacenters containing a large number of outdated and underused servers are cumbersome to manage and expensive to maintain. By reducing server count, consolidation projects can help organizations realize greater efficiencies. An environment with fewer systems lowers datacenter space, power, and cooling requirements and reduces administrative tasks. Consolidating applications and servers can also reduce software licensing, support, and maintenance fees. As a result, companies can achieve significant capital and operating cost reductions. Furthermore, moving software that executes on older hardware and operating systems onto newer, more-powerful platforms often provides the side benefit of improving application performance.

Workload characteristics often drive the consolidation approach. In some cases, applications can easily reside on the same server in the same operating system instance. Figure 1 provides a simple example of consolidating four NFS servers onto a single platform. The homogeneity of the workload keeps resource, management, and tuning conflicts to a minimum. In this example, the server and operating system count both move from four to one, creating an environment that is easier to manage and maintain.

Figure 1. The consolidation of four homogeneous workloads onto a single server.

In contrast to this first example, most consolidation projects are more complex and involve the colocation of multiple types of workloads. Within many IT departments, early attempts to host a mix of applications on a single server uncovered a number of challenges. For instance, an ill-behaved application can starve the processing or I/O resources from other colocated software. Maintenance and tuning requirements might not always align easily. In addition, simply installing a number of applications on a single server can fail to provide the strict boundaries required by applications that process sensitive data. In the past, finding compatible workloads meant creating detailed application profiles, including operating system tuning needs, software library requirements, uptime mandates, and

Driving Datacenter Efficiency Through Server and Application Consolidation

4

growth predictions. At the end of these efforts, many organizations failed to identify more than a few application consolidation opportunities. Many applications simply required the isolation previously only available by executing them on a separate server.

Today, a number of approaches exist to help isolate software programs running within a consolidated server. As described in the following sections, Sun SPARC Enterprise M-Series servers support a range of technologies to accommodate various flexibility, availability, and security requirements.

Server Consolidation

Server consolidation projects aim at reducing the total number of systems in the datacenter. The security and workload characteristics of some applications can complicate server consolidation efforts. However, hard partitioning technology can simplify the process of reducing the total number of systems while still maintaining complete application isolation. Hard partitioning technology divides the physical resources of a system, providing individual workloads with exclusive access to specific CPU, memory, and I/O components.

Sun SPARC Enterprise M-Series servers offer hard partitioning technology in the form of Dynamic Domains. Instantiating a number of Dynamic Domains on a Sun SPARC Enterprise M-Series server divides the system into multiple electrically isolated partitions. Each Dynamic Domain executes a unique instance of Oracle Solaris. Because isolation is instantiated all the way to the hardware, configurations can be created in which software changes, reboots, and potential faults in one domain do not impact applications running in other domains. Figure 2 provides an example of consolidating a number of resource intensive applications with various security and uptime requirements onto a single platform.

Driving Datacenter Efficiency Through Server and Application Consolidation

5

Figure 2. Server consolidation with complete fault and resource isolation and dynamic control of compute capacity within each domain.

A key feature of Sun SPARC Enterprise M-Series servers, DR supports the movement of CPU, memory, and I/O resources from one Dynamic Domain to another—without the need for downtime. As described in the following examples, using Dynamic Domains and DR can help organizations respond more rapidly to changing business conditions and project requirements:

• Consolidation. One Sun SPARC Enterprise M-Series server can replace multiple smaller servers. Consolidated servers are easier to administer, more robust, and offer the flexibility to shift resources freely from one application to another. Increased flexibility is especially important as applications grow, or when demand reaches peak levels, requiring additional resources to be rapidly deployed.

• Development, production, and test environments. In production environments, many sites require separate development and test systems. Isolating these systems with domains helps enable development work to continue without impacting production runs. With the Sun SPARC Enterprise M-Series servers, development and test functions can safely coexist on the same platform.

• Software migration. Dynamic Domains can be used to help migrate systems or application software and associated users to updated versions. New or perhaps more-experienced users can employ the latest versions in one isolated domain, while others waiting to be trained can continue to

Driving Datacenter Efficiency Through Server and Application Consolidation

6

use older versions in another domain. This approach applies equally well to Oracle Solaris, database applications, new administrative environments, and other applications.

• Special I/O or network functions. A system domain can be established to deal with specific I/O devices or functions isolated within its domain. For example, a high-end tape device can be attached to a dedicated system domain, which can be added to other system domains when there is a need to make use of the device.

• Departmental systems. Multiple projects or departments can share a single Sun SPARC Enterprise M-Series server, increasing economies of scale and easing cost justification and accounting requirements.

• Configuring for special resource requirements or limitations. Projects that have resource requirements that might starve other applications can be isolated to their own system domain. For applications that lack scalability, multiple instances of the application can be run in separate system domains, or in containers within one domain.

• Hardware repairs and rolling upgrades. Because each domain runs its own instance of Oracle Solaris and has its own peripherals and network connections, domains can be reconfigured without interrupting the operation of other domains. Domains can be used to remove and reinstall boards for repair or upgrade, to test new applications, and to perform operating system updates.

Taking advantage of Dynamic Domains can help organizations reduce the number of datacenter platforms, while dramatically increasing the flexibility of the capacity planning and management process. More information on the hard partitioning capabilities of Sun SPARC Enterprise M-Series servers is found in the section titled “Improving Datacenter Efficiency and Flexibility.”

Application Consolidation

Minimizing operating costs is a common goal of consolidation projects. For many organizations, this aim requires application consolidation—reducing the number of operating system instances and applications as well as the number of platforms. Ideally, application consolidation merges the functions of multiple software programs into a single workload. At the least, application consolidation involves hosting more than one software program per operating system instance.

Resource contention between applications and conflicting software settings can become an obstacle to application consolidation. Operating system virtualization technology helps create isolated environments, allowing multiple applications to coexist without interference. Sun SPARC Enterprise M-Series servers support Containers technology to facilitate provisioning the resources of a single server and operating system instance into many private environments.

Containers are the virtual operating system abstractions that control namespace and software fault isolation. Each Container maintains a unique identity that is separate from the underlying hardware and behaves as if it is a single system. Oracle Solaris Resource Manager is an additional component of a Container that further defines resource rights. Oracle Solaris Resource Manager leverages operating system controls to govern the use of CPU, memory, and I/O. For example, system administrators can set and enforce policies that guarantee a share of CPU cycles and virtual memory space to individual

Driving Datacenter Efficiency Through Server and Application Consolidation

7

zones. Administrators can also set upper limits on process count, number of logins, and connect time for each system user ID. To increase flexibility, Oracle Solaris Resource Manager supports the dynamic allocation of threads to a Container.

Oracle Solaris Containers enable organizations to

• Build customized, isolated environments—each with its own IP address, file system, users, and assigned resources—to safely and easily consolidate systems

• Guarantee sufficient CPU and memory resource allocation to applications while retaining the ability to use idle resources as needed

• Reserve and allocate a specific CPU or group of CPUs for the exclusive use of the container

Figure 3 illustrates how Containers technology can be used to host multiple workload types within a single instance of Oracle Solaris. In this scenario, administrators can configure separate LAN or virtual LAN connections with exclusive IP stacks for each Container, enabling secure separation of network traffic. In addition, the ability to exert fine-grained control over rights and resources keeps applications from interfering with one another.

Driving Datacenter Efficiency Through Server and Application Consolidation

8

Figure 3. Application consolidation with isolation between applications provided by Containers technology.

Utilizing Containers can help organizations create consolidated environments that dramatically decrease the number of operating system instances to manage. Additional information on utilizing Containers with Sun SPARC Enterprise M-Series servers can be found in the section titled “Simplifying Application Consolidation.”

Putting It All to Work

Each type of virtualization technology offers different strengths. Table 1 compares the capabilities of Dynamic Domains and Containers. It is important to note that the technologies discussed in this paper are not mutually exclusive. Using hard partitioning in conjunction with OS virtualization technology and resource management software can help create the best-possible server consolidation environment.

Driving Datacenter Efficiency Through Server and Application Consolidation

9

TABLE 1. COMPARISON OF VIRTUALIZATION TECHNOLOGIES FOR SUN SPARC ENTERPRISE M-SERIES SERVERS

DYNAMIC DOMAINS ORACLE SOLARIS CONTAINERS

Resource

isolation

Physically defined boundaries with electrical

isolation of hardware resources

• Software-defined boundaries

• Sharing of underlying hardware

Resource

control

• Dynamic adjustment of hardware resource

allocation

• Granularity limited to the constraints imposed by

the specific hardware platform

• Fine-grain, flexible control of system and network

resource allocation

• Ability to set upper limits on resources for

individual processes and users

Operating

system

• One operating system instance per hard partition

• Supports installation of a unique operating

system version on each partition

• Combines with Containers to support many

private execution environments in each domain

• Many private execution environments within one

instance of the operating system

• Provides runtime support for additional operating

environments

Identity Unique instance of the operating system Unique namespace and identity for each execution

environment

Security Complete isolation from other domains Prevents unauthorized access and unintended

intrusions

Test

environment

support

• Dynamic allocation of compute resources to test

environments during nonpeak periods

• Supports cost-effective creation of development

and test environments that scale on demand

• Provides rapid cloning of existing environments

to another container within the same system or

on another system with the same processor

technology

• Simplifies the progression from test to production

Sun SPARC Enterprise M-Series servers deliver the technology to help organizations create an ideal consolidation environment. For example, multiple Containers that implement Oracle Solaris Resource Manager controls can be established within each Dynamic Domain on a Sun SPARC Enterprise M-Series server to create the following advantages:

• Applications that must remain physically isolated from one another or require different versions of the operating system can be placed in separate domains.

• Compute resources can be added and removed from Dynamic Domains on demand to maximize system use.

• By using Containers, multiple applications can reside within a single domain, reducing the number of operating system instances to manage while minimizing application conflicts.

• Oracle Solaris Resource Manager can govern the proper distribution of resources within a given Container, helping avoid the potential for one application to starve another software program of compute power.

Driving Datacenter Efficiency Through Server and Application Consolidation

10

Improving Datacenter Efficiency and Flexibility

Given the increasing need for nonstop availability of IT services, organizations often demand service-level agreements (SLAs). These contracts define the services to be provided along with metrics for determining if they have been adequately delivered. Hosting each project or application with a dedicated system can help meet SLA agreements. However, this approach often creates a proliferation of systems, increasing administration tasks and costs and creating an inflexible, inefficient environment. Consolidating workloads onto fewer, more-powerful servers can foster greater efficiency.

Sun SPARC Enterprise M-Series servers offer powerful features to partition the system’s resources into isolated domains. The assignment of resources to individual domains can be dynamically adjusted to meet changing demands. For example, a single Sun SPARC Enterprise M-Series server, partitioned into domains, each running an instance of Oracle Solaris, can support many applications, including

• File and print services for heterogeneous clients

• Messaging and mail services

• Web services

• Application services

• Mission-critical databases, data warehouses, and analytics

Dynamic Domains

Two key technologies Sun SPARC Enterprise M-Series servers offer to enable consolidation are Dynamic Domains and Dynamic Reconfiguration. Through the power of DR—an enabling technology behind Dynamic Domains—system resources can be added and removed from individual domains without impacting operation. This capability can lead to more-flexible and cost-effective management of IT resources.

A domain is an independent system resource that runs its own copy of Oracle Solaris. Domains divide a system’s total resources into separate units that are not affected by one another’s operations. Domains can be used for different types of processing—for example, one domain can be used to test new applications, while another domain can be used for production purposes.

Each domain uses a separate boot disk with its own instance of Oracle Solaris, as well as I/O interfaces to network and disk resources. CPU/memory boards and I/O boards can be separately added and removed from running domains using DR, providing they are not also assigned to other domains. Domains run applications in strict isolation from applications running in other domains. Security between domains is maintained through role-based access control, assigning unique privileges per domain and restricting the platform and root administrators from domain control and data access. Domains, in conjunction with modular systems like the Sun SPARC Enterprise M-Series servers, can help decrease costs and reduce overhead when employed to consolidate applications. Data from one domain is isolated from other domains at the hardware level. This separation is enforced by the system controller (SC) ASIC, ensuring one domain cannot access data packets from another domain.

Driving Datacenter Efficiency Through Server and Application Consolidation

11

How Domains Work

Sun SPARC Enterprise M-Series servers have a unique partitioning feature that can divide one physical system board (PSB) into one logical board or four logical boards. The number of physical system boards in the Sun SPARC Enterprise M-Series servers varies from 1 to 16, depending on the model. The I/O varies with the server, and can include PCI Express (PCIe) slots, PCI-X slots, and built-in I/O. Table 2 provides the characteristics of each Sun SPARC Enterprise M-Series server.

TABLE 2. SUN SPARC ENTERPRISE M-SERIES SERVERS SYSTEM CHARACTERISTICS

PROCESSORS MEMORY

PHYSICAL

SYSTEM

BOARDS I/O BOARDS

DYNAMIC

DOMAINS

Sun SPARC

Enterprise

M3000 Server

• 1

• SPARC64 VII

Up to 64 GB 1 • 4 PCIe slots

• External x2 SAS

port

1

Sun SPARC

Enterprise

M4000 Server

• Up to 4

• SPARC64 VI or

SPARC64 VII

Up to 256 GB 1 • 1 I/O tray

• 4 PCIe slots and

1 PCI-X slot per

I/O tray

Up to 2

Sun SPARC

Enterprise

M5000 Server

• Up to 8

• SPARC64 VI or

SPARC64 VII

Up to 512 GB 2 • Up to 2 I/O trays

• 4 PCIe slots and

1 PCI-X slot per

I/O tray

Up to 4

Sun SPARC

Enterprise

M8000 Server

• Up to 16

• SPARC64 VI or

SPARC64 VII

Up to 1 TB Up to 4 • Up to 4 I/O units

(IOUs)

• 8 PCIe slots per

IOU

Up to 16

Sun SPARC

Enterprise

M9000-32

Server

• Up to 32

• SPARC64 VI or

SPARC64 VII

Up to 2 TB Up to 8 • Up to 8 IOUs

• 8 PCIe slots per

IOU

Up to 24

Sun SPARC

Enterprise

M9000-64

Server

• Up to 64

• SPARC64 VI or

SPARC64 VII

Up to 4 TB Up to 16 • Up to 16 IOUs

• 8 PCIe slots per

IOU

Up to 24

To use a PSB in the system, the hardware resources on the board must be logically divided and reconfigured as eXtended System Boards (XSBs), which support two types: Uni-XSB and Quad-XSB. These XSBs can be combined freely to create domains.

Driving Datacenter Efficiency Through Server and Application Consolidation

12

A Uni-XSB is a PSB that is not logically divided and is configured into one XSB. It contains all of the resources on the board⎯that is, four CPUs, 32 DIMMs⎯and I/O and is suitable for domains requiring a large quantity of resources. Because Uni-XSBs constitute physical domaining with boundaries at the board level, a Uni-XSB provides the best fault isolation. If a board fails in a configuration that uses Uni-XSBs, only one domain is affected. Uni-XSBs for the midrange and high-end systems are illustrated in Figure 4 and Figure 5. Uni-XSBs can also be configured for memory mirror mode. In this mode, the PSB has two memory units, one mirroring the other. Saving the same data in a separate memory unit improves data security.

Figure 4. Uni-XSB on a midrange system.

Figure 5. Uni-XSB on a high-end system.

Driving Datacenter Efficiency Through Server and Application Consolidation

13

A Quad-XSB is a PSB that is logically divided and configured into four XSBs. Each of the four XSBs contains one-quarter of the total board resources, for example, on high-end systems, one CPU, eight DIMMs, and two PCIe cards (one-quarter of the I/O unit or IOU), as illustrated in Figure 6.

Figure 6. Quad-XSB on a high-end system.

On midrange servers, shown in Figure 7, only two XSBs have I/O. Quad-XSBs enable subboard domain granularity and, therefore, better resource use. However, if a board fails, each domain that uses the board experiences a fault condition. Memory mirror mode can be enabled for Quad-XSBs in the midrange systems only.

Figure 7. Quad-XSB on a midrange system.

A domain consists of one or more XSBs. Each domain runs its own copy of Oracle Solaris and must have a minimum of one CPU, eight DIMMs, and I/O. The number of domains allowed depends on the server model. The default is one domain and the maximum number of domains is 24. The

Driving Datacenter Efficiency Through Server and Application Consolidation

14

maximum number of XSBs in a domain is 16. Domains can be set up to include both Uni- and Quad-XSBs.

A domain component list (DCL) identifies the potential resources for a domain. A single XSB can potentially belong to multiple domains. However, a single XSB can be assigned to only one specific domain. The domain configuration software maps each XSB number to a Logical System Board LSB) number.

Figures 8a and 8b illustrate a configuration of four domains using one Uni-XSB and two Quad-XSBs. Note that each domain requires I/O, but that it is not a requirement for each LSB.

Figure 8a. A domain configuration using a Uni-XSB.

Driving Datacenter Efficiency Through Server and Application Consolidation

15

Figure 8b. A domain configuration using a Quad-XSBs.

Oracle Solaris is installed on a per-domain basis. The operating system image is installed on internal disks. On midrange systems, the disks are available only for the first (top) I/O device and the third (third from top) I/O device. The second and fourth I/O devices do not have the capability to support internal hard disks.

Fault Isolation and Error Management

Domains are protected against software or hardware failures in other domains. Failures in hardware shared between domains cause failures only in the domains that share the hardware. When a domain encounters a fatal error, a domainstop operation occurs that cleanly and quickly shuts down only the domain with the error. Domainstop operates by shutting down the paths in and out of the system address controller and the system data interface ASICs. The shutdown is intended to prevent further corruption of data and to facilitate debugging by not allowing the failure to be masked by continued operation.

Driving Datacenter Efficiency Through Server and Application Consolidation

16

When certain hardware errors occur in a Sun SPARC Enterprise M-Series server, the system controller performs specific diagnosis and domain recovery steps. The following automatic diagnosis engines identify and diagnose hardware errors that affect the availability of the system and its domains:

• eXtended System Control Facility (XSCF) diagnosis engine. Diagnoses hardware errors associated with domainstop operations.

• Oracle Solaris diagnosis engine. Identifies nonfatal domain hardware errors and reports them to the system controller.

• Power-on self-test diagnosis engine. Identifies any hardware test failures that occur when the power-on self-test (POST) is run.

In most situations, hardware failures that cause a domain crash are detected and eliminated from the domain configuration either by POST or OpenBoot PROM during the subsequent automatic recovery boot of the domain. However, there can occasionally be situations where failures are intermittent or the boot-time tests are inadequate to detect the failures that cause repeated domain failures and reboots.

In those situations, the XSCF uses configurations or configuration policies supplied by the domain administrator to eliminate hardware from the domain configuration in an attempt to get a stable domain environment running.

Dynamic Reconfiguration

DR and automated dynamic reconfiguration (ADR) allow resources to be dynamically reallocated, or balanced, between domains. Using this technology enables a physical or logical restructuring of the hardware components of Sun SPARC Enterprise M-Series servers while the system is running and the applications remain available. This high degree of resource flexibility allows the domain or platform administrator to reconfigure the system easily to provision the resources to meet changing workload demands. Domain configurations can be optimized for workloads that are either compute intensive, I/O intensive, or both. DR can also be used to remove and replace failed or upgraded hardware components while the system is online.1

DR functions of Sun SPARC Enterprise M-Series servers are performed on XSB units and managed through the XSCF. The XSCF security management restricts DR operations to administrators who have the proper access privileges. Three types of system board components can be added or deleted by DR: CPU, memory, and I/O devices.

1 Sun SPARC Enterprise M4000, Sun SPARC Enterprise M5000, Sun SPARC Enterprise M8000, and Sun SPARC Enterprise M9000 servers can perform DR to logically move system resources between domains. In addition, Sun SPARC Enterprise M8000 and Sun SPARC Enterprise M9000 servers can perform hot-swap operations to physically add or remove boards from the chassis.

Driving Datacenter Efficiency Through Server and Application Consolidation

17

DR allows the domain or platform administrator to perform the following functions:

• Display the DCL and domain status

• Display the status and state of system or I/O boards and some components to help prepare for DR operations, including whether the board is Capacity on Demand (COD) or not

• Test live boards

• Register system or I/O boards to the DCLs of domains

• Delete (electrically isolate) system or I/O boards from a domain in preparation for moving to another domain or removal from the system while the domain remains running

• Add system or I/O boards to a domain to add resources or replace a removed board, while the domain remains running

• Configure or unconfigure CPU or memory modules on system boards to control power and capacity of a domain or isolate faulty components

• Enable or disable PCI cards or related components and slots

• Reserve a system board to a domain

For example, on the Sun SPARC Enterprise M8000 and Sun SPARC Enterprise M9000 servers, the IT operator can use DR to delete a faulty system board, then use the system’s hot-plug feature to physically remove it. After plugging in the repaired board or a replacement, DR can be used to add the board into the domain. System or I/O boards can also be associated with multiple domains for load balancing or to provide extra capabilities for specific tasks. However, resources can only be assigned to one domain at a time. In addition, combining the capabilities of DR with network and storage multipathing solutions can foster the creation of redundant network or storage subsystems with automatic failover, load balancing, and DR capabilities.

Basic Dynamic Reconfiguration Functions

All system boards that are targets of DR operations must be registered in the target domain’s DCL through the XSCF. The basic functions of DR are add, delete, move, and replace.

• Add. DR can be used to add a system board to a domain without stopping Oracle Solaris, provided the board is installed in the system and not assigned to another domain. A system board is added in three stages: assign, connect, and configure. In the add operation, the selected system board is assigned to the target domain so that it is connected to the domain. Then, the system board is configured to the Oracle Solaris instance of the domain. At this point, the system board is added to the domain, and its CPU and memory resources can be used by that domain.

• Delete. DR can be used to delete a system board from a domain without stopping Oracle Solaris running in that domain. A system board is deleted in three stages: unconfigure, disconnect, and unassign. In the delete operation, the selected system board is unconfigured from its domain by Oracle Solaris. Then, the board is disconnected to unassign it from the domain. At this point, the system board is deleted from the domain.

Driving Datacenter Efficiency Through Server and Application Consolidation

18

• Move. DR can be used to reassign a system board from one domain to another without stopping Oracle Solaris running in either domain. The move function changes the configurations of both domains without physically removing and remounting the system board. The move operation for a system board is a serial combination of the delete and add operations; in other words, the selected system board is deleted from its domain and then added to the target domain.

• Replace. DR can be used to remove a system board from a domain and either add it back later or replace it with another system board, without stopping Oracle Solaris running on that domain, provided both boards satisfy DR requirements (such as not making up an entire domain and no processes are running on the CPU). In the replace operation, the selected system board is deleted from the OS of the domain. Then, the system board is removed when it is ready to be released from its domain. After field parts replacement or other such task, the system board is reinstalled and added. DR cannot be used to replace a system board in a midrange system because replacing a system board replaces a motherboard unit (MBU). To replace a system board in a midrange system, turn off the power of all domains, and then perform the hardware replacement.

In the example shown in Figure 9, system board #2 is deleted from domain A and added to domain B. In this way, the physical configuration of the hardware (mounting locations) is not changed, but the logical configuration is changed for management of the system boards.

Figure 9. An example of a reconfiguration.

Using Dynamic Reconfiguration to Change a CPU Configuration

Upon adding a CPU, it is automatically recognized by Oracle Solaris and becomes available for use. To delete a CPU, it must meet the following conditions:

• No running process is bound to the CPU to be deleted.

• The CPU to be deleted does not belong to any processor set.

Driving Datacenter Efficiency Through Server and Application Consolidation

19

• If the resource pools facility is in use by the domain, the CPU to be deleted must not belong to a resource pool (see “Dynamic Resource Pools” for more information on resource pools and processor sets).

A SPARC Enterprise server domain runs in one of the following CPU operational modes:

• SPARC64 VI Compatible Mode. All processors in the domain, which can be SPARC64 VI processors, SPARC64 VII processors, or any combination of them, behave like and are treated by the OS as SPARC64 VI processors. The new capabilities of SPARC64 VII processors are not available in this mode.

• SPARC64 VII Enhanced Mode. All boards in the domain must contain only SPARC64 VII processors. In this mode, the server uses the new features of these processors.

DR operations work normally on domains running in SPARC64 VI Compatible Mode. DR can be used to add, delete, or move boards with either or both processor types, which are all treated as if they are SPARC64 VI processors. DR also operates normally on domains running in SPARC64 VII Enhanced Mode, with one exception: DR cannot be used to add or move into the domain a system board that contains any SPARC64 VI processors. To add a SPARC64 VI processor, the domain must be powered off, changed to SPARC64 VI Compatible Mode, and then rebooted.

Using Dynamic Reconfiguration to Change Memory Configurations

The DR functions classify system boards by memory usage into two types: kernel memory board and user memory board. A kernel memory board is a system board on which kernel memory (that is, memory internally used by Oracle Solaris and containing an OpenBoot PROM program) is loaded. Kernel memory is allocated in the memory on a single system board as much as possible. If all memory on the system board is not allocated to kernel memory and more kernel memory must be added, the memory on another system board is also used.

DR operations can be performed on kernel memory boards. When a kernel memory board is deleted, the system is suspended and kernel memory on the system board to be deleted is copied into memory on another system board. The copy destination board

• Cannot have any kernel memory

• Must have the same or more memory

• Must have the same memory configuration as the system board to be deleted

Kernel cage memory is a function used to minimize the number of system boards to which kernel memory is allocated. Kernel cage memory is enabled by default in Oracle Solaris 10. If the kernel cage is disabled, the system might run more efficiently, but kernel memory is spread among all boards, and DR operations do not work on memory if the kernel cage is disabled.

A user memory board is a system board on which no kernel memory is loaded. Before deleting user memory, the system attempts to swap out the physical pages to the swap area of disks. Sufficient swap space must be available for this operation to succeed.

Driving Datacenter Efficiency Through Server and Application Consolidation

20

Some user pages are locked into memory and cannot be swapped out. These pages receive special treatment by DR. Intimate Shared Memory (ISM) pages are special user pages that are shared by all processes. ISM pages are permanently locked and cannot be swapped out as memory pages. ISM is usually used by database software to achieve better performance. Locked pages cannot be swapped out, but the system automatically moves these pages to the memory on another system board. Deleting user memory fails if there is not sufficient free memory size on the remaining system boards to hold the relocated pages.

Using Dynamic Reconfiguration to Change I/O Configurations

In the domain where DR is performed, all device drivers must support the addition of devices by DR. When DR adds an I/O device, it is reconfigured automatically. An I/O device can be deleted when the device is not in use in the domain where the DR operation is to be performed and the device drivers in the domain support DR. In addition, all PCI cards and I/O device interfaces on a system board must support DR. If not, DR operations cannot be executed on that system board. In this case, the power supply to the domain must be turned off before performing maintenance and installation.

In most cases, the device to be deleted is in use. For example, the root file system or any other file systems required for operation cannot be unmounted. To solve this problem, the system can be configured using redundant configuration software to make the access path to each I/O device redundant. One way to accomplish this for disk drives is to employ software that enables disk mirroring.

PCI slots support hot-plug. Before a PCI card can be removed, it must be unconfigured and disconnected. The XSCF controls DR events; however, because hot-plug is controlled entirely within Oracle Solaris, the XSCF is not aware of hot-plug events, including I/O Box hot-plug events. The service manager feature in Oracle Solaris includes a new daemon, oplhpd, which listens for I/O DR events and sends messages to the XSCF. The XSCF uses this information to keep track of faulty I/O cards and when they are replaced.

Replacing Quad-XSB System Boards

If a domain is configured by only the XSBs in the PSB to be replaced, the DR operation for replacement is disabled, and the domain must be stopped for replacement. In the example in Figure 10, domain #1 has a configuration that requires it to be stopped before the system board can be replaced.

Driving Datacenter Efficiency Through Server and Application Consolidation

21

Figure 10. Domain #1 must be stopped before the PSB can be replaced.

System Board Pooling

The system board pooling function assigns a specific system board in a status where that board does not belong to any domain. This function can be effectively used to move a system board among multiple domains as needed. For example, a system board can be added from the system board pool to a domain when CPU or memory experiences a high load. When the added system board becomes unnecessary, it can be returned to the system board pool. A system board that is pooled can be assigned to a domain only when it is registered in DCL for that domain.

Reserving Domain Configuration Changes

A domain configuration change is reserved when a system board cannot be added, deleted, or moved immediately for operational reasons. The reserved add, delete, or move of the system board is executed when the power of the target domain is on or off, or the domain is rebooted. If a system board used as a floating board is pooled in the system board pool, a domain configuration change can be reserved to assign the system board to the intended domain in advance, preventing the system board from being acquired by another domain.

Automated Dynamic Reconfiguration

ADR enables an application to execute DR operations without requiring user interaction. This ability is provided by an enhanced DR framework that includes the Reconfiguration Coordination Manager and the system event facility. The Reconfiguration Coordination Manager executes preparatory tasks before a DR operation, error recovery during a DR operation, and cleanup after a DR operation. The ADR framework enables applications to automatically give up resources prior to unconfiguring them, and to capture new resources as they are configured into the domain.

Driving Datacenter Efficiency Through Server and Application Consolidation

22

Global Automated Dynamic Reconfiguration

Remote DR and local ADR functions are building blocks for a feature called global automatic DR. Global automatic DR introduces a framework that can be used to automatically redistribute the system board resources on a Sun SPARC Enterprise M-Series system. This redistribution can be based upon factors such as production schedule, domain resource uses, and domain functional priorities. Global automatic DR accepts input describing resource use policies, and then uses those policies to automatically marshal the Sun SPARC Enterprise M-Series system resources to produce the most effective use.

Capacity on Demand

Capacity on Demand (COD) is an innovative procurement model enabled by Dynamic Domains. With COD, fully configured systems are shipped with only a portion of their resources enabled—in accordance with current needs. Additional processors and memory are installed but initially disabled with a permit tracking mechanism. Under certain conditions, COD boards can be used before actually purchasing a permit.

When a system encounters a resource constraint, purchasing a COD Hardware Activation Option can quickly enable additional capacity. Processors and memory can then be added to existing or new domains on the system. This approach helps avoid the potentially costly possibility of overburdening critical systems when workload increases, helps reduce system outages, and reduces upgrades that take valuable time to execute.

Simplifying Application Consolidation

Using the power of Dynamic Domains, development, prototype, and production environments can be combined on a single large server, rather than on three separate servers. Still other consolidation projects combine multiple database instances and application servers within a single system, sharing the same operating system instance and providing cost savings in administrative tasks such as data management and archive. Whereas Dynamic Domains support the consolidation of several systems into one Sun SPARC Enterprise M-Series server, Containers technology refines resource control further to simplify the consolidation of several applications into one domain. The Containers functionality in Oracle Solaris 10 enables multiple, software-isolated applications to run on a single server or domain. With this capability, organizations can easily deploy multiple applications on a single server while maintaining software boundaries.

In some cases, operating system version incompatibilities between existing applications and new hardware can stall the overall consolidation initiative. Containers software can help mitigate this issue. These tools capture and transfer the configuration, data, and other operating system elements that surround an existing application. Using this technology, older applications can deploy within a Container on Oracle Solaris 10—despite dependencies on Oracle Solaris 8 or Oracle Solaris 9. By encapsulating the application within a Container, Oracle Solaris 8 or Oracle Solaris 9 runtime environment is made available to the application as needed.

Driving Datacenter Efficiency Through Server and Application Consolidation

23

By taking advantage of Containers, a consolidation project using Oracle Solaris 10 can efficiently address the needs of applications that currently reside on Oracle Solaris. Furthermore, these older applications can now reap the benefits of the greater compute power, reduction of space, and lower power and cooling requirements offered by Sun SPARC Enterprise M-Series servers.

Oracle Solaris Containers

With Containers, IT operators can gain tight control over the allocation of system and network resources. Configurations can even favor certain users in mixed workload environments. For example, in large brokerage firms, traders intermittently require fast access to execute a query or perform a calculation, while other system users have more-consistent workloads. Using Containers, traders can be granted a proportionately larger number of shares of resources to give them the system resources they require.

Containers establish boundaries for resources, such as CPUs, and can be expanded to adapt to the changing processing requirements of the application or applications running in the Container. A Container is a virtualized operating system environment created within a single instance of Oracle Solaris. Applications within containers are isolated, preventing processes in one container from monitoring or affecting processes running in another container. Even a superuser process from one container cannot view or affect activity in other containers. A Container also provides an abstract layer that separates applications from the physical attributes of the system on which they are deployed. Examples of these attributes include physical device paths.

Containers enable more-efficient use of the system. Dynamic resource reallocation permits unused resources to be shifted to other containers as needed. Fault and security isolation means that poorly behaved applications do not require a dedicated and underused system. With containers, these applications can be consolidated with other applications. Containers also allow the IT operator to delegate some administrative functions while maintaining overall system security.

Containers are designed to provide fine-grained control over the resources that applications use, allowing multiple applications to operate on a single server while maintaining specified quality of service levels. Fixed resources such as processors and memory can be partitioned into pools on multiprocessor systems, with different pools shared by different projects (a specified collection of processes) and isolated application environments. Dynamic resource sharing allows different projects to be assigned different ratios of system resources.

When resources such as CPUs and memory are dynamically allocated, resource capping controls can be used to set limits on the amount of resources a project uses. With all these resource management capabilities, organizations can consolidate many applications onto a single server, as illustrated in Figure 11. This helps to reduce operational and administrative costs while also increasing availability.

Driving Datacenter Efficiency Through Server and Application Consolidation

24

Figure 11. Example of Containers in a domain.

Resource management is provided by Oracle Solaris Resource Manager. Every service is represented by a project, which provides a network wide administrative identifier for related work. All the processes that run in a container have the same project identifier, also known as the project ID. The Oracle Solaris kernel tracks resource use through the project ID. This relationship is depicted in Figure 12. Historical data can be gathered by using extended accounting.

Figure 12. Example of projects in a container.

Driving Datacenter Efficiency Through Server and Application Consolidation

25

Oracle Solaris Resource Manager

Modern computing environments have to provide a flexible response to the varying workloads that are generated by different, consolidated applications on a system.

A workload is an aggregation of all processes of an application or a group of applications. Oracle Solaris provides a facility to name workloads as projects once they are identified. For example, one project is named for a sales database and another project is named for a marketing database. If resource management features are not used, Oracle Solaris responds to workload demands by adapting to new application requests dynamically. This default response generally means that all activity on the system is given equal access to resources.

Oracle Solaris Resource Manager enables systems to treat workloads individually by

• Restricting access to a specific resource

• Offering resources to workloads on a preferential basis

• Isolating workloads from each another

• Denying resources or preferring one application over another for a larger set of allocations than otherwise permitted

• Preventing an application from consuming resources indiscriminately

• Changing an application’s priority based on external events

• Balancing resource guarantees to a set of applications against the goal of maximizing system use

These capabilities enable Containers to deliver predictable service levels. Effective resource management is enabled in Oracle Solaris by offering control, notification, and monitoring mechanisms. Many of these capabilities are provided through enhancements to existing mechanisms such as the proc(4) file system, processor sets, scheduling classes, and new mechanisms such as dynamic resource pools.

Dynamic Resource Pools

Resource pools enable the IT operator to separate workloads so that they do not consume overlapping resources. They provide a persistent configuration mechanism for processor sets, and optionally, scheduling classes, as illustrated in Figure 13. Resource pools provide a mechanism for dynamically adjusting each pool’s resource allocation in response to system events and application load changes. Dynamic resource pools simplify and reduce the number of decisions required from the IT operator. Pools are automatically adjusted to preserve system performance goals. The software periodically examines the load on the system and determines whether intervention is required to enable the system to maintain optimal performance.

Driving Datacenter Efficiency Through Server and Application Consolidation

26

Figure 13. Resource pools.

Resource Management Control

Oracle Solaris provides three types of control mechanisms to control resource usage:

• Constraint. This is a resource-sharing mechanism that sets bounds on the amount of specific resources a workload can consume. It can also be used to control ill-behaved applications—such as applications with memory leaks—that can otherwise compromise performance or availability through unregulated resource requests.

• Scheduling. This is a resource-sharing mechanism that refers to making a sequence of resource allocation decisions at specific intervals based on a predictable algorithm. An application that does not need its current allocation leaves the resource available for another application’s use. Scheduling-based resource management enables full use of an under-committed configuration, while providing controlled allocations in a critically committed or overcommitted scenario. The algorithm determines the level of control. For example, it might guarantee that all applications have some access to the resource. The fair share scheduler is an example of a scheduling mechanism that manages application access to CPU resources in a controlled manner.

• Partitioning. This is a more-rigid mechanism used to bind a workload to a subset of the system’s available resources. This binding guarantees that a known amount of resources are always available to the workload. Resource pools are a partitioning mechanism that limits workloads to a specific subset of the resources of the system. Partitioning can be used to avoid system wide over commitment. However, in avoiding this over commitment, the ability to achieve high uses can be reduced because resources bound to one pool are not available for use by a workload in another pool when the workload bound to them is idle, unless a policy for dynamic resource pools is employed. A good candidate for this type of control mechanism might be transaction processing systems that must be guaranteed a certain amount of resources at all times.

Managing CPU Resources with Resource Pools

The ability to partition a server using processor sets has been available since version 2.6 of Oracle Solaris. Every system contains at least one processor set—the system or default processor set that

Driving Datacenter Efficiency Through Server and Application Consolidation

27

consists of all of the processors in the system. Additional processor sets can be dynamically created and removed on a running system, providing that at least one CPU remains for the system processor set.

Resource pools enable IT operators to create a processor set by specifying the number of processors required, rather than CPU physical IDs. The definition of a processor set is therefore not tied to any particular type of hardware. It is also possible to specify a minimum and maximum number of processors for a pool. Multiple configurations can be defined to adapt to changing resource requirements, such as different daily, nightly, or seasonal workloads. Resource pools can have different scheduling classes. Scheduling classes work per resource pool. The two most common are the fair share scheduler and the time-sharing scheduler.

Fair Share Scheduler

The fair share scheduler (FSS) allocates CPU resources using CPU shares. The FSS helps ensure that CPU resources are distributed among active zones or projects based on the number of shares each zone or project is allocated. Therefore, more-important workloads should be allocated more CPU shares. A CPU share defines the portion of the CPU resources available to a project in a resource pool. It is important to note that CPU shares are not the same as CPU percentages. Shares define the relative importance of projects with respect to other projects. If project A is twice as important as project B, then project A should be assigned twice as many shares as project B.

The actual number of shares assigned is largely irrelevant—2 shares for project A versus 1 share for project B yields the same results as 18 shares for project A versus 9 shares for project B. Project A is entitled to twice the amount of CPU as project B in both cases. The importance of project A relative to project B can be increased by assigning more shares to project A, while keeping the same number of shares for project B.

The FSS calculates the proportion of CPU allocated to a project by dividing the shares for the project by the total number of active projects. An active project is a project with at least one process using the CPU. Shares for idle projects, that is, projects with no active processes, are not used in the calculations. Important to note is that the FSS only limits CPU usage if there is competition for the CPU. A project that is the only active project on the system can use 100 percent of the CPU, regardless of the number of shares it holds. CPU cycles are never wasted— if a project does not use all of the CPU it is entitled to because it has no work to perform, the remaining CPU resources are distributed between other active processes.

Fair Share Scheduler and Processor Sets

The FSS can be used in conjunction with processor sets to provide more fine-grained control over the allocation of CPU resources among projects that run on each processor set than would be available with processor sets alone. When processor sets are present, the FSS treats every processor set as a separate partition. CPU entitlement for a project is based on CPU usage in that processor set only. The CPU allocations of projects running in one processor set are not affected by the CPU shares or activity of projects running in another processor set, because the projects are not competing for the same

Driving Datacenter Efficiency Through Server and Application Consolidation

28

resources. Projects only compete with each other if they are running within the same processor set, as illustrated in Figure 14.

Figure 14. Example fo allocating shares in processor sets.

Resource Pools and Dynamic Reconfiguration Operations

DR allows the hardware to be reconfigured while the system is running. A DR operation can increase, reduce, or have no effect on a given type of resource. Because DR can affect available resource amounts, the pools facility must be included in these operations. When a DR operation is initiated, the pools framework acts to validate the configuration. If the DR operation can proceed without causing the current pools configuration to become invalid, then the private configuration file is updated. An invalid configuration is one that cannot be supported by the available resources.

If the DR operation causes the pools configuration to be invalid, then the operation fails and the IT operator is notified by a message to the message log. The configuration can be forced to complete using the DR force option. The pools configuration is then modified to comply with the new resource configuration. For information on the DR process and the force option, refer to the Dynamic Reconfiguration User Guide for the particular Oracle hardware on which the pools are running.

Managing the Virtualized Environment

Even in a consolidated environment with a reduced number of platforms, administrators need solutions that simplify the management of an IT infrastructure. Sun SPARC Enterprise M-Series servers offer built-in management capabilities to ease common administrative tasks.

Driving Datacenter Efficiency Through Server and Application Consolidation

29

eXtended System Controller Facility

Around-the-clock system operation, disaster recovery hot sites, and geographically dispersed organizations lead to requirements for efficient, remote management of systems. One of the many benefits of Sun SPARC Enterprise M-Series servers is the support for system management capabilities through an onboard service processor. eXtended System Controller Facility (XSCF) firmware is preinstalled on the service processor boards and consists of system management applications and two user interfaces: XSCF Web (a browser-based GUI) and XSCF Shell (a terminal-based CLI). The XSCF Web uses the secure version of the HTTPS and the Secure Sockets Layer/Transport Level Security protocols for connection to the server connected to a network and for Web-based support of server status display, server operation control, and configuration information display. The screen shot in Figure 15 provides an example of the XSCF Web (BUI) interface.

The XSCF firmware is a single, centralized point for managing hardware configuration, controlling the hardware monitor and cooling system (fan units), monitoring domain status, powering on/off peripheral units, and monitoring errors. The XSCF centrally controls and monitors the server. The XSCF includes a partitioning function to configure and control domains. It has a function to monitor the server through an Ethernet connection to enable remote control. It also reports failure information to the system administrator.

To gain additional management capabilities, Sun SPARC Enterprise M-Series servers can be managed by Oracle Enterprise Manager Ops Center software or third-party management tools. The XSCF supports communication with third-party management tools using built-in SNMP.

Figure 15. Screen shot of the XSCF Web interface.

Domain Creation

Driving Datacenter Efficiency Through Server and Application Consolidation

30

Domains can be created by either a CLI command or via the XSCF Web console BUI interface. Figure 16 provides a sample image of the system board configuration capabilities within the BUI interface. On this screen, administrators can set boards as Uni-XSBs or Quad-XSBs. Other attributes, such as memory mirroring, are also specified here. Once the system boards are configured, Figure 17 shows the screen on the BUI interface that allows the assignment of an XSB to a specific domain.

Figure 16. System board configuration screen within the XSCF Web console.

Figure 17. XSCF Web console domain configuration interface.

Driving Datacenter Efficiency Through Server and Application Consolidation

31

Oracle Enterprise Manager Ops Center

Oracle Enterprise Manager Ops Center allows IT administrators to actively manage and monitor infrastructure resources from virtually anywhere on the network. Oracle Enterprise Manager Ops Center simplifies the management of Oracle Solaris, Linux, and Windows using an advanced knowledgebase while enabling automated lifecycle processes. It also provides full lifecycle management of virtual guests, including resource management and mobility orchestration. Oracle Enterprise Manager Ops Center helps customers streamline operations and reduce downtime.

Asset Management and Discovery

Oracle Enterprise Manager Ops Center automatically draws out the relationship between servers and their associated service processors to hypervisors and operating system instances. Assets can be grouped manually based on location or business function or a smart groups feature can automatically sort the topology. Assets can be registered through inventory software services from Oracle, which provides details about the product’s lifecycle so the user can take appropriate actions.

Simplified Provisioning Process

After discovering and identifying systems and their components, Oracle Enterprise Manager Ops Center can automatically filter through the operating system images and available firmware and present only those that are appropriate to the target system. Oracle Enterprise Manager Ops Center automatically creates and maintains the underlining technologies used during OS and firmware provisioning so administrators can focus on more important tasks. By having Oracle Enterprise Manager Ops Center deploy only the relevant firmware and operating system across larger and diverse asset groups, guesswork is eliminated. Oracle Enterprise Manager Ops Center also bridges the gap between the embedded Oracle VM Server for SPARC and its controlling domain by automatically verifying the appropriate firmware is installed. This process simplifies virtual machine creation later on and is completely transparent to the user.

Reduced Administrative Costs

By automating most of the deployment process for physical and virtual systems, user involvement is minimized. Oracle Enterprise Manager Ops Center offers a facility to create and store the approved operating system or firmware profile required by business services. Jumpstart, JET modules, kickstart, and yast customizations can be stored under named profiles to be deployed more easily later by personnel less familiar with the underlying technology. This allows the business to more efficiently leverage IT skill sets across functional units.

Rapid Deployment

Oracle Enterprise Manager Ops Center deploys a complete stack on a bare-metal server to make the system production ready within a short time. It can create a snapshot of a system catalog and restore an operating system to a previous state. It can compare inventories of multiple systems and make target systems match source inventories. This process can be applied to a single system, multiple systems, or

Driving Datacenter Efficiency Through Server and Application Consolidation

32

multiple datacenters. Since Oracle Enterprise Manager Ops Center has out of the box automation and in-depth knowledge of Oracle systems, operational staff can spend more time focusing on driving greater business value.

Fault and Event Management

Hardware status and operating system performance is tracked to check the overall health of the system. When a predefined threshold or hardware condition is reached, a notification is sent to the user interface and an e-mail is auto generated.

Comprehensive Reports

Oracle Enterprise Manager Ops Center’s rich UI and functionality presents information based on user definitions. Monitored information can be presented at a per-system/per-virtual resource level or can be aggregated across a group of servers or virtual pools. Historical information for parameters such as CPU, memory, network I/O, and WATT consumption can be monitored and stored for future reference. Moreover, any and all gathered data can be exported for further analysis or to create custom reports.

Automated Patching Using a Unique Knowledgebase

Oracle’s Knowledge Services is a hosted metadata knowledgebase of Oracle Solaris, Oracle Unbreakable Linux, Red Hat, and SuSE operating systems. This knowledgebase is a very powerful capability unique to Oracle. It is served down to customers through a web service or in a disconnected mode. Leveraging the knowledgebase metadata improves patch accuracy and reduces downtime. It maintains advanced patch, rpm, and package dependency information that has been discovered through unique methods exclusively owned by Oracle. Oracle Enterprise Manager Ops Center uses this knowledgebase to download only the required patches the first time (not all new patches)—saving both network bandwidth and compute resources. It applies those patches and performs appropriate actions (single/multiuser mode, reboot option) as required.

Driving Datacenter Efficiency Through Server and Application Consolidation

33

Comprehensive reports cover patch requirements and gaps with other systems, enabling system updates and compliance.

Reduce Downtime

Oracle Enterprise Manager Ops Center helps system administrators meet their maintenance windows in three ways. Leveraging its unique knowledgebase, the product first examines the installed software to see if any broken dependencies exist. Next it searches against vendor bugs, Common Vulnerability databases, or customer profiles to discover if updates are needed. With every action, it automatically takes snapshots of the inventory on the box in case rollbacks or time comparisons are needed. It automatically resolves required patch trees and groups them correctly during patch installation. Lastly, it will cache patch payloads on the agents and simulate installation to insure the operating system commands and directories are healthy enough to install additional software. Now the platform can be reliably patched with a higher level of confidence that nothing will go wrong. Oracle Enterprise Manager Ops Center will also automatically discover Oracle Solaris Live Upgrade alternate boot environments and display them for selection during patching allowing for zero downtime patching.

Compliance

Multiple compliance reports are possible with Oracle Enterprise Manager Ops Center.

• Compare all the servers against a business project’s requirements.

• Compare against an older vendor provided baseline or always test against the latest information from the vendor.

• Test against a government approved common vulnerability database.

• Continuously schedule reports to help discover the server sprawl across the datacenter.

• Compare servers to one another or compare previous snapshots of the same server.

Driving Datacenter Efficiency Through Server and Application Consolidation

34

• Report who installed or uninstalled what, when, and where via Oracle Enterprise Manager Ops Center.

Rest assured that any compliance violation can quickly be addresses on demand.

Manages Oracle Virtualization Technologies

Oracle Enterprise Manager Ops Center manages the lifecycle of Oracle Solaris Containers and Oracle VM Server for SPARC. Their resources are monitored continuously to provide up-to-date information on usage. Based on the dynamic needs of the applications, new Oracle Solaris Containers and Oracle VM Server for SPARC virtual guests can be created, deleted, cloned, or reconfigured.

Centralizes Management of Resources

As the central management console for all relevant infrastructures, Oracle Enterprise Manager Ops Center tracks hardware, virtualization components, and operating systems. It provides the appropriate components to keep physical and virtualization assets up to date. Oracle Enterprise Manager Ops Center ensures that the system using Oracle VM Server for SPARC has the appropriate firmware.

Lifecycle Management—Simple Deployment and Maintenance

With Oracle Enterprise Manager Ops Center, you can install and manage all relevant components in a virtualized stack.

• Asset discovery. Oracle Enterprise Manager Ops Center can discover all assets, such as hardware, firmware, virtual systems, and operating systems. You can view them in a usable format.

• Provisioning. With Oracle Enterprise Manager Ops Center, you can provision firmware and operating systems on bare-metal and virtual systems.

• Patching. With its unique knowledgebase, Oracle Enterprise Manager Ops Center keeps all components (physical as well as virtual) in the stack up to date.

• Monitoring. Oracle Enterprise Manager Ops Center monitors physical and virtual systems to provide end-to-end monitoring of the complete stack. It monitors individual as well as aggregate resources to get a complete view of the system.

Eco-friendly

Oracle Enterprise Manager Ops Center monitors the power of all servers and aggregates consumption patterns. Based on the outcome, administrators can balance the resources by shutting servers down, migrating workloads, or leveraging power capping capabilities in the servers.

Reduces Resource Management Complexity

By providing access to all assets, such as hardware, virtual systems, and operating systems (through one interface), Oracle Enterprise Manager Ops Center makes managing these resources simple. Assets can

Driving Datacenter Efficiency Through Server and Application Consolidation

35

be logically grouped, automated through smart groups, tagged for custom grouping, and filtered with multiple options.

Investment Protection

Oracle Solaris 8 and Oracle Solaris 9 servers can easily migrate to Oracle Solaris 10. This allows Oracle Solaris 8 and Oracle Solaris 9 implementations to leverage the latest capabilities in Oracle Solaris 10 and newer servers. Virtual resource management is a seamless extension of physical system management. System management software users can easily adjust to virtual resource management through Oracle Enterprise Manager Ops Center’s rich user interface.

Oracle Enterprise Manager Ops Center delivers comprehensive lifecycle management of physical and virtual resources in an easy-to-use interface.

Scale with Resources

Virtualization makes it easy to deploy resources on demand. A proliferation of resources requires a robust management platform that can not only manage the compute elements in one location, but can also scale with different geographic locations. With its three-tier architecture, Oracle Enterprise Manager Ops Center scales its performance and usability from a single datacenter to distributed datacenters. With proxies deployed closer to the managed systems, performance more-easily meets required service levels. Oracle Enterprise Manager Ops Center uses the latest Web technologies and fits into existing datacenter designs reducing the need for configuration changes.

Efficient Resource Use

Oracle Enterprise Manager Ops Center pools virtual resources to cater to similar applications. Appropriate policies can be applied to virtual resources that are generated on demand and placed in pools of physical machines that meet the application requirements. Once in the pool, resource policies continue to watch over the workload and auto balance the environment automatically by migrating

Driving Datacenter Efficiency Through Server and Application Consolidation

36

Oracle VM Server for SPARC guests within the pool to assure the most optimal use of physical resources.

Availability

Virtual resources are offered depending on the application needs. Virtualization mobility features such as cold migration for Oracle Solaris Containers and warm migration for Oracle VM Server for SPARC can be used to move virtual resources to other systems to improve utilization with minimal downtime.

Security

Oracle Solaris Container and Oracle VM Server for SPARC technologies ensure isolation between different containers and domains. In addition, role-based mechanisms are implemented to allow multiple users to manage systems with different access controls. Oracle Enterprise Manager Ops Center can patch physical as well as virtual systems that have security vulnerabilities.

SNMP Service

An SNMP agent can be configured and enabled on the service processor, allowing the system to be monitored by third-party tools. The service processor SNMP agent monitors the state of the system hardware and domains, and exports the following information to an SNMP manager:

• System information such as chassis ID, platform type, total number of CPUs, and total memory

• Configuration of the hardware

• DR information, including which domain-configurable units are assigned to which domains

• Domain status

• Power status

• Environmental status

The service processor SNMP agent can supply system information and fault event information using public management information bases (MIBs). SNMP managers—for example, a third-party manager application—use any service processor network interface with the SNMP agent port to communicate with the agent. The SNMP agent supports concurrent access from multiple users through SNMP managers.

SNMP can be configured using V1, V2, or V3. The XSCF supports two MIBs for SNMP:

• SP-MIB (XSCF extension MIB). This is used to attain information on the status and configuration of the platform. If there is a fault, it sends a trap with the basic fault information.

• FM-MIB (Fault Management MIB). This is used only when there is a fault. It sends the fault trap, but includes all of the same detailed information as the fault management architecture (FMA) MIB in a Solaris domain from Oracle. The information has the data needed by the service technician when placing a service call. It is also useful if the domain crashed due to a part failure.

Driving Datacenter Efficiency Through Server and Application Consolidation

37

There are two methods of FMA reporting on the XSCF: through SNMP and through the DSCP to the affected domain. To have the XSCF report all platform faults through SNMP using FMA descriptors, enable SNMP on the XSCF.

Conclusion

In today’s exceedingly competitive environment, every IT department operates under the mandate to reduce costs, increase return on investment, and provide a more-consistent environment to support compliance initiatives. In addition, IT solutions must be able to quickly adapt to changes in demand and business processes.

The Sun SPARC Enterprise M-Series servers are the most powerful and innovative enterprise-class systems available from Oracle today. With the ability to partition the system into subboard-level domains, isolate applications into containers, and manage resources with fine-grained and dynamic control, the systems are ideally suited for consolidating applications and optimally virtualizing and using resources.

GUI-based tools are included with Oracle’s Sun SPARC Enterprise M-Series servers for administering, monitoring, and managing the hardware, operating system, storage, and applications. These tools streamline and automate many tasks, thus decreasing complexity and IT operations costs, while providing a more-consistent environment.

Driving Datacenter Efficiency Through Server and Application Consolidation June 2010 Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 oracle.com

Copyright © 2009, 2010, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. UNIX is a registered trademark licensed through X/Open Company, Ltd. 0110