sgi confidential 4 - 1 shelf hardware. sgi confidential 4 - 2 shelf front view canister status leds...

22
SGI Confid ential 4 - 1 Shelf Hardware

Upload: samantha-morris

Post on 17-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

SGI Confidential 4 - 1

ShelfHardware

SGI Confidential 4 - 2

Shelf Front View

CanisterStatus LEDs

Canister 0 Canister 7

Canister Position Number

Canister Release Button

Each shelf contains 8 canisters with each canister containing up 14 SATA drives

SGI Confidential 4 - 3

Shelf Rear View

On/OffButton

48vdc Plug

Shelf Controller 0 Shelf Controller 1

Fan 0/1

Shelf StatusLEDs Fan 2/3 Fan 4/5 Fan 6/7

ProtectiveFan Cover

DC

Tray

ME

Tray

E T

ray

SGI Confidential 4 - 4

Shelf Architectural Overview

SDC7

SDC6

SDC5

SDC4

SDC3

SDC2

SDC1

SDC0M

E T

ray

SC0

HPDC0Fan

Fan

Fan

Fan

DC Riser

SAS/Cntl/Mngmnt

HBA

SC1

HBA

X8 SAS

X16 PCIe

Cntl

X8 SAS

X16 PCIe

Cntl

GigE

FC/SCSI

GigE

FC/SCSI

3.3, 5, 12v

HPDC1

48vdc

COOLHEAT

LED Status

DC Tray

E Tray

SGI Confidential 4 - 5

Canister Assembly

The Canister subassembly consists of a circuit board that makes a blind-mate connection with the mid-plane. These connections provide power and signals to the Canister module.

The circuit board also has 14 serial ATA power and signal connectors that interface with the 14 hard drives. The Canister subassembly houses the drives in the SGI COPAN 400 Series, and 14 drives are installed in each Canister module.

There are two status LEDs on the front of the canister.

The canister assembly by itself is the field replaceable unit.

SGI Confidential 4 - 6

HDD00

HDD00

HDD01

HDD01

HDD01

HDD00

HDD00

Canister VTL Architectural Overview (1 of 2)HDD02

HDD03

HDD04

HDD05

HDD06

HDD07

HDD08

HDD09

HDD10

HDD11

HDD12

HDD13SDC0

HDD02

HDD03

HDD04

HDD05

HDD06

HDD07

HDD08

HDD09

HDD10

HDD11

HDD12

HDD13SDC1

HDD02

HDD03

HDD04

HDD05

HDD06

HDD07

HDD08

HDD09

HDD10

HDD11

HDD12

HDD13SDC2

HDD00

HDD01

HDD02

HDD03

HDD04

HDD05

HDD06

HDD07

HDD08

HDD09

HDD10

HDD11

HDD12

HDD13

SDC7

HDD01

HDD02

HDD03

HDD04

HDD05

HDD06

HDD07

HDD08

HDD09

HDD10

HDD11

HDD12

HDD13SDC3

RAID SET 1SDC0, HDD0SDC1, HDD0SDC2, HDD0SDC3, HDD0LUN1, 27, 53

RAID SET 3SDC0, HDD1SDC1, HDD1SDC2, HDD1SDC3, HDD1LUN3, 29, 55

SGI Confidential 4 - 7

Canister VTL Architectural Overview (2 of 2)RAIDSET Canister:Drive 500GB 1TB 2TB EVENT LOG HDD#-------------------------------------------------------------------------------------------------------------RAID 1 0:0 1:0 2:0 3:0 lun1 lun27 lun53 HDD0 HDD1 HDD2 HDD3RAID 2 4:0 5:0 6:0 7:0 lun2 lun28 lun54 HDD4 HDD5 HDD6 HDD7RAID 3 0:1 1:1 2:1 3:1 lun3 lun29 lun55 HDD8 HDD9 HDD10 HDD11RAID 4 4:1 5:1 6:1 7:1 lun4 lun30 lun56 HDD12 HDD13 HDD14 HDD15 RAID 5 0:2 1:2 2:2 3:2 lun5 lun31 lun57 HDD16 HDD17 HDD18 HDD19RAID 6 4:2 5:2 6:2 7:2 lun6 lun32 lun58 HDD20 HDD21 HDD22 HDD23RAID 7 0:3 1:3 2:3 3:3 lun7 lun33 lun59 HDD24 HDD25 HDD26 HDD27RAID 8 4:3 5:3 6:3 7:3 lun8 lun34 lun60 HDD28 HDD29 HDD30 HDD31RAID 9 0:4 1:4 2:4 3:4 lun9 lun35 lun61 HDD32 HDD33 HDD34 HDD35 RAID 10 4:4 5:4 6:4 7:4 lun10 lun36 lun62 HDD36 HDD37 HDD38 HDD39RAID 11 0:5 1:5 2:5 3:5 lun11 lun37 lun63 HDD40 HDD41 HDD42 HDD43 RAID 12 4:5 5:5 6:5 7:5 lun12 lun38 lun64 HDD44 HDD45 HDD46 HDD47RAID 13 0:6 1:6 2:6 3:6 lun13 lun39 lun65 HDD48 HDD49 HDD50 HDD51RAID 14 4:6 5:6 6:6 7:6 lun14 lun40 lun66 HDD52 HDD53 HDD54 HDD55RAID 15 0:7 1:7 2:7 3:7 lun15 lun41 lun67 HDD56 HDD57 HDD58 HDD59RAID 16 4:7 5:7 6:7 7:7 lun16 lun42 lun68 HDD60 HDD61 HDD62 HDD63RAID 17 0:8 1:8 2:8 3:8 lun17 lun43 lun69 HDD64 HDD65 HDD66 HDD67RAID 18 4:8 5:8 6:8 7:8 lun18 lun44 lun70 HDD68 HDD69 HDD70 HDD71RAID 19 0:9 1:9 2:9 3:9 lun19 lun45 lun71 HDD72 HDD73 HDD74 HDD75RAID 20 4:9 5:9 6:9 7:9 lun20 lun46 lun72 HDD76 HDD77 HDD78 HDD79RAID 21 0:10 1:10 2:10 3:10 lun21 lun47 lun73 HDD80 HDD81 HDD82 HDD83RAID 22 4:10 5:10 6:10 7:10 lun22 lun48 lun74 HDD84 HDD85 HDD86 HDD87RAID 23 0:11 1:11 2:11 3:11 lun23 lun49 lun75 HDD88 HDD89 HDD90 HDD91RAID 24 4:11 5:11 6:11 7:11 lun24 lun50 lun76 HDD92 HDD93 HDD94 HDD95RAID 25 0:12 1:12 2:12 3:12 lun25 lun51 lun77 HDD96 HDD97 HDD98 HDD99RAID 26 4:12 5:12 6:12 7:12 lun26 lun52 lun78 HDD100 HDD101 HDD102 HDD103

SGI Confidential 4 - 8

Shelf Controller Overview (1 of 4)The Shelf Controller (SC) is a series of interconnecting boards that provide aninterface between the Host Server or Application Server to the drives. The SC is comprised of:• Shelf Controller Mother Board (SCMB): Houses the point of load DC/DC voltage converters, micro controller, SAS controller chip, PCI-E switch, mid-plane connections and the I/O connectors for the two daughter cards.• Shelf Controller Compute Engine (SCCE): COM Express™ compliant daughter card.• Shelf Controller I/O Personality Module (SCIO): PCI-E daughter card that contains the controller circuitry to implement the Shelf Enclosure external interconnect.• Battery Back-up Unit Module (BBU): Maintains power to the SC and cache during power outages. The BBU is a separate FRU from the shelf controller.

SGI Confidential 4 - 9

Shelf Controller Front View (2 of 4)

SCMB Status LED

Ethernet Port 10/100 BaseT

Service Port (115200bps)

Dual Port Fibre Channel HBA(4gb per port full duplex)

USB PortsOK to Service LED

COPAN 400 Self Controller• Two SC are required for dual path or failover configurations• Run the FreeBSD Linux Kernel and contains a firmware image• Service port provides access to the SCMB user menu (to be used under guidance of COPAN support• Ethernet port allows connection to the PSM Service GUI and SCMB user menu• The SC is the field replaceable unit

SGI Confidential4 -

10

Shelf Controller Top View (3 of 4)

SCMB PCB

SCCE PCB

SCIO PCB

BBU Connector

BBU

SGI Confidential4 -

11

Shelf Controller Bottom View (4 of 4)

SCMB ResetButton

This reset button is not connected

SCMB PCB

SCMB Status LED

SGI Confidential4 -

12

Shelf Controller Serial Cable

SGI Confidential4 -

13

DC-Tray (1 of 2)

DC Riser Board

LED Status BoardCOOLHEAT

Board (bottom)

HPDC Board 0 (top)

HPDC Board 1 (middle)

Note: The COOLHEAT and HPDC Board contain a firmware image.

The DC-Tray is the field replaceable unit that is comprised of the LED Status Board,DC Riser Board, the COOLHEAT Board, and two Hi Power DC (HPDC) Boards.

SGI Confidential4 -

14

DC Tray (2 of 2)

SE Status LEDDC0 Status LEDDC1 Status LED

Fan 0/1 Status LEDFan 2/3 Status LEDFan 4/5 Status LEDFan 6/7 Status LED

Fan 0/1 OK to remove LEDFan 2/3 OK to Remove LEDFan 4/5 OK to Remove LEDFan 6/7 OK to Remove LED

LED Board Indicators

SGI Confidential4 -

15

Disk Drive

COPAN 400 Disk Drives• Supported drives are 3.5” SATA II 1TB or 2TB• Disks are not hot swappable• Disks are the field replaceable unit• Each drive contains a user flashable firmware image

SGI Confidential4 -

16

ME Tray

The ME Tray routes signals and power to-and-from all the internal boards except the DC power card. The ME Tray takes power from the DC-DC boards and supplies power to the Canister modules and the shelf controllers.

The midplane is a field replaceable unit

SGI Confidential4 -

17

Fan Assembly

COPAN 400 Fan Pack• Pack contains 2 fans• Fans are not hot swappable• Fans are the field replaceable unit• Fan swapping occurs every 6 hours after power on

SGI Confidential4 -

18

Shelf Maintenance Procedures (1 of 2)All shelf FRUs require that the shelf be stopped before any components are replaced.Be sure to record firmware versions for those components that contain an image.Follow the power off procedure described earlier in class.

Once the shelf is stopped from the Service GUI, all FRUs except the fans require that the shelf then be powered off. Following a power off, remove the 48vdc plug.

+ For drive replacement, ensure jumpers are in the correct locations. After the shelf is placed back online, the new drive will be marked as Foreign. Set the drive to Spare and the reconstruct will start automatically. The rebuild will restart from new if it is interrupted. Drive replacement is necessary when they are more than three failed drives in a shelf.

+ When replacing the Shelf Controller, Make a note of the part number and the serial number on the defective and new shelf controller. These should be on white stickers on the front panel. If there aren’t stickers on the front panel, or if they are unreadable, there should be similar stickers on the underside of the printed circuit board itself. The serial number will resemble the example: HPS00000065 or SPS54400065

Fill out the failed part form. Include the serial number from the suspect and the replacement controllers. Loosen the thumbscrews that hold the battery cover plate to the shelf controller and remove it. Use care when installing the new SC.

SGI Confidential4 -

19

Shelf Maintenance Procedures (2 of 2)

Following FRU replacement, follow the power on procedure discussed earlierin class. In some cases you will have to update the FRU ID (UUID) from the Service GUI if the shelf does not start. Once the shelf has been started:

+ verify LED indicators for proper operation+ examine logs for normal start up+ flash firmware as necessary

As always:

+ label all cables so you know where to reconnect them if not labeled+ inspect hardware for bent pins and so on+ label each drives canister and canister location so you put them back in the correct location when removing more than one drive

Best Practice: Only perform one service action at a time.

SGI Confidential4 -

20

Module Review QuestionsThis completes the Copan Shelf module. Answer the review questions before proceeding.

01. List the shelf FRU? Which are hot pluggable?02. Where are fans 2/3 located? Where is canister 7 located?03. Which position is SC0 located (left or right)?04. The HPDC and COOLHEAT PCB are part of which FRU?05. Which components contain a firmware image?06. The SCMB, SCIO, and SCCE are part of which FRU?07. List the characteristics of the fan packs?08. Under which conditions are the RAID sets divided into LUNs not larger than 2TB?09. Given a VTL shelf w/2TB disks, where is HDD25 located?10. Given a VTL shelf w/2TB disks, RAID set 21 contains which LUNs?11. Given a MAID shelf w/2TB disks, RAID set 10 contains which LUNs?12. Which locations contain the hot spares?13. How many disks can be failed in a shelf before you must perform a repair action?14. RAID set 27 is composed of which disks? What are these disks called? 15. Describe the repair action for a disk.16. Following any repair action, what must you do in PSM GUI if the shelf fails to start?

SGI Confidential4 -

21

Lab ProjectUsing applicable service documentation, you will be given 60 minutes to:

• Perform shelf FRU replacement procedures• ssh into the shelf to observe the boot process• Perform a shelf locate from the PSM GUI• View component firmware versions• Upgrade shelf component firmware (if this wasn’t done earlier)• Create and test a serial cable• Observe LEDs

SGI Confidential4 -

2222