emc 257758

Upload: bahman-mir

Post on 02-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Emc 257758

    1/7

    "Backend cleanup process for factory re-installation of VNX OE for File (NAS software for VNX File)"

    ID: emc257758

    Usage: 14

    Date Created: 12/10/2010

    Last Modified: 02/03/2012

    STATUS: Approved

    Audience: Support

    Question:Backend cleanup process for factory re-installation of VNX OE for File (NAS software for VNXFile)

    Environment: EMC SW: VNX Operating Environment (OE) for File

    Environment: Product: VNX File/Unified

    Environment: Backend cleanup using nas_raid -s cleanup

    Environment: Factory re-installation using Express Installation DVD image on Control Station

    Problem:Requirements to perform a factory re-installation of the Operating Environment for File (thatis, NAS code)

    Problem:

    nas_raid script fails if system is part of a multi-domain system:

    Cannot cleanup domain master. Please move master to another array.

    Fix:

    Backend cleanup for factory re-installation of the File O/SCautions:

    During the cleanup process, the Control LUNs are zeroed out so as to make a fresh re-installation possible. It should be noted that the Control LUNs and default StorageGroup (~filestorage) are now part of the FLARE private LUN space, and no longerdirectly accessible from the GUI or NaviCLI.

    After the cleanup process, verify that all Control LUNs are owned by SP A on Chain 0,or the installation process will fail.

    Cleanup script may not remove other Storage Groups, Storage Pools, and the like.

    Backend cleanup script does not remove the default ~filestorage HBAUID records and

    must be done manually.

    VNX FILE/UNIFIED BACKEND CLEANUP PROCEDURE

    1. Deconfigure Proxy ARPthe main task here is to get the SPs back on the 128.221.252& 253 networks:

    # /nasmcd/sbin/clariion_mgmtstopNote: If you cannot stop Proxy ARP services or cleanup the backend, see emc287103for possible workarounds. LUNs 0 & 1 must be zeroed out in order to perform a freshreinstall of File OE.

    2.

    Verify that the storage processors (SPs) are up and running with the default internalnetwork IP addresses:

    # ping 128.221.252.200PING 128.221.252.200 (128.221.252.200) 56(84) bytes of data.64 bytes from 128.221.252.200: icmp_seq=1 ttl=128 time=0.535 ms# ping 128.221.253.201PING 128.221.253.201 (128.221.253.201) 56(84) bytes of data.64 bytes from 128.221.253.201: icmp_seq=1 ttl=128 time=0.353 ms

  • 8/10/2019 Emc 257758

    2/7

    3.

    Make sure /tftpboot directory is available at root of system--untar from /nas/tools ifrequired:

    # cd /# tar zxvf /nas/tools/tftpboot.tar.gz

    4.

    Unset the NAS_DB environment and stop NAS Services:

    Note: If running dual Control Stations, shutdown CS1. If onsite, unplug the powercable from CS1 and leave it offline.# unset NAS_DB# /sbin/service nas stop

    5. Run the Cleanup script (which may take 15-20 minutes to complete):

    # cd /tftpboot/setup_backend# ./nas_raid -n ../bin/navicli -a 128.221.252.200 -b 128.221.253.201s cleanup

    Do you want to clean up the system [yes or no]?: yesCleaning Storage Group "~filestorage"Removing LUNPXE boot slot 2...Starting NBS on all control LUNZero LUN 1 with dd.Finished with LUN 1.Zero LUN 0 with dd.Finished with LUN 0.Removing diskgroupThe following storage groups still exist:~filestorageRemoving sparesSecurity domain removed

    Done

    Note: If the nas_raid script fails with 'Cannot cleanup domain master', you will need toremove any other systems from the domain before the script will complete.# /tftpboot/bin/navicli -h 128.221.252.200 domain -messner -remove 10.241.216.233[SP IP of other domain to remove from the current domain]

    6. Verify that Control LUNs have been properly zeroed out:

    # /sbin/fdiskl | grep partitionDisk /dev/nda doesn't contain a valid partition tableDisk /dev/ndb doesn't contain a valid partition tableDisk /dev/ndc doesn't contain a valid partition table

    Disk /dev/ndd doesn't contain a valid partition tableDisk /dev/ndf doesn't contain a valid partition table

    Note: It is possible that you may not have NBS access to the backend LUNs from yourBlades. If this is so, you must first PXEBoot a blade in order to restore backend LUNaccess. The /dev/nde partition is not zeroed out.

    Optional Method for zeroing LUNs 0 & 1:# /nas/sbin/t2pxe -force_pxe ALL -->Force PXE boot of all servers, then if it reports

  • 8/10/2019 Emc 257758

    3/7

    success, try to zero out LUNs 0 & 1# dd if=/dev/zero of=/dev/nda bs=1MB count=134# dd if=/dev/zero of=/dev/nde bs=1MB count=134

    7. Manually remove other Storage Groups, if necessary:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagegroup -list

    or

    # /tftpboot/bin/navicli -h 128.221.252.200 storagegroup -list

    Storage Group Name: SG_Celerra_c125Storage Group UID: E2:12:0B:D6:F5:FC:DF:11:8F:CA:00:60:16:41:67:7D

    HLU/ALU Pairs:

    HLU Number ALU Number---------- ----------

    0 31 12 03 2

    #/tftpboot/bin/navicli -h 128.221.252.200 storagegroup -destroy -gnameSG_Celerra_c125

    Destroy Storage Group SG_Celerra_c125 (y/n)? y

    8. Manually remove Pool luns from the ~filestorage Storage Group if required:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 lun -destroy -l 13

    Are you sure you want to perform this operation?(y/n): yCannot unbind LUN because its contained in a Storage Group

    Get List of HLU numbers for ~filestorage SG:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagegroup -list -gname ~filestorage

    Remove HLU Luns from ~filestorage:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagegroup -removehlu -gname ~filestorage -hlu 18

    Remove HLU 18 from ~filestorageThe specified operation will potentially affect a File System Storage configuration. Doyou want to continue (y/n)? y

    9. Manually destroy Pool LUNs first, if necessary:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagegroup -removehlu -gname ~filestorage -hlu 25 -->Remove Pool lunfrom SG first

  • 8/10/2019 Emc 257758

    4/7

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 lun -list# /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 lun -destroy -l 0 -->Syntax for removing Pool luns once the StorageGrouphas been destroyed

    Are you sure you want to perform this operation?(y/n): y

    10.

    Destroy the Storage Pool once the Pool LUNs are removed:

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagepool -listPool Name: Pool 0Pool ID: 0Raid Type: r_10# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 storagepool -destroy -id 0

    Are you sure you want to perform this operation?(y/n): y# /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 storagepool -list

    11. Manually destroy RAID Group LUNs and RAID Groups, if necessary:

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 getrg -lunlist -->In this example, there are RAID Group LUNs and a RAID Group todestroyRaidGroup ID: 1List of luns: 7 8# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 unbind 7Unbinding a LUN will cause all data stored on that LUN to be lost.

    Unbind LUN 7 (y/n)? y

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 unbind 8

    Unbinding a LUN will cause all data stored on that LUN to be lost.

    Unbind LUN 8 (y/n)? y

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 removerg 1

    MetaLuns:# /tftpboot/bin/navicli -h 128.221.252.200 metalun -list -->There may be metaluns[e.g., 8184, 8185, etc] if layered apps were in use# /tftpboot/bin/navicli -h 128.221.252.200 metalun -destroy -metalun 12 -->Selectmetalun from list, within the 8184 lun

    12.Verify whether any Control LUNs are trespassed from SP A to SP B:

    # /nasmcd/sbin/t2tty -c 2 "camshowconfig"CAM Devices on scsi-0:TID 00: 0:d0+ 1:d1+ 2:d2+ 3:d3+ 4:d4- 5:d5- -->d4 & d5 are trespassed to SP BCAM Devices on scsi-16:TID 00: 0:d6- 1:d7- 2:d8- 3:d9- 4:d10+ 5:d11+1291584475: ADMIN: 6: Command succeeded: camshowconfig

    Note: Through the use of - and +, the above output shows that Control Luns d4 andd5 are NOT on Chain 0 (SPA). These LUNs must be trespassed back to Chain 0 on all

  • 8/10/2019 Emc 257758

    5/7

    servers before a new install can succeed.

    13. Trespass back all Control LUNs to SPA Chain 0 as required using the followingcommands:

    # /tftpboot/bin/t2ttyc 2 camtrespass c0t0l4# /tftpboot/bin/t2ttyc 2 camtrespass c0t0l5

    # /tftpboot/bin/t2ttyc 3 camtrespass c0t0l4# /tftpboot/bin/t2ttyc 3 camtrespass c0t0l5

    Note: In the above example, LUNs 4 & 5 were trespassed back to Chain 0 SPA oneach of the two blades present on the system.

    14.Verify existing Data Mover WWN HBAUID Records, remove HBAUID records, andverify:

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 storagegroup -list -gname ~filestorage |head -15

    Storage Group Name: ~filestorageStorage Group UID: 60:06:01:60:00:00:00:00:00:00:00:00:00:00:00:04

    HBA/SP Pairs:

    HBA UID SP Name SPPort------- ------- ------50:06:01:60:C6:E0:14:97:50:06:01:69:46:E0:14:97 SP B 250:06:01:60:C6:E0:14:97:50:06:01:61:46:E0:14:97 SP B 350:06:01:60:C6:E0:14:97:50:06:01:68:46:E0:14:97 SP A 250:06:01:60:C6:E0:14:97:50:06:01:60:46:E0:14:97 SP A 3

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 port -removehba -o -hbauid 50:06:01:60:C6:E0:14:97:50:06:01:69:46:E0:14:97# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 port -removehba -o -hbauid 50:06:01:60:C6:E0:14:97:50:06:01:61:46:E0:14:97# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope

    0 port -removehba -o -hbauid 50:06:01:60:C6:E0:14:97:50:06:01:68:46:E0:14:97# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 port -removehba -o -hbauid 50:06:01:60:C6:E0:14:97:50:06:01:60:46:E0:14:97# /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 storagegroup -list -gname ~filestorage

    Storage Group Name: ~filestorageStorage Group UID: 60:06:01:60:00:00:00:00:00:00:00:00:00:00:00:04

    15.

    Verify whether array security was destroyed. If not a shared system, manually destroysecurity.

    # /tftpboot/bin/naviseccli -h 128.221.252.200 -user sysadmin -password sysadmin -scope 0 domain -list

    Security is not initialized. Security must be initialized before any domain operations canbe performed in this system. Create a global administrator to initialize security.

    Note: The above return indicates that no security domain remains and has beendestroyed--no further action required

    # /tftpboot/bin/navicli -h 128.221.252.200 domain -listNode: c250IP Address: 128.221.253.201

  • 8/10/2019 Emc 257758

    6/7

    Name: spbPort: 80Secure Port: 443IP Address: 128.221.252.200 (Master)Name: spaPort: 80Secure Port: 443

    IP Address: 10.241.216.235Name: c250Port: 80Secure Port: 443Note: The above return indicates that a security domain does exist and must bedestroyed

    # /tftpboot/bin/navicli -h 128.221.252.200 -user sysadmin -password sysadmin -scope0 domain -messner -destroy

    WARNING: You are about to destroy the local directories on the followingsystems:128.221.252.200Please note that this operation will not update the master directory database.Proceed?(y/n) y

    # /tftpboot/bin/navicli -h 128.221.253.201 -user sysadmin -password sysadmin -scope0 domain -messner -destroy

    WARNING: You are about to destroy the local directories on the followingsystems:128.221.253.201Please note that this operation will not update the master directory database.Proceed?(y/n) y

    16. Using the proper bootable Express Install media, reboot the Linux system and performa "boot:install". For Dual Control Station environments, make sure that CS1 ispowered off during the factory install of CS0. See the "Note" section below for arepresentative example of the questions and answers given for a typical ExpressInstallation. Make sure to toggle the option from Yes to No when the screen for

    setting up the Control Station LAN IP address appears, since you DO NOT want to setthe External IP address yet (you will use the VNX Installation Assistant after theinstallation is completed to set the Control Station name and IP address andto initialize the File/Unified system). Reboot CS0 after the File OE installationcompletes so as to generate the "Waiting for VIA..." initialization message.

    17. Once CS0 has completed software installation and reboot, perform the factoryinstallation of CS1 using either the DVD media or CD2 media, and keep CS0 poweredup during the CS1 installation. At the end of the successful FIle OE installation on CS1,reboot it, and via the serial console, ensure that it displays the "Waiting for VIA..."initialization message. At this point, the dual CS environment is ready to be initializedusing the VNX Installation Assistant.

    18. Before running the VIA, however, perform the following actions, depending on whether

    the system is a File-only or Unified configuration:

    For File-only VNX Systems:a) A File-only installation should not have the -UnisphereBlock enabler installed--use navicli ndu-list to check.b) A File-only installation should have the -UnisphereFile enabler installed--use navicli ndu -listto check.c) Run the VNX Installation Assistant to complete the system initialization.For Unified VNX Systems:

  • 8/10/2019 Emc 257758

    7/7

    a) A Unified installation should have both the -UnisphereBlock and -UnisphereFileenablers installed--use navicli ndu -list to check.b) Set the Unified flag on the Control Station:

    # /nas/sbin/nas_hw_upgrade -option -enable -clariionfc

    c) Run the VNX Installation Assistant to complete the system initialization.

    Notes:

    Typical Express Installation questions, inputs, and/or answers:

    1. Express Installation using DVD or 2-disc CD set:

    boot:install-----------------Is this a Secondary Control Station (y/n/a)?n-----------------Is this a Control Station Fresh Install?yes----------------Is this a Secondary Control Station? [yes or no]: no----------------

    Accept the defaults for the "Primary Internal Network Setup", "IPMI Network Setup", and"Backup Internal Network Setup" screens----------------DO NOT SETUP THE EXTERNAL LAN NETWORKING AT THIS TIME (we will setup the ControlStation external LAN using the VIA initialization wizard after the File OE reinstallation iscompleted)For the Network Configuration screen, "Do you want to configure LAN (not dialup) networkingfor your installed system?"Tab to "No" and enter.---------------Detecting movers in cabinet: 2Is this the expected number of movers in the cabinet? [yes or no]: yes---------------Pick a NAS Administrator usernameUsername [default: nasadmin] : nasadminNew UNIX password: nasadminRetype new UNIX password: nasadmin----------------Do you wish to enable UNICODE? [yes or no]: yes

    2. At the end of the File OE installation, log into the Control Station as nasadmin, su to root,then reboot the Control Station. When the following message is displayed at the Control Stationserial console, initialize the Unified system using the VIA:

    # reboot---------------------------Waiting for VNX Installation Assistant to continue.......