lan free

Download Lan Free

If you can't read please download the document

Upload: kiran-kumar-peteru

Post on 18-Dec-2015

39 views

Category:

Documents


7 download

DESCRIPTION

BBBLHB

TRANSCRIPT

Library Manager : Resposible for Media mounts/dismounts/ maintains inventory of the Library , Library AuditLibrary Client : Owns the Volumed and requests the LM for Mounts,Storage Agent : A reduced version of TSM server and used for lan free transfer of data between client and TSM Storage Pools.Connectivity :Library device : Must be connected to the Library Manager Drive device : MUst be connected to both Library Manager and Library ClientLanfree Backup ComponentsLibrary[Robotic Arm and Storage Slots] : Def LibraryDrives : Def Devclss , Def driveMedia : Holds the dataHBA :Device Driver :Switches :Zoning :BA/TDP Client : Client to do backupsStorage Agent :Library Manager :Library Client :TSM Server Configurations 1.Server to server communication- server communication addresss and testing connectivity - Servers defined to each other -SAN Media is used for data movementTCP/IP is used as a communication protocolLibrary must be shared between the Library Managers and Library Client2.Library Definitions at Library ManagerLibrary Definitions at Library ClientPaths to Library [ Only Library Manager ]Paths to Drives [ For Library Manager + All Library Clients who will use the Drives ]Path definitions from library Manager to its drives should always be Online Path definitions from library client to drives may be offline because of connectivity issue.2.1 Identifying Offline paths marked by system_Console libraryManager:run rjpst3.Validating Lanfree configuration from TSM ServerValidate Lanfree node_name Storage_Agent- Check all storage pool allocated for backups to a client from Mgmtclass destination parameter- Check for each stgpool if the Library is Shared or not- Checks if Library is shared then paths from STA to Drives are defined or not- Checks if paths are defined then Paths are online or not- also checks if STA is pingable from TSM Server4.Client Configurations- Datareadpath : any|lanfree- Datawritepath: any|lanfree- Maxnummp : max number of drives that can be used by a node Client Configurations1.Drive AvailabilityDrive DefinitionsWINDOWS- Windows device Manager [Windows Only]- You should be able to see ult3580 Drives or Physical drives- Use sansurfer or Hbanywhere utility to get WWPN Details- TSMinstallDir\Storage Agent\tsmdlst /detail [ Windows Only ]- You should be able to see the drive serial and WWPN Number if not there is connectivity issue or Driver Issue- Drive Test- TSMinstalldir\device\mttest --> set special device file --> 37 ---> should not give any error.HPUNIX- ioscan -fnkCtape | grep -iE 'IBM|^Class' [ Hp Unix ] - for every tape device ULT3580 ,test the drive for operability- tapeutil [ open device :1 - /dev/rmt/rmtx - open in readonlymode:2 - query serial Number:3 ]- if you are able to see serial number the Drive is operational- ioscan -fnkCunknown - sometimes the drives go in unknown device category can be a connectivity issue or Device driver issue- ioscan -fnkCfc- used to identify which HBA being used for the Drivesfcmsutil argument will be drive : from ioscan command [ FC status ]here driver is /dev/td1LINUX - cat /proc/scsi/IBMtape [ shows Serial Number ,this actually uses device driver to read the device to get the serial number ]- ls -lrt /dev/IBMtape* [ Shows device special names ]- cat /proc/scsi/qla*/[1..9] [ List FC adapter Ports ]- /opt/hp/hp_fibreutils/hp_rescan -a [ all devices on qlogic , -l on specific hba ]z [ it removes the added LUN's then adds , dont use unless required ]SUN SOLARIS- ls -alrt /dev/rmt | grep IBM [ shows device special names ]- /opt/IBMtape/tapelist -l [ gives you serial number ]- /opt/IBMtape/tapeutil [ Drive Test ]- tapeutil [ open device - /dev/rmtx - open in readonlymode:2 - query serial Number ]- if you are able to see serial number the Drive is operational- ls -l /dev/fc/* [ List adapters and ports ]- ls -l /dev/fc/fcp*To check WWN specifics: `luxadm -e dump_map /devices/pci@8,600000/SUNW,qlc@1/fp@0,0:devctl` AIX- lsdev -Cc tape- lscfg -vl deviceName [ Serial Number ]- lsdev -Cc adapter [ list all fc adapters ]- lscfg -vl fc*|rmt*- grep -p [ grep results displays the paragraph ]- IBMtapeutil -f /dev/IBMtape2 inquiry 802. Device Driver - If device Driver software is not running the Device will not work at all WINDOWS- after installation the driver starts automatically as part of OS driversHpUnix- /usr/sbin/swlist | grep atdd [ You should be able to see the atdd device driver ]- kernel loads the atdd during end of boot process so you can claim unclaim devices as long as kernel has loaded the atdd.Linux - lintape [ linux 2.6 and above ]IBMtape has been replaced by lin_tape, which can be found here

lin_tape is open source driver, but it's essentially the same driver as it shares most of its code with IBMtape. Even the kernel module it installs is still called IBMtape.ko. Checking Driver status-/usr/bin/lin_taped status [ for lintaped ]-/usr/bin/IBMtaped status [ for 2.4 and lower ]-/var/log/lin_tape.errorlog [ logs error ]Sun Solaris- IBM Tape Device Driver ,loaded as part of system initialization Aix - IBM Tape Device Driver ,loaded as part of system initialization 3. TSM Configurations- dsm.sys lanfree options- enablelanfree : if commented or set to no then lanfree not enabled- lanfreetcpserveraddress :hostname/ip of storage agent- lanfreetcpport : port of storage agent- lanfreecommmethod : tcpip/sharedmem/namedpipe4. Storage Agent- Usually installed on the client which needs to do a Lanfree backup- installed at Unix - /opt/tivoli/tsm/Storage*/bin Aix - /usr/tivoli/tsm/Storage*/bin Win - C:\progra*\tivoli\tsm\Storage*- dsmsta.opt [ devconfig ]- devconfig.out- setstorageserver myname=abc mypa=secret myhla=hostname servername=node_reg_instance serverpa=secret hla=ip lla=port - commmethodsharedmem -- shmp namedpipe -- pipename [ winonly ]tcpip -- tcpport5. Starting Stopping Storage Agent- Linux : /opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc stop/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc start if above doesent work kill the process and remove the /opt/tivoli/tsm/StorageAgent/bin/dsmserv.lock file and start it using above utility. verify Storage Agent stopped/started [ wait for abt 5 secs to allow for stop/start ]ps -ef | grep dsmsta- HPUnix/AIx- Kill the process to stop it- remove /opt/tivoli/tsm/StorageAgent/bin/dsmserv.lock file - start process by ./dsmsta > /dev/null & verify Storage Agent stopped/started [ wait for abt 5 secs to allow for stop/start ]ps -ef | grep dsmsta- Windows - Goto Service Panel stop the Storage agent Service- Goto Service Panel start the Storage agent Service- Sun Solaris /opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc stop/opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc start if above doesent work kill the process and remove the /opt/tivoli/tsm/StorageAgent/bin/dsmserv.lock file and start it using above utility. verify Storage Agent stopped/started [ wait for abt 5 secs to allow for stop/start ]ps -ef | grep dsmsta6. Testing Connectivity Between Client Node and Storage Agentopen dsmc -se=stanzaq sess : you should see a storage agent , session Oracle clientsopen dsmc -se=oraclestanzathe password will be hostname if the hostname is greater than 8 chars long else the passwd will be [hostname01... or hostname12..] to make it 8 chars longq sess : you should see a storage agent sess7. RMT Mismatch- device special name changes of the - Every drive is uniquely identified by its Serial Number , a drive may many device special names but only one serial number7.1 Identifying the Path definitions on Library Mangerlm:q path storageagent- you will get paths for physical drives if its defined- you will get paths for vtl drives if it is definedRMAN Backups :- Oracle PPTCommon Oracle backups issues- Storage Agent Down : - Maximum Mount Point Exceeded :- server detected internal error- Device Problem- Device Driver Problem- RMT Mismatch Issue- Session goes into MediaW state- Slow backups ,time outs - Drive I/O error at TSM , VTL issues.- Query for max Channels- Query for Current backup stats- Lanfree or LanbasedQueriesMax Channels- q node oraclenode f=d - lm:q path storageagent- The number of online drive and maxnummp must match , - if maxnummp=3 and 2 drives online and one offline then make the offline drive online- if maxnummp=3 and 3 drives online and one offline no action requiredCurrent Backups Stats- get a rough starttime from oracle dba- q act orig=client node=oranode begint=-timeestimate- you will see the transactions completed this will tell u the bytes backed and the database name- tell him the last transaction time - If backup is lanfree tell him the number of active sessions- go to library manager and do -storageagent:q sess -look for client sessions check the session states Lanfree or Lanbased Backups- q node oraclenode f=d- if datawritepath=lanfree then its definitely lanfree if datawritepath=any- login to client - cd /opt/tivoli/tsm/client/oracle/bin*- cat tdpo.opt : note down the dsmi_orc_config value- cat the opt file form the prev parameter- look for the servername in dsm.sys file and check for lanfree parameterProblem Determination 1. Backups fail with error ANS0350E The current client configuration does not comply with the value of the DATAWRITEPATH or DATAREADPATH server option for this node.- StorageAgent is down or not responding , recycle the storage agent2. Backups failed with Maximum Mount point exceeded- ask the DBA how many channels he is using and maximum he is allowed to- Ask him to run backups with appropriate number of channels3. Backups fails with Unable to allocate device- Check paths and make them online also do RMT mismatch4. server detected internal error- recycle storage agent- check the device driver is running- Ask vtl team to check for low light- Check for i/o errors at TSM server and Library manager5. Backups sessions in MediaW state in storage agent- do a RMT mismatch- device drive is runningSample Errorsdsmerror.log and rman Logs10/25/08 23:50:18 ANS0278S The transaction will be aborted.10/25/08 23:50:18 ANS0278S The transaction will be aborted.10/25/08 23:50:18 ANS0326E This node has exceeded its maximum number of mount points.10/25/08 23:50:18 ANS0326E This node has exceeded its maximum number of mount points.08:19:08 ANS0278S The transaction will be aborted.08:19:08 ANS0278S The transaction will be aborted.02/23/09 ANS1315W Unexpected retry request. The server found an error while writing the data02/23/09 21:02:10 Error -50 sending request 02/23/09 21:02:11 ANS1235E An unknown system error has occurred from which TSM cannot recover. ---> Communication issues Between [sta and client] or [sta and tsm] or [client and tsm] recycling sta and retrying usually works02/23/09 21:02:11 ANS1235E An unknown system error has occurred from which TSM cannot recover.02/23/09 22:38:49 ANS1301E Server detected system error -------> Communication issues Between [sta and client] or [sta and tsm] or [client and tsm] recycling sta and retrying usually works02/23/09 22:38:49 ANS1301E Server detected system error02/23/09 01:40:05 ANS4994S TDP Oracle HP ANU0599 TDP for Oracle: (5536): =>(sd1n0v2_ORACLE_BKUP) ANU2602E The object /adsmorc//c-694282618-20090223-00 was 02/23/09 06:13:18 ANS0278S The transaction will be aborted.02/23/09 06:13:18 ANS0278S The transaction will be aborted.02/23/09 06:13:18 ANS1315W Unexpected retry request. The server found an error while writing the data.02/23/09 06:13:18 ANS1315W Unexpected retry request. The server found an error while wrkiting the data.RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on ORA_SBT_TAPE_2 channel at 01/17/2009 14:01:46 ORA-27192: skgfcls: sbtclose2 returned error - failed to close file ORA-19511: Error received from media manager layer, error text: ANS1235E (RC-72) An unknown system error has occurred from which TSM cannot recover. Backup Now failed coz STA not running , also specified the lanfreetcpserveraddress , need to observe-------------------------TSM Act Log :Checking Logs at ClientOracle Config files : /opt/tivoli/tsm/client/oracle/bin*/cat tdpo.opt : check for errorlog pathtail -100 tdpo.errorcd /var/opt/oracle/logsdbname_backup_status : contains backup status -----> check this for backup statusDbname_rman_oracle_bkup.timestamp : contains rman logs -----> check for oracle rman logsTSM Actlog- q act orig=client node=oranode begint=-estimatedstarttime se=error|unable|fail|terminate- q act se=error begint=-estimatedstarttime- librarymanager: q act se=error begint=-estimatedstarttime- q act se=unable begint=-estimatedstarttime- q act se=fail begint=-estimatedstarttime- q act se=terminate begint=-estimatedstarttime- q act orig=server server=sta begint=-? se=error|unable|fail|terminate|severed|abort|conne