04 routine admin procedures


Meta data Backups

If the namenode’s persistent metadata is lost or damaged, the entire filesystem is rendered unusable, so it is critical that backups are made of these files. You should keep multiple copies of different ages (one hour, one day, one week, and one month, say) to protect against corruption, either in the copies themselves or in the live files running on the namenode.

A straightforward way to make backups is to write a script to periodically archive thesecondary namenode’s previous.checkpoint subdirectory (under the directory definedby the fs.checkpoint.dir property) to an offsite location.

Data Backups• Data Loss can occur• Prioritize your data for backups• Output of Map Reduce jobs considered as expensive• Distcp is ideal for making backups to other HDFS clusters


Filesystem Check (fsck)

It is advisable to run HDFS’s fsck tool regularly ( for example, daily) on the whole filesystem to proactively look for missing or corrupt blocks

Filesystem Balancer

Run the balancer tool regularly to keep the filesystem datanodes evenly balanced.


• To grow storage available to a cluster, commission new nodes• To shrink the cluster, decommission the existing nodes• It can sometimes be necessary to decommission a node if it is misbehaving, perhaps because it is failing more often than it should or its performance is noticeably slow.• Datanodes that are permitted to connect to the namenode are specified in a file whose name is specified by the dfs.hosts property• Similarly, tasktrackers that may connect to the jobtracker are specified in a file whose name is specified by the mapred.hosts property.• In most cases, there is one shared file, referred to as the include file, that both dfs.hosts and mapred.hosts refer to

The file (or files) specified by the dfs.hosts and mapred.hosts properties is different from the slaves file. The former is used by the namenode and jobtracker to determine which worker nodes may connect. The slaves file is used by the Hadoop control scripts to perform cluster-wide operations, such as cluster restarts. It is never used by the Hadoop daemons.


1. Add the network addresses of the new nodes to the include file.2. Update the namenode with the new set of permitted datanodes using thiscommand:% hadoop dfsadmin -refreshNodes3. Update the jobtracker with the new set of permitted tasktrackers using:% hadoop mradmin -refreshNodes4. Update the slaves file with the new nodes, so that they are included in future operationsperformed by the Hadoop control scripts.5. Start the new datanodes and tasktrackers.6. Check that the new datanodes and tasktrackers appear in the web UI.


1. Add the network addresses of the nodes to be decommissioned to the exclude file.Do not update the include file at this point.2. Update the namenode with the new set of permitted datanodes, with thiscommand:% hadoop dfsadmin -refreshNodes3. Update the jobtracker with the new set of permitted tasktrackers using:% hadoop mradmin -refreshNodes4. Go to the web UI and check whether the admin state has changed to “DecommissionIn Progress” for the datanodes being decommissioned. They will start copyingtheir blocks to other datanodes in the cluster.5. When all the datanodes report their state as “Decommissioned,” then all the blockshave been replicated. Shut down the decommissioned nodes.6. Remove the nodes from the include file, and run:% hadoop dfsadmin -refreshNodes% hadoop mradmin -refreshNodes7. Remove the nodes from the slaves file.


${fs.checkpoint.dir}/current/VERSION/edits/fsimage/fstime

/previous.checkpoint/VERSION/edits/fsimage/fstime

Layout of these directories is identical in Namenode and secondary name node.

1) If name node simply crashed, but dfs.name.dir is not corrupted. Just restart it. 2) If fsimage is corrupted: Replace the namenode machine with new one. Create

dfs.name.dir. Copy previous.checkpoint directory to current directory of name node. Start the name node.

3) If fsimage is corruted and If secondary name node taking over as new primary name node: Use –importCheckpoint option while starting the namenode daemon. The -importCheckpoint option will load the namenode metadata from the latest checkpoint in the directory defined by the fs.checkpoint.dir property

4) If fsimage is corrupted: Replace the namenode machine with new one. Create dfs.name.diron new machine (must be empty). Point fs.checkpoint.dir to NFS (if you have taken backup on NFS). Start the namenode with –importCheckpoint option.


FIFO Scheduler: Default scheduler which will process all tasks of a job before start executing next job tasks.

Fair Scheduler : Allows multiple users of cluster a fair share simultaneously. Each job is assigned with a pool and each pool is assigned with an even share of available task slots.

Capacity Scheduler: Support multiple queues. Queues are guaranteed a fraction of the capacity of the grid (their 'guaranteed capacity') in the sense that a certain capacity of resources will be at their disposal. All jobs submitted to a queue will have access to the capacity guaranteed to the queue.

Demo on Fair Scheduler configuration and allocating resources to users.


1. Make sure that any previous upgrade is finalized before proceeding with anotherupgrade.2. Shut down MapReduce and kill any orphaned task processes on the tasktrackers.3. Shut down HDFS and backup the namenode directories.4. Install new versions of Hadoop HDFS and MapReduce on the cluster and onclients.5. Start HDFS with the -upgrade option.6. Wait until the upgrade is complete.7. Perform some sanity checks on HDFS.8. Start MapReduce.9. Roll back or finalize the upgrade (optional).

OLD_HADOOP_INSTALL and NEW_HADOOP_INSTALL for environments

Commands: % $NEW_HADOOP_INSTALL/bin/start-dfs.sh –upgrade% $NEW_HADOOP_INSTALL/bin/hadoop dfsadmin -upgradeProgress statusfsck for for step#7% $NEW_HADOOP_INSTALL/bin/stop-dfs.sh% $OLD_HADOOP_INSTALL/bin/start-dfs.sh –rollback% $NEW_HADOOP_INSTALL/bin/hadoop dfsadmin -finalizeUpgrade


DFSADMIN


Filesystem Check - fsck

Over-replication is not normally a problem, and HDFS will automatically deleteexcess replicas.


Filesystem Check - fsck

Corrupt or missing blocks are the biggest cause for concern, as it means data has been lost. By default, fsck leaves files with corrupt or missing blocks, but you can tell it to perform one of the following actions on them:

• Move the affected files to the /lost+found directory in HDFS, using the -move option. Files are broken into chains of contiguous blocks to aid any salvaging efforts you may attempt.• Delete the affected files, using the -delete option. Files cannot be recovered afterbeing deleted.

Finding the blocks for a file.% hadoop fsck /user/tom/part-00007 -files -blocks –racks

• The -files option shows the line with the filename, size, number of blocks, andits health (whether there are any missing blocks).• The -blocks option shows information about each block in the file, one line perblock.• The -racks option displays the rack location and the datanode addresses for eachblock.


Datanode block scanner

Blocks are periodically verified every three weeks to guard against disk errors over time (this is controlled by the dfs.datanode.scan.period.hours property, which defaults to 504 hours). Corrupt blocks are reported to the namenode to be fixed.

http://datanode:50075/blockScannerReport

http://datanode:50075/blockScannerReport?listblocks

Balancer

% start-balancer.sh

Over time, the distribution of blocks across datanodes can become unbalanced. Anunbalanced cluster can affect locality for MapReduce, and it puts a greater strain onthe highly utilized datanodes, so it’s best avoided

http://datanode:50075/blockScannerReport

http://datanode:50075/blockScannerReport?listblocks

04 routine admin procedures

Documents