checking the health of your active directory enviornment

33
Stanley Lopez, Senior Premier Field Engineer February 24, 2012 Checking the Health of your Active Directory Environment

Upload: spiffy

Post on 12-Jan-2015

8.489 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Checking the health of your active directory enviornment

Stanley Lopez, Senior Premier Field Engineer

February 24, 2012

Checking the Health of your Active Directory Environment

Page 2: Checking the health of your active directory enviornment

Canada

US

Latam

UK

WE

France

Germany

CEE

Japan

India APAC

MEA

GCR

Overview of PFE Premier Field Engineering (PFE) provides technical

leadership for Microsoft’s Premier customers around the

world to promote health in their IT environments through

onsite, remote and dedicated support services.

Envision

Project Planning

Build

Stabilize

Deploy

Operate

Page 3: Checking the health of your active directory enviornment

3

Microsoft

Confidential

Assess Plan Stabilize Educate Prevent Optimize

Driving Operations Excellence

• Active

Directory,

Exchange &

Windows

Server Risk

Assessment

& Health

Check

Program -

ADRAP

Operations

RAP

Operation

Strategic

Review

Messaging

Service Map

ADRAP

Remediation

* Dedicated

Support

Engineer for

Exchange &

Windows

Servers

Troubleshooti

ng & Disaster

Recovery

Workshop

Roles &

Knowledge

Management

Desired

Configuration

Management

Proactive

Monitoring

Management

Software

Update

Management

• Monthly

Hot Fix

Change and

Configuration

Management

Service Level

Management

Service

Catalog

Design

Capacity

Management

Get Healthy Stay Healthy

Ready for Business & Mission Critical Support

Page 4: Checking the health of your active directory enviornment

Is Your AD Healthy?

Major Components of Active

Directory

Active Directory Replication

SYSVOL Replication

Name Resolution

Domain Controller health

Why DR is important for AD

Page 5: Checking the health of your active directory enviornment

5

Microsoft Confidential

Major Components of Active Directory

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 6: Checking the health of your active directory enviornment

6

Microsoft Confidential

Active Directory Replication

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 7: Checking the health of your active directory enviornment

7

Active Directory Replication

Synchronizes changes between domain controllers in a multi-master environment

Ensures data stored on all domain controllers is consistent

Replication Model and Benefits

Multi-Master

– Scalability, Reliability and High Availability

Store and forward

– Reduce communication over WAN Links

Pull Replication

– Request-Pull

– Request consist of data already received

State-based and Attribute Level Replication

– Minimize replication traffic

Active Directory Replication 101

Page 8: Checking the health of your active directory enviornment

8

Domain Y

Domain DNS Zone

Domain

Forest DNZ Zone

Configuration

Schema

Replication occurs at partition level

Directory Partition Replicas

Forest-wide Replication

Domain-wide Replication

Note: sometimes called as NC (Naming Context)

Act

ive D

irect

ory

Data

base

NTD

S.D

IT

Global Catalogue

Page 9: Checking the health of your active directory enviornment

9

Replication Topology

Connection Object

ISTG

Bridgehead Server

Site A

Site Link A-B

Cost 100/Interval 15

Site B Site C

Site Link A-C

Cost 100/Interval 180

ISTG Bridgehead Server

/ ISTG

Subnets

Page 10: Checking the health of your active directory enviornment

10

Connections

A one-way, inbound route from one DC, the source to another DC, the destination

Site

Define sets of DC that are well connected together, in terms of speed and cost

A site contains one or more subnets

A site can contain more than one domain and one domain can span more than one site

Within a site, the replication topology is generated by KCC automatically

Site Links

Between sites, site link have to be established in order for the KCC (ISTG) to generate the topology across the

sites

Site link contains the schedule which determines when replication can take place as well as an assigned ‘cost’

Site Link Bridge

When more than 2 sites are linked for replication and use the same transport, all of the site link are ‘bridged’

Site link bridge are ‘transitive’

Bridgehead Server

Designated server to perform site-to-site replication, for each directory partition

Bridgehead servers can be designated by the administrator or automatically assigned by KCC

Inter-Site Topology Generator (ISTG)

Within a site, KCC will run on each DC to generate the topology for the site

Between sites, a DC will be designated as the ISTG to generate the topology for inter-site replication

The first DC for the site automatically becomes the ISTG

ISTG need not necessary be a bridge head server

Inter-site Replication Topology

Page 11: Checking the health of your active directory enviornment

11

KCC vs. Manually created connection objects

No automatic fail-over for manually created connection objects

Directory partition connection

One for Schema and Configuration, one for Domain

Global Catalog Replication

Connection required for ISTG to create inter-site topology

Bridgehead Servers

2000 – One per domain/per site

2003 and above – more than one may be selected

Subnets to site mapping

Ensure that clients communicate with the ‘closest’ DC

Things to note…

Page 12: Checking the health of your active directory enviornment

12

Repadmin

Active Directory Sites and Services

Event viewer

DCDiag

Replmon

Active Directory Topology Diagrammer (ADTD)

Checking Replication

Page 13: Checking the health of your active directory enviornment

13

Verify Forest-wide replication status at least once a week and prior

to making major changes that rely on directory replication

Monitor ISTGs and Bridgehead servers more frequently

DO NOT

Fix DC that has not been replicating for more than TSL

Restore backups more than the TSL

Decrease TSL without proper understanding of the impact, unless

there is a strong justification for it.

Create manual connection objects unnecessarily

Assign preferred bridgehead servers without both a compelling

reason and thorough understanding of expected results

Change default setting without a proper understanding of the

implications

AD Replication Best Practices

Page 14: Checking the health of your active directory enviornment

14

Microsoft Confidential

SYSVOL Replication

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 15: Checking the health of your active directory enviornment

15

File Replication Services

Distributed File Replication Services

SYSVOL Replication

Page 16: Checking the health of your active directory enviornment

16

Verify dependent services are functioning

Name Resolution

AD Replication

Review FRS status

SONAR

Event Logs

FRSDiag

Review DFRS status

DFS Replication has an in-box diagnostic report for the

replication backlog, replication efficiency, and the number of files

and folders in a given replication group

Dfsrdiag.exe is a command-line tool that can generate a backlog

count or trigger a propagation test. Both show the state of

replication.

Checking SYSVOL replication

Page 17: Checking the health of your active directory enviornment

17

Replication/FRS failures undetected

Journal Wrap failures

FRS service not running

Improper decommissioning of domain controllers

SYSVOL partition running out of disk space

Storing non-group policy files in SYSVOL

Configuring inappropriate permissions on SYSVOL folders

Manual copying/deleting of files

Improper use of D2/D4

Excessive Replication

File system policy

Anti-Virus Software

Defragmenter

Sharing Violation

Files held open by applications

Common pitfalls for FRS

Page 18: Checking the health of your active directory enviornment

18

Proactively monitor AD and FRS replication

Monitor the event logs for FRS regularly for FRS errors,

sharing violations and excessive replication

Clean up metadata of improperly decommissioned DC

Do not stop FRS service for extended period of time

Never copy files that live in the SYSVOL between DC,

always try to troubleshoot why files aren’t replicating

Use D2(Non-Authoritative) and D4(Authoritative) with

care

Do not configure file system policies on SYSVOL

Do not scan or defrag SYSVOL

Do not store non-group policy files in SYSVOL

FRS best practices

Page 19: Checking the health of your active directory enviornment

19

DFS Replication is a multi-master replication engine, this means that changes can be

made on all locations. Do not make changes on one document on two locations at

the same time, changes will not merge, the conflict is solved by using the last writer

wins.

Sharing violations -users open files and gain exclusive WRITE locks in order to modify

their data- will prevent DFSR from replicating the modified file. Periodically those

changes are written within NTFS by the application and the USN Change Journal is

updated. DFSR Monitors that journal and will attempt to replicate the file, only to find

that it cannot because the file is still open.

An event will be logged if DFSR is repeatedly having troubles replicating open files. In

the DFS Replication event log entries for 4302 and 4304 will appear.

The option to adjust the replication schedule in DFSR management is greyed out.

This is done because SYSVOL replication follows the same replication path and

schedule as active directory. If the time window is open DFSR will replicate almost

instantly. If the replication is not possible because of the schedule replication will start

when the time window opens. This means that if AD replication is not permitted

between 6:00 am and 10:00 am DFS Replication will also not replicate. As soon as the

schedule allows replication, the changed files will be replicated.

DFRS Best Practices

Page 20: Checking the health of your active directory enviornment

20

Microsoft Confidential

Name Resolution

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 21: Checking the health of your active directory enviornment

21

Domain Name System

Provides name resolution service

Used by

Client & applications – for locating DC as well as

‘services’ provided by DC

Domain Controllers – for Active Directory Replication

and File Replication Services

DNS 101

Page 22: Checking the health of your active directory enviornment

22

TCP/IP Configurations

Domain Controllers must be configured with proper IP

Address and pointing to valid DNS servers

DNS Records

Required records must be registered properly on DNS

servers

Servers must be functioning properly

Forwarders/delegation/secondary, etc. must be

configured properly and valid

What needs to be in place for AD to function properly

Page 23: Checking the health of your active directory enviornment

23

Host (A) record

IP Address of domain controllers

Registered by DHCP Client

Registered by DNS Client on Windows 2008

Service Resource Record (SRV) Records

Registered by Netlogon service on DC

Used by clients/services to locate various type of

services provided by domain controller

GUID (CNAME) Record

Required for AD Replication

Registered only of forest root DNS server

Records Registered by DCs

Page 24: Checking the health of your active directory enviornment

24

Verify TCP/IP configurations

IPConfig

Verify DNS server functionality

NSLookup

DCDiag /test:DNS

DNS server console

Event Logs

Verify GUID and Glue Records

DNSLint

Re-register records

Cycle Netlogon

Cycle DHCP Client/DNS Client or IPConfig /RegisterDNS

Capture Network Trace

Netmon

Checking your DNS

Page 25: Checking the health of your active directory enviornment

25

Administrators not familiar/aware of name resolution

design

Invalid(Stale) TCP/IP, forwarders, delegation, etc. settings

DCs pointing to external (invalid) DNS servers

Single point of failure configurations

DNS forwarder loop

Zone Transfer not secured

Dynamic update not enabled

DNS scavenging not enabled

Multi-homed domain controllers

Common Pitfalls

Page 26: Checking the health of your active directory enviornment

26

Audit DNS entries used by DC replication with DNS on a

monthly basis

Ensure that disconnected NICs are disabled

Adopt a standardized configuration for domain

controllers and DNS servers

Allow zone transfer to specific servers only

Allow only secured dynamic updates

Configure DNS Scavenging to remove stale records

DNS Best Practices

Page 27: Checking the health of your active directory enviornment

27

Microsoft Confidential

Major Components of Active Directory

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 28: Checking the health of your active directory enviornment

28

Service Pack level

When was the last time your DC was restarted?

Event Logs

How often do you review the logs for errors or

warnings

Is Time Synchronization configured properly in the

environment (W32tm)

Domain Controller Health

Page 29: Checking the health of your active directory enviornment

29

Potential Failures not detected

Service Failing

DC experiencing bottleneck

System running low on disk space

No proper management of event logs

DCs running on outdated service pack

DCs not patched with security updates

Time Synchronization improperly configured

Common Pitfalls

Page 30: Checking the health of your active directory enviornment

30

Run DCDiag on a weekly basis to verify the overall well-

being of domain controllers

Review event logs on domain controllers regularly to

uncover problems in the early stage

Perform base-lining and regular monitoring of domain

controllers to uncover any potential resource bottleneck

Configure only the Forest root domain PDCe as NTP

type server

Best Practices

Page 31: Checking the health of your active directory enviornment

31

Microsoft Confidential

Major Components of Active Directory

Active Directory

Replication

SYSVOL

Replication

Name Resolution

Domain Controller

Health

Disaster Recovery

Page 32: Checking the health of your active directory enviornment

32

Loss of DCs

Loss of data

Re-introduction of lingering objects

Loss of configuration partition data

Disaster Recovery

Page 33: Checking the health of your active directory enviornment

Questions?