planning server deployments

Planning Server DeploymentsPlanning Server DeploymentsLesson 20

Skills MatrixSkills Matrix

ReplicationReplication• You use replication to put copies of the same

data at different locations throughout the enterprise.

• Common reasons to replicate are:– To move data closer to the user.– To reduce locking conflicts when multiple sites

want to work with the same data.– To allow site autonomy so each location can

set up its own rules and procedures for working with its copy of the data.

– To preclude the impact of read-intensive operations, such as report generation and ad hoc query processing from the OLTP database.

ReplicationReplication

• SQL Server uses two strategies for replication: replication itself and distributed transactions managed by the distributed transaction coordinator. Whichever strategy you use, the copies of the data are current and consistent.

• You can also use both strategies in the same environment.

ReplicationReplication

• The timing describes the main difference between replication and distributed transactions. With distributed transactions, SQL Server maintains your data one hundred percent synchronized one hundred percent of the time. Replication involves some latency.

Publisher/SubscriberPublisher/Subscriber

• The publisher is the source database where replication begins. – It makes data available for

replication.

• The subscriber is the destination database where replication ends. – It either receives a snapshot of all the

published data or applies transactions that have been replicated to itself.

Publisher/SubscriberPublisher/Subscriber• The distributor is the intermediary between the

publisher and subscriber. – It receives published transactions or snapshots

from the publisher and then stores and forwards these publications to the subscribers.

• The publication is the storage container for different articles. – A subscriber can subscribe to an individual

article or an entire publication.• An article is the data, transactions, or stored

procedures that are stored within a publication. – This is the actual information being replicated.

New Publication WizardNew Publication Wizard

Two-phase CommitTwo-phase Commit• Two-phase commit (sometimes referred to

as 2PC) is a form of realtime distribution in which modifications are made to all involved databases at the same time.

• Distributed transactions handles this. • As with any transaction, either all statements

commit successfully or all modifications roll back.

• Two-phase commit uses the Microsoft DTC to accomplish its tasks.

• The DTC implements the functionality of a portion of the Microsoft Transaction Server.

Replication FactorsReplication Factors• Autonomy: This refers to how much

independence you want to give each subscriber with regard to the replicated data.

• Latency: Latency refers to the time lag between updates on the subscriber.

• Transactional consistency: Although several types of replication exist, the most common method moves transactions from the publisher through the distributor and on to the subscriber.

Types of ReplicationTypes of Replication• Snapshot replication - Snapshot replication

distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data.

• Transactional replication - A type of replication that typically starts with a snapshot of the publication database objects and data.

• Merge replication - A type of replication that allows sites to make autonomous changes to replicated data, and at a later time, merge changes and resolve conflicts when necessary.

Distribution TypesDistribution Types

• Distributed transactions• Transactional replication• Transactional replication with

immediate updating subscribers• Snapshot replication• Snapshot replication with immediate

updating subscribers• Merge replication• Queued updating

Distribution TypesDistribution Types

Queue UpdatingQueue Updating• With transactional and snapshot replication,

you can also configure queued updating. • Like the immediate updating subscribers

option, this gives your users the ability to make changes to the subscription database.

• But unlike immediate updating subscribers, queued updating will store changes until the publisher can be contacted.

• This can be extremely useful in networks where you have subscribers who are not always connected or the connection is unreliable.

SubscriptionsSubscriptions

• When you set up your subscribers, you can create either pull or push subscriptions.

Push SubscriptionsPush Subscriptions• Push subscriptions help centralize your

administrative duties because the subscription itself is stored on the distribution server.

• In other words, the data can be pushed to the subscribers based on the publisher’s schedule.

• Push subscriptions are most useful if a subscriber needs to be updated whenever a change occurs at the publisher.

• The publisher knows when the modification takes place, so it can immediately push those changes to the subscribers.

Pull SubscriptionsPull Subscriptions• Pull subscriptions are configured and

maintained at each subscriber. • The subscribers will administer the

synchronization schedules and can pull changes whenever they consider it necessary.

• This type of subscriber also relieves the distribution server of some of the overhead of processing.

• Pull subscriptions are also useful in situations in which security is not a primary issue.

Replication AgentsReplication Agents

• Five replication agents handle the tasks of moving data from the publisher to the distributor on to the subscribers. – Logreader agent– Distribution agent– Snapshot agent– Merge agent– Queue reader agent

Merge ReplicationMerge Replication

• When you use merge replication, the merge agent can be centrally located on the distributor, or it can reside on every subscriber involved in the merge replication process.

• When you have implemented push replication, the merge agent will reside on the distributor.

• In a pull scenario, the merge agent is on every subscriber.

Merge ReplicationMerge Replication

Conflict Resolution in Merge Conflict Resolution in Merge ReplicationReplication• Performing updates to the same records at

multiple locations causes conflicts. • To resolve these conflicts, SQL Server uses the

MSmerge_contents table and some settings from the publication itself.

• When you first create a merge publication, you can use the conflict resolver with three levels of resolution tracking in a merge publication:– Row-level tracking– Column-level tracking– Logical record-level tracking

Snapshot ReplicationSnapshot Replication• When you use snapshot replication, an entire

copy of the publication moves from the publisher to the subscriber.

• Everything on the subscriber database is overwritten, allowing for autonomy, as well as transactional consistency because all changes are made at once.

• Latency can be high for this type of replication if you want it to be.

• When you use snapshot replication, there is no merge agent. Snapshot replication uses the distribution agent.

Snapshot ReplicationSnapshot Replication

Transactional ReplicationTransactional Replication• When you use transactional replication, only the

changes (transactions) made to the data are moved.

• Before these transactions can be applied at a subscriber, however, the subscriber must have a copy of the data as a base.

• Because of its speed and relatively low overhead on the distribution server, transactional replication is currently the most often-used form of replication.

• Generally, data on the subscriber is treated as read-only, unless you are implementing transactional replication with immediate updating subscribers.

Transactional ReplicationTransactional Replication

Publication IssuesPublication Issues• Before you start your replication process, you

should consider a few more topics, including data definition issues, IDENTITY column issues, and some general rules involved when publishing.

• Keep the following data definition items in mind when you are preparing to publish data:– Timestamp data types– Identity values– User-defined data types– Not for replication

Tips for Distribution ServersTips for Distribution Servers• Here are some tips to keep in mind when

selecting a machine to be the distributor:– Ensure you have enough hard disk

space for the Distribution working folder and the distribution database.

– You must manage the distribution database’s transaction log carefully.

– The distribution database will store all transactions from the publisher to the subscriber.

Tips for Distribution ServersTips for Distribution Servers– Snapshots and merge data are stored in

the Distribution working folder.– Be aware of the size and number of

articles being published.– Text, ntext, and image datatypes are

replicated only when you use a snapshot.

– A higher degree of latency can significantly increase your storage space requirements.

– Know how many transactions per synchronization cycle there are.

Replication ModelsReplication Models

• You can use one of several models for each replication process that you implement:– Central publisher/central distributor– Remote distribution– Central subscriber/multiple publishers– Multiple publishers/multiple

subscribers

Central Publisher ModelCentral Publisher Model

Remote DistributionRemote Distribution

Central Subscriber/Multiple PublishersCentral Subscriber/Multiple Publishers

Multiple Publishers/Multiple Multiple Publishers/Multiple SubscribersSubscribers

• Use this model when you need to maintain a single table on multiple servers.

• Each server subscribes to the table and also publishes the table to other servers.

• This model can be particularly useful in the following business situations:– Reservations systems– Regional order-processing systems– Multiple warehouse implementations

Multiple Publishers/Multiple Multiple Publishers/Multiple SubscribersSubscribers

Heterogeneous ReplicationHeterogeneous Replication

• Heterogeneous database replication allow you to replicate data to non-Microsoft database servers including replicate to databases across the Internet.

• Heterogeneous replication occurs when you publish to other databases through an OLE DB connection.

Heterogeneous ReplicationHeterogeneous Replication

• When you publish to these non–SQL Server subscribers, you need to keep the following rules in mind:– Only push subscriptions are supported.– You can publish index views as tables;

they cannot be replicated as an indexed view.

– Snapshot data will be sent using bulk copy’s character format.

– Datatypes will be mapped as closely as possible.

Heterogeneous ReplicationHeterogeneous Replication– The account under which the distribution

agent runs must have read access to the install directory of the OLE DB provider.

– If an article is added to or deleted from a publication, subscriptions to non–SQL Server subscribers must be reinitialized.

– NULL and NOT NULL are the only constraints supported for all non–SQL Server subscribers.

– Primary key constraints are replicated as unique indexes.

Replication over the InternetReplication over the Internet

• Replicating data over the Internet allows remote, disconnected users to access data stored or “parked”, temporarily on an FTP site when they need it using a connection to the Internet. Replicate data over the Internet using: – A Virtual Private Network (VPN).– The Web synchronization option for

merge replication.

Installing and Using ReplicationInstalling and Using Replication

• To successfully install and enable replication, you must install a distribution server, create your publications, and then subscribe to them.

• Before any of this can take place, you must first configure SQL Server.

• To install your replication scenario, you must be a member of the sysadmins fixed server role.

Installing and Using ReplicationInstalling and Using Replication• Before you can configure your SQL Server for

replication, the computer itself must meet the following requirements:– All servers involved with replication must

be registered in Management Studio.– If the servers are from different domains,

Active Directory trust relationships must be established before replication can occur.

– Any account you use must have access rights to the Distribution working folder on the distribution server.

Installing and Using ReplicationInstalling and Using Replication

• Use a single Windows domain user account for all your SQL Server Agents. – Do not use a LocalSystem account

because this account has no network capabilities and will not, therefore, allow replication.

– Also, you need to make the account a member of the Domain Administrators group because only administrators have access to the system ($) shares.

Installing a Distribution ServerInstalling a Distribution Server

• Before you can enable a publication database, you must be a member of the sysadmin fixed server role.

• Once you have enabled publishing, any member of that database’s db_owner role can create and manage publications.

Adding a PublicationAdding a Publication• The Create Publication Wizard allows you to

specify the following options:– Number of articles– Schedule for the snapshot agent– Whether to maintain the snapshot on the

distributor– Tables and stored procedures you want to

publish– Publications that will share agents– Whether to allow updating subscribers– Whether to allow pull subscriptions

Creating a SubscriptionCreating a Subscription

• As part of the process of creating a subscription, you will be able to specify the publishers you want to subscribe to and a destination database to receive the published data, verify your security credentials, and set up a default schedule.

Testing ReplicationTesting Replication

• You can now verify that replication is running properly.

Replication MonitorReplication Monitor

• The Replication Monitor gathers replication information about the different replication agents. This includes the agent history, with information about inserts, updates, deletes, and any other transactions that were processed.

• Through the Replication Monitor, you can also edit the various schedules and properties of the replication agents.

Replication ScriptsReplication Scripts• Now that you have replication set up and

working properly, you may want to save all your hard work in the form of a replication script.

Replication ScriptsReplication Scripts• Scripting your replication scenario has the

following advantages:– You can use the scripts to track different

versions of your replication implementation.– You can use the scripts (with some minor

tweaking) to create additional subscribers and publishers with the same basic options.

– You can quickly customize your environment by modifying the script and then rerunning it.

– You can use the scripts as part of your database recovery process.

Replication ScriptsReplication Scripts

• From here you can script the distributor and publications for the various replication items stored with this distribution server.

• You can also script the options for any subscribers and even the replication jobs.

• When you have made your choices, just click the Script to File button and save the script wherever you like.

Replication ResourcesReplication Resources• Replication requires considerable memory

and processor resources. • You can perform a number of tweaks to

increase the performance of your replication scheme:– Set a minimum memory allocation limit.– Use a separate hard disk for all the

databases used in replication.

Replication ResourcesReplication Resources– Use multiple processors.– Publish only the amount of data

required.– Place the snapshot folder on a drive that

does not have database or log files.– Be sparing with horizontal partitioning.– Use a fast network.– Run agents continuously instead of

frequently.

SummarySummary

• Replication is a powerful tool used to distribute data to other database engines in your enterprise, which you need to do so your data will be closer to your users and, therefore, faster and easier for them to access.

SummarySummary• Microsoft uses a publisher/subscriber

metaphor to explain replication. • The publisher contains the data that needs

to be copied. • The subscribers get a copy of the data from

the publisher, and the distributor moves the data from the publisher to the subscribers.

• The data are published in groups called publications; a publication can contain several articles, which are the actual data being replicated.

SummarySummary• You can choose from three main types of

replication: merge, transactional, and snapshot. • Each has pros and cons, but you should consider

three main issues when picking a replication type: autonomy, latency, and consistency.

• In other words, you need to know whether the data has to be replicated right away or whether it can be late (latency).

• You need to know whether subscribers can update the data (autonomy); and you need to know whether the transactions need to be applied all at the same time and in a specific order (consistency).

SummarySummary

• When you have picked the right type of replication, you have a number of physical models to choose from:– Central publisher/central distributor– Remote distribution– Central subscriber/multiple publishers– Multiple publishers/multiple

subscribers– Heterogeneous replication

SummarySummary• Once you have implemented a replication

solution, you need to back it up. – You should back up all the databases

involved in replication, but especially the distributor, because if you do not, the transaction log in the distribution database will fill up and replication will stop.

• You should also generate replication scripts so that if your server ever suffers a catastrophic failure, you will be able to rebuild the replication solution much faster.

SummarySummary

• Also keep in mind all the points for enhancing replication performance.

• Once you have implemented replication, your users will come to depend on it, and if it doesn’t move fast enough, it will not be dependable, and your users will not be happy.

• If you keep it in top shape, though, users will be able to take full advantage of the power of replication.

Summary for Certification ExaminationSummary for Certification Examination

• Know the publisher/subscriber metaphor: Publishers contain the original copy of the data where changes are made.

• Subscribers receive copies of the data from the publishers.

• The data are disseminated to the subscribers through the distributor.

Summary for Certification ExaminationSummary for Certification Examination• Know the types of replication: Three basic types of

replication exist: snapshot, transactional, and merge. • In transactional replication, transactions are read right

from the transaction log and copied from the publisher to the subscribers.

• In snapshot replication, the entire publication is copied every time the publication is replicated.

• In merge replication, data from the publisher is merged with data from the subscribers, which are allowed to update.

• With the immediate updating subscribers and queued updating options, subscribers can make changes to data that has been replicated with transactional and snapshot data as well.

Summary for Certification ExaminationSummary for Certification Examination• Know the replication models: You need to

be familiar with the various models—that is, who publishes, who subscribes, and who distributes.

• In the central publisher/central distributor model, a single server is both the publisher and distributor, and there are multiple subscribers.

• The remote distribution model has one publishing server, a separate distributor, and multiple subscribers.

Summary for Certification ExaminationSummary for Certification Examination• In the central subscriber/multiple

publishers model, multiple publishers all publish to a single subscribing server.

• The multiple publishers/multiple subscribers model contains multiple publishing servers and multiple subscribing servers. The number of distributors is undefined.

• Heterogeneous replication describes replication to a third-party database engine, such as DB2 or Oracle.

Summary for Certification ExaminationSummary for Certification Examination

• Understand how publications and articles work: A publication comprises articles that contain the data being replicated. An article is actually a representation of a table. The article can be partitioned either vertically or horizontally, and it can be transformed.

planning server deployments

Documents

transactional replication

type of replication

merge replication

replication ends

replication factorsautonomy

replicated data

published data

data available