Merging data (1)

Download Merging data (1)

Post on 15-Jul-2015

51 views

Category:

Software

0 download

Embed Size (px)

TRANSCRIPT

<ul><li><p> Merging Data and Passing Tables 10-1 </p><p>Module 10 </p><p>Merging Data and Passing Tables Contents: </p><p>Lesson 1: Using the MERGE Statement 10-3 </p><p>Lesson 2: Implementing Table Types 10-14 </p><p>Lesson 3: Using TABLE Types As Parameters 10-22 </p><p>Lab 10: Passing Tables and Merging Data 10-26 </p></li><li><p>10-2 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>Module Overview </p><p>Each time a client application makes a call to a SQL Server system, considerable delay is encountered at </p><p>the network layer. The basic delay is unrelated to the amount of data being passed. It relates to the </p><p>latency of the network. For this reason, it is important to minimize the number of times that a client needs </p><p>to call a server for a given amount of data that must be passed between them. Each call is termed a </p><p>"roundtrip". </p><p>In this module you will review the techniques that provide the ability to process sets of data rather than </p><p>individual rows. You will then see how these techniques can be used in combination with TABLE </p><p>parameter types to minimize the number of required stored procedure calls in typical applications. </p><p>Objectives </p><p>After completing this lesson, you will be able to: x Use the MERGE statement x Implement table types x Use TABLE types as parameters </p></li><li><p> Merging Data and Passing Tables 10-3 </p><p>Lesson 1 </p><p>Using the MERGE Statement </p><p>A very common requirement when coding in T-SQL is the need to update a row if it exists but to insert </p><p>the row if it does not already exist. SQL Server 2008 introduced the MERGE statement that provides this </p><p>ability plus the ability to process entire sets of data rather than processing row by row or in several </p><p>separate set-based statements. This leads to much more efficient execution and simplifies the required </p><p>coding. In this lesson, you will investigate the use of the MERGE statement and the use of the most </p><p>common options associated with the statement. </p><p>Objectives </p><p>After completing this lesson, you will be able to: x Explain the role of the MERGE statement x Describe how to use the WHEN MATCHED clause x Describe how to use the WHEN NOT MATCHED BY TARGET clause x Describe how to use the WHEN NOT MATCHED BY SOURCE clause x Explain the role of the OUTPUT clause and $action x Describe MERGE determinism and performance</p></li><li><p>10-4 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>MERGE Statement </p><p>Key Points </p><p>The MERGE statement is most commonly used to insert data that does not already exist but to update the </p><p>data if it does exist. It can operate on entire sets of data rather than just on single rows and can perform </p><p>alternate actions such as deletes. </p><p>MERGE </p><p>It is a common requirement to need to update data if it already exists but to insert it if it does not already </p><p>exist. Some other database engines (not SQL Server) provide an UPSERT statement for this purpose. The </p><p>MERGE statement provided by SQL Server is a more capable replacement for such statements in other </p><p>database engines and is based on the ANSI SQL standard together with some Microsoft extensions to the </p><p>standard. </p><p>A typical situation where the need for the MERGE statement arises is in the population of data warehouses </p><p>from data in source transactional systems. For example, consider a data warehouse holding details of a </p><p>customer. When a customer row is received from the transactional system, it needs to be inserted into the </p><p>data warehouse. When later updates to the customer are made, the data warehouse would then need to </p><p>be updated. </p><p>Atomicity </p><p>Where statements in other languages typically operate on single rows, the MERGE statement in SQL </p><p>Server can operate on entire sets of data in a single statement execution. It is important to realize that the </p><p>MERGE statement functions as an atomic operation in that all inserts, updates or deletes occur or none </p><p>occur. </p><p>Source and Target </p><p>The MERGE statement uses two table data sources. The target table is the table that is being modified and </p><p>is specified first in the MERGE statement. Any inserts, updates or deletes are applied only to the target </p><p>table. </p></li><li><p> Merging Data and Passing Tables 10-5 </p><p>The source table provides the rows that need to be matched to the rows in the target table. You can think </p><p>of the source table as the incoming data. It is specified in a USING clause. The source table does not have </p><p>to be an actual table but can be other types of expressions that return a table such as: x A view x A sub-select (or derived table) with an alias x A common table expression (CTE) x A VALUES clause with an alias The source and target are matched together as the result of an ON clause. This can involve one or more </p><p>columns from both tables. </p></li><li><p>10-6 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>WHEN MATCHED </p><p>Key Points </p><p>The WHEN MATCHED clause defines the action to be taken when a row in the source is matched to a row </p><p>in the target. </p><p>WHEN MATCHED </p><p>The ON clause is used to match source rows to target rows. The WHEN MATCHED clause specifies the </p><p>action that needs to occur when a source row matches a target row. In most cases, this will involve an </p><p>UPDATE statement but it could alternately involve a DELETE statement. </p><p>In the example shown in the slide, rows in the EmployeeUpdate table are being matched to rows in the </p><p>Employee table based upon the EmployeeID. When a source row matches a target row, the FullName and </p><p>EmploymentStatus columns in the target table are updated with the values of those columns in the </p><p>source. </p><p>Note that only the target table can be updated. If an attempt is made to modify any other table, a syntax </p><p>error is returned. </p><p>Multiple Clauses </p><p>It is also possible to include two WHEN MATCHED clauses such as shown in the following code block: </p><p>WHEN MATCHED AND s.Quantity &gt; 0 </p><p>... </p><p>WHEN MATCHED </p><p>... </p><p>No more than two WHEN MATCHED clauses can be present. When two clauses are used, the first clause </p><p>must have an AND condition. If the source row matches the target and also satisfies the AND condition, </p><p>then the action specified in the first WHEN MATCHED clause is performed. Otherwise, if the source row </p></li><li><p> Merging Data and Passing Tables 10-7 </p><p>matches the target but does not satisfy the AND condition, the condition in the second WHEN MATCHED </p><p>clause is evaluated instead. </p><p>When two WHEN MATCHED clauses are present, one action must specify an UPDATE and the other action </p><p>must specify a DELETE. </p><p>Question: What is different about the UPDATE statement in the example shown, compared to a normal </p><p>UPDATE statement? </p></li><li><p>10-8 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>WHEN NOT MATCHED BY TARGET </p><p>Key Points </p><p>The WHEN NOT MATCHED BY TARGET clause specifies the action that needs to be taken when a row in </p><p>the source cannot be matched to a row in the target. </p><p>WHEN NOT MATCHED </p><p>The next clause in the MERGE statement that you will consider is the WHEN NOT MATCHED BY TARGET </p><p>statement. It was mentioned in the last topic that the most common action performed by a WHEN </p><p>MATCHED clause is to update the existing row in the target table. The most common action performed by </p><p>a WHEN NOT MATCHED BY TARGET clause is to insert a new row into the target table. </p><p>In the example shown in the slide, when a row from the EmployeeUpdate table cannot be found in the </p><p>Employee table, a new employee row would be added into the Employee table. </p><p>With a standard INSERT statement in T-SQL, the inclusion of a column list is considered a best practice </p><p>and avoids issues related to changes to the underlying table such as the reordering of columns or the </p><p>addition of new columns. The same recommendation applies to an INSERT action within a MERGE </p><p>statement. While a column list is optional, best practice suggests including one. </p><p>Syntax </p><p>The words BY TARGET are optional and are often omitted. The clause is then just written as WHEN NOT </p><p>MATCHED. Note again that no table name is included in the action statement (INSERT statement) as </p><p>modifications may only be made to the target table. </p><p>The WHEN NOT MATCHED BY TARGET clause is part of the ANSI SQL standard. </p></li><li><p> Merging Data and Passing Tables 10-9 </p><p>WHEN NOT MATCHED BY SOURCE </p><p>Key Points </p><p>The WHEN NOT MATCHED BY SOURCE statement is used to specify an action to be taken for rows in the </p><p>target that were not matched by rows from the source. </p><p>WHEN NOT MATCHED BY SOURCE </p><p>While much less commonly used than the clauses discussed in the previous topics, you can also take an </p><p>action for rows in the target that did not match any incoming rows from the source. </p><p>Generally, this will involve deleting the unmatched rows in the target table but UPDATE actions are also </p><p>permitted. </p><p>Note the format of the DELETE statement in the example on the slide. At first glance, it might seem quite </p><p>odd as it has no table or predicate specified. In this example, all rows in the Employee table that were not </p><p>matched by an incoming source row from the EmployeeUpdate table would be deleted. </p><p>Question: What would the DELETE statement look like if it only deleted rows where the date in a column </p><p>called LastModifed were older than a year? </p></li><li><p>10-10 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>OUTPUT Clause and $action </p><p>Key Points </p><p>The OUTPUT clause was added in SQL Server 2005 and allows the return of a set of rows when performing </p><p>data modifications. In 2005, this applied to INSERT, DELETE and UPDATE. In SQL Server 2008 and later, </p><p>this clause can also be used with the MERGE statement. </p><p>OUTPUT Clause </p><p>The OUTPUT clause was a useful addition to the INSERT, UPDATE and DELETE statements in SQL Server </p><p>2005. For example, consider the following code: </p><p>DELETE FROM HumanResources.Employee </p><p> OUTPUT deleted.BusinessEntityID, deleted.NationalIDNumber </p><p> WHERE ModifiedDate &lt; DATEADD(YEAR,-10,SYSDATETIME()); </p><p>In this example, employees are deleted when their rows have not been modified within the last ten years. </p><p>As part of this modification, a set of rows is returned that provides details of the BusinessEntityID and </p><p>NationalIDNumber for each row deleted. </p><p>As well as returning rows to the client application, the OUTPUT clause can include an INTO sub-clause </p><p>that causes the rows to be inserted into another existing table instead. Consider the following example: </p><p>DELETE FROM HumanResources.Employee </p><p> OUTPUT deleted.BusinessEntityID, deleted.NationalIDNumber </p><p> INTO Audit.EmployeeDelete </p><p> WHERE ModifiedDate &lt; DATEADD(YEAR,-10,SYSDATETIME()); </p><p>In this example, details of the employees being deleted are inserted into the Audit.EmployeeDelete table </p><p>instead of being returned to the client. </p></li><li><p> Merging Data and Passing Tables 10-11 </p><p>OUTPUT and MERGE </p><p>The OUTPUT clause can also be used with the MERGE statement. When an INSERT is performed, rows can </p><p>be returned from the inserted virtual table. When a DELETE is performed, rows can be returned from the </p><p>deleted virtual table. When an UPDATE is performed, values will be available in both the inserted and </p><p>deleted virtual tables. </p><p>Because a single MERGE statement can perform INSERT, UPDATE and DELETE actions, it can be useful to </p><p>know which action was performed for each row returned by the OUTPUT clause. To make this possible, </p><p>the OUTPUT clause also supports a $action virtual column that returns details of the action performed on </p><p>each row. It returns the words "INSERT", "UPDATE" or "DELETE". </p><p>Composable SQL </p><p>In SQL Server 2008 and later, it is now possible to consume the rowset returned by the OUTPUT clause </p><p>more directly. The rowset cannot be used as a general purpose table source but can be used as a table </p><p>source for an INSERT SELECT statement. Consider the following example: </p><p>INSERT INTO Audit.EmployeeDelete </p><p>SELECT Mods.EmployeeID </p><p>FROM (MERGE INTO dbo.Employee AS e </p><p> USING dbo.EmployeeUpdate AS eu </p><p> ON e.EmployeeID = eu.EmployeeID </p><p> WHEN MATCHED THEN </p><p> UPDATE SET e.FullName = eu.FullName, </p><p> e.EmploymentStatus = eu.EmploymentStatus </p><p> WHEN NOT MATCHED THEN </p><p> INSERT (EmployeeID,FullName,EmploymentStatus) </p><p> VALUES </p><p> (eu.EmployeeID,eu.FullName,eu.EmploymentStatus) </p><p> OUTPUT $action AS Action,deleted.EmployeeID) AS Mods </p><p>WHERE Mods.Action = 'DELETE'; </p><p>In this example, the OUTPUT clause is being used with the MERGE statement. A row would be returned for </p><p>each row either updated or deleted. However, you wish to only audit the deletion. You can treat the </p><p>MERGE statement with an OUTPUT clause as a table source for an INSERT SELECT statement. The enclosed </p><p>statement must be given an alias. In this case, the alias "Mods" has been assigned. </p><p>The power of being able to SELECT from a MERGE statement is that you can then apply a WHERE clause. </p><p>In this example, only the DELETE actions have been selected. </p><p>Note that from SQL Server 2008 onwards, this level of query composability also applies to the OUTPUT </p><p>clause when used in standard T-SQL INSERT, UPDATE and DELETE statements. </p><p>Question: How could the OUTPUT clause be useful in a DELETE statement? </p></li><li><p>10-12 Implementing a Microsoft SQL Server 2008 R2 Database </p><p>MERGE Determinism and Performance </p><p>Key Points </p><p>The actions performed by a MERGE statement are not identical to those that would be performed by </p><p>separate INSERT, UPDATE or DELETE statements. </p><p>Determinism </p><p>When an UPDATE statement is executed with a join, if more than one source row matches a target row, </p><p>no error is thrown. This is not permitted for an UPDATE action performed within a MERGE statement. Each </p><p>source row must match only a single target row or none at all. If more than a single source row matches a </p><p>target row, an error occurs and all actions performed by the MERGE statement are rolled back. </p><p>Performance of MERGE </p><p>The MERGE statement will often outperform code constructed from separate INSERT, UPDATE and </p><p>DELETE statements and conditional logic. In particular, the MERGE statement only ever makes a single </p><p>pass through the data. </p></li></ul>