sql clr demystified

62
SQL CLR Demystified A look at the integration of the CLR into SQL Server Matt Whitfield Atlantis Interactive UK Ltd

Upload: silver

Post on 30-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

SQL CLR Demystified. A look at the integration of the CLR into SQL Server Matt Whitfield Atlantis Interactive UK Ltd. Overview of CLR integration SqlContext Stored Procedures Scalar & Table Functions CLR DML Triggers CLR DDL Triggers Aggregate Functions. CLR Types - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SQL CLR Demystified

SQL CLR Demystified

A look at the integration of the CLR into SQL Server

Matt WhitfieldAtlantis Interactive UK Ltd

Page 2: SQL CLR Demystified

What will be covered

• Overview of CLR integration

• SqlContext• Stored Procedures• Scalar & Table

Functions• CLR DML Triggers• CLR DDL Triggers• Aggregate Functions

• CLR Types• Visual Studio

Database Projects• A look at permission

sets• Examples:

– Data maniplation for performance

– Environment manipulation for maintenance

Page 3: SQL CLR Demystified

Overview

• CLR integration into SQL Server allows us to run CLR code that conforms to certain restrictions

• CLR integration means that we can use common business rule validation code in the middle tier and the database

• For some operations, CLR integration allows us to extract maximum performance

• CLR integration allows us to perform maintenance tasks that would have been very difficult or impossible before

Page 4: SQL CLR Demystified

CLR Disabled by Default

• CLR integration is disabled by default in SQL Server, and therefore must be enabled:

EXEC sp_configure ‘clr enabled’, 1;

RECONFIGURE WITH OVERRIDE;GO

• This can also be done using the Surface Area Configuration tool.

Page 5: SQL CLR Demystified

What Framework version?

• The CLR in SQL Server is always loaded as the 2.0 runtime

• Until .NET 4.0, a single CLR host process could only load a single version of the .NET runtime

• It is possible future versions of SQL Server may allow more up-to-date runtime versions to be loaded

• We can confirm the loaded version with:select * from sys.dm_clr_properties

• However, we can use types from .NET 3.0 and 3.5, as these releases did not update mscorlib

Page 6: SQL CLR Demystified

Some Things To Note

• When SQL Server loads assemblies, they are cached in memory. When the O/S signals memory pressure to SQL Server, explicit garbage collection may be run, and the assemblies may be unloaded. This can cause performance issues if it happens frequently.

• SQL CLR code cannot be costed accurately by the query optimiser, as it does not look at what the code actually does – this can affect execution plans.

• SQL CLR code can sometimes prevent parallelism, and SQL CLR procedures are usually single threaded. Sometimes this can hurt performance, other times you may find you achieve the same amount of work in the same time using 1 thread instead of 8.

Page 7: SQL CLR Demystified

SqlContext

• SqlContext is a class that gives us access to four members:– IsAvailable– Pipe– TriggerContext– WindowsIdentity

• IsAvailable indicates whether the other members can be used

Page 8: SQL CLR Demystified

SqlContext.Pipe

• Pipe is the main point of interaction with the SQL Server when returning result sets or messages

• Messages can be sent to the client using the Send(string) function:SqlContext.Pipe.Send(“Hello World”);

• Single row result sets can be sent using the Send(SqlDataRecord) function.

Page 9: SQL CLR Demystified

Multi-row Result Sets

• SendResultsStart is used to specify the schema of the result set– No data is sent by the

SendResultsStart call

• 0 or more calls to SendResultsRow are then sent with data– Empty result sets can be

sent by making no calls to SendResultsRow

• SendResultsEnd marks the end of the result set

SendResultsStart(SqlDataRecord)

SendResultsRow(SqlDataRecord)

SendResultsEnd()

Page 10: SQL CLR Demystified

SqlContext.WindowsIdentity

• This returns a standard System.Security.Principal.WindowsIdentity object

• The property can only be used under EXTERNAL_ACCESS or UNSAFE permission sets

• For 99% of CLR work, using this property is not required

Page 11: SQL CLR Demystified

SqlContext.TriggerContext

• TriggerContext gives you access to:– ColumnCount – number of columns in the target table– EventData – SqlXml that returns the same as the

EVENTDATA T-SQL function– TriggerAction – a TriggerAction enumeration value

that specifies which action caused the trigger to fire– IsUpdatedColumn – method that determines if a

column was updated – same as the T-SQL UPDATE() function

– …but what about the inserted and deleted virtual tables that are available in normal triggers?

Page 12: SQL CLR Demystified

Context Connection

• To access data from within CLR code, the special ‘context connection’ connection string is used…using (SqlConnection connection = new SqlConnection("context connection=true")){ connection.Open(); ...}

• We can then run any SQL statement as we would normally from .NET

• The inserted and deleted virtual tables are available through this method when in a DML trigger

Page 13: SQL CLR Demystified

Stored Procedures

• What is a stored procedure?– A procedure that contains a sequence of

operations, which may affect existing data, return result sets or both

– CLR Stored Procedures are no different– CLR Stored Procedures are implemented as

static members of a class– The containing class is irrelevant to the

execution of the procedure, and is an organisational unit only

Page 14: SQL CLR Demystified

Hello Worldpublic partial class StoredProcedures{ [Microsoft.SqlServer.Server.SqlProcedure] public static void HelloWorld() { SqlContext.Pipe.Send("Hello World!"); }}

• Procedure is just a method, decorated with the SqlProcedure attribute.• Procedure is registered in SQL Server as follows:

CREATE PROCEDURE [dbo].[HelloWorld]AS EXTERNAL NAME [SQLAssembly].[StoredProcedures].[HelloWorld]GO

Page 15: SQL CLR Demystified

Simple Data Access

• In part 1:– we will open a context connection– we will execute a command– we will read the results, and use those to populate a Dictionary

object

• In part 2:– We will create record metadata– We will create a record set– We will populate the record set from the Dictionary

• While a very mundane example, and not something you would usually use the CLR for, it shows the basic mechanics of moving data into and out of the CLR

Page 16: SQL CLR Demystified

Simple Data Access (1/2)public static void RowCounter(){ Dictionary<string, int> typeDictionary = new Dictionary<string, int>(); using (SqlConnection connection = new SqlConnection("context connection=true")) { connection.Open(); using (SqlCommand command = new SqlCommand("SELECT [type] from [sys].[objects];",

connection)) { using (SqlDataReader reader = command.ExecuteReader()) { while (reader.Read()) { string type = reader.GetString(0); if (typeDictionary.ContainsKey(type)) { typeDictionary[type] = typeDictionary[type] + 1; } else { typeDictionary.Add(type, 1); } } } } }...

Page 17: SQL CLR Demystified

Simple Data Access (2/2)... SqlDataRecord record = new SqlDataRecord(new SqlMetaData[] { new SqlMetaData("Type", SqlDbType.NChar,

2), new SqlMetaData("Count", SqlDbType.Int) }); SqlContext.Pipe.SendResultsStart(record); foreach (KeyValuePair<string, int> kvp in

typeDictionary) { record.SetString(0, kvp.Key); record.SetInt32(1, kvp.Value); SqlContext.Pipe.SendResultsRow(record); } SqlContext.Pipe.SendResultsEnd();}

Page 18: SQL CLR Demystified

Things to note

• That we did not create a new SqlDataRecord object for every row, we just populated the existing object

• That the name exposed to SQL Server does not have to match the name of the implementing method

• The return type of the stored procedure can be int to return a value, as with T-SQL stored procedures

Page 19: SQL CLR Demystified

Scalar Functions

• What is a scalar function?– A statement or set of statements that return a

single value of a specific type based on zero or more parameters

– CLR Scalar Functions are no different– CLR Scalar Functions are implemented much

the same as stored procedures – a single static method of a class

Page 20: SQL CLR Demystified

String Length Scalar Function

[Microsoft.SqlServer.Server.SqlFunction]public static long fn_LengthTestCLR(string input){ return input.Length;}

CREATE FUNCTION [dbo].[fn_LengthTestCLR] (@input nvarchar (MAX))RETURNS bigint AS EXTERNAL NAME [SQLAssembly].

[UserDefinedFunctions].[fn_LengthTestCLR]GO

Page 21: SQL CLR Demystified

Performance Comparison• Consider the equivalent T-SQL function, and the following SQL

Statement:

CREATE FUNCTION dbo.fn_LengthTest(@input varchar(MAX))

RETURNS [bigint]AS BEGIN RETURN LEN(@input)END

SELECT SUM(dbo.fn_LengthTest(o.[name])) FROM [sys].[objects] [o] CROSS JOIN [sys].[all_columns] [ac] CROSS JOIN [sys].[all_columns] [ac1]

Page 22: SQL CLR Demystified

CLR would be slower, right?

• No!

• On a dual core 2GHz CPU –– SQL Function – 9.036 seconds– SQL CLR Function – 3.918 seconds

• Overhead of calling a SQL CLR function is lower than that of calling a T-SQL function

Page 23: SQL CLR Demystified

Table Functions

• What is a table function?– A statement or set of statements that return a single

result set with a defined schema based on zero or more parameters

– CLR Table Functions can only be multi-statement equivalents, there is no concept of an ‘inline CLR table function’

– Implemented as a pair of methods – one that returns an IEnumerable, and one that sets outbound parameters for each object – both are static members

Page 24: SQL CLR Demystified

Regular Expression Function

• In Part 1:– We will implement an object that we can return to hold

the data required to fill each row• In Part 2:

– We will create a list of our storage object– We will run a regular expression match based on the

parameters– We will populate our list with the results of the regular

expression– We will return the list

• In Part 3:– We will set the values to be stored in each row

Page 25: SQL CLR Demystified

A Regular Expression Function (1/3)

private struct _match{ public readonly int MatchNumber; public readonly int GroupNumber; public readonly string CaptureValue; public _match(int matchNumber, int groupNumber, string captureValue) { MatchNumber = matchNumber; GroupNumber = groupNumber; CaptureValue = captureValue; }}

Page 26: SQL CLR Demystified

A Regular Expression Function (2/3)

[SqlFunction(FillRowMethodName = "FillRow", TableDefinition = "MatchNumber int, " + "GroupNumber int, " + "CaptureValue nvarchar(MAX)")]public static IEnumerable GetCaptureGroupValues(string input, string pattern){ List<_match> matchList = new List<_match>(); int matchIndex = 0; foreach (Match m in Regex.Matches(input, pattern)) { int groupIndex = 0; foreach (Group g in m.Groups) { matchList.Add(new _match(matchIndex, groupIndex++, g.Value)); } matchIndex++; } return matchList;}

Page 27: SQL CLR Demystified

A Regular Expression Function (3/3)

public static void FillRow(Object obj, out int MatchNumber, out int GroupNumber, out SqlString

CaptureValue){ _match match = (_match)obj; MatchNumber = match.MatchNumber; GroupNumber = match.GroupNumber; CaptureValue = match.CaptureValue;}

Page 28: SQL CLR Demystified

Creation SQLCREATE FUNCTION [dbo].[GetCaptureGroupValues] (@input [nvarchar] (MAX), @pattern [nvarchar] (MAX))RETURNS TABLE ([MatchNumber] [int] NULL, [GroupNumber] [int] NULL, [CaptureValue] [nvarchar] (MAX) NULL)AS EXTERNAL NAME [SQLAssembly].[UserDefinedFunctions].

[GetCaptureGroupValues]GO

• Again, the name exposed to SQL Server does not have to match the name of the main body function

• The SQL definition can specify different column names and parameter names to the main method and fill row method

Page 29: SQL CLR Demystified

Things to note

• That we used a struct for the storage object, rather than a class, because of reduced instantiation costs

• That our function was decorated with a SqlFunction attribute that named the fill row method and specified the shape of the output table

• That the fill row method took an object as it’s first parameter, and had an outbound parameter for each of the columns we wanted to fill

Page 30: SQL CLR Demystified

CLR Triggers

• Known bug in Visual Studio 2005 and 2008 that means SqlTrigger attribute does not allow you to specify the schema of the target table

• Fixed in Visual Studio 2010

• Workaround is to comment out the SqlTrigger attribute, and deploy manually

Page 31: SQL CLR Demystified

DML Triggers

• What is a DML Trigger?– A statement or set of statements run when

data in a target table is modified– CLR DML Triggers are no different– CLR DML Triggers still have access to the

inserted and deleted virtual tables– CLR DML Triggers are implemented as static

methods in a class

Page 32: SQL CLR Demystified

A Simple DML Trigger

• We will specify the event type and target object using the SqlTrigger attribute

• We will connect to the database using the context connection

• We will select the rows from the inserted virtual table, and count them

• We will send a message to the SqlContext pipe stating the number of rows

Page 33: SQL CLR Demystified

A Simple DML Trigger[Microsoft.SqlServer.Server.SqlTrigger(Target="CallLog", Event="FOR UPDATE")]public static void TestDMLTrigger(){ int i = 0; using (SqlConnection connection = new SqlConnection("context connection=true")) { connection.Open(); using (SqlCommand command = new SqlCommand("SELECT * from inserted;",

connection)) { using (SqlDataReader reader = command.ExecuteReader()) { while (reader.Read()) { i++; } } } } SqlContext.Pipe.Send(i.ToString() + " rows in inserted");}

Page 34: SQL CLR Demystified

DDL Triggers

• What is a DDL Trigger?– A statement or set of statements run when a

schema modification event occurs– CLR DDL Triggers are no different– CLR DDL Triggers have access to the event

data via SqlContext.TriggerContext– CLR DDL Triggers are implemented as static

methods in a class

Page 35: SQL CLR Demystified

A Simple DDL Trigger

• We will specify the target and the event type using the SqlTrigger attribute

• We will create an XmlDocument based on the TriggerContext object from the SqlContext object

• We will then use XPath to find the created object’s schema and name

• We will send a message to the SqlContext pipe stating the name of the created object

Page 36: SQL CLR Demystified

A Simple DDL Trigger[Microsoft.SqlServer.Server.SqlTrigger(Target="DATABASE", Event="FOR CREATE_TABLE")]public static void TestDDLTrigger(){ XmlDocument document = new XmlDocument(); document.LoadXml(SqlContext.TriggerContext.EventData.Value);

string objectName = document.SelectSingleNode("//SchemaName").InnerText +

"." +

document.SelectSingleNode("//ObjectName").InnerText;

SqlContext.Pipe.Send("Table " + objectName + " was created.");}

Page 37: SQL CLR Demystified

Aggregate Functions

• What is an aggregate function?– A function that takes a series of values and

accumulates them into a single, aggregated result– CLR Aggregate functions are no different– Implemented as a class with four methods:

• Init• Accumulate• Merge• Terminate

– Aggregate functions in SQL Server 2005 can take only one parameter, in SQL Server 2008 this restriction is lifted

Page 38: SQL CLR Demystified

Aggregate Function Methods

• Init method is called first, here we initialise our member fields

• Accumulate is called to add a value to the aggregate

• Merge is called to add together two aggregates (for example, as a result of parallelism)

• Terminate is called to return our value

Page 39: SQL CLR Demystified

A Simple Aggregate

• We will use the SqlUserDefinedAggregate attribute to specify the serialisation format of our aggregate

• We will track the minimum value in Accumulate, and also track that we have seen values

• In Merge, we will check if the other aggregate class has seen values - if it has, then we take the minimum value seen

• In Terminate we will return the minimum value we have seen if we have seen any values, or NULL if we have not seen any values

Page 40: SQL CLR Demystified

A Simple Aggregate (1/2)[Serializable][Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)]public struct MinValue{ int minValue; bool hasValues;

public void Init() { minValue = Int32.MaxValue; hasValues = false; }

public void Accumulate(SqlInt32 Value) { hasValues = true; if (Value.Value < minValue) { minValue = Value.Value; } }...

Page 41: SQL CLR Demystified

A Simple Aggregate (2/2)... public void Merge(MinValue Group) { if (Group.hasValues) { if (Group.minValue < minValue) { minValue = Group.minValue; } hasValues = true; } }

public SqlInt32 Terminate() { if (hasValues) { return minValue; } else { return SqlInt32.Null; } }}

Page 42: SQL CLR Demystified

SqlUserDefinedAggregate

• This attribute contains parameters which can affect the query optimiser, and can cause incorrect results if set wrong:– IsInvariantToDuplicates – should only be true if

aggregating the same value many times does not affect the result

– IsInvariantToNulls – should only be true if aggregating NULL values does not affect the result

– IsInvariantToOrder – should only be true if the result is not dependent on the order of the aggregated values

– IsNullIfEmpty – should only be true if the aggregate of 0 values is NULL

Page 43: SQL CLR Demystified

Native and UserDefined Serialisation

• If your aggregate contains reference types then you must implement custom serialisation, and specify Format.UserDefined in the SqlUserDefinedAggregate attribute

• When specifying UserDefined format, your aggregate must implement the IBinarySerialize interface, and specify it’s maximum size in the SqlUserDefinedAggregate attribute

• Maximum byte size can be 1 to 8000 in SQL Server 2005, or -1 for any value between 8001 bytes and 2 GB in SQL Server 2008.

• The same serialisation interface applies to both Aggregates and User Defined Types

Page 44: SQL CLR Demystified

CLR Types

• What is a user defined type?– A type based on a system type, which may be

bound to old-style rules and defaults– CLR Types are different – CLR Types can be complex, and can have

properties and methods– CLR Types can have static methods (and the

built in CLR Types in SQL Server 2008 have some)

Page 45: SQL CLR Demystified

A Simple CLR Type

• Part 1:– We will define the serialisation format of the type– We will provide the IsNull property and the Null static property to

return a NULL instance

• Part 2:– We will define methods to convert the type to and from strings

• Part 3:– We will define a property for the symbol, a method to change the

symbol and a static method to create a new currency with a defined symbol and value

• Part 4:– We will declare our member fields and provide implementation of

the IBinarySerialize interface to load and store our type

Page 46: SQL CLR Demystified

A Simple CLR Type (1/4)[Serializable][Microsoft.SqlServer.Server.SqlUserDefinedType(Format.UserDefined, MaxByteSize=24)]public struct Currency : INullable, IBinarySerialize{ public bool IsNull { get { return _null; } }

public static Currency Null { get { Currency h = new Currency(); h._null = true; return h; } }

Page 47: SQL CLR Demystified

A Simple CLR Type (2/4) public override string ToString() { return _symbol + _value.ToString(); }

public static Currency Parse(SqlString s) { if (s.IsNull || string.IsNullOrEmpty(s.Value) || s.Value.Length < 2 || string.Equals(s.Value, "NULL", StringComparison.OrdinalIgnoreCase)) { return Currency.Null; } Currency u = new Currency(); u._symbol = s.Value[0]; u._value = Decimal.Parse(s.Value.Substring(1)); return u; }

Page 48: SQL CLR Demystified

A Simple CLR Type (3/4) public char Symbol { get { return _symbol; } }

[SqlMethod(IsMutator=true)] public void ChangeSymbol(char newSymbol) { _symbol = newSymbol; }

public static Currency CreateCurrency(char symbol, Decimal value) { Currency c = new Currency(); c._symbol = symbol; c._value = value; return c; }

Page 49: SQL CLR Demystified

A Simple CLR Type (4/4) private char _symbol; private Decimal _value; private bool _null;

public void Read(BinaryReader r) { _null = r.ReadBoolean(); _symbol = r.ReadChar(); _value = r.ReadDecimal(); }

public void Write(BinaryWriter w) { w.Write(_null); w.Write(_symbol); w.Write(_value); }}

Page 50: SQL CLR Demystified

The CLR Type in use• Using our static method to create an instance:DECLARE @i [Currency]SET @i = [Currency]::[CreateCurrency]('$', 0.53)PRINT CONVERT([varchar], @i)

• Using our instance mutator method to change the symbol:DECLARE @i [Currency]SET @i = [Currency]::[CreateCurrency]('$', 0.53)SET @i.ChangeSymbol('£')PRINT CONVERT([varchar], @i)

• Retrieving the value of a property:DECLARE @i [Currency]SET @i = [Currency]::[CreateCurrency]('$', 0.53)PRINT @i.Symbol

Page 51: SQL CLR Demystified

Things to note

• We can call static methods and properties on CLR types with [type-name]::[member-name]

• IBinarySerialize must be used to serialise types that cannot be serialised natively

• A CLR type should always be able to parse it’s own string representation

• Methods that change the state of the type must be marked as mutator methods

• Although in .NET we can define operator overloads to specify how operators (e.g. +, -, *, /) are applied to our classes, SQL Server does not use these.

Page 52: SQL CLR Demystified

Permission Sets

• Each assembly in SQL Server has an associated ‘permission set’ – one of:– SAFE– EXTENAL_ACCESS– UNSAFE

• The permission set assigned to the assembly determines how limited the CLR functionality is

• This, in turn, determines how much damage we can do to the stability of the host process (i.e. SQL Server itself)

Page 53: SQL CLR Demystified

Allowed operationsSAFE EXTERNAL_ACCESS UNSAFE

Code access security permissions

Execute only Execute + access to external resources

Unrestricted

Programming model restrictions

Yes Yes No

Verifiability requirement

Yes Yes No

Local data access Yes Yes Yes

Ability to call native code

No No Yes

Page 54: SQL CLR Demystified

Programming model restrictions

• A few of the programming model restrictions for SAFE and EXTERNAL_ACCESS assemblies are as follows:– No static fields may be used to store information– PEVerify type safety checking test is passed– No synchronization may be used– Finalizer methods may not be used– No threading may be used– No self-affecting code may be used

• There are more requirements that must be met under the above two permission sets – the above is just some of the more common restrictions.

Page 55: SQL CLR Demystified

Visual Studio Database Projects

• Visual Studio offers us the ability to create a ‘CLR Database Project’ which enables simplified deployment and debugging

• Some issues present, though these are mostly solved in VS2010

• Database projects handle the deployment of the objects that we create simply, which can be really helpful during implementation

Page 56: SQL CLR Demystified

Performance Example

Phil Factor Speed Phreak Competition 4

The Log Parsing Problem• Producing aggregated results from an IIS log file• Made difficult by the need to determine

‘sessions’ – i.e. groups of rows from the same IP within a set time-span

• Requirement to produce three aggregated result sets – visitor summary by day, visitor summary and page summary by day

Page 57: SQL CLR Demystified

How did the CLR help?

• CLR methods can yield excellent results when doing forward-only navigation through result sets, performing aggregation and calculation along the way

• Iteration over the log allowed the data to be parsed into ‘visits’ by applying the session timeout

• Iteration over collection of visits allowed the data to be aggregated into country and page summaries

Page 58: SQL CLR Demystified

What was the result?

• Execution in 301ms, as opposed to the closest T-SQL approach which took 1779ms.

• Efficiency gains through the fact that the CLR procedure would only occupy 1 thread

• Maintenance benefits in readability and simplicity

Page 59: SQL CLR Demystified

Maintenance Example

• CLR Procedure to retrieve the free disk space on any specified server – by Tara Kizer

• Uses System.Diagnostics namespace to interface with WMI to retrieve results

• Requires UNSAFE assembly permission set because PerformanceCounterCategory is synchronised

• Simple use of framework classes to retrieve the information required

Page 60: SQL CLR Demystified

When should we use the CLR?

• To improve performance:– In situations where a forward-only scan over a result set requires

row by row processing to produce an aggregated result set– In situations where complex logic cannot simply be expressed in

T-SQL• To improve maintenance:

– In situations where information required can be simply obtained from any .NET process

– In situations where interaction with the O/S is required (will require elevated permission set)

• For functionality:– In situations where commonly available .NET Framework

classes provide desired functionality (classic example is Regular Expression handling

– To bring business logic from the middle tier into the database

Page 61: SQL CLR Demystified

And when should we not?

• When the functionality presented offers little over it’s T-SQL equivalent

• Instead of regular DML statements (i.e. a CLR procedure to insert a row is not the way to go!)

• To replicate row by row update logic from a cursor – conversion to set-based T-SQL yields far higher performance

• When the desired operation really does not belong in the database

• When the database has to be portable to other types of database server

Page 62: SQL CLR Demystified

Summary

• We learned:– How to implement each type of CLR object– How to get data out of and back into SQL Server– That the only SQL CLR object type which differs

significantly from it’s T-SQL equivalent is the CLR User Defined Type

– That the CLR can offer significant performance improvements, particularly for forward-only scans and scalar functions

– That we are restricted as to what we can achieve in a CLR method based on the permission set of the assembly