inb343 week2 sql server intro
DESCRIPTION
TRANSCRIPT
04/10/23 1
INB(N)343 Advanced Data Mining and Data
Warehousing
Introducing SQL Server 2008
04/10/23 Y Li: INB(N)343 week 2 2
Objectives
• What is SQL Server 2008• The key components of SQL server• SQL server 2008 databases• SQL server 2008 storage architecture
What’s SQL Server 2008• It is primarily thought of as a Relational Database
Management System (RDBMS).• It can be more accurately described as an Enterprise Data
Platform.• It builds on many of the features that had first been
incorporated in SQL server 2005.• It also provides rich reporting capabilities, powerful data
analysis, and data mining.• It also has features that support asynchronous data
applications, data-driven event notification and more.
04/10/23 Yuefeng Li :INB(N)343 week 2 3
The key components of SQL server• Database Engine• Integration Services• Analysis Services• Reporting Services• Service Broker• Data Tier Web Services• Replication Services• Replication Services• Multiple Instances• Database Mail
04/10/23 Yuefeng Li :INB343 week 2 4
Database Engine• It is the primary component of SQL Server 2008 for the
efficient storage, retrieval, and manipulation of relational and XML formatted data.
• It is highly optimized for transaction processing.• The Database Engine is also responsible for the controlled
access and modification of data through its security subsystem.
04/10/23 Yuefeng Li :INB343 week 2 5
Integration Services• SQL Server Integration Services (SSIS) is Microsoft’s
enterprise class data Extract, Transform, and Load (ETL) tool, – Where the ETL process begins by defining a data sources.
From the data source, you define which data you are interested in copying to a new destination.
• Integration services is part of a suit of tools included in the Business Intelligent Development Studio (BIDS), – which includes add-ins for designing solutions for integration
services, analysis services, and reporting services.
04/10/23 Yuefeng Li :INB343 week 2 6
Analysis Services• It delivers Online Analytical Processing (OLAP) and Data
Mining functionality for Business Intelligence application.• It provides a robust environment for the detailed analysis
of data, through– User-created, multidimensional data structures, such as
relational databases, spreadsheets, flat files, and other multidimensional sources.
• The Data Mining component allows the analysis of large quantities of data by mining hidden relationships and patterns that may be of interest to an organization’s data analyst.
04/10/23 Yuefeng Li :INB343 week 2 7
Reporting Services• It is a Web Service-based solution for designing,
deploying, and managing flexible, dynamic web-based reports, as well as traditional paper reports.– These reports can contain information from virtually any data
source.
• It also allows for more savvy users to create ad hoc reports based report models.
• Report models provide an abstraction of the data in the form of entities so that users are not bothered with having to know the intricate details of the actual data sources.
04/10/23 Yuefeng Li :INB343 week 2 8
Notice: Compatibility Level• In the new Database window, we can use change the
compatibility level using options.• Normally, we should have
– Collation: <server default>– Recovery model: Full– Compatibility level: SQL Server 2008 (100)
• SQL Server 2008 allows three compatibility levels: 80, 90, and 100 that correlates to SQL Server 2000, SQL Server 2005, and SQL Server 2008.
• The following code will set the compatibility level of the AdventureWorks2008 database to SQL Server 2000:ALTER DATABASE AdventureWorks2008
SET COMPATIBILITY_LEVEL = 80
04/10/23 Yuefeng Li :INB(N)343 week 2 9
SQL Server 2008 Database Objects
• Server– The server scope encompasses all the objects that exist on the
instance of SQL Server, regardless of their respective database of namespace.
– The term server actually refers to the SQL Server 2008 instance.• Database
– The database scope defines all the objects within a database catalog. Schemas exist in the database scope.
• Schema– A schema is a namespace for database object. Each database can
contain one or more schemas. All data objects in a database reside in a specific schema.
04/10/23 Yuefeng Li :INB343 week 2 10
SQL Server 2008 Database Objects cont.
• Object Names– Every object is identified by a 4-part, fully qualified name, with the
form server.database.schema.object– The default schema is dbo.– The fully qualified name can be abbreviated– E.g., Fred’s query “select * from Person” may resolve to
FITPA00034343.AdventureWorks.dbo.Person; or
FITPA00034343.AdventureWorks.Person.Person– If we use the user-defined schema Person, the query will have to
be rewritten to reference the appropriate schema scope: “select * from Person.Person”; or Fred’s default schema can be changed to the Person schema
use AdventureWorks;
GO
Alter user Fred with DEFAULT_SCHEMA=Person;
04/10/23 Yuefeng Li :INB343 week 2 11
The most basic object: Table• A table can be thought of as equating to an accountant’s
ledger or an Excel spreadsheet. • It is made up of what is called domain data (columns) and
entity data (rows). The actual data for the database is stored in the tables.
• Each table definition also contains the metadata (descriptive information about data) that describes the nature of the data it is to contain.
• Each column has its own set of rules about what can be stored in that column.
• A violation of the rules of any one column can cause the system to reject an inserted row, an update to an existing row, or the deletion of a row.
04/10/23 Yuefeng Li :INB343 week 2 12
Example of a table
04/10/23 Yuefeng Li :INB343 week 1 13
Example of a table cont.• The table in Figure 1-1 is made up of five columns of data.
The number of columns remains constant regardless of how much data (even zero) is in the table.
• Currently, the table has fourteen records. The number of records will go up and down as we add or delete data, but the nature of the data in each record (or row) is described and restricted by the data type of the column.
04/10/23 Yuefeng Li :INB343 week 1 14
Partitioning Tables• All data pages in logical units are stored in partitions.• Normally, tables are stored in a single partition defined on
a single filegroup.• SQL server also provides the ability to separate large table
into smaller units by horizontally partitioning the table across multiple files managed by filegroup definitions.
04/10/23 Yuefeng Li :INB(N)343 week 2 15
Indexes• An index is an object that exists only within the framework
of a particular table or view. • An index works much like the index does in the back of an
encyclopedia. • There is some sort of lookup (or “key”) value that is sorted
in a particular way and, once you have that, you are provided another key with which you can look up the actual information.
• An index provides us ways of speeding the lookup of our information.
04/10/23 Yuefeng Li :INB343 week 2 16
Two Indexes categories ❑ Clustered — You can have only one of these per table.
– If an index is clustered, it means that the table on which the clustered index is based is physically sorted according to that index.
– If you were indexing an encyclopedia, the clustered index would be the page numbers (the information in the encyclopedia is stored in the order of the page numbers).
❑ Non-clustered — You can have many of these for every table. – This is more along the lines of what you probably think of when you
hear the word “index.” – This kind of index points to some other value that will let you find the
data. – For our encyclopedia, this would be the keyword index at the back of
the book.
04/10/23 Yuefeng Li :INB343 week 2 17
Triggers• A trigger is an object that exists only within the framework
of a table. • Triggers are pieces of logical code that are automatically
executed when certain things (such as inserts, updates, or deletes) happen to your table.
• Triggers can be used for a great variety of things, but are mainly used for either copying data as it is entered, or checking the update to make sure that it meets some criteria.
04/10/23 Yuefeng Li :INB(N)343 week 2 18
Triggers: An Example
• E.g. whenever a row is deleted from particular table, we might want to execute an INSERT statement:
CREATE TRIGGER MyTrigger AFTER DELETE ON MyTable REFERENCING OLD ROW AS MyOldRow FOR EACH ROW INSERT INTO MyLog VALUES (‘Deleted’, MyOldRow.MyColumn1, MyOldRow.MyColumn2,
CURRENT_TIMESTAMP);
(Because a DELETE statement may delete more than one row, we need to say FOR EACH ROW here, so that we get one row inserted for each row deleted.)
Yuefeng Li :INB(N)343 week 2
Triggers: Why use triggers?
• People use triggers for:– logging/audit-trail (for security and/or statistical analysis);– archiving to another table (for keeping historical data);– enforce validation-rules/constraints that cannot easily be
achieved through SQL constraints (CHECK, CREATE ASSERTION, etc.).
• Triggers cannot be applied to SELECT statements, although this may change in future SQL standards and implementations (to enforce better security & privacy policies).
Yuefeng Li :INB(N)343 week 2
What are the parts of a trigger?
• Every trigger must be created with 5 parts:1. a name;
2. a triggering event, which determines when the trigger is activated this specifies:• the event type (Insert, Update or Delete);
• for Updates: which columns are involved;
• the table involved;
3. whether the trigger is fired immediately before or after the triggering event (see later slide);
Yuefeng Li :INB(N)343 week 2
What are the parts of a trigger? cont.
4. a trigger level, either:– row-level: the trigger is fired once for each row
inserted/updated/deleted; or
– statement-level: the trigger is fired once for each SQL Insert/Update/Delete statement (regardless of the number of rows affected).
5. a trigger body, i.e. the statements executed when the trigger is activated.
Yuefeng Li :INB(N)343 week 2
Why BEFORE and AFTER triggers?
• An AFTER trigger fires after the action has completed. Both the old and new row values are "visible" to the trigger (where appropriate).
• So, an AFTER trigger is good for logging and/or archiving, as shown in the earlier example.
Yuefeng Li :INB(N)343 week 2
Why BEFORE and AFTER triggers?
• On the other hand, a BEFORE trigger fires before the action (Insert, Update or Delete) is attempted.
• What use is this? Such a trigger has the ability to:– skip the action for one or more rows (i.e. prevent certain
rows being inserted/updated/deleted); or– effectively change the action, (e.g. a Delete action could be
turned into an Update that sets a “logically-deleted” flag in each of the rows affected); or
– change the row being inserted/updated (for INSERT and UPDATE actions only) – as shown by example on next slide.
• So, a BEFORE trigger is good for enforcing rules/constraints.
Yuefeng Li :INB(N)343 week 2
Example of a BEFORE trigger
CREATE TRIGGER MyInsertTrigger BEFORE INSERT ON MyTable REFERENCING NEW ROW AS NewRow FOR EACH ROW SET NewRow.ContactPhoneNumber = COALESCE (NewRow.ContactPhoneNumber, NewRow.HomePhoneNumber);
• As you can see, this trigger tries to ensure that each newly-inserted row has a (non-null) ContactPhoneNumber. – If the new row has a non-null ContactPhoneNumber already, that
value will be used (i.e. unchanged). – But if the ContactPhoneNumber is null, then the
HomePhoneNumber will be used (which won’t be null, we hope).
Yuefeng Li :INB(N)343 week 2
Constraints• A constraint is yet another object that exists only within the
confines of a table. • Constraints are much like they sound; they confine the
data in your table to meet certain conditions. • Constraints, in a way, compete with triggers as possible
solutions to data integrity issues. They are not, however, the same thing: Each has its own distinct advantages.
04/10/23 Yuefeng Li :INB(N)343 week 2 26
Triggers vs. Constraints
• Remember that CHECK constraints can be specified within CREATE TABLE (and within CREATE ASSERTION).
• Some DBMSs use (system-generated) triggers for handling Referential Integrity constraints, e.g. cascaded deletes/updates.– PostgreSQL does this.
– It’s relatively easy to implement, but is not the most machine-efficient way to do things.
Yuefeng Li :INB(N)343 week 2
Triggers vs. Constraints (cont.)
• Triggers are more flexible. This can be an advantage: a trigger can execute a complex SQL routine.
• Constraints are more rigid. But this rigidity can be an advantage in some situations:– It’s easier for humans to see what the constraint does.
– It’s easier for the DBMS to see what the constraint does, and to optimise accordingly.
– Constraints can be deferred (i.e. checked at the end of a transaction, rather than during it) – triggers can’t.
Yuefeng Li :INB(N)343 week 2
Diagrams• A database diagram is a visual representation of the database design, including the
various tables, the column names in each table, and the relationships between tables.
04/10/23 Yuefeng Li :INB343 week 2 29
Views
• A view is, at its core, really nothing more than a stored query, to reduce the complexity for the end user, or hiding sensitive data.
• The basic formCREATE VIEW <view name>
AS
<SELECT statement>
04/10/23 Yuefeng Li :INB343 week 2 30
ExampleUSE Oly2000a;
GO
CREATE view Name_vw
AS
SELECT GivenName, CountryCode
FROM dbo.Competitors
WHERE CompetitorNum <=9000
• Notice it does not return any rows. It just lets us know that the view has been created: Command(s) completed successfully. We can run a SELECT statement against this view
SELECT * FROM dbo.Name_vw
04/10/23 Yuefeng Li :INB343 week 2 31
SQL Server 2008 Database• Two types of databases: system databases and user
databases.• The system databases are used to store system-wide data
and metadata, including master, model. msdb, and tempdb databases , as well as the hidden resources database.
• User databases are created by users who have the appropriate level of permission to store application data.
04/10/23 Yuefeng Li :INB343 week 2 32
SQL Server 2008 Database Storage• All system and user databases are stored in files.• There is always a minimum of two files: one data file (e.g.,
“mydb.mdf”) and one transaction log file (e.g., mydb_log.ldf).
• When a user database is created, it must contain at least a data file, or called primary data file, which is a member of the default Primary filegroup.
• More filegroups can aslo be defined.• Upon initial creation of database, one transaction log must
be defined, which is used to record all modifications to the database.
04/10/23 Yuefeng Li :INB343 week 2 33
SQL Server Data Types• We are going to assume that you already understand how data types
work.• Normal Numerics: Integer, Decimal and Float
04/10/23 Yuefeng Li :INB343 week 2 34
Data Type Name
Class Size in Bytes
Nature of the data
Bit Integer 1 The first bit data type in a table takes up 1 byte; the next 7 make use of the same byte.
Bigint Integer 8 -263 to 263-1
Int Integer 4 -2,147,483,648 to 2,147,483,647
SmallInt Integer 2 -32,768 to 32,767
TinyInt Integer 1 0 to 255
Decimal or Numeric
Decimal/ Numeric
Varies Fixed precison and scale from -1038-1 to 1038-1. The two names are synonymous,
Float Approximate Numerics
Varies -1.79E+308 to 1.79E+308
SQL Server Data Types: Money, Date/Time
04/10/23 Yuefeng Li :INB343 week 2 35
Data Type Name Class Size in Bytes
Nature of the data
Money Money 8 From -263 to 263 plus precision to four decimal places.
SmallMoney Money 4 -214, 748.3648 to +214,748.3647
DateTime Date/Time 8 from January 1, 1753 to December 31, 9999 with an accuracy of three hundredths of a second.
DateTime2 Date/Time Varies (6-8)
Supports larger dateranges and large time-fraction precision
SamllDateTime Date/Time 4 from January 1, 1900, to June 6, 2079, with an accuracy of one minute
DateTimeOffset Date/Time Varies (8-10)
Time is stored internally as UTC time
Date Date/Time 3 from January 1, 0001 to December 31, 9999, as defined by the Gregorian calendar.
Time Date/Time Varies (3-5)
precisions as granular as 100 nano econds
SQL Server Data Types: Character, Unicode and Binary
04/10/23 Yuefeng Li :INB343 week 2 36
Data Type Name
Class Size in Bytes
Nature of the data
Char Character Varies Fixed-length character data
VarChar Character Varies Variable-length character data
Text Character Varies Legacy support as of SQL Server 2005. Use varchar(max) instead!
XML Character Varies Defines a character field as being for XML data
Nchar Unicode Varies Fixed-length Unicode character data
NVarChar Unicode Varies Variable-length Unicode character data
Ntext Unicode Varies Variable-length Unicode character data. It is legacy support only
Binary Binary Varies Fixed-length binary data with a maximum length of 8,000 bytes
VarBinary Binary Varies Variable-length binary data with a maximum specified length of 8,000 bytes
Image Binary Varies Legacy support only as of SQL Server 2005. Use varbinary(max) instead!
T-SQL• The most fundamental Transact-SQL (T-SQL) statements. • T-SQL is SQL Server’s own dialect of Structured Query
Language (SQL). • The T-SQL statements that we will learn in this lecture are:
❑ SELECT
❑ INSERT
❑ UPDATE
❑ DELETE
04/10/23 Yuefeng Li :INB343 week 2 37
The Basic Syntax Rules of a SELECT Statement
SELECT [ALL|DISTINCT] [TOP (<expression>) [PERCENT] [WITH TIES]] <column list>
[FROM <source table(s)/view(s)>]
[WHERE <restrictive condition>]
[GROUP BY <column name or expression using a column in the SELECT list>]
[HAVING <restrictive condition based on the GROUP BY results>]
[ORDER BY <column list>]
[[FOR XML {RAW | AUTO | EXPLICIT | PATH [(<element>)]} [,XMLDATA][,ELEMENTS][,BINARY base 64]]
[OPTION (<query hint>, [, ...n])]
04/10/23 Yuefeng Li :INB343 week 2 38
Example
USE Oly2000a
SELECT GivenName, CountryCode, COUNT(*) AS Count
FROM dbo.Competitors
WHERE CompetitorNum <= 9000
GROUP BY GivenName, CountryCode
ORDER BY GivenName, CountryCode;
04/10/23 Yuefeng Li :INB343 week 2 39
Result
04/10/23 Yuefeng Li :INB343 week 2 40
The full syntax for INSERTINSERT [TOP ( <expression> ) [PERCENT] ] [INTO] <tabular object>
[(<column list>)]
[ OUTPUT <output clause> ]
{ VALUES (<data values>) [,(<data values>)] [, …n]
| <table source>
| EXEC <prodecure>
| DEFAULT VALUES
The more basic syntax for an INSERT statement looks like this:
INSERT [INTO] <table>
[(<column list>)]
VALUES (<data values>) [,(<data values>)] [, …n]
04/10/23 Yuefeng Li :INB343 week 2 41
Examples of InsertINSERT INTO Sales
(StoreCode, OrderNumber, OrderDate, Quantity, Terms, TitleID)
VALUES
(‘TEST’, ‘TESTORDER’, ‘01/01/1999’, 10, ‘NET 30’, 1234567);
USE AdventureWorks2008;
DECLARE @MyTable Table
(
SalesOrderID int,
CustomerID char(5)
);
INSERT INTO @MyTable
SELECT SalesOrderID, CustomerID
FROM AdventureWorks2008.Sales.SalesOrderHeader
WHERE SalesOrderID BETWEEN 44000 AND 44010;
04/10/23 Yuefeng Li :INB343 week 2 42
Update StatementUPDATE [TOP ( <expression> ) [PERCENT] ] <tabular object>
SET <column> = <value>[.WRITE(<expression>, <offset>, <length>)]
[,<column> = <value>[.WRITE(<expression>, <offset>, <length>)]]
[ OUTPUT <output clause> ]
[FROM <source table(s)>]
[WHERE <restrictive condition>]
Let’s look at the more basic syntax:
UPDATE <table name>
SET <column> = <value> [,<column> = <value>]
[FROM <source table(s)>]
[WHERE <restrictive condition>]
04/10/23 Yuefeng Li :INB343 week 2 43
DeleteDELETE [TOP ( <expression> ) [PERCENT] ] [FROM] <tabular object>
[ OUTPUT <output clause> ]
[FROM <table or join condition>]
[WHERE <search condition> | CURRENT OF [GLOBAL] <cursor name>]
The basic syntax couldn’t be much easier:
DELETE <table name>
[WHERE <condition>]
04/10/23 Yuefeng Li :INB343 week 2 44
Joins• A JOIN does just what it sounds like — it puts the
information from two tables together into one result set.• We can think of a result set as being a “virtual” table. It has
both columns and rows, and the columns have data types.• The various forms of the JOIN clause. These will include:
❑ INNER JOIN
❑ OUTER JOIN (both LEFT and RIGHT)
❑ FULL JOIN
❑ CROSS JOIN
04/10/23 Yuefeng Li :INB343 week 2 45
INNER JOIN• The preferred code for an INNER JOIN looks something
like this:
SELECT <select list>
FROM <first_table>
<join_type> <second_table>
[ON <join_condition>]
04/10/23 Yuefeng Li :INB343 week 2 46
Examples of INNER JOINS SELECT *
FROM dbo.Events
INNER JOIN dbo.Results
ON dbo.Events.EventId = dbo.Results.EventId
SELECT pbe.EventId, pbe.Style, hre.CompetitorNum, hre.ElapsedTime, pp.CountryCode, pp.FamilyName
FROM dbo.Events pbe
INNER JOIN dbo.Results hre
ON pbe.EventId = hre.EventId
INNER JOIN dbo.Competitors pp
ON hre.CompetitorNum = pp.CompetitorNum
04/10/23 Yuefeng Li :INB343 week 2 47
Results of the first join
04/10/23 Y Li :ITB(N)740 week 2 48
The result of the second join
04/10/23 Yuefeng Li :INB343 week 2 49
The result of the second join cont.
04/10/23 Yuefeng Li :INB343 week 2 50
(208 row(s) affected)
OUTER JOINsSELECT <SELECT list>
FROM <the table you want to be the “LEFT” table>
<LEFT|RIGHT> [OUTER] JOIN <table you want to be the “RIGHT” table>
ON <join condition>
• OUTER JOINs are, as I’ve said, inclusive in nature. What specifically gets included depends on which side of the join you have emphasized.
• A LEFT OUTER JOIN includes all the information from the table on the left, and a RIGHT OUTER JOIN includes all the information from the table on the right.
04/10/23 Yuefeng Li :INB343 week 2 51
Examples of OUTER JOINsSELECT *
FROM dbo.Events
LEFT OUTER JOIN dbo.Results
ON dbo.Events.EventId = dbo.Results.EventId
SELECT pbe.EventId, pbe.Style, hre.CompetitorNum, hre.ElapsedTime, pp.CountryCode, pp.FamilyName
FROM dbo.Events pbe
LEFT OUTER JOIN dbo.Results hre
ON pbe.EventId = hre.EventId
LEFT OUTER JOIN dbo.Competitors pp
ON hre.CompetitorNum = pp.CompetitorNum
04/10/23 Yuefeng Li :INB343 week 2 52
The result of the first outer join
04/10/23 Y Li :ITB(N)740 week 2 53
Left Right
The result of the second outer join
(214 row(s) affected)
04/10/23 Yuefeng Li :INB343 week 2 54
OUTER vs. INNER• The outer join yields similar results, but with a rather
important difference.• If we are joining and we don’t have a matching record for
the table, then what happens? • Since it is not on the inclusive side of the JOIN (in this
case, the LEFT side), SQL Server will fill in a NULL for any value that comes from the opposite side of the join if there is no match with the inclusive side of the JOIN.
• In this case, all but 3 of our rows have EventIDs. What we can discern from that is that all of our Events are associated with at least one result except three (SWM411, SWM412, SWM451).
04/10/23 Yuefeng Li :INB343 week 1 55
FULL JOIN• Like many things in SQL, a FULL JOIN (also known as a
FULL OUTER JOIN) is basically what it sounds like — it is a matching up of data on both sides of the JOIN with everything included, no matter what side of the JOIN it is on.
• A FULL JOIN is perhaps best described as what you would get if you could do a LEFT JOIN and a RIGHT JOIN in the same JOIN. You get all the records that match, based on the JOIN field(s).
• You also get any records that exist only on the left side, with NULLs being returned for columns from the right side. Finally, you get any records that exist only on the right side, with NULLs being returned for columns from the left side.
04/10/23 Yuefeng Li :INB343 week 1 56
Countries Competitors
Inner Join
Countries Competitors
Full Outer Join
Countries Competitors
Left Outer Join
Countries Competitors
Right Outer Join
Yuefeng Li :INB343 week 2
ExamplesSELECT ev.EventId, re.CompetitorNum, re.ElapsedTime
FROM dbo.Events ev
FULL JOIN dbo.Results re
ON ev.EventId = re.EventId
WHERE re.CompetitorNum is NULL
GO
SELECT ev.EventId, re.CompetitorNum, re.ElapsedTime
FROM dbo.Events ev
LEFT OUTER JOIN dbo.Results re
ON ev.EventId = re.EventId
WHERE re.CompetitorNum is NULL
GO
SELECT ev.EventId, re.CompetitorNum, re.ElapsedTime
FROM dbo.Events ev
RIGHT OUTER JOIN dbo.Results re
ON ev.EventId = re.EventId
04/10/23 Yuefeng Li :INB343 week 2 58
CROSS JOIN• CROSS JOINs are very strange critters indeed. • A CROSS JOIN differs from other JOINs in that there is no
ON operator and that it joins every record on one side of the JOIN with every record on the other side of the JOIN.
• In short, you wind up with a Cartesian product of all the records on both sides of the JOIN.
• The syntax is the same as any other JOIN, except that it uses the keyword CROSS (instead of INNER, OUTER, or FULL) and that it has no ON operator.
04/10/23 Yuefeng Li :INB(N)343 week 2 59
04/10/23 Y Li: INB(N)343 week 2 60
References
1. Vieira, Robert Author, Beginning Microsoft SQL Server 2008 Programming, (QUT Library E book).
2. Chris Leiter et al., Beginning Microsoft SQL Server 2008 Administration, Wrox, Hoboken. (QUT Library E book)