title of the presentationdownload.microsoft.com/download/0/b/e/.../dat002.pdfthis presentation is...
TRANSCRIPT
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 1
DAT002
Enterprise Data Warehousing with SQL Server Integration Services & Reporting Services
Grant [email protected] Manager, SQL Server Integration ServicesMicrosoft
Patrocinadores Prerequisites
Assumptions
Experience with SSIS and SSAS
GoalsDiscuss design, performance, and scalability for building ETL packages and reports
Best practices
Common mistakes
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 2
Agenda
Integration Services
Quick overview of IS
Principles of Good Package Design
Performance Tuning
Reporting Services
Reporting Services Architecture
Setup and configuration
Securing your report server
Report and model management
Scheduling and subscriptions
Scale out
BPA = Best Practice Analyzer
Utility that scans your SQL Server metadata and recommends best practices
Best practices from dev team and Customer Support Services
What‟s new:
Support for SQL Server 2005
Support for Analysis Services and Integration Services
Scan scheduling
Auto update framework
CTP available now, RTM April
http://www.microsoft.com/downloads/details.aspx?FamilyId=DA0531E4-E94C-4991-82FA-F0E3FBD05E63&displaylang=en
A DW architecture
Datawarehouse(SQL Server, Oracle,
DB2, Teradata)
SQL/OracleSAP/Dynamic
sLegacy Text XML
Integration Services
Reports Dashboard Scorecards Excel BI tools
Analysis Services (DAT001)
What is SQL Server Integration Services?
Introduced in SQL Server 2005
The successor to Data Transformation Services
The platform for a new generation of high-performance data integration technologies
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 3
Call center data: semi
structured
Legacy data: binary files
Application database
ETLWarehouse
Reports
Mobile
data
Data mining
Alerts & escalation
•Integration and warehousing require separate, staged operations.•Preparation of data requires different, often incompatible, tools.•Reporting and escalation is a slow process, delaying smart responses.•Heavy data volumes make this scenario increasingly unworkable.
Hand
coding
StagingText mining
ETL Staging
Cleansing &
ETL
Staging
ETL
ETL Objective: Before SSIS
Call center:
semi-structured data
Legacy data: binary files
Application database
•Integration and warehousing are a seamless, manageable operation.•Source, prepare, and load data in a single, auditable process.•Reporting and escalation can be parallelized with the warehouse load.•Scales to handle heavy and complex data requirements.
SQL Server Integration Services
Text miningcomponents
Customsource
Standardsources
Data-cleansingcomponents
Merges
Data miningcomponents
Warehouse
Reports
Mobile
data
Alerts & escalation
Changing the Game with SSIS
SSIS Architecture
Control Flow (Runtime)A parallel workflow engine
Executes containers and tasks
Data Flow (“Pipeline”)A special runtime task
A high-performance data pipeline
Applies graphs of components to data movement
Component can be sources, transformations or destinations
Highly parallel operations possible
Principles of Good Package Design -General
Follow Microsoft Development GuidelinesIterative design, development & testing
Understand the BusinessUnderstanding the people & processes are critical for success
Kimball’s “Data Warehouse ETL Toolkit” book is an excellent reference
Get the big pictureResource contention, processing windows, …
SSIS does not forgive bad database design
Old principles still apply – e.g. load with/without indexes?
Platform considerationsWill this run on IA64 / X64?
No BIDS on IA64 – how will I debug?
Is OLE-DB driver XXX available on IA64?
Memory and resource usage on different platforms
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 4
Principles of Good Package Design -Architecture
Process ModularityBreak complex ETL into logically distinct packages (vs. monolithic design)
Improves development & debug experience
Package ModularitySeparate sub-processes within package into separate Containers
More elegant, easier to develop
Simple to disable whole Containers when debugging
Component ModularityUse Script Task/Transform for one-off problems
Build custom components for maximum re-use
Bad Modularity
Good Modularity Principles of Good Package Design -Infrastructure
Use Package ConfigurationsBuild it in from the start
Will make things easier later on
Simplify deployment Dev QA Production
Use Package LoggingPerformance & debugging
Build in Security from the startCredentials and other sensitive info
Package & Process IP
Configurations & Parameters
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 5
Principles of Good Package Design -Development
SSIS is visual programming!
Use source code control systemUndo is not as simple in a GUI environment!
Improved experience for multi-developer environment
Comment your packages and scriptsIn 2 weeks even you may forget a subtlety of your design
Someone else has to maintain your code
Use error-handlingUse the correct precedence constraints on tasks
Use the error outputs on transforms – store them in a table for processing later, or use downstream if the error can be handled in the package
Try…Catch in your scripts
Component Drilldown - Tasks & Transforms
Avoid over-designToo many moving parts is inelegant and likely slow
But don’t be afraid to experiment – there are many ways to solve a problem
Maximize ParallelismAllocate enough threads
EngineThreads property on DataFlow Task
“Rule of thumb” - # of datasources + # of async components
Minimize blockingSynchronous vs. Asynchronous components
Memcopy is expensive – reduce the number of asynchronous components in a flow if possible – example coming up later
Minimize ancillary dataFor example, minimize data retrieved by LookupTx
Debugging & Performance Tuning -General
Leverage the logging and auditing featuresMsgBox is your friend
Pipeline debuggers are your friend
Use the throughput component from Project REAL
Experiment with different techniquesUse source code control system
Focus on the bottlenecks – methodology discussed later
Test on different platforms32bit, IA64, x64
Local Storage, SAN
Memory considerations
Network & topology considerations
Debugging & Performance Tuning -Volume
Remove redundant columnsUse SELECT statements as opposed to tables
SELECT * is your enemy
Also remove redundant columns after every async component!
Filter rowsWHERE clause is your friend
Conditional Split in SSIS
Concatenate or re-route unneeded columns
Parallel loadingSource system split source data into multiple chunks
Flat Files – multiple files
Relational – via key fields and indexes
Multiple Destination components all loading same table
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 6
Debugging & Performance Tuning -Application
Is BCP good enough?Overhead of starting up an SSIS package may offset any performance gain over BCP for small data sets
Is the greater manageability and control of SSIS needed?
Which pattern?Many Lookup patterns possible – which one is most suitable?
See Project Real for examples of patterns:http://www.microsoft.com/sql/solutions/bi/projectreal.mspx
Which component?Bulk Import Task vs. Data Flow
Bulk Import might give better performance if there are no transformations or filtering required, and the destination is SQL Server.
Lookup vs. MergeJoin (LeftJoin) vs. set based statements in SQLMergeJoin might be required if you’re not able to populate the lookup cache.
Set based SQL statements might provide a way to persist lookup cache misses and apply a set based operation for higher performance.
Script vs. custom componentScript might be good enough for small transforms that’re typically not reused
Case Study - Patterns
105 seconds 83 seconds
Use Error Output for handling Lookup miss
Ignore lookup errors and check for null looked up values in Derived Column
Debugging & Performance Tuning – A methodology
Optimize and Stabilize the basicsMinimize staging (else use RawFiles if possible)
Make sure you have enough Memory
Windows, Disk, Network, …
SQL FileGroups, Indexing, Partitioning
Get Baseline
Replace destinations with RowCount
Source->RowCount throughput
Source->Destination throughput
Incrementally add/change components to see effectThis could include the DB layer
Use source code control!
Optimize slow components for resources available
Case Study - ParallelismFocus on critical path
Utilize available resources
Memory Constrained Reader and CPU Constrained
Let it rip! Optimize the slowest
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 7
Package Design Principles
Integration Services Summary
Follow best practice development methods
Understand how SSIS architecture influences performanceBuffers, component types
Design Patterns
Learn the new featuresBut do not forget the existing principles
Use the native functionalityBut do not be afraid to extend
Measure performanceFocus on the bottlenecks
Maximize parallelism and memory use where appropriateBe aware of different platforms capabilities (64bit RAM)
Testing is key
Recursos ÚteisOverview videos
http://www.microsoft.com/sql/technologies/integration/tours.mspx
SQL Server Integration Services site – links to blogs, training, partners, etc.: http://msdn.microsoft.com/SQL/sqlwarehouse/SSIS/default.aspx
http://www.microsoft.com/sql/solutions/bi/projectreal.mspx
SSIS MSDN Forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=80&SiteID=1
SSIS MVP community site: http://www.sqlis.com
Microsoft BI homepage:
http://www.microsoft.com/bi
Enterprise Reporting with SQL Reporting Services
- Management and Scalability –
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 8
Agenda
Reporting Services Architecture
Setup and configuration
Securing your report server
Report and model management
Scheduling and subscriptions
Scale out
SQL Server Database / SQL Server Agent
Shared Components
Web Service
(IIS / ASP.NET)
Windows Service
URL Access SOAP Endpoints
Data
WMI
Delivery
Security
Report Manager
Rendering
BrowserDesignTools
ManagementStudio
ConfigurationTool
ReportServer ReportServerTempDB
Reporting Services ComponentsReporting Services Native Mode
SQL Server Reporting Services and SharePoint Integration
SQL Server 2005 Reporting Services SP2 integrates with Windows SharePoint Services to enable publishing, viewing, and management of rich reports
Office SharePoint Server 2007 “Lights Up”
Report library integration of Reporting Services functionality
Rich reports in Dashboards with filter Web Parts
Integration Benefits
New services for WSS and Office SharePoint 2007 Servers
Integrated User Experience for Reporting Services users
SharePoint Content DB Report Server DB
WSS/OS Report Server
Reporting Services Add-in
RS Viewer
Web Part
Report
Management UI
WSS Object Model
SP2 Report Server Web Service
Security
Extension
Catalog
Synchronization
WSS Object Model
SharePoint Integration ArchitectureReporting Services SharePoint Mode
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 9
Reporting Services 2005 Setup
Reporting Services setup in Microsoft SQL Server 2005 has two modes:
Default Configuration
Files Only Installation
Default configuration assumes you want to:Install on default web site (will create new App Pool in Microsoft Windows 2003)
Install relational database in same instance
Use service accounts for database connection
Configurations requiring files only setup:Remote catalog database
Scale out deployment (a.k.a. Web farm) installation
SharePoint Integration mode
Client Setup includes the Microsoft Visual Studio 2005 shell (Business Intelligence Development Studio)
Upgrading from Microsoft SQL Server 2000 Reporting Services (SSRS)
Setup supports upgrade of “default” SQL 2000 Reporting Services installations
No changes to virtual directories, custom extensions
SQL 2005 Reporting Services supports use of SQL Server 2000 relational database
Caveat: Setup upgrades all components in default instance
Existing reports will continue to work
Published reports and snapshots will continue to work on upgraded Report Server
SQL 2000 reports can be published to SQL 2005 Report Server
Opening reports in the Report Designer will upgrade them to the new RDL schema
Reporting Services Web Service supports existing SOAP endpoint
New endpoints for management and report execution
WMI object model has changed
Reporting Services 2005 Configuration Tool
Configuration Tool Features
Virtual directories
Supports non-default Web sites
Service identities
Database settings
Creation and upgrade
Scripts can be saved to be applied later
SharePoint integration (SP2)
Key management
Scale-out initialization
Does not sync settings across machines
E-mail delivery settings
Report processing account
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 10
Server ConfigurationManagement & Configuration Tools
Report Manager
Web-based viewing and management application
SQL Server 2005 Management Studio
Superset of Report Manager functionality
Reporting Services Configuration Tool
Windows-based tool for local or remote configuration of service
Client utilities
Script Host
Encryption Key Management
Custom applications
Windows SharePoint Services / Microsoft Office SharePoint Server 2007
Enabled in SP2
Report Management
Reports, data sources, and report models are published to SharePoint document libraries
When a report is selected in WSS, the report viewer Web Part calls the report server API to process and render the report
Users can manage properties and subscribe to reports through WSS UI (calls RS SOAP API)
UI includes ability to launch Report Builder to create / edit reports
New report server delivery extension allows for rendered reports to be delivered to WSS document libraries (including Report Center)
Design tools (Report Designer, Report Builder, Model Designer) are updated to work with WSS
Report Manager is not supported in SharePoint Integration Mode
Reporting Services SharePoint Mode
Report Viewer Web Part
Used in full page view or on Web Part Pages
Wraps the ReportViewer ASP.NET Viewer Control
Handles report rendering calls to report server
Web Part PropertiesReport: ReportPath, HyperlinkTarget
View: AutoGenerateTitle, AutoGenerateDetailLink, ToolBarMode, ParametersMode, ParametersAreaWidth, DocumentMapMode, DocumentMapAreaWidth
Parameter Default Values
Supports Filter Consumer and Row Consumer interfaces for specifying report parameter values via filter Web Parts
Can slice Excel Workbooks and Reports on a single Web Part page
Reporting Services SharePoint Mode
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 11
Management APIs
Web ServiceFor managing content in Report Server
SQL Server 2005 splits API into Management and Execution endpoints
Backward compatibility endpoint for existing applications
Full SOAP API implementation (includes WSDL) w/complex types
Add service reference in Visual Studio
Supports SSL and scripting
WMIUsed for managing service configuration
Enumerate instances of Report Server
Supports remote configuration and works even if Web service is not available
No WMI events (configuration only)
Role-Based Security Model
Tasks
Sets of low-level operations
Item-level (for example, create report) or system-level (for example, manage jobs)
Not customizable
Roles
Sets of tasks
Default roles installed by default(browser, publisher)
Default roles can be customized, new ones created
Roles identified by name, localized
Groups/users
Windows/Active Directory or customauthentication users
Role assignments
Associates groups/users with Roles
Inherited from parent in namespace
SharePoint Integrated Mode in SP2 maps to WSS permissions
Item
Role
Assignment
Group or
User
Role
Task
Operation
Securing ItemsReport Data Sources
Administrator can set connection type and connection string after publishing
Credential OptionsPrompt for Windows or database credentials
Securely stored Windows or database credentials
Integrated Security (Requires Kerberos delegation; can be disabled in SAC)
None (uses report execution account; must be enabled in Configuration Tool)
Shared Data SourcesConnection and credential information stored as a secured object in the namespace
Single point of management for multiple reports
SharePoint Integration Mode in SP2 can use .RSDS or .ODC files
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 12
Report Caching
Execution sessionsAutomatically created for each report execution
Keeps consistency between server round trips (images, paging, exporting)
Session timeout set in server properties
Cache snapshotsOn-demand reports can be cached between users
Cache index is based on parameter values
Cache valid for a specified time after execution or cleared on schedule
Limitations – User-specific expressions (User ID, Language), stored credentials
Tip: Use Null Delivery Provider to deliver reports to cache
Snapshots and History
Execution snapshot
Report execution is scheduled, all users get same data
Single instance of processed report
Limitations: No query parameters or user-specific expressions, stored credentials
History snapshots
Multiple instances of report snapshots for archiving, auditing purposes
Stored independently of data source, report definition
System and report-specific retention policy
Managing Report Execution
Configure cache and snapshots via Report Manager or SQL Management Studio
Set execution timeouts on a system-wide or per-report basis
Long running reports can be stopped manually
Report Execution Log enables analysis of server usage
Optionally, executions are logged to Report Server database
Includes report, format, user, start, end, cache hit, size
Setup includes SSIS package and sample reports
Scheduling
Management events can be scheduled on the report server
Caching, Subscriptions, History
Schedules are stored in database and integrated with SQL Agent
When triggered, Agent adds entry to queue
Scheduled events are queued in database and polled by Windows service
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 13
Shared Schedules
Managed shared schedules independently of reports, subscriptions, or snapshots
Change shared schedule properties
Name
Days, times, or frequencies
Start and end dates
Pause and resume shared schedule
Expire a shared schedule
Delete shared schedule
Subscriptions
Subscription triggered by an event (schedule, snapshot creation, external)
Delivery extension (e-mail, file share) specifies how report is delivered
E-mail delivery requires an SMTP server
Extensible delivery architecture
Can specify output format (HTML, XLS)
Can deliver links as well as rendered reports
Two types of subscriptions
Standard
Data Driven
Standard Subscriptions
Single report sent to a fixed set of addressesEnd user wants to customize his/her own report delivery
How it worksSet up by a user with „Manage Individual Subscriptions‟ permission
User creates a standing request to run a report at a specific time and delivered in a certain format
Can be triggered based on a schedule or snapshot generation
Specify report, execution conditions, parameters, rendering format, delivery location, etc.
In SQL Server 2005, users can subscribe to reports with User!UserID and User!Language
Data Driven Subscriptions
When to use
Delivery of a report to a dynamic list of destinations with customized content for each destination
How it works
Set up by a user with „Manage any Subscriptions‟ permission
Define delivery query to return list of destinations and parameters
Specify delivery settings and parameter values as a static or field from delivery query
Set to run according to a defined schedule or trigger from snapshot
Data driven subscriptions require
SQL Sever Enterprise Edition!
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 14
Scale Out Deployment
Data Sources
Flat Files,
OLE DB,
ODBC
Oracle
SQL Server
DB2
ClientsReporting Services Scale Out Deployment
Windows Server
SQL Server
Report Metadata and Cache
Failover Cluster
NLB
Report Server
Windows Server
IIS
Windows Server
SQL Server
Report Server
Windows Server
IIS
Report Server
Windows Server
IIS
Scale out requires SQL Server Enterprise Edition!
Scale Out Setup
Run setup (files only) to install first report server instance
Run setup (files only) to install second report server instance
Use configuration tool to create report server database and configure first report server instance
Use configuration tool to configure second report server instance
Install and configure load balancing functionality (NLB, switch)
Server Configuration Files
Unique per Report Server – not transferable
Configuration (including extensions) should be same per machine in scale-out deployment
Specific areas of interest
Report Server database connection
Report Execution account and password
Extension Configuration (including E-mail Delivery)
Use Configuration Tool, text editor or command line utilities to modify
File monitoring updates server settings
Code Access Security (CAS) for extensions stored in separate file
Data Encryption
When data source connections and credentials are stored, they are encrypted in Report Server database
Stored symmetric key encrypted with instance-based private key
In SQL Server 2005, only Windows service has encryption/decryption logic
Shared by all machines in scale-out deployment
Restore key when machine name, installation or Windows service account changes
Manage keys with RSKEYMGMT or Configuration Tool
Extract a copy of the encryption key
Apply stored encryption key
Remove encrypted data on machine
Always backup your symmetric key!
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 15
Reporting Services Summary
General “Care and Feeding” of your Report Server is easy!
Once initial configuration has been finished, many users can manage content themselves
Complex configurations will require planning
Network infrastructure
Security architecture
Deployment policies
Scalability requirements
SharePoint integrated mode in SP2 requires understanding of WSS management as well
Recursos ÚteisReporting Services Product Site
http://www.microsoft.com/sql/reporting
Technical Chats and Webcasts
http://www.microsoft.com/communities/chats/default.mspx
http://www.microsoft.com/usa/webcasts/default.asp
MSDN & TechNet
http://microsoft.com/msdn
http://microsoft.com/technet
Virtual Labs
http://www.microsoft.com/technet/traincert/virtuallab/rms.mspx
Newsgroups
http://communities2.microsoft.com/communities/newsgroups/en-us/default.aspx
Technical Community Sites
http://www.microsoft.com/communities/default.mspx
User Groups
http://www.microsoft.com/communities/usergroups/default.mspx
Participe Noutras Sessões
DAT001 – Analysis Services
DAT003 – Office BI
DAT004 – Performance Point 2007
DAT006 – Quad Core for SQL Databases
Outros RecursosPara Profissionais de TI
TechNet Plus2 incidentes de suporte gratuito profissional
software exclusivo: Capacity Planner
software Microsoft para avaliação
actualizações de segurança e service packs
acesso privilegiado à knowledge base
formação gratuita
e muito mais.
www.microsoft.com/portugal/technet/subscricoes
Microsoft TechDays 2007 - Lisboa 22/03/2007 5:12 PM
2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 16
Questionário de AvaliaçãoPassatempo!
Complete o questionário de avaliação e devolva-o no balcão da recepção.
Habilite-se a ganhar uma Xbox 360 por dia!
DAT002
Integration Services & Reporting Services
© 2007 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.