sql/med and more
DESCRIPTION
Management of External Data. SQL/MED and More. Database Seminar HS11/12. Overview. Introduction SQL/MED Linking PostgreSQL & MSSQL Further Information (about SQL/MED) Conclusion. Introduction (1/2). Different Database Managemenent Systems Each system has different benefits - PowerPoint PPT PresentationTRANSCRIPT
SQL/MED AND MORE
Database Seminar HS11/12
Management of External Data
Overview
Introduction
SQL/MED
Linking PostgreSQL & MSSQL
Further Information (about SQL/MED)
Conclusion
Introduction (1/2)
Different Database Managemenent Systems
Each system has different benefits Possible scenarios
… …
Introduction (2/2)
SQL/MED gives new opportunities
Use other systems as needed That’s possible? Really?
SQL/MED (1/3)
SQL/MED defined in ISO/IEC 9075-9:2003 Management of External Data
Two concepts Foreign Data Wrappers Datalinks
At least 10 years old Not very widespread Most “googled” information is PostgreSQL
related
SQL/MED (2/3)
Foreign Data WrappersAdvanceInitRequestAllocDescriptorAllocQueryContextAllocWrapperEnvCloseConnectServerFreeDescriptorFreeExecutionHandleFreeFSConnectionFreeQueryContextFreeReplyHandleFreeWrapperEnvGetAuthorizationIdGetBoolVEGetDescriptorGetDiagnosticsGetDistinctGetNextReplyGetNumBoolVEGetNumChildrenGetNumOrderByElemsGetNumReplyBoolVEGetNumReplyOrderByGetNumReplySelectElemsGetNumReplyTableRefsGetNumRoutMapOptsGetNumSelectElemsGetNumServerOptsGetNumTableColOptsGetNumTableOptsGetNumTableRefElemsGetNumUserOptsGetNumWrapperOptsGetOptsGetOrderByElemGetReplyBoolVEGetReplyCardinalityGetReplyDistinctGetReplyExecCostGetReplyFirstCostGetReplyOrderElemGetReplyReExecCostGetReplySelectElemGetReplyTableRefGetRoutineMappingGetRoutMapOpt
GetRoutMapOptNameGetSelectElemGetSelectElemTypeGetServerNameGetServerOptGetServerOptByNameGetServerTypeGetServerVersionGetSPDHandleGetSQLStringGetSRDHandleGetStatisticsGetTableColOptGetTableColOptByNameGetTableOptGetTableOptByNameGetTableRefElemGetTableRefElemTypeGetTableRefTableNameGetTableServerNameGetTRDHandleGetUserOptGetUserOptByNameGetValExprColNameGetValueExpDescGetValueExpKindGetValueExpNameGetValueExpTableGetVEChildGetWPDHandleGetWrapperLibraryNameGetWrapperNameGetWrapperOptGetWrapperOptByNameGetWRDHandleInitRequestIterateOpenReOpenSetDescriptorTransmitRequest
Access external data
FDW is a library Programming language
neutral Compile for different OS’s
Good idea – breakthrough? API Existing technologies
SQL/MED (3/3)
Data links
Link files like cell values
DBMS becomes “manager” Only process allowed to
change the file Integrity mechanism
Good idea – breakthrough? Very OS heavy Existing technologies
Linking PostgreSQL & MSSQL (1/4)
Microsoft Linked Servers
SQL/MED: Foreign Data Wrappers
Linking PostgreSQL & MSSQL (2/4)
Microsoft Linked Servers
OLE DB Very similar to Foreign Data Wrappers Connection to "wrappers" via interface
Related to ODBC Not limited to SQL C++ instead of C
Widespread Many OLE DB
providers available Supports ODBC
Linking PostgreSQL & MSSQL (3/4)
PostgreSQL Foreign Data Wrappers (1/2)
Using the OBDC_FDW extension One time
Each time
CREATE FOREIGN DATA WRAPPER odbc_fdw LIBRARY 'odbc_fdw.so‘;CREATE EXTENSION odbc_fdw;
CREATE SERVER odbc_server FOREIGN DATA WRAPPER odbc_fdw OPTIONS (dsn ‘…DSN…');
CREATE FOREIGN TABLE odbc_table (db_id integer, db_name varchar(255)
)SERVER odbc_serverOPTIONS (… sql_query 'select id, name from `dbo`.`table`' …);
odbc_fdw.so
SELECT
passed to FDW
passed to FDW
PostgreSQL proprietary API for FDWs ‘C’ Code
Method pointer in header
Linking PostgreSQL & MSSQL (4/4)
PostgreSQL Foreign Data Wrappers (2/2)
odbc_fdw_handler(PG_FUNCTION_ARGS){ FdwRoutine *fdwroutine = makeNode(FdwRoutine); fdwroutine->PlanForeignScan = odbcPlanForeignScan; fdwroutine->ExplainForeignScan = odbcExplainForeignScan; fdwroutine->BeginForeignScan = odbcBeginForeignScan; fdwroutine->IterateForeignScan = odbcIterateForeignScan; fdwroutine->ReScanForeignScan = odbcReScanForeignScan; fdwroutine->EndForeignScan = odbcEndForeignScan; PG_RETURN_POINTER(fdwroutine);}
Further Information about SQL/MED (1/4)
Query costs
Interesting applications
Further Information about SQL/MED
Query costs (1/2)
Consider the following tables
Row count of a JOIN statement (all employees) Best case 500 rows Worst case 100’000 rows
Best execution strategy External system performs JOIN? Perform JOIN locally?
City
- ID- Name- Latitude- Longitude
Employee
- ID- Name- Gender
* 1..*
200500
Worst Case Scenario
Costs 1.0 $ per transferred row 0.1 $ per local join operation
Strategy #A SELECT * FROM Employee JOIN City
Strategy #B SELECT * FROM Employee SELECT * FROM City
Clear win for #A Important to implement PlanForeignScan
500$
200$
Further Information about SQL/MED
Query costs (2/2)
Local JOIN 500x200 = 10’000$
SQL Server
FDW
External System
PlanForeignScan
100’000$
Further Information about SQL/MED
Interesting applications
Extension www_fdw to query all Restful WebservicesCREATE SERVER google_server FOREIGN DATA WRAPPER www_fdw OPTIONS (uri 'https://ajax.googleapis.com/search/web?v=1.0');
CREATE FOREIGN TABLE google_table (title text, snippet text, link text,
q text) SERVER google_server;
• Response• Request
Field legend
title | snippet | link-------------------+-----------------------------------------------------CatDog – Wikipedia | CatDog is an American... | http://en.wikipedia...
select * from google_table where q =’cat dog’ limit 1;
Conclusion
Great concepts FDW: Accessing external data via standard interfaces Datalink: Create secure links from tuples to files
Drawbacks which prevent the breakthrough Far too complex API Existing technologies (Microsoft, Oracle) Documentation
Do we really need it? Most environments are based on 1 server technology
Use built-in "MED" (Linked Servers, DBLink) Other ways to solve problems
Many years to stable release
Outlook
Relies on community Stable wrappers needed Other DBMS need to push it
Uncertain future
The End
Questions?